Re: [load balancing] The GSLB Page of Shame

Firstly, the objective of this paper was to get information out. I'm not
selling anything, or competing with any of these products. If customers are
informed of these caveats/tradeoffs, and still decide to purchase a pair of
GSLB devices for $60K, great! Then the objective of this paper would be met.
The customer would have made an informed decision. Problem is, in most cases
customers are being told they can use multiple A records and simultaneously
have "control" (as you put it), and/or are not being told about the browser
caching issue. I'm guilty of personally telling customers this, many who
listen on this list. (Again, sorry about that, I didn't know!).

Secondly, after speaking with hundreds of customers about SLB/GSLB over the
years, I can't think of a single customer that is both in the market for a
multi-site H/A (or "business continuity") solution, and is also OK with
potentially every user having to restart their browser or reboot their
computer upon a site, Internet connection, power, or SLB-pair, failure -
especially a high-end customer that would be able to afford such a device.
Users just don't think "hey, the problem is probably my browser". Clearly
single sign-on is lost if the server/site blows up (that's in the paper).
For most sites, that session loss happens even if a single server in a farm
fails (not even considering multi-site). In that case the user is prompted
to log on again, and can recover transactions from a shared database.
Unfortunately, there's no practical way to prompt a user to restart their
browser, so without multiple A records, for all such users (maybe all
users), the entire global site appears to be off the Internet.

Thirdly, multiple A records were in use well before Alteon or Cisco or F5 or
RadWare had GSLB products, and many marquee sites use multiple A records
(for both multi-site and multi-homed single site), so although what you say
about "some browsers" is probably technically correct, I think it may
obfuscate the point in this context - that multiple A records are a
fundamentally sound and well tested approach.

Fourthly, the "control" that can actually be achieved with a DNS based GSLB
device is way overrated (even ignoring the H/A issue). I purposely avoided
that topic in this paper, primarily because most GSLB customers these days
just want active/backup ("business continuity"), don't have a site in
Zimbabwe, and could care less about proximity/load/etc., but also because
it's a different subject altogether - maybe a good reason to do a "Why DNS
Based GSLB Doesn't Work - Part II".

(score - 2 people guessed Tasmanian Devil, nobody guessed Bing)

>I've been following the discussion, and I feel the urge to jump into it :D
>Let's assume that the application is statefull.
>What happens in a multi a-record setup, when a single site has a disaster?
>- Well any user currently
>connected to that site, will have to relogon and everything they did is
>properly gone.
>Let's assume we have some backend replication, so that in case of a site
>disaster the user is able
>to connect to a different site without having to relogon.
>Site fails, user don’t even know it.
>Let’s do the same with a GLSB setup.
>Site fails; user has to restart browser. And then relogon and redo what he
>/ she was doing.
>So why would anyone do GSLB? Simply put: Control
>Who needs control? Or why would anyone need that kind of control?
>Well the Internet spans the World, and in many countries, especially
>emerging economies, the
>connections to the rest of the world are painfully slow. If these countries
>are important to you,
>then you need control. If not then go with a multi a-record setup, and live
>happily ever after (with
>a bit of luck)
>One last thing, multi a-records aren’t all that good - Some browsers will
>happily connect
>simultaneously to two or more of the IP’s returned.
> "P T"
> <pt_lbIZZAThotmail.c To:
> om> cc:
> Sent by: Subject: Re: [load balancing] The
>GSLB Page of Shame
> owner-lb-lIZZATvegan
> .net
> 05-04-2004 01:30
> Please respond
> to lb-l
>BGP convergence doesn't GSLB happen quick enough to avoid connection
>failure, and therefore cannot replace multiple A records for H/A. That's
>most customers use multiple A records. Problem is, multiple A records break
>GSLB, and most customers (or vendors) don't know that.
> >>It's obvious that GSLB works or vendors wouldn't be selling it,
>Vendors are making claims that are technically incorrect, exactly specific
>to this issue. The claims are on their Web sites, in their manuals, in
>sales pitches, etc. You are saying all this is well known, but you are
>wrong. Look for yourself.
> >Pete,
> >
> >First off, let me state quite simply that my response has nothing to do
> >with my current employment, however I'm (NetScaler) on the side of the
> >right in that NetScaler doesn't charge for the functionality that you
> >describe.
> >
> >As a vendor system engineer my first statement is that in your examples
> >a catastrophic event (Internet connection, power, switching, and/or
> >load balancing failures) without a doubt anyone would be best served
> >BGP4 for high availability.
> >
> >Now let me back track a little so you can get a better understanding on
> >position. First, the feature/product is called Global Server Load
> >Balancing (GSLB) not Global Site High Availability. Pointing to GSLB as
> >failed HA solution for catastrophic failures is equal to saying that SLB
> >(Server Load Balancing) is a complete failure due to a site Database
> >failure. So I feel that in general your document has very valid points
> >the High Availability aspects of GSLB however I'm a firm believer that
> >should be combined with BGP4 for any sensitive transactional environment
> >where time equals lots of money and I would bet that any
> >ReallyBigWellTrustedFinancialSite's have probably rolled out an EBGP
> >routed environment before doing a GSLB installation.
> >
> >The problem is that your document assumes that the "overwhelmingly most
> >compelling reason that Internet sites are hosted in multiple locations
> >high availability". However there are several reasons that companies
> >choose GSLB: even load distribution across multiple links, risk
> >mitigation to spread Internet load across multiple POPs, site
> >(through planned fail-over), reduce data center costs (colocation
> >competition), site/application scalability, and of course site high
> >availability to name the majority of reasons but in general I wouldn't
> >call it the most compelling reason.
> >
> >The reality is most sites/companies/customers don't perform real time
> >synchronization (assuming the application requires it) between
> >geographically dispersed data centers without a BGP4 implementation. In
> >the last 3 GSLB conversations I've had customers are asking that GSLB
> >provide automatic site fail-over and the majority of reasons stated is
> >to the fact that Database work (primarily sycronization) needs to be
> >completed before bringing up the cold site online. Anyone who's been
> >quoted an EMC SRDF solution and the bandwidth requirements (especially
> >across the country/globe) will probably know that real time data
> >synchronization in addition to automagic site fail-over on catastrophic
> >failure is the design goal, but often times is more costly then the
> >application revenue generation during the 30 minute black out of the
> >existing IE customers with a DNS cache.
> >
> >Now for the business side. Your document states that it's "the
> >responsibility of the manufacturer, not the customer, to determine if a
> >product adds value remotely close to what is represented in marketing
> >claims." Now in general I believe that in all cases marketing claims
> >should be responsible, however, to think that "customers" should not be
> >responsible for making vendor product selection when the sole purpose of
> >the IT/Technical Operations personnel who are being paid by the company
> >provide technical solutions to business problems is a completely
> >irresponsible statement on your part, a manufacture cannot know every
> >business requirement that will be needed for the entire market that they
> >service. If the people that are hired by a company don't take the time
> >do due diligence to evaluate, speak with "like" customer references, and
> >have business discussions with Vendor account executives they (customer)
> >should be liable for not doing their job. And if the vendor can't
> >an evaluation or customer references then customer should move on but I
> >can't feel responsible for people who make technical decisions off a
> >marketing product feature list.
> >
> >So for a little constructive criticism on your document, like I've
> >previously the technical details are pretty tight in general but seem to
> >be bias and during my reading I called it a manifesto. Either way it
> >comes across as a sort of personal vendetta or a technical sales pitch
> >which isn't objective. The biggest problem overall is the there seems
> >be no continuity between the initial claim that no "DNS based GSLB
> >solution can function also provide high availability"
> >probably should be sidenoted/caveated with "during a catastrophic
> >failure") and it continues to expand and try to address topics such as:
> >
> >Proximity based GSLB on the Internet
> >Multiple A records for HA (but at the same time doesn't work for a
> >Active-Standby sites)
> >HTTP Redirects (which works fine, except someone COULD bookmark)
> >GSLB Advertisement Flapping
> >Triangulation
> >IP proxy
> >Backup redirection
> >
> >Now it's perfectly fine to discuss other topics, the problem is that
> >of these solutions weren't designed to address the catastrophic failures
> >that were mentioned in your document so it's entirely unfair to pull out
> >the almighty trump-card of "it doesn't work in a catastrophic failure"
> >in the case of GSLB Advertisement Flapping by changing the game to talk
> >about outages outside of catastrophic to discredit GSLB health checking.
> >
> >It's obvious that GSLB works or vendors wouldn't be selling it, or more
> >importantly, companies wouldn't be buying it. The old adage, if you're
> >not part of the solution your a part of the problem definitely applies
> >this document. I'd love to see it updated as an informative document
> >shows both sides and tries to propose solutions to a particular
> >but in its current state of "I'm going to pee on all the cheerio's by
> >dragging them into unrelated discussions" although informative is not
> >helpful in general.
> >
> >
> >>Jay, that's just it... there are _no_ workarounds that fix the issues!
> >>thought the paper did a pretty thorough job of explaining why the state
> >>of the art workarounds are not sufficient. Did you read it?
> >>
> >>It's not like any one piece of this information is new, but the sum
> >>of it is definitely not widely known within SLB vendors. For example,
> >>several of them, including the current market share leader, have
> >>information on their Web sites explaining how they return DNS A records
> >>in a list with the "best" IP address first. That doesn't work on the
> >>Internet! The DNS gurus laugh in our faces. (Hey, I didn't know it
> >>either, how embarrassing!). It's OK to be wrong. As technologists we
> >>learn from mistakes, fix the stuff we build, sell stuff that works,
> >>right? So GSLB doesn't work as we all had hoped, and we can't fix it.
> >>There's no sense pointing fingers or lamenting about how browser
> >>should change to observe TTLs or SRV records or whatever... even if
> >>fixes were released in software today we would have to live with the
> >>installed client base for years. Now we know that, and move on. I hope
> >>your company, NetScaler, takes such higher ground.
> >>
> >>What would really be shameful is for vendors to, after realizing issues
> >>such as these, continue promoting features and products that do not
> >>as advertised.
> >>
> >>
> >>
> >>Pete.
> >>
> >>
> >>>It's a shame the GSLB manifesto couldn't be more neutral, it makes
> >>>very valid point. There are several reasons to use GSLB and several
> >>>workarounds for the specific problems, it just seems that the author
> >>>chose not to highlight or know them.
> >>>
> >>>
> >>>>A paper that describes why DNS based GSLB solutions cannot work
> >>>>reliably for browser based clients.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>Pete.
> >>>>
> >>>>
Free up your inbox with MSN Hotmail Extra Storage. Multiple plans available.
FREE pop-up blocking with the new MSN Toolbar – get it now!
Watch LIVE baseball games on your computer with MLB.TV, included with MSN

