RE: [load balancing] Alteon 2424 - problem with pbind on passive cookie

From: Mecklem, Timothy <tmecklemIZZATmedplus.com>
Date: Tue May 17 2005 - 09:30:20 EDT

    In my prior communication with Nortel support, they have given me
several issues that you might want to consider. Probably the largest
point to consider if you are placing this in a production environment is
that (according to the support technician I was working with) versions
prior to 22.0.6 had issues doing pbind passive cookie inspection
properly when the number of load balanced servers is greater than nine.
We were already running a newer version of the Alteon OS, so I did not
pay a great deal of attention to that information although I couldn't
find it on the release notes of any of the versions of the OS on the
nortel site. You may want to verify that with Nortel for accuracy since
nine seems like such an arbitrary cutoff number.
    The issue that I encountered with pbind and urlslb was very
frustrating, especially since the nortel weblogic doc specifically
stated that both should be on. When you do both at the same time, it
may look like pbind is persisting correctly for a time, then all of a
sudden it will act as if no persistent binding is happening, skipping
over cookies as if they don't exist. We were unaware at first that this
was occurring until we found a bug in our application code that caused a
session not to be replicated to the other servers. Since the weblogic
cluster was replicating the sessions in all other cases, it didn't
matter functionally if the pbind wasn't working, the receiving server
just had to do a lookup on the session and take over the reins as the
primary server for that session. As part of that process, it sent a
set-cookie on the JSESSIONID and reset the primary server portion of the
cookie to its own JVM id. If you can capture the packets on the client
side, you can tell if something is wrong when the session has been
established and the primary JVM id portion of the cookie has been set
and reset several times over one TCP session. Additionally, in case you
are following the weblogic guide, it is also incorrect in implying that
when you turn on the "look for cookie in URI" under the pbind option
that it will still search for an actual cookie. You can either have URI
JSESSIONID inspection or cookie JSESSIONID inspection, but not both at
the same time (this also according to a knowledgeable cotnact at nortel
support).
    The final one that I have encountered has been a deal breaker for us
so far. We have an open ticket on it, but I haven't heard back from
support about it and I'm beginning to doubt that the person I was
working with even exists anymore. The issue deals with POST requests
that span mutliple packets. Naturally, the first packet of the request
contains the JSESSIONID cookie and is routed to the correct server.
What happens with the rest of the packets of the request is still
somewhat baffling. In 95+% of the cases, they make it to the correct
server and everything is fine. On the remaining cases, it's almost like
the Alteon loses its mind and routes the packets to another server. The
issue looks like it has to do with the sequence numbers of the TCP
sessions. Whenever I see the issue recorded in a packet capture, there
are two running TCP sessions (both dealing with a POST request with
multiple packets at the same time) on separate servers that converge on
the same TCP sequence number, at which point that Alteon routes all
remaining POST requests to one server, cutting off the other until that
server issues a read timeout error to the client. I am not entirely
sure that that is the exact issue that I'm dealing with, but it
certainly is at least a symptom. One server gets the rest of the
request and the other is left hanging. We have been able to expose this
regularly within 5 minutes under a load test of the system, and the
issue seems to have non-linear growth as load increases (potentially due
to the number of open TCP sessions I suppose). I'd really appreciate
anything on this one, because it's causing us to basically use the
alteon as a "dumb" load balancer until it can perform as reliably as the
apache weblogic plugin.
    I apologize that I made a reference to 22.0.7 being the version that
fixes many of the cookie related problems. 22.0.6 is the one that I was
referred to by support, and 22.0.7 was the version we decided to go
with. In either case, you can go through the release notes of the
versions leading up to 22.0.6 and see that they have indeed had a
disproportionate percentage of problems related to pbind cookie
inspection compared to other fixes.
 
Let me know if I answered your question thoroughly. I was pretty
involved with our Alteon in a test environment for some time, but it has
been a few weeks since I've done anything with the Alteon.

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
jon.hartman@verizon.com
Sent: Monday, May 16, 2005 11:51 PM
To: lb-l@vegan.net
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on
passive cookie

Could you please expand upon your problems with passive cookies?
 
We've been evaluating the 22.0.2 code version and are considering it in
a production environment where we are also using the JSESSIONID cookie
inspection method of persistency. What sort of issues did you see in
version prior to 22.0.7?

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Mecklem, Timothy
Sent: Monday, May 16, 2005 1:02 PM
To: lb-l@vegan.net
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on
passive cookie

    I have had the same sort of experience with the 2424 series. I set
up passive load balancing against a Weblogic Cluster JSESSIONID cookie
and had initial problems because the Nortel doc specifies that URLSLB
should also be set. After several config dumps and packets dumps,
Nortel support determined that the urlslb should not be used with the
pbind passive cookie inspection feature. Additionally, certain versions
prior to 22.0.7 seem to have problems with passive cookie
inspection/pinning.
    On a slightly different note but in the same passive cookie binding
thread, I am experiencing odd behavior whenever a POST request comes
through that spans multiple packets. From dumps it looks like TCP
sessions seem to almost "collide" with similar sequence numbers across
other servers. The first packet (with the cookie) invariably gets
routed to the originating server, but subsequent packets end up in
another server's packet trace. This is happening somewhat randomly, but
the occurrence ratio increases disproportionately under load tests.
 
Tim

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Hendershott, Steve
Sent: Monday, May 16, 2005 12:24 PM
To: lb-l@vegan.net
Subject: [load balancing] Alteon 2424 - problem with pbind on passive
cookie

We are using Software Version 22.0.2 on an Alteon 2424 and have
persistent binding on a "passive" cookie. The cookie is inserted by the
server and examined by the 2424. It works for a while and then breaks
down. The 2424 starts handing out sessions to the wrong server.

 

We switched over to a "re-write" cookie method where we let the 2424
over-write the cookie set by the server and it works better. The
"passive" cookie used to work in older versions of the Alteon software.

 

Does any one else have this experience? Is anyone using a passive cookie
to bind sessions?

 

Thanks,

 

Steve

************************************************************************
**

This e-mail and any files transmitted with it may contain privileged or

confidential information. It is solely for use by the individual for
whom

it is intended, even if addressed incorrectly. If you received this
e-mail

in error, please notify the sender; do not disclose, copy, distribute,
or

take any action in reliance on the contents of this information; and
delete

it from your system. Any other use of this e-mail is prohibited. Thank
you

for your compliance.

____________________
The Load Balancing Mailing List
Unsubscribe: mailto:majordomo@vegan.net?body=unsubscribe%20lb-l
Archive: http://vegan.net/lb/archive
LBDigest: http://lbdigest.com
MRTG with SLB: http://vegan.net/MRTG
Hosted by: http://www.tokkisystems.com
Received on Tue May 17 10:38:07 2005

This archive was generated by hypermail 2.1.8 : Tue May 17 2005 - 11:01:38 EDT