RE: [load balancing] Alteon 2424 - problem with pbind on passive cookie

From: Henry Silva <hsilva1IZZATnortel.com>
Date: Wed May 18 2005 - 01:42:10 EDT

The post issue that you speak of will be addressed in 22.0.3.0.

 

Henry

 

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Mecklem, Timothy
Sent: Tuesday, May 17, 2005 4:31 PM
To: lb-l@vegan.net
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on passive
cookie

 

Haha, oh boy... umm... yes, I have access to super secret Alteon code.

 

Not really... Actually, I was referring to the 21 series of the OS in my
previous posts. I apologize about that misidentification. We tried the
newer OS 22 series as well and backed down from it because it was causing us
problems as well, specifically in the POST problem. In the 22 series, the
POST problem is reproducible in our environment 100% of the time for all
requests over 1 packet. We backed it down to 21.0.7 and the problem almost
disappeared, although not completely as I'm sure you read from my previous
writing. Again, sorry for the confusion, I should have caught that one.

 

Tim

 

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Hendershott, Steve
Sent: Tuesday, May 17, 2005 2:32 PM
To: 'lb-l@vegan.net'
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on passive
cookie

Can you validate the OS version you are using? On the Alteon web site we can
only see OS versions 21.0.8 or 22.0.2. Are you using a special build beyond
22.0.2, or are you using a version 21.0.7?

 

Your information was very helpful and gave us some things to think about.

 

Thanks,

 

Steve

 

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Mecklem, Timothy
Sent: Tuesday, May 17, 2005 9:30 AM
To: lb-l@vegan.net
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on passive
cookie

 

    In my prior communication with Nortel support, they have given me
several issues that you might want to consider. Probably the largest point
to consider if you are placing this in a production environment is that
(according to the support technician I was working with) versions prior to
22.0.6 had issues doing pbind passive cookie inspection properly when the
number of load balanced servers is greater than nine. We were already
running a newer version of the Alteon OS, so I did not pay a great deal of
attention to that information although I couldn't find it on the release
notes of any of the versions of the OS on the nortel site. You may want to
verify that with Nortel for accuracy since nine seems like such an arbitrary
cutoff number.

    The issue that I encountered with pbind and urlslb was very frustrating,
especially since the nortel weblogic doc specifically stated that both
should be on. When you do both at the same time, it may look like pbind is
persisting correctly for a time, then all of a sudden it will act as if no
persistent binding is happening, skipping over cookies as if they don't
exist. We were unaware at first that this was occurring until we found a
bug in our application code that caused a session not to be replicated to
the other servers. Since the weblogic cluster was replicating the sessions
in all other cases, it didn't matter functionally if the pbind wasn't
working, the receiving server just had to do a lookup on the session and
take over the reins as the primary server for that session. As part of that
process, it sent a set-cookie on the JSESSIONID and reset the primary server
portion of the cookie to its own JVM id. If you can capture the packets on
the client side, you can tell if something is wrong when the session has
been established and the primary JVM id portion of the cookie has been set
and reset several times over one TCP session. Additionally, in case you are
following the weblogic guide, it is also incorrect in implying that when you
turn on the "look for cookie in URI" under the pbind option that it will
still search for an actual cookie. You can either have URI JSESSIONID
inspection or cookie JSESSIONID inspection, but not both at the same time
(this also according to a knowledgeable cotnact at nortel support).

    The final one that I have encountered has been a deal breaker for us so
far. We have an open ticket on it, but I haven't heard back from support
about it and I'm beginning to doubt that the person I was working with even
exists anymore. The issue deals with POST requests that span mutliple
packets. Naturally, the first packet of the request contains the JSESSIONID
cookie and is routed to the correct server. What happens with the rest of
the packets of the request is still somewhat baffling. In 95+% of the
cases, they make it to the correct server and everything is fine. On the
remaining cases, it's almost like the Alteon loses its mind and routes the
packets to another server. The issue looks like it has to do with the
sequence numbers of the TCP sessions. Whenever I see the issue recorded in
a packet capture, there are two running TCP sessions (both dealing with a
POST request with multiple packets at the same time) on separate servers
that converge on the same TCP sequence number, at which point that Alteon
routes all remaining POST requests to one server, cutting off the other
until that server issues a read timeout error to the client. I am not
entirely sure that that is the exact issue that I'm dealing with, but it
certainly is at least a symptom. One server gets the rest of the request
and the other is left hanging. We have been able to expose this regularly
within 5 minutes under a load test of the system, and the issue seems to
have non-linear growth as load increases (potentially due to the number of
open TCP sessions I suppose). I'd really appreciate anything on this one,
because it's causing us to basically use the alteon as a "dumb" load
balancer until it can perform as reliably as the apache weblogic plugin.

    I apologize that I made a reference to 22.0.7 being the version that
fixes many of the cookie related problems. 22.0.6 is the one that I was
referred to by support, and 22.0.7 was the version we decided to go with.
In either case, you can go through the release notes of the versions leading
up to 22.0.6 and see that they have indeed had a disproportionate percentage
of problems related to pbind cookie inspection compared to other fixes.

 

Let me know if I answered your question thoroughly. I was pretty involved
with our Alteon in a test environment for some time, but it has been a few
weeks since I've done anything with the Alteon.

 

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
jon.hartman@verizon.com
Sent: Monday, May 16, 2005 11:51 PM
To: lb-l@vegan.net
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on passive
cookie

Could you please expand upon your problems with passive cookies?

 

We've been evaluating the 22.0.2 code version and are considering it in a
production environment where we are also using the JSESSIONID cookie
inspection method of persistency. What sort of issues did you see in version
prior to 22.0.7?

 

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Mecklem, Timothy
Sent: Monday, May 16, 2005 1:02 PM
To: lb-l@vegan.net
Subject: RE: [load balancing] Alteon 2424 - problem with pbind on passive
cookie

    I have had the same sort of experience with the 2424 series. I set up
passive load balancing against a Weblogic Cluster JSESSIONID cookie and had
initial problems because the Nortel doc specifies that URLSLB should also be
set. After several config dumps and packets dumps, Nortel support
determined that the urlslb should not be used with the pbind passive cookie
inspection feature. Additionally, certain versions prior to 22.0.7 seem to
have problems with passive cookie inspection/pinning.

    On a slightly different note but in the same passive cookie binding
thread, I am experiencing odd behavior whenever a POST request comes through
that spans multiple packets. From dumps it looks like TCP sessions seem to
almost "collide" with similar sequence numbers across other servers. The
first packet (with the cookie) invariably gets routed to the originating
server, but subsequent packets end up in another server's packet trace.
This is happening somewhat randomly, but the occurrence ratio increases
disproportionately under load tests.

 

Tim

 

  _____

From: owner-lb-l@vegan.net [mailto:owner-lb-l@vegan.net] On Behalf Of
Hendershott, Steve
Sent: Monday, May 16, 2005 12:24 PM
To: lb-l@vegan.net
Subject: [load balancing] Alteon 2424 - problem with pbind on passive cookie

We are using Software Version 22.0.2 on an Alteon 2424 and have persistent
binding on a "passive" cookie. The cookie is inserted by the server and
examined by the 2424. It works for a while and then breaks down. The 2424
starts handing out sessions to the wrong server.

 

We switched over to a "re-write" cookie method where we let the 2424
over-write the cookie set by the server and it works better. The "passive"
cookie used to work in older versions of the Alteon software.

 

Does any one else have this experience? Is anyone using a passive cookie to
bind sessions?

 

Thanks,

 

Steve

 

**************************************************************************

This e-mail and any files transmitted with it may contain privileged or

confidential information. It is solely for use by the individual for whom

it is intended, even if addressed incorrectly. If you received this e-mail

in error, please notify the sender; do not disclose, copy, distribute, or

take any action in reliance on the contents of this information; and delete

it from your system. Any other use of this e-mail is prohibited. Thank you

for your compliance.

 

**************************************************************************

This e-mail and any files transmitted with it may contain privileged or

confidential information. It is solely for use by the individual for whom

it is intended, even if addressed incorrectly. If you received this e-mail

in error, please notify the sender; do not disclose, copy, distribute, or

take any action in reliance on the contents of this information; and delete

it from your system. Any other use of this e-mail is prohibited. Thank you

for your compliance.

____________________
The Load Balancing Mailing List
Unsubscribe: mailto:majordomo@vegan.net?body=unsubscribe%20lb-l
Archive: http://vegan.net/lb/archive
LBDigest: http://lbdigest.com
MRTG with SLB: http://vegan.net/MRTG
Hosted by: http://www.tokkisystems.com
Received on Wed May 18 02:50:24 2005

This archive was generated by hypermail 2.1.8 : Wed May 18 2005 - 03:25:31 EDT