RE: [load balancing] ServerIronXL - Real servers failing L4 check for no apparent reason

From: Frank Yue (fyueIZZATfoundrynet.com)
Date: Thu Jul 15 2004 - 13:42:21 EDT

  • Next message: Basil Hussain: "RE: [load balancing] ServerIronXL - Real servers failing L4 check for no apparent reason"

    It sounds like the server is passing the L4 healthcheck, but failing the L7
    healthcheck. You can do several things. First, is to use the 'show server
    real http [name]' to look at the specific state of the healthchecks.
    Second, is to turn on the command 'server no-fast-bringup' at the global
    level. This will change the Foundry methodology for bringing a server up.
    The server will only be brought up if the server passes the L4 check AND the
    L7 check when this command is enabled.

    You should verify that the HTTP check that you are doing is returning a
    valid status code (200-299). Otherwise, the site will be marked as down.

    -Frank Yue
    Consulting Engineer, Layer 4-7
    1004 W. Mandevilla Court
    Wilmington, NC 28409
    fyueIZZATfoundrynet.com
    www.foundrynet.com

    -----Original Message-----
    From: owner-lb-lIZZATvegan.net [mailto:owner-lb-lIZZATvegan.net] On Behalf Of Basil
    Hussain
    Sent: Thursday, July 15, 2004 11:15 AM
    To: lb-lIZZATvegan.net
    Subject: RE: [load balancing] ServerIronXL - Real servers failing L4 check
    for no apparent reason

    Hi,

    Thanks for the tip. I'd completely forgotten about tcpdump.

    I looked up in the ServerIron manual what the procedure for a level 4 TCP
    health check is, to see what it would be sending, and what it expects to
    receive. The procedure is documented as follows:

    1. The SI sends a TCP SYN packet to the port on the real server.
    2. The SI expects the real server to respond with a SYN ACK.
    3. If the SI receives the SYN ACK, the SI send a TCP RESET, satisfied that
    the TCP port is alive.

    So, I did some packet capture on one of the web servers that is 'failing'. I
    made sure I captured an instance of a health check failing by monitoring the
    syslog our ServerIron generates. I then used Ethereal to filter the results
    down to just exchanges between our load-balancer (10.1.1.3) and the server.
    Here's an excerpt:

    No. Time Source Destination Protocol
    ------------------------------------------------------------------
    518 2004-07-15 15:45:44.481041 10.1.1.3 10.1.1.100 TCP
    1737 > http [SYN] Seq=0 Ack=0 Win=16384 Len=0 MSS=1460

    519 2004-07-15 15:45:44.481061 10.1.1.100 10.1.1.3 TCP
    http > 1737 [SYN, ACK] Seq=0 Ack=1 Win=30660 Len=0 MSS=1460

    520 2004-07-15 15:45:44.481410 10.1.1.3 10.1.1.100 TCP
    1737 > http [RST] Seq=1 Ack=3451710180 Win=1 Len=0

    Well, look what we have here! It's a SYN->SYN/ACK->RST transaction with the
    load balancer! I'm no TCP expert, but it looks complete. After all, the docs
    state the SI doesn't send an RST unless it actually receives the SYN/ACK.

    So, I am now even more stumped!

    Regards,

    --
    Basil Hussain
    I.T. Systems Developer and Administrator, Kodak Weddings
    basil.hussainIZZATkodakweddings.com
    

    > -----Original Message----- > From: owner-lb-lIZZATvegan.net [mailto:owner-lb-lIZZATvegan.net]On Behalf Of > tony bourke > Sent: 14 July 2004 17:37 > To: lb-lIZZATvegan.net > Subject: RE: [load balancing] ServerIronXL - Real servers failing L4 > check for no apparent reason > > > Hi Basil, > > I would try using some type of sniffer (such as tcpdump) and watch what > goes on between the ServerIron and the real server. Perhaps that can shed > some light onto the issue. > > Tony > > On Wed, 14 Jul 2004, Basil Hussain wrote: > > > No-one has any ideas on this? > > > > I really can't figure this out. > > > > -- > > Basil Hussain > > I.T. Systems Developer and Administrator, Kodak Weddings > > basil.hussainIZZATkodakweddings.com > > > > > > > -----Original Message----- > > > From: owner-lb-lIZZATvegan.net [mailto:owner-lb-lIZZATvegan.net]On Behalf Of > > > Basil Hussain > > > Sent: 12 July 2004 11:19 > > > To: Load Balancing List > > > Subject: [load balancing] ServerIronXL - Real servers failing L4 check > > > for no apparent reason > > > > > > > > > Hi, > > > > > > I have a strange problem that I hope someone may be able to shed > > > some light > > > on. > > > > > > First, let me explain my set-up. I have two physical servers being > > > load-balanced by a Foundry ServerIron XL. I am balancing four web > > > sites over > > > both. Because of the need to do SSL on more than one web site, I > > > have to do > > > IP-based virtual hosting. Therefore, I have set up a virtual network > > > interface (using Linux IP aliasing) for each site. So, it looks like: > > > > > > server-a: > > > site-a: 10.1.1.100 > > > site-b: 10.1.1.101 > > > site-c: 10.1.1.102 > > > site-d: 10.1.1.107 > > > server-b: > > > site-a: 10.1.1.110 > > > site-b: 10.1.1.111 > > > site-c: 10.1.1.112 > > > site-d: 10.1.1.117 > > > > > > I have eight real servers set up on the LB (one for each > > > server/site combo), > > > along with four virtual servers (one for each web site). Each > > > virtual server > > > is therefore bound to two real servers. > > > > > > The problem I am having is that my ServerIron keeps logging > errors approx. > > > every minute telling me that two of my real servers are > failing level 4 > > > health checks: > > > > > > 11-07-2004 23:59:33 ServerIron, L4 server 10.1.1.110 > > > server-b-site-a port 80 > > > is down > > > 11-07-2004 23:59:35 ServerIron, L4 server 10.1.1.100 > > > server-a-site-a port 80 > > > is down > > > 11-07-2004 23:59:39 ServerIron, L4 server 10.1.1.100 > > > server-a-site-a port 80 > > > is up > > > 11-07-2004 23:59:39 ServerIron, L4 server 10.1.1.110 > > > server-b-site-a port 80 > > > is up > > > > > > But, strangely, it only gives errors for site A! No problems at > > > all with all > > > other sites! Also, it doesn't complain about port 443. > > > > > > What possible reasons could there be for this behaviour? > > > > > > Regards, > > > > > > -- > > > Basil Hussain > > > I.T. Systems Developer and Administrator, Kodak Weddings > > > basil.hussainIZZATkodakweddings.com > > > > > > P.S. In case anyone asks why I have eight real servers defined, > > > one for each > > > server/site combination, it is because of the non-contiguous > IP addresses. > > > Otherwise, I would use the 'host-range' command... > > > > > > ____________________ > > > The Load Balancing Mailing List > > > Unsubscribe: mailto:majordomoIZZATvegan.net?body=unsubscribe%20lb-l > > > Archive: http://vegan.net/lb/archive > > > LBDigest: http://lbdigest.com > > > MRTG with SLB: http://vegan.net/MRTG > > > Hosted by: http://www.tokkisystems.com > > > > > > > ____________________ > > The Load Balancing Mailing List > > Unsubscribe: mailto:majordomoIZZATvegan.net?body=unsubscribe%20lb-l > > Archive: http://vegan.net/lb/archive > > LBDigest: http://lbdigest.com > > MRTG with SLB: http://vegan.net/MRTG > > Hosted by: http://www.tokkisystems.com > > > ____________________ > The Load Balancing Mailing List > Unsubscribe: mailto:majordomoIZZATvegan.net?body=unsubscribe%20lb-l > Archive: http://vegan.net/lb/archive > LBDigest: http://lbdigest.com > MRTG with SLB: http://vegan.net/MRTG > Hosted by: http://www.tokkisystems.com >

    ____________________ The Load Balancing Mailing List Unsubscribe: mailto:majordomoIZZATvegan.net?body=unsubscribe%20lb-l Archive: http://vegan.net/lb/archive LBDigest: http://lbdigest.com MRTG with SLB: http://vegan.net/MRTG Hosted by: http://www.tokkisystems.com

    ____________________ The Load Balancing Mailing List Unsubscribe: mailto:majordomoIZZATvegan.net?body=unsubscribe%20lb-l Archive: http://vegan.net/lb/archive LBDigest: http://lbdigest.com MRTG with SLB: http://vegan.net/MRTG Hosted by: http://www.tokkisystems.com



    This archive was generated by hypermail 2.1.4 : Thu Jul 15 2004 - 13:52:25 EDT