Re: [load balancing] AD4 randomly drops entire client subnet

From: Ken Thurman <thudinga [izzat] yahoo.com>
Date: Thu Apr 06 2006 - 16:50:11 EDT

Well first off the AD4's have 2 processors per port
and 2 more for Layer 3 and management. What you
describe sounds more like a memory leak, if reseting
the box restores service then it's probably not a bad
port. I know there were some code versions that had
problems under certain circumstances, like lots of
SNMP queries to the MP, and also some where it
couldn't log to the defined syslog server for some
reasons it would loose memory.I have also seen it
happen under an arp flood. I would open a case with
Nortel support they should be able to tell you if it's
a bad switch port or if its a code issue.

Regards,

Ken

--- "Jason J. W. Williams" <jasonjwwilliams@gmail.com>
wrote:

> Hi Marc,
>
> The AD4s are designed with one ASIC per port...at
> least that's what I
> remember from training. I haven't tried switching
> back to the bad port,
> because it is a production box. I'll try swapping a
> non-critical subnet back
> to the port. Tried to track down a Fluke...but
> haven't been able to get my
> hands on a good one yet. Any chance this could be a
> config prob?
>
> Thanks in advance,
> J
>
> On 4/6/06, Marc.Massar@firstdatacorp.com
> <Marc.Massar@firstdatacorp.com>
> wrote:
> >
> >
> > Could be bad ports. I don't know what the guts of
> the AD4 look like, but
> > my bet is that there is 1 chip that controls 2-4
> ports on the front (could
> > be one chip for every port...depending on the
> chips used on the inside of
> > the AD4). I've seen that sort of behavior where
> the chip inside gets toasty
> > and a block of ports go bad. You could probably
> test this with a low level
> > ethernet test device. You'll want to see if you
> are getting the right
> > signals from the port to establish a link.
> Anritsu has at least one test
> > device that will do this, and probably more than
> 1. They're pricey for just
> > a simple diagnostic, but there might be other
> vendors out there too.
> > What happens if you switch the traffic back to a
> previously 'bad' port?
> > -Marc
> >
> >
> > *"Jason J. W. Williams"
> <jasonjwwilliams@gmail.com>*
> > Sent by: owner-lb-l@vegan.net
> >
> > 04/05/2006 08:58 PM Please respond to
> > lb-l@vegan.net
> >
> > To
> > lb-l@vegan.net cc
> >
> > Subject
> > [load balancing] AD4 randomly drops entire client
> subnet
> >
> >
> >
> >
> >
> >
> > Hello,
> >
> > I've got an issue that I think may be an ailing
> AD4 but can't replicate
> > the problem. Used to have this config in a
> active-standby VRRP pair. My
> > clients are using linux bonding (layer 2 failover
> not aggregation) to
> > connect to the switch (which stands between the
> servers and the Alteon
> > port). The behavior is that after a period of time
> the Alteon would start to
> > drop traffic to the entire subnet...to the point
> where the clients couldn't
> > even ping the Alteon (but could ping each other).
> Pinging the gateway from
> > another port on the AD4 worked fine. A very
> strange side effect is that this
> > dropping behavior starts slowly (subnet disappears
> for short while and then
> > reappears) and then the outages increase in
> duration until the subnet drops
> > permanently. Taking bonding off doesn't resolve
> the issue. Removing VRRP and
> > only using a single Alteon doesn't resolve the
> issue. Once the condition has
> > occurred, rebooting the AD4 sometimes clears the
> condition and sometimes
> > doesn't (it tends towards not working the longer
> this goes on). All of the
> > IP interfaces and ports are using tagged VLANs to
> force their traffic
> > through particular ports. When this occurs it
> seems to be isolated to all
> > VLANs on a particular port.
> >
> > The only thing that reliably clears the condition
> is taking all the VLANs
> > from the affected port and reassigning them to a
> non-affected port.
> > Originally this happened on subnets hung off of
> port 3 of our AD4. Moving
> > them to port 7 fixed the issue (for the past 3
> months). Then suddenly the
> > subnets hung off of port 6 started acting up
> yesterday...only this time the
> > period from intermittent outage to permanent
> outage occurred over a matter
> > of 2-3 hours instead of 2-3 days. At this point
> VRRP was involved, but the
> > second member of the pair was powered off and the
> clients were set to use
> > the powered on unit's IP not the VR IP as their
> gateway. Recycling the unit
> > did not resolve the issue. Moving the VLANs from
> port 6 to port 4 did
> > resolve the issue.
> >
> > This behavior is very strange and we're thinking
> that it may be bad ports.
> > However, we don't want to replace the pair of AD4s
> on a hunch if it somehow
> > is a config issue. Any help or advice is greatly
> appreciated. Thank you in
> > advance.
> >
> > Best Regards,
> > Jason
> >
> > ------------------------------
> >
> >
> > *
> > The information in this message may be proprietary
> and/or
> > confidential, and protected from disclosure. If
> the reader of this
> > message is not the intended recipient, or an
> employee or agent
> > responsible for delivering this message to the
> intended recipient,
> > you are hereby notified that any dissemination,
> distribution or
> > copying of this communication is strictly
> prohibited. If you have
> > received this communication in error, please
> notify First Data
> > immediately by replying to this message and
> deleting it from your
> > computer.
> > *
> >
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
____________________
The Load Balancing Mailing List
Unsubscribe: mailto:majordomo@vegan.net?body=unsubscribe%20lb-l
Archive: http://vegan.net/lb/archive
LBDigest: http://lbdigest.com
MRTG with SLB: http://vegan.net/MRTG
Hosted by: http://www.tokkisystems.com
Received on Thu Apr 6 16:50:17 2006

This archive was generated by hypermail 2.1.8 : Thu Apr 06 2006 - 17:12:55 EDT