Packet corruption, Resource errors, Identifying a data loop – Intel BLADE SERVER IXM5414E Manuale d'uso

Pagina 288: Packet corruption 265, Resource errors 265, Identifying a data loop 265

Advertising
background image

Understanding and Troubleshooting the Spanning Tree Protocol

265

In this example, port 2 on bridge B can receive but not transmit packets. Port 2 on bridge C should
be in the discarding state, but since it can no longer receive BPDUs from port 2 on bridge B, it will
change to the forwarding state. If the failure exists at boot time, STP will not converge on a stable
topology and restarting the bridges will have no effect.

NOTE

In the previous example, restarting the bridges will provide a temporary resolution.

This type of failure is difficult to detect because the Link-state LEDs for Ethernet links rely on the
transmit side of the cable to detect a link. If a unidirectional failure on a link is suspected, it is
usually necessary to go to the console or other management software and look at the packets
received and transmitted for the port. For example, a unidirectional port will have many packets
transmitted but none received, or vice versa.

Packet corruption

Packet corruption can lead to the same type of failure. If a link is experiencing a high rate of physical
errors, a large number of consecutive BPDUs can be dropped and a port in the discarding state
would change to the forwarding state. The discarding port would have to have the BPDUs dropped
for 50 seconds (at the default settings) and a single BPDU would reset the timer. If the Max. Age is
set too low, this time is reduced.

Resource errors

The switch performs its switching and routing functions primarily in hardware, using specialized
application-specific integrated circuits (ASICs). STP is implemented in software and is thus reliant
upon the speed of the CPU and other factors to converge. If the CPU is over utilized, it is possible
that BPDUs might not be sent in a timely fashion. STP is generally not very CPU intensive and is
given priority over other processes, so this type of error is rare.

It can be seen that very low values for the Max. Age and the Forward Delay can result in an unstable
spanning tree. The loss of BPDUs can lead to data loops. The diameter of the network can also cause
problems. The default values for STP give a maximum network diameter of about seven. This means
that two bridges in the network cannot be more than seven hops apart. Part of this diameter
restriction is the BPDU age field. As BPDUs are propagated from the root bridge to the leaves of the
spanning tree, each bridge increments the age field. When this field is beyond the maximum age, the
packet is discarded. For large diameter networks, STP convergence can be very slow.

Identifying a data loop

Broadcast storms have a very similar effect on the network-to-data loops, but broadcast storm
controls in modern bridges have been (along with subnetting and other network practices) very
effective in controlling broadcast storms. The best way to determine if a data loop exists is to capture
traffic on a saturated link and check whether similar packets are seen multiple times.

Generally, if all the users of a given domain are unable to connect to the network at the same time, a
data loop is the cause. In this case, the port utilization data will have unusually high values.

The priority for most cases is to restore connectivity as soon as possible. The simplest remedy is to
manually disable all of the ports that provide redundant links. Disabling the ports one at a time and
then checking for the restoration of a user’s connectivity will identify the link that is causing the

Advertising