A network administrator will face a variety of network issues. In this video, you’ll learn how to troubleshoot collisions, broadcast storms, duplicate addresses, asymmetric routes, and more.
Although it’s very unusual to find devices configured for half-duplex Ethernet these days, you may still run into some legacy devices that are connected to a hub or that are configured for half-duplex. One of the challenges you have with half-duplex communication is the fact that there may be collisions.
A collision is created when there are two devices that want to communicate on the network at the same time, and they will both start to send information simultaneously. All of that signal will suddenly collide with each other and none of this information will make it through the network in one piece. When this happens, these devices recognize that a collision has occurred. They both wait a random amount of time. And then they try re-transmitting again.
If you’re on a half-duplex network, then collisions are perfectly normal. It’s how Ethernet operates at half-duplex. If you have a lot of people communicating at the same time, you will see an increasing number of collisions occurring on the network and that could cause performance problems for everyone. On today’s networks, though, you’ll find that most of your devices are connected at full-duplex, which means collisions should not be occurring at all.
So if you do have collisions increasing on your network, you may want to look at the configuration of your devices. This is sometimes as simple as finding one device that is set for full-duplex and another device that’s set for half-duplex, and collisions will be created because of the half-duplex connection.
But there can also be a hardware problem, and the hardware problem is causing the collision counter to increase as well. You’ll want to look at the network interface card or the drivers for that device and make sure that there’s no issue with either of those components.
If you were to look at a Cisco switch and do a show interfaces for a particular interface, you can see the number of runts, giants, input errors, CRCs, collisions, or late collisions, which might give you some insight into how well this interface is performing.
You can see on this interface that thousands of packets have been input. We can see broadcasts on the network. But there have been instances where collisions and interface resets have occurred, so we might want to look at the configuration of this switch and the device that’s connected to this interface to see if both are configured properly.
One necessary evil of IPv4 is that we have to use broadcasts for certain protocols to be able to communicate. This is a normal part of IPv4. And it’s important that you’re able to track and monitor how many broadcasts might be occurring on your network. This broadcast domain is, of course, limited to devices that are on the same VLAN, and these broadcast domains are separated by routers.
There will always be some broadcasts communicating on the network. Your problem occurs when there are a large number of broadcasts because every device on the network has to evaluate every packet that comes through that’s configured as a broadcast. If you have one or two broadcasts a second, that’s not much of a problem for any device. If there are a hundred or a thousand broadcasts in a single second, then you will certainly see performance problems on your network.
One of the best ways to identify how many broadcasts are occurring on the network, and where these broadcasts are coming from, is to take a packet capture. This is a packet capture that is showing a large number of broadcasts. These happen to be ARP, or Address Resolution Protocol, and you can see the time stamps between each of the broadcasts on the network. This will give us information about the type of broadcast, the device that’s sending the broadcast, and how many broadcasts might be occurring in any particular time.
If we look at our packet capture, we can see there’s not a huge number of packets in a single second. But this might tell us that we need to keep an eye on this and, perhaps, perform some other type of mitigation for broadcast storms. For example, you might want to separate the network into smaller subnets and, therefore, create smaller broadcast domains.
Something that should not occur very often is a duplicate MAC address. This is the Media Access Control address, or the burned-in address on our Ethernet cards. These MAC addresses are designed to be unique. And they are changed by the manufacturer, so that every interface card has a different MAC address. If you do see duplicate MAC addresses on your network, this could be related to an on-path attack.
If this isn’t something malicious, then this may be a simple mistake. There have been cases where manufacturers have created network interface cards that happen to have a duplicate MAC address. Or you may be locally administering your MAC addresses, where you’re assigning MAC addresses to a device, and you may accidentally duplicate that MAC address across two different devices.
One great way to see if this is a problem is to take a packet capture. You should be able to see duplicate devices fighting over an address resolution protocol, where the MAC address keeps changing for a single IP address. You might also be able to confirm a device’s MAC address at your machine by looking at your ARP cache. You can ping a device, look at your ARP cache, and see what MAC address is associated with that particular IP address.
It would probably be more common to have a duplicate IP address than having a duplicate MAC address. This is certainly common if you are statically assigning devices because it’s very easy to accidentally assign the same IP address to two devices.
This might also be something associated with DHCP. If someone configures static IP addresses inside of that DHCP server, or they set up multiple DHCP servers inadvertently, then you could be assigning duplicate IP addresses from two separate DHCP servers.
You may find that two devices that are configured with the same IP address will have intermittent connectivity as they both try to communicate at the same time. But it’s more common these days to have an operating system already detect that that IP address exists on the network. And when you start up a device, it will disable itself because it doesn’t want to create a duplicate IP address.
If a duplicate IP address has occurred, then we want to perform some fundamental troubleshooting. First, we need to check the IP addresses for all of our devices and make sure all of them are unique. And before we configure a static address on a device, we may want to ping the network to make sure that address isn’t currently in use.
If we do ping an address and get a response, we can look at our ARP table and see what the MAC address is of that device. We can, then, go to our switch, have a look at our MAC address table, and see what interface on the switch that device might be connected to. If we’re concerned there may be duplicate DHCP servers, then we could take a packet capture, watch the DHCP process, and see if we’re getting responses from multiple DHCP servers.
If you’re running an application that takes advantage of multicast, or you’re doing some type of streaming video on your network that uses multicast, then you may run into cases of multicast flooding.
This is because the destination IP address for multicast is a multicast IP address. It’s not an actual device on the network. So your switch doesn’t know where to send multicast traffic. So by default, it will send traffic to every switch port on that switch. This is sometimes referred to as multicast flooding.
Of course, this means that every device that needs that multicast traffic will receive the multicast communication. But it also means that devices that are not involved in the multicast communication also receive that multicast traffic. This will consume unnecessary resources on devices that don’t need to receive that multicast traffic. And of course, it’s going to consume bandwidth and processing time on your switch.
To more intelligently forward this multicast traffic, we want to use IGMP snooping. IGMP is the Internet Group Management Protocol. And it’s used by routers and switches to intelligently forward where this multicast traffic should go.
For example, on your switch, you may want to enable IGMP snooping on different interfaces or different VLANs that need that functionality. This will allow the switch to monitor for multicast traffic for that VLAN and intelligently forward that traffic to only the devices that need to receive multicast traffic.
We often think of a network communication as being a single wire between devices. But the reality is that we create redundancies within our network, so that if we lose a router or lose a switch, our network continues to operate. Unfortunately, this also means that we might create an asymmetric route. This is traffic that would go one path on the way out of the network and a completely different path on the way inbound.
If this is engineered properly, then an asymmetric route isn’t going to be a problem. But if we accidentally create an asymmetric route, you may find problems with firewalls or other state-based devices not properly understanding the state of a flow and dropping that traffic because it was using an asymmetric route.
You can sometimes use traceroute to help identify where one of these asymmetric routes might be. For example, here is a traceroute I took from my machine to one of the quad nine DNS servers. That’s 22.214.171.124. And you can perform the same traceroute on your computer and see if you get similar results. Of course, with traceroute, it performs its check three times and gives you the list of the routes on each of those three attempts.
You can see that route numbers one, two, three, and four had a single router associated with them. But router five had one IP address for the first two attempts and a different IP address for the third. On route six, is a similar two router issue.
I had one router on the first attempt, a different router on the second, and then back to the original router on the third. It’s certainly possible, based on this traceroute, that we have an asymmetric route, where information may go outbound using one router and inbound using a completely different IP address.
If this is by design, then you’ll want to configure your firewalls and other state-based devices to take into account the asymmetric route. Or if this is a mistake, we’ll want to go back to the routers on the inbound and outbound side and make sure they’re forwarding traffic to the correct IP address.
If you’re not running spanning tree protocol on your network, then you are susceptible to a switching loop. A switching loop is when switches, which are forwarding traffic based on a destination MAC address, create a loop on the network that constantly sends traffic back and forth between two different switches. Everything gets sent between these devices, unicasts, broadcasts, multicasts, and anything else that may be on the network.
With IP addresses, we have a time to live that would recognize a loop. But with MAC addresses, there’s no mechanism to recognize any type of loop on the network. If you create a loop between switches, the traffic will circle around that loop until you remove the loop from the network.
On a network with two switches and a single link between those switches, we don’t have a loop. We can have one device send traffic to another device over that link. And that device would send that information back over the same connection.
But if we inadvertently connect another wire between those switches, we’ve, effectively, created a loop. And if we introduce traffic into this network, it will move to one switch, go down the loop, move to the next switch, and begin looping through over and over again until we remove that second link from the network.
If you’re managing a layer 3 network, or routed network, you also have to be aware of routing loops. A routing loop is when one router is configured with the best next hop to be this router. And then this router thinks the best next hop is this router. And they continue to send information back and forth between each other over and over again until the time to live equals zero, and the packet is discarded.
If you’re managing a statically routed network, this can be very easy to misconfigure. And you may find making a single configuration change will set up a loop on your network. Fortunately, you can very easily see this if you perform a traceroute because you’ll have a route that goes to 10.1.10.1. Then the next hop is 10.2.10.2. To Then the next hop is back to 10.1.10.1. And then the next hop is back to 10.2.10.2 and so on.
You can see, it’s very easy to recognize when a loop has occurred because this will continue to show in our list of routes until it hits the maximum for that traceroute. To resolve this problem, we’ll need to log into the routers, look at the configuration, and make sure that our routing tables are pointing to the correct next hop.
Not only can we misconfigure where the next hop might be, there may be times when we completely forget to add a next hop. In that case, we would have a missing route, where we’re sending traffic into a router and the router has no information on where that traffic might go. If there’s no destination, or next hop, in the routing table, the router will discard that packet.
You might get a notification about this kind of missing route because an ICMP host unreachable message might be sent back to the original station. For example, if you perform a ping from one device to another, you may find that incorrect routes along the way may cause that to be sent back to you with a destination host unreachable.
As with most things associated with routing tables, the key to troubleshooting is to look at the routes on every single router along the way and make sure that you’ve configured the proper routes in both directions for both ingress and egress traffic.