How does IP get our data from one side of the network to the other? In this video, you’ll learn about the fundamentals of IP and how IP addresses and port numbers are used on our modern networks.
The data networks that we’ve created allow us to move massive amounts of information from one side of the globe to the other. But underneath the surface, there’s a very simple process that occurs every single time you want to move data from point A to point B. In the non-networked world, whenever we need to move our things from one place to the other, we use a shipping truck or some type of moving van to be able to move it from one location to another.
This is exactly the process that we use on a network, except instead of a moving van, we’re going to use IP. Of course, we need to have some type of road for this moving van to ride on, and our roads are effectively our ethernet networks, our cable networks, our DSL networks, and other types of Wide Area and Local Area connections. The truck that we put on these electronic roads is IP, or the “internet protocol.” Inside of the Internet Protocol, we have additional data that we’re putting inside of it. So just like inside of a moving van, we will put a box inside that stores our personal information.
And inside of the box is even more of our personal information. For example, in the kitchen, we might have glasses that we put into one box. We put that one box into our IP moving van, and we send that IP moving van to the other location. This is the process that occurs every time we want to be able to send information from one device to another over the network.
If we were to look at the way that IP operates, we would start with the very basics of an Ethernet frame. With an Ethernet frame– in this case, one that’s communicating between a client and a server– we have an ethernet header and an ethernet trailer. And inside of that header and trailer, we put the entire payload that will be sent across the ethernet network. Of course, inside of that ethernet payload, we could be sending anything. But the most common form of communication inside of that ethernet payload is IP, or the Internet Protocol. That means we’re going to have an IP header, and inside of that IP header will be the IP payload.
The IP payload of course, is more than just one large group of data. Inside of that IP payload could be TCP or UDP data– in this particular example it’s TCP data, which means there’ll be a TCP header and some TCP payload. And since we’re sending information between the client and, in this particular case, a web server, that TCP payload holds HTTP data. In our description of IP being the moving truck and the TCP or UDP data being the box that’s inside of the moving truck, we’re really referring to sending a single box inside of this single truck, even though the pictures that we’re showing here are showing multiple boxes.
Inside of this truck, we have UDP boxes or TCP boxes. That means that, generally, there are two different ways to send this data from place to place. And this different type of communication will depend on the application that you’re using. We sometimes refer to this TCP or UDP protocol as an OSI Layer 4 protocol, which refers, of course, to the transport layer. By using TCP and UDP inside of this truck, we’re able to perform multiplexing, which means that we can communicate using multiple applications simultaneously over the same network by using these different types of protocols.
TCP stands for the Transmission Control Protocol. This is a type of protocol that is very formal. It is connection-oriented, which means there is a setup process before you send data. You would then send all of the data through the network, and every time you send data through the network, an acknowledgment is sent back to the origination station. You’ll sometimes hear TCP referred to as a “reliable delivery protocol.” That means that when TCP sends information, it expects to receive an acknowledgment back that that information was received correctly. This doesn’t mean that UDP is somehow less capable of sending this information across the network, but receiving this acknowledgment gives us information that verifies that the information was received on the other side.
We also add sequence numbers inside of these TCP headers, so if this TCP packet gets out of order as they’re being set across the network, we can put them in the right order on the other side. Or if we’re missing a particular packet, we can request just that packet to be transmitted and put everything back in its proper order. This also means that the receiver can manage the traffic flow by sending back very specific acknowledgment frames that tell the source device to either send more information or to slow down.
UDP is a much more straightforward protocol. It simply sends information from one device to the other. There’s no process of setting up the connection prior to sending the data, and there are no acknowledgments that are sent back from the receiving device to confirm that data has been received. When you hear UDP referred to as an “unreliable delivery method,” doesn’t mean that the data somehow doesn’t make it across the network. It’s because we don’t receive an acknowledgment when that data is sent across the network.
This also means that we don’t have a way to perform any error recovery, or retransmit individual pieces of information that may have been lost and not received by the other side. Because there’s not a communication acknowledgment sent back to the original sender, we have no way to manage the flow of data. The sending device simply sends all of the data to the destination, and hopes that all of that information will be received properly. There’s no way for the source to be able to understand if it should go faster or slower, because there’s no acknowledgment to any of this data.
So now we’ve got our moving truck. There is a box inside of the moving truck that has an address on it. In the case of our networks, it’s an IP address. In the case of an actual moving truck, it’s a physical address of a particular house. In the case of our IP moving truck, the truck looks at the destination IP address, and is able to drive to that particular location. Once the truck arrives at that IP address, it has to determine where this box is going. In the case of a delivery truck, each box has a particular room name written on it. One box may go to the kitchen, another box may go to the living room. In the case of TCP or UDP, we use port numbers to designate where these boxes will be delivered.
The port number is a number that is referring to a location of a service on that device. So it might be port number 443, or port number 25, and that box is handed off to the service that manages those port numbers. Sometimes you’ll see these port numbers and IP addresses combined together into a single series of numbers, and we commonly refer to that as a socket. For example, the socket of data being sent to a server consists of the server IP address, the protocol– which is TCP or UDP– and a server application port number. There might also be a client socket– the client IP address, protocol, and client port number would be part of that socket.
The port numbers that we use when we’re dropping these boxes off at these remote locations are often a well-known number. They’re often a permanent value that is always associated with that service. We refer to these as non-ephemeral ports, or non-temporary ports. That means these ports are relatively permanent, and they’re usually something that’s very well known. For example, on a web service the well-known port numbers used for HTTP are port 80, and for HTTPS it’s port 443. When you connect to a remote web server, it almost always will be using those port numbers, because those will be the ones that everyone else is expecting to connect to.
Usually these non ephemeral ports are in a range between 0 and 1,023, and they’re commonly associated with the service. This doesn’t mean that all services will only use port numbers between 0 and 1,023, but that is one of the most common ranges you will find for these permanent or non-ephemeral port numbers. Ephemeral ports are ports that are used temporarily, very commonly on a client device, to be able to communicate with these services, and those port numbers commonly range between 1,024 through 65,535.
And again, just because this happens to be the very common range that we use for ephemeral ports doesn’t mean that your device is using this specific range. If you happen to see port numbers that are in one of these ranges, you have to confirm that port number is being used as a permanent or non-ephemeral port number on that device, or is something temporary or ephemeral on that device.
These port numbers are really used to that we know where to send data. They can be any number between 0 and 65,535. And the TCP port numbers are only used for TCP. The UDP port numbers are only used for UDP. If you’re working with a service on a particular device, then it’s probably using a non-ephemeral port number that’s in our normal list from 0 to 1,023. But this isn’t always the case, and you should confirm what port numbers are used by your service.
These port numbers are only used to be able to communicate between devices. They’re not intended to be used as some type of security function. You could change the well-known, expected port number for a service to be something that no one is expecting, but it’s relatively easy for someone to perform a port scan on a device and determine what port number may be in use. So changing that port number to something that’s unexpected is not a way to provide any additional security.
In most cases, we would like these port numbers to be well known. That means that everybody recognizes the web servers are going to communicate in the clear over port 80, and they’re going to communicate over an encrypted channel using port 443. If you change the port numbers of your web server, then people trying to connect to that device using those well-known port numbers will not find a service at that port, and won’t be able to communicate to your server. It’s also important to remember that TCP port numbers and UDP port numbers are different sets of numbers. If you’re communicating to a web server, then you’re communicating over TCP 80 or TCP 443. You cannot connect to that web server over UDP port 80 and UDP port 443, because that web server is only expecting TCP data.
We mentioned earlier that we’re able to perform multiplexing using these protocols, which means we can communicate to a device using multiple applications simultaneously. This slide gives us a good example of this, where we have a client device with an IP address of 10.0.0.1, and we have a server device with an IP address of 10.0.0.2. This server is configured to be a web server, a Voice-over-IP server, and an email server. And the server is therefore able to communicate using TCP port 80 for the web server, UDP port 5004 for the VoIP server, and TCP port 143 for the email server.
If we were to look at these frames, we would see they were communicating over an Ethernet network. So each of these frames would have an Ethernet header and an Ethernet trailer. And as is the case for most of our network communication, all of these are communicating via IP. Inside of these IP packets is TCP data, containing HTTP information; we have UDP, that is sending the Voice-over-IP data; and we have another TCP packet, that has email data inside. We’re sending frames that look very similar to each other, but all of these frames have a different type of data inside of them. So it’s important that the server is able to differentiate between HTTP data, Voice-over-IP data, and email data. We’re able to do that by using those port numbers.
For example, our source address is going to be 10.0.0.1. Our destination is going to be 10.0.02. And you can see those listed in the IP packets that we have on the screen. We then have our TCP and UDP headers, and you can see that this first bit of HTTP data has a TCP source port of 3000. This is an ephemeral port that has been randomly assigned by the client device, and this client device is specifying that it’s communicating to a TCP destination port of Port 80, which is the web service on this device.
We have another packet that has a UDP source port of 7100. Again, a random number assigned by the client communicating to a well-known port number of 5004 on the server. And lastly, we have our email data that is communicating over TCP. The random ephemeral port used by the client device is TCP port 4407, and it’s communicating to a non-ephemeral, well-known port number on the server, a port 143, to send the email data.
By using this combination of IP addresses and port numbers, we’re able to send different types of data, very often to the same device, and have all of that information arrived properly at the destination service.