Load balancers provides many options for keeping services available and efficient. In this video, you’ll learn about load balancing, scheduling, affinity, and active/passive load balancing.
If you’re watching this video on the Professor Messer website, then you are taking advantage of load-balancing technology. Load balancing is a way to distribute the load that is incoming across multiple devices, thereby making the resource available to more people than having a single server in place. This is implemented by having multiple web servers behind the scenes. And when you access professormesser.com, your query is distributed to one of those available servers.
This all happens without any knowledge from the end user. And it’s all done automatically behind the scenes. This type of load balancing could scale up into very large implementations. And some of the largest networks in the world are using load balancing for their web servers, their database servers, and other services that they provide on their infrastructure. One nice feature of load balancing is because there are so many servers in place, if one of those servers happens to fail, the load balancer recognizes the failure and simply continues to use the remaining servers.
The end users who are accessing those devices have no idea that a server has failed. All they know is that the service remains available, and everything is working normally. Obviously, the primary function of a load balancer is to balance the load. And you can configure the load balancer to manage that load across multiple servers. You can also set up the load balancer so that some of the TCP overhead is offloaded onto the load balancer, rather than down to the individual server.
This keeps the communication between the load balancer and the servers very efficient and maintains the speed of the communication. This might also be used for SSL offloading. The encryption and decryption process used for SSL or TLS is one that uses additional CPU cycles on a device. So you may find that the load balancer is the one performing that SSL encryption and decryption in the hardware of this device. And it is instead sending “in the clear” information down to these individual servers that are all within the same data center.
This load balancer might also provide caching services. It will keep a copy of very common responses. And when you make a request to one of these servers, and the load balancer already has that response in the cache, it can reply back to you on the internet without ever accessing any of the local servers.
Load balancers can also provide “quality of service” functionality so that certain applications would have a higher priority than other applications running on these same servers. You might also use the load balancer for content switching. This means that certain applications might be switched to individual servers. And other applications might be switched to other servers within that same load balancer.
There are many ways to configure the operation of a load balancer. One of those is in a round-robin form. The first user communicating through that load balancer would be distributed to the first server or server A. The second user who’s communicating through that load balancer would be round-robin or distributed to the next server on the list. And then the third person coming through the load balancer would be sent to the third server and so on. This round-robin process assures that all of the servers are going to get exactly the same amount of load across everyone communicating into the network.
There are also variants to this round-robin process. For example, a weighted round-robin might prioritize one server over another. So perhaps one of the servers would receive half of the available load. And the other servers would make up the rest of that load. With dynamic round-robin, the load balancer is keeping track of the load that is occurring across all of the servers. And when a request comes into the load balancer, it will send the next request to the server that has the lightest load.
And of course, this is a staple for active/active server load balancing, where all of these servers are active simultaneously. And if one of these servers happens to fail, all of the other servers can then pick up the load and continue to operate without anyone on the outside knowing that there’s a problem. Most of the time, the load balancer is going to distribute an incoming load across whatever it happens to be available on the inside of the load balancer.
But there may be certain applications that require that any time a user is communicating that they’re always communicating to exactly the same server. In those instances, our load balancer needs to support affinity. Affinity is defined as being a kinship or a likeness. But in the world of load balancers, it means that a user communicating through that load balancer will always be distributed to the same server. This is usually tracked using a session ID or a combination of variables, such as an IP address and a set of port numbers.
If those same IP addresses and same port numbers are in use, then that communication will always go to one particular server. For example, our server here at the top will communicate to the load balancer. The load balancer will assign that session to server A. The second user communicating through that load balancer may be assigned to server B. If that first user then sends more information on that session through the load balancer, the load balancer will recognize that that is the same session from earlier and send that session down to server A.
The same thing will occur if the second user sends information in. The load balancer will recognize that that is an active session and sends that down to the server B, which was the original server used by that second user. Our load balancer might also be set up in an active/passive mode, where some of the servers are actively in use, and other servers are on a standby mode. This means if one of our active server fails, we have other devices that could suddenly move into an active mode and begin providing services through that load balancer.
For example, we have a user communicating through the load balancer to server A. And as long as server A is communicating, we’re just fine. But there may be times when server A is no longer available. And if server A happens to have a failure, we might have servers on standby– like server C– that could suddenly turn themselves on and start providing services through that load balancer. That means the next time that user communicates through the load balancer, they’ll automatically be assigned to the server C instead of the server A, which is no longer available.