If you are securing a server farm, then you probably manage a group of load balancers. In this video, you’ll learn about active/active and active/passive load balancing.
<< Previous Video: Proxies Next: Access Points >>
As the name implies, a load balancer is designed to take a load of traffic and distribute it across multiple resources. And it does this without the end-user even realizing that it’s occurring. We often see this on large-scale implementations. You can have many web servers configured and have everyone visit one single URL. But in reality, their load may be distributed across multiple internal web servers.
Another nice capability of a load balancer is that it’s able to provide fault tolerance. If you have multiple web servers behind a load balancer, the load balancer is always checking in to make sure that those web servers are available. If a web server happens to go down, the load balancer recognizes that it’s no longer available and begins moving the load between the remaining web servers behind the load balancer.
Here’s how a load balancer manages this process. You might have a number of users on the internet, and they’re all hitting the load balancer, which then decides how to distribute that load across servers. A load balancer can also take some of the TCP traffic that would normally be set up and calling across all of these servers and instead set up a single TCP connection, making the network communication that much more efficient.
Some load balancers can also offload the encryption process. We know that SSL encryption adds an additional load to the web servers. So instead of having the servers handle the encryption and decryption, you can have all of that process occur on the load balancer.
These load balancers might also provide caching. So instead of requesting the same information from all of these servers, that information can be cached locally. And you won’t even have to make a request down to the web server.
Some load balancers allow you to configure quality of service. So some applications may get priority over others even though they may all be going to the same web servers. And lastly, a load balancer may be configured for content switching, which means that the load balancer could take one application and load it a certain way across servers. A separate application may be using the same servers, but may be using a different number of servers to provide that load balancing.
You have a lot of different options on how traffic is scheduled to go to different servers that are behind a load balancer. One type of scheduling is called round-robin scheduling where each server is selected in turn. And if round-robin scheduling is set for 1 to 1, then the first bit of traffic will go to Server A. The second bit of traffic through the load balance will be scheduled to Server B. The third type of traffic through the load balancer will be scheduled to Server C. And because this load balancer is scheduling in a round-robin method, the last bit will go to Server D.
There are also other ways to provide this round-robin functionality. You could wait the round-robin scheduling so that perhaps one server might get twice as much traffic than any of the other servers. So we might send twice as much to Server A than we do to Server B.
We can also have dynamic round-robin scheduling, which means the load balancer is going to monitor the load that’s occurring on these different servers. And if one server is more loaded than the others, it will use some of the other servers first. That way, you’re able to distribute the load as traffic is coming through the network.
We also refer to this type of scheduling as active/active scheduling. That means that all of these servers are active. And requests coming through the load balancer could use any of these active servers at any time.
One of the challenges we have with some web-based applications is that everything that occurs in that application should be occurring on the same web server. Many of these applications don’t have the concept of multiple servers being used simultaneously. And in this case, we use a characteristic of load balancers called affinity. In the non-technical world, we refer to affinity as a kinship or a likeness. And in the technical world, an affinity means that the load balancer will always use the same server for a particular user or a particular application instance.
An example of this would be a user that is communicating with this application through the load balancer. The load balancer assigns the affinity for this particular user to Server A. If another user wants to use the same application, the load balancer may assign that user to Server B. That means if that first user sends traffic back through the load balancer, it will recognize that particular session is in place from earlier and send that traffic back to Server A. The same thing will happen for the second user. The load balancer recognizes that it’s our second user and sends that traffic, again, back to Server B.
Some load balancers can be configured for active/passive load balancing. That means that some servers will be currently active and able to take requests, and other servers are on standby. We’ve marked these active servers with green, and the standby servers are red. If any one of these green active server fails, the load balancer will identify the failure and begin using one of the standby servers in its place. That means we could have a user communicating through the load balancer and have that load balancers assign that particular session to Server A. And all of the traffic from that user will go through the load balancer and to Server A.
If Server A fails, the load balancer is going to recognize that failure, set that server to be offline, and turn Server C on to be available. That way, the user can continue to use that application through the load balancer. The load balancer will now have additional servers available to serve that particular load. And it will wait until that server’s back online before allowing traffic to go back to Server A.