Having an app crash in the middle of shopping is frustrating.
There can be many reasons for the crash, including improper workload management. A cloud-based application quickly becomes overloaded with a single server behind it, leading to service outages, downtimes, and even crashes.
Too many users simultaneously accessing the same data or software within a company’s IT infrastructure causes a crash, especially if the application traffic isn’t properly balanced. But, how do companies balance the traffic? Enter load balancing software.
What is a load balancer?
A load balancer is a network device or software that facilitates load balancing – the process of methodically and efficiently distributing network traffic over multiple backend servers in a server pool. It reduces application management and maintenance overheads by acting as a reverse proxy. This makes applications more reliable and capable of handling concurrent users and network sessions.
A load balancer acts like an invisible facilitator to users. They can’t see but have their requests routed through it all the time. For example, a user shopping for Christmas will browse different products before choosing one or more. When the user clicks on a product, the client device sends an information request to backend servers. A load balancer receives this incoming request and routes it to all capable servers.
This load distribution is crucial for efficient data throughput and optimization of application delivery resources. It also improves performance by preventing a single server from becoming overloaded. Furthermore, a load balancer checks a server’s ability to handle requests, removes unfit servers, and even creates new virtualized application servers as needed.
Functions of a load balancer:
- Adds or removes servers on demand
- Reduces user request response time
- Distributes client requests efficiently
- Identifies and blocks malicious content
- Automates disaster recovery on backup sites
- Makes the computing environment more resilient
- Auto-detect server failures and redirect client requests
- Allows server maintenance without impacting current operations
- Directs traffic to capable and online servers to ensure performance and reliability
Today, load balancers are usually contained in application delivery controllers (ADCs) that optimize the data flow between two entities and improve application performance. This resulted from their continued evolution of load balancers originally developed to ensure predictable, scalable, and highly available application services.
Load balancing has come a long way since its inception. Businesses in the early days of commercial internet relied on single PC-based servers to handle web traffic. But, these servers couldn’t handle the increasing demands of web applications.
Then came the domain name system (DNS) round-robin technology. It performed well in terms of sequential user distribution using the name as a virtualization point. However, it couldn’t improve high availability and dynamically remove malfunctioning servers.
This led to the birth of purpose-built load-balancing solutions. These solutions are directly built into the operating system (OS) or the application software and communicate using the cluster internet protocol (IP) instead of physical IP addresses. A service session would start only after a server responded to the connection request and redirected it to a physical IP address.
These proprietary load balancing solutions improved scalability, predictability, and high availability. The real challenge was the constant communication among clustered members. More servers meant more inter-server communication and processor utilization. This increased network traffic and eventually impacted end-user traffic.
Then came the era of network-based load balancing hardware. These were essentially application-neutral appliances located outside the application server. They used bi-directional network address translation (NAT) to forward users from virtual server addresses to the most capable real servers.
The network-based devices also made it possible to mask application server identity and take a server offline for maintenance without user disruption. ADCs you see today come from these network-based load balancers.
After starting as hardware appliances, ADCs quickly became available as both virtual machines and software. Software-based ADCs met traffic spikes, ensured security, and offered more flexibility. ADCs are incomplete without the underlying load balancing technology crucial for handling application delivery.
Importance of load balancing
High-traffic websites receive millions of concurrent user requests for instant application data return. This massive workload, if left unchecked, can overwhelm a single server. An overloaded server may fail to deliver responses or become unavailable, resulting in business loss. This is where load balancers come in.
A load balancer coordinates with servers for efficient distribution of user requests. It ensures that no server is overloaded or not processing any traffic. Moreover, load balancing devices can detect malfunctioning servers and redirect traffic to operational servers. Some load balancers with advanced algorithms can even predict whether a server is likely to get overloaded soon and route traffic accordingly.
Businesses receiving massive traffic have different stakeholders as well as customers visiting their apps or websites. The ability to support this increased demand is key to continued patronage from stakeholders and customers. This is why businesses add load balancers to their cloud infrastructure for improved availability, responsiveness, and scalability.
¿Quieres aprender más sobre Software de Balanceo de Carga? Explora los productos de Balanceo de carga.
Types of load balancers
Load balancers are primarily responsible for managing server load balancing (SLB). Depending on their capabilities, they can also offer additional functionalities: security, authentication, and geographic storage. These varied functions, along with configurations, help categorize load balancers into seven types.
1. Network load balancer (NLB)
A network load balancer distributes traffic based on network variables: source IP, destination IP, source port, destination port, and IP protocol. It’s also known as layer 4 (L4) load balancer since it works in the fourth layer of the open systems interconnection (OSI) model.
A network load balancer doesn’t consider application-level parameters such as content type, cookie data, custom headers, user location, or application behavior. It’s context-less and directs traffic based on network-layer information contained in packets.
2. Application load balancer (ALB)
An application load balancer distributes incoming requests based on multiple application-level variables. It’s also known as layer 7 (L7) load balancer because it ranks highest in the OSI model.
An application load balancer is content-aware. It makes load balancing decisions on content payload elements such as uniform resource locator (URL), hypertext transfer protocol (HTTP) header, and secure sockets layer (SSL). This awareness allows an application load balancer to control server traffic according to application behavior and usage.
3. Global server load balancer (GSLB)
A global server load balancer or multi-site load balancer uses dynamic DNS technology to distribute traffic across servers in different locations.
Global server load balancing acts as a DNS proxy and uses real-time load balancing algorithms to respond. It maximizes performance and minimizes latency by allowing users to connect to a server geographically closer to them. In addition to load balancing, it monitors server health through configurations and facilitates recovery in a server disaster.
4. Hardware load balancer device (HLD)
A hardware load balancer device is a physical device located on-premises routing web traffic to different servers. It’s usually implemented on the transport layer (L4) and the application layer (L7) of the OSI model.
HLDs either randomize the traffic distribution or consider various factors such as the available server connection, resource utilization, and server processing power. These load balancers have no dependencies, and you can install them in data centers. Despite handling a huge volume of traffic, HLDs offer limited flexibility and are expensive.
5. Software load balancer (SLB)
A software load balancer uses one or more scheduling algorithms to distribute incoming requests across the servers. SLBs are flexible, and you can easily integrate them into virtualization orchestration solutions. Since SLBs run on commodity hardware, they are also less expensive.
6. Virtual load balancer (VLB)
A virtual load balancer is a hardware load balancer that works on a virtual machine. VLBs use virtualized application delivery controller software to distribute network traffic load. Large organizations with constant traffic spikes often use these load balancers. Common challenges of VLBs include limited automation abilities, less scalability, and a lack of centralized management.
7. Gateway load balancer (GLB)
A gateway load balancer deploys, scales, and manages virtual appliances such as firewalls, deep packet inspection systems, and intrusion prevention systems. This load balancer uses a single entry and exit point network gateway to distribute the traffic. A GLB operates on the third layer of the OSI model.
How does a load balancer work?
A load balancer typically sits between a client and the hosts that provide client services. Suppose a load balancer is connected to a virtual server and a cluster of service points. It’s also connected to hosts for processing returning traffic to the client. Here’s how this load balancer works during a transaction:
- Connection attempt: A client attempts to connect to services behind a load balancer.
- Connection acceptance: The load balancer accepts the incoming request. It chooses the most suitable host but routes the request only after changing the destination IP based on the selected host’s services.
- Host response: The host accepts the connection and responds to the client via the load balancer.
- Return packet intercept: The load balancer intercepts the return packet and changes the source IP to the virtual server IP before forwarding the packet to the client.
- Return packet receipt: The client receives the return packet and continues to make other requests.
Load balancing algorithms
Load balancing efficiency stems from load balancing algorithms.
A load-balancing algorithm is a logic that a load balancer uses to process incoming data packets and distribute loads among servers. Selecting the right algorithm is key to reliability, performance, and redundancy.
Load balancing algorithms analyze incoming traffic and use various parameters to distribute it. Based on these parameters, load balancing algorithms are divided into two types: static and dynamic.
Static load balancing algorithm
A static load balancing algorithm distributes all traffic evenly across servers. It’s static because it doesn’t consider the state of the current system for load shifting.
This algorithm routes traffic based on knowledge of server resources. Moreover, a static load balancing algorithm requires load balancing tasks to perform solely. Other devices cannot perform these tasks. This algorithm is ideal for systems with low load variations.
Below are some of the most commonly used static load balancing algorithms.
Round-robin
Round-robin load balancing is one of the simplest and most widely used algorithms for distributing incoming requests to servers. It goes through a list of servers and forwards a client request to each server in the list. At the end of a list, it repeats the same process until all requests are distributed. The round-robin algorithm assumes that all servers are available with the same load balancing capability.
Two variants of the round-robin algorithm are:
1. Weighted round-robin: Weighted round-robin is an advanced configuration of round-robin. It assigns a weight to each server that usually denotes a server’s load handling capacity. The higher the weight, the more requests a server receives.
Suppose two servers, X and Y, have weights 2 and 1, respectively. A load balancer will distribute two requests to X for every request it sends to Y.
2. Dynamic round-robin: A dynamic round-robin algorithm dynamically assigns a weight to each server. This weight varies depending on the current load and idle capacity of a server.
Source IP hash
The source IP hash load balancing algorithm generates a hash key by combining the source and destination IP addresses. Every hash key is unique to a specific server and allocates requests to that server only. This key can be regenerated and ensures that a client directs to the same server if a session crashes.
Randomized static
A randomized static load balancing algorithm randomly assigns tasks to servers. It can even compute a random permutation if it knows the number of tasks in advance. Since processors already know tasks assigned to them, a server doesn’t need a distribution master in such cases. This algorithm performs well for a minimal task size.
Central manager
The central manager algorithm distributes the workload with a central node as a coordinator, choosing the processor with the least load. Having load information on all servers helps the manager select the least loaded processor. This algorithm uses a lot of inter-process communication, which can quickly become a bottleneck.
Threshold
The threshold algorithm assigns incoming traffic to new servers. The servers are selected locally, eliminating the need for communicating via remote messages. Each server keeps a copy of the processor load categorized into three levels:
- Under loaded: load <tunder
- Medium: tunder ≤ load ≤ tupper
- Overloaded: load > tupper
Dynamic load balancing algorithm
A dynamic load balancing algorithm makes load balancing decisions based on node performance information. This algorithm distributes the workload at runtime and finds the designated appropriate load on the lightest server in a network. Dynamic load balancing algorithms are usually complex but offer better performance and tolerance.
The most commonly used dynamic load balancing algorithms are:
Least connection
This load balancing algorithm distributes client requests to servers with the least number of active connections at the time of receiving a new request. For longer-lived connections, the least connection algorithm considers the active connection load while distributing requests.
Weighted least connection
Weighted least connection, an advanced configuration of the least connection algorithm, uses server weighting and active connections to distribute client requests. Once an administrator assigns a weight to each server based on its traffic-handling capacity, weighted least connection considers various application server characteristics for load distribution. This algorithm requires more computing times but ensures efficient traffic distribution.
Weighted response time
This load balancing algorithm leverages applications servers’ response time for load distribution. The earlier the response, the faster a server receives the next request. This algorithm calculates application server weights by connecting response time to a health check.
Resource-based (adaptive)
With this algorithm, the resources available to each server drive load distribution decisions. Each server contains an agent or specialized software that measures available CPU and memory metrics. This agent reports to the load balancer for efficient request distribution. Another resource-based algorithm that uses software-defined networking (SDN) controllers and network layer knowledge for traffic distribution decisions is a resource-based (SDN adaptive) load algorithm.
Dynamic load balancing algorithms may seem to outperform static ones, but they’re more complex since they account for the current system state. This complexity often leads to overhead and poor load-balancing decisions. This is why it’s crucial to get the full picture of a load balancing mechanism before choosing a load balancer.
Static load balancing | Dynamic load balancing | |
Load distribution | At compile time | At run time |
Stability | More | Less |
Complexity and cost | Less | More |
Reliability and response time | Less | More |
Predictability | Easy | Difficult |
Processor thrashing and state woggling | None | Considerable |
Resource utilization | Less | More |
Communication delay | Lesser | More |
Load balancer benefits
Unresponsive applications or web resources aren’t only frustrating to users but also affect potential business. Load balancing helps you streamline resource usage, response time, and data delivery in high-traffic environments. Below are some of the major benefits of a load balancer.
High performance
A load balancer reduces the loading speed and response times by routing incoming traffic to capable servers. Intelligent load balancing reduces the load on servers, optimizes the user experience, and improves the performance of your website or application. Here’s how a load balancer improves server performance:
- SSL offload: Removes servers’ overhead and make more resources available for your web application or resources
- Traffic compression: Compresses website traffic to optimize user experience
- Traffic caching: Delivers content quickly with retained copies of frequently accessed web elements
- HTTP 2.0 support: Helps communicate with clients and load websites faster
Redundancy
A malfunctioning server disrupts services. Load balancers use built-in redundancy to handle hardware failures and mitigate their impact on website uptime. They automatically reroute client requests to working servers during a server failure. Here’s how load balancers further ensure redundancy:
- Continued service: Use working servers to route client requests when a server fails
- Workaround busy servers: Identify busy servers and redirect traffic to less occupied servers
- Highly available site: Deploy load balancers in pairs so that one can take over when the other fails
- Business continuity: Detects site outage and redirects visitors to a pre-set alternative website
Scalability
Traffic peaks can become a nightmare without a mechanism to manage the increasing load. A load balancer adds a physical or virtual server in the traffic distribution process to accommodate this demand. Scalability ensures uninterrupted service. Here’s how scalability helps:
- Business continuity: Redirects traffic to alternate sites during site outages
- Website capacity: Handles spikes in traffic with additional servers
- Cloud auto-scaling: Meets varying demands of cloud-hosted websites
- Capacity addition: Avoids disruptive upgrades
Security
Load balancers also protect data with additional network security layers. A load balancer uses the following features to boost your website or application’s security:
- Web application firewall (WAF): Protects website or app from emerging threats and runs daily rule updates
- User access authentication: Protects web resources from unauthorized access
- Threat detection: Identifies and drops denial of service (DDoS) traffic
Load balancer challenges
Some of the most common challenges with a load balancer stem from unsuitable configuration options. Some of these challenges are:
- Closing silent connections: A default configuration in load balancers closes transmission control protocol (TCP) connections that are silent for a while. While this is apt for load balancers handling web server connections, auto-closure can cause a client’s reconnection attempts to fail.
- Connection pooling: Load balancers often come with a connection pooling feature that keeps the server-load balancer connection alive. Multiple clients use these connections. The problem with connection pooling is that once the multiplexed connection closes, all client-side connections also close.
- Expensive TCP connection: Creating a new TCP connection for every request is expensive and adds to the request processing time. These new connections also significantly increase the traffic between the load balancer and the server. It’s best to configure a load balancer to reuse TCP connections for requests from the same client.
- TCP retransmission timeout: The load balancers using failover suffer from TCP transmission timeout and cause long latencies for clients. This is because unavailable servers keep existing client connections open and buffer data before closing them. You can avoid this by modifying the host server’s TCP retransmission timeout or configuring a load balancer to shut down unhealthy connections.
Reverse proxy vs. load balancer vs. CDN vs. clustering
A reverse proxy accepts a client request and forwards it to a server that can fulfill the request. On the other hand, a load balancer distributes client requests among a pool of servers. Both:
- Improve efficiency
- Act as dedicated, purpose-built devices
- Return the server’s response to the client
- Contribute to a client-server computing architecture
- Mediate the communication between client and server
The only difference between reverse proxy and load balancer is that the former is used for a single web or application server and the latter for multiple servers. In addition to offering increased security, load balancers offer web acceleration that reduces response time.
A content delivery network (CDN) is a network of servers in multiple geographical locations. CDN delivers data to users from the location closest to them. CDNs and load balancers fulfill similar roles: efficient data distribution and uptime maximization.
The main difference between CDNs and load balancers is that CDNs distribute content over a large geographic area, while load balancers distribute client traffic across a network of servers.
- Type of server: Load balancing can work with different types of servers, while clustering needs identical servers within a cluster.
- Management: A controller can manage clusters, but load balancing requires additional networking expertise.
- Operations: Load balancers don’t depend on destination servers. Clusters depend on node agents and managers for communication.
A growing organization has to identify ways to meet evolving server needs. The choice of technology varies with the problem you’re trying to solve. Regardless of the instrument you choose, it should be inexpensive and flexible to change.
To summarize, you should use:
- Reverse proxy to speed up traffic flow by caching commonly used content and compressing data (inbound and outbound)
- Load balancer to improve application performance by increasing application capability and reliability
- CDN to deliver content quickly to users with a network of servers located across geographies
- Clustering to ensure high availability by allowing other servers to take over in case of an outage
Load balancer use cases
A load balancer primarily distributes incoming traffic among backend servers, but it can also do a lot more. Some of the notable use cases of a load balancer are:
- Horizontal scaling: Scaling traffic is a common concern for growing organizations offering web services. There are two types of scaling: horizontal (distributing traffic among multiple servers) and vertical (moving applications to a powerful server to meet increasing demands). Load balancers facilitate horizontal scaling and make your website or application more reliable.
- High availability: High availability is key to reducing downtime and optimizing system reliability. Load balancers eliminate single points of failure and help you to achieve high availability. They automatically identify non-functional servers with health checks and remove them for better availability.
- Blue/green deployments: Suppose you want to test your software thoroughly before deploying it to production infrastructure. First off, you need to make sure everything’s working before deployment. Thanks to the blue/green deployment technique, you can easily perform these tests with a load balancer. You can also switch back to the old version if a deployment fails.
- Canary deployments: Canary deployments test a new version of an application for a subset of users and update the rest of the server pool accordingly. If you see too many errors after adding a canary server to the pool, you can abort the deployment. When there are no errors, you can continue deploying updates to the rest of the pool.
- A/B deployments: A/B deployments help you to make informed marketing and development decisions. Load balancers allow you to add B servers to the existing A server pool. Once done, you can get meaningful insights from the monitoring and logging infrastructure.
Load balancing software
Efficient load balancing depends on choosing the right software. If you’re looking to handle high-volume traffic by distributing client requests across incoming servers, let the load balancing software do the heavy lifting.
To be included in this category, a software product must:
- Monitor web traffic and distribute resources
- Scale infrastructure workloads to balance traffic
- Integrate with or provide failover and backup services
*Below are the top five leading load balancing software solutions from G2’s Winter 2021 Grid® Report. Some reviews may be edited for clarity.
1. F5 NGINX
F5 NGINX offers a cloud-native and easy-to-use reverse proxy and load balancer. It comes with varied capabilities such as security controls, DNS system recovery, advanced monitoring, session persistence, Kubernetes containers, and representational state transfer (REST) application programming interface (API).
What users like:
Simple, easy-to-use, and powerful HTTPS server. The load balancing is one of the best I've ever seen. I currently only use the free version, but it does everything I need it to do. It's lightweight and doesn't use many resources.
- F5 NGINX Review, Joseph S.
What users dislike:
Less community support and documentation than other web servers like Apache, but probably more than enough depending on your expertise and use case. Not as many modules or extensions as Apache. It could be difficult to understand how to configure it initially. But once you get the hang of it, it’s pretty simple to use.
- F5 NGINX Review, Amogh H.
2. Kemp LoadMaster
Kemp LoadMaster offers load balancing solutions for high performance and secure delivery of application workloads. It’s known for simplified deployments, flexible licensing, and top-rated technical support.
What users like:
The LoadMaster product is incredibly easy to use for all the basics. We have managed to easily set up load balancing rules for web and internal systems, but there's so much more it can do as a load balancing product. These features can be quite tricky to set up, but the support is second to none. So, any issues setting up a service, submit a ticket, and within an hour, someone comes back, and remote sessions get the problems sorted. No waiting for days for a reply. This is what really draws us to renew each year – you realize how important good support is. Setting the LoadMasters up from a VM image takes about 15 minutes for a HA setup.
- Kemp LoadMaster Review, Daniel S.
What users dislike:
Web UI. It's unpolished and not intuitive. For example, real servers are not configured under the "Real Servers" section. It's difficult to find the settings, and settings for related things are spread across several sections. Inconsistent placement of the buttons, inconsistent field names: sometimes all caps like "Connectivity Timeout,” sometimes mixed case like "Retry attempts" just below it. Lack of support for Duo 2FA. Much of the online documentation and tutorials are out of date.
- Kemp LoadMaster Review, Peter K.
3. Azure Traffic Manager
Azure Traffic Manager offers a cloud-based load balancing service designed to ensure high availability, increase responsiveness, provide usage-based insights, and combine on-premises and cloud systems.
What users like:
Traffic managers allow redirecting users to appropriate endpoints based on different settings. We are using it to route users to national versions of the site, depending on the users’ location. It allows users to see information in their native language and minimizes the response time of web pages because national sites are located in the nearest datacenters. The main feature is that the entire routing is performed in the background, and all users can use a single URL to access the site regardless of their location. The second great feature is the failover option, so the site remains available with the same URL regardless of its current location. We are using it for on-premise sites where Azure Site Recovery was configured for servers. URL remains live even after failover to Azure.
- Azure Traffic Manager Review, Arthur S.
What users dislike:
It increases the initial response time for the site because of routing activities. Even if users are redirected to the closest site, the initial request will go to the Azure region where the traffic manager is deployed.
- Azure Traffic Manager Review, Michael L.
4. AWS Elastic Load Balancing
AWS Elastic Load Balancing distributes incoming application traffic to improve application scalability. It's known to offer high availability, robust security, and automatic scaling features.
What users like:
ELB is highly available and allows traffic routing to other servers instantly even if the primary one is down. It isn't chargeable. ELB works well with AWS auto scaling (helps scale up/scale down at run time without any impact). We can integrate ELB with Route53 and cloudfront to serve the request on time from different location to avoid latency. It's secure and easy to install as well.
- AWS Elastic Load Balancing Review, Anshu K.
What users dislike:
Health checks take a little longer than expected at times and this can cause issues if you don't set up health check configuration carefully.
- AWS Elastic Load Balancing Review, Joey D.
5. Micro Focus Silk Performer
Micro Focus Silk Performer optimizes application performance with realistic and powerful load and stress testing. It offers end-to-end diagnostics and enables apps to tolerate increased usage.
What users like:
Application analysis and the ease of launching any required size peak load functionality testing. It's easy to build tests using Micro Focus Silk Performer for production quality improvement.
- Micro Focus Silk Performer Review, Debra M.
What users dislike:
Scripting was difficult initially. Many functions look similar but it's difficult to assess the right one.
- Micro Focus Silk Performer Review, Rajesh H.
Improve uptime and load distribution
Today, DevOps need to ensure that applications meet the increasing traffic demands with minimal user disruption. They leverage load balancing capabilities such as network load distribution, uptime improvement, server failure detection, and back-end server load minimization to keep the server functional. Load balancers are crucial for application scalability, availability, and security.
Looking to add load balancers to your network? Find out more about virtual private servers and whether you should choose them to store resources.

Sudipto Paul
Sudipto Paul is a Sr. Content Marketing Specialist at G2. With over five years of experience in SaaS content marketing, he creates helpful content that sparks conversations and drives actions. At G2, he writes in-depth IT infrastructure articles on topics like application server, data center management, hyperconverged infrastructure, and vector database. Sudipto received his MBA from Liverpool John Moores University. Connect with him on LinkedIn.