What is Load Balancing & How it Works (Complete Breakdown)

Load balancing ensures the optimal distribution of network traffic across backend servers. The core function of load balancing is to increase the performance and availability of web applications and APIs by routing requests in a manner that places the least strain on resources. Without a load balancer distributing requests across backend servers, some servers will be overloaded while others will be underutilized.

For example, if two web servers host a copy of the same website, then web traffic can be balanced between the two. Splitting the traffic between both servers ensures a single server does not bear the weight of all traffic and will not become overloaded. The two servers can support the website more efficiently, managing connections or workloads equally.

In this all-in-one introduction to load balancing, we unpack everything you need to know about load balancing including the benefits of load balancing, algorithms for distributing traffic, how load balancers work, the different types of load balancers, and more.

What is a load balancer?

A load balancer is a hardware or software appliance that manages incoming web traffic for servers that deliver web applications and services to clients. Using a distribution algorithm to sort incoming traffic, a load balancer designates connections across a fleet of servers. 

The main objective of a load balancer is to ensure that every connection request from a legitimate user can be served by a healthy, available, performant, and secure server. In meeting this objective, a load balancer carries out various functions, most notably reducing unnecessary strain so that one server does not get bogged down and face performance issues.

The strengths of load balancing lie in its ability to:

  • maintain balanced servers and prevent downtime,

  • handle fluctuations in traffic and scale easily by adding backend servers,

  • create responsive experiences for clients accessing web-based applications,

  • support seamless upgrading by redirecting traffic when servers undergo maintenance, and

  • act as a security layer with the load balancer sitting as a reverse proxy between the client and server, safeguarding your applications from the risks of direct connections.

While a load balancer typically handles web-based traffic, it can work for other networked applications, including FTP servers, databases, and cache servers.

What terms should you know?

Term

Definition

Application

An application is a computer software package that performs a particular function directly for an end user or another application. It is also referred to as an application program or application software.

Server

A server is a physical machine, a virtual machine or a software that provides functionality for other programs or devices known as "clients."

Client

Refers to a user, device, or other endpoint accessing resources from the upstream application.

Reverse Proxy

A reverse proxy, such as a load balancer, sits between the client and the upstream application and manages or acts upon incoming requests on behalf of the application.


Some reverse proxies that are not load balancers include caching servers, web application firewalls, and network traffic capture and tampering tools.

High availability

Refers to the continuous uptime of applications and services. “Available” means being accessible to the client. Applications that are almost always accessible are highly available.

Five nines

Five-nines availability is an uptime of 99.999%. This is a common quantified measurement of high availability.

Web traffic

The data passing between the client and the application.

Traffic distribution

The process of directing client requests across a fleet of servers.

Algorithms

The logic or pattern used to distribute web traffic across servers. This can be based on dynamic metrics or static rules.

Dynamic algorithm

Actively accounts for the current load on each server and distributes web traffic accordingly. For example, the least connections algorithm sends traffic to servers based on which server has the fewest live connections.

Static algorithm

Distributes web traffic based on a sorting rule that is not affected by the load servers are managing. For example, the round-robin algorithm sends traffic to servers in a simple rotation.

Connections

Refers to the active sessions over which clients exchange data with servers. With TCP/IP, the networking protocol for web applications, clients must first connect to a server before they can send or receive transmissions. A reverse proxy load balancer receives connections on behalf of upstream servers and then makes a separate connection from itself to the servers (or re-uses an already existing connection). 

Redundancy

The duplication of components in network infrastructure to ensure application and API availability. If a component fails, the backup component can take its place.

Failover

Redirecting traffic to other healthy servers when one server fails.

Health checks

A load balancing feature that checks the overall health status of a server. A load balancer uses health checks to determine whether it should send connections to a server or take the server out of rotation.

Scalability

The ability of network infrastructure to handle increased workloads efficiently.

Sticky sessions

Maintains session persistence, ensuring requests from a single client are sent to the same server, along with the continuity of a client's session on the network.

What does a load balancer do?

An effective load balancer will:

  • Distribute web traffic across a fleet of servers using static or dynamic algorithms and sticky sessions for a seamless user experience.

  • Maintain high availability of web applications and APIs, with health checks and automatic failover when a server becomes unhealthy.

  • Scale to meet fluctuations in web traffic demands with high performance and low resource use.

  • Secure your traffic, for example with SSL termination, filtering, and rate limiting.

  • Accelerate application performance and improve end user experience, for example with caching, compression, and SSL offloading.

  • Run layer 4 and/or layer 7 load balancing.

  • Be customizable and configurable to meet the application's load balancing needs.

Read More: Why Your Load Balancer Should Be Fast & Flexible

The benefits of load balancing

Load balancing is essential in guaranteeing the high availability of your website or networked application. High availability means that your application is there when your customers need it. But the benefits don’t stop there:

  • Increased performance. With routing capabilities of load balancing, each system is able to operate optimally, resulting in faster processing times and quicker responses.

  • Improved user experience. Users accessing your highly available web applications will have a responsive and consistent experience with your services.

  • Enhanced reliability. Should unforeseen circumstances arise, load balancing ensures your services are unaffected, leveraging redundant components for quick recovery and continuous application delivery.

  • Highly scalable. With load balancing, infrastructure is more capable of scaling out to meet increasing demands, enabling systems to function more effectively and adapt in real-time.

  • Increased capacity. The ability to scale out means load balancing can increase capacity and the ability to serve more customers simultaneously.

  • Fortified security. A load balancer brings security features that can be baked into your lines of defense.

  • Improved cost efficiency. HAProxy enables linear scalability, allowing for optimal resource utilization as you add more servers. Without linear scaling, adding servers might result in diminishing returns.

Read more about benefits of load balancing.

How does a load balancer impact business?

Whether a small or large-scale operation, investing in a load balancer can have a significant impact on the bottom line.

A load balancer helps you make the most of opportunities for growth and change.

Here’s how a load balancer can make a difference to how clients engage with your services and the experiences they walk away with.

  • Business growth. As your business grows and the demand for your services increases, a load balancer helps you to scale out to meet that demand, providing you with a high-performing, cost-efficient solution in serving applications to customers.

  • Business continuity. A load balancer adds resiliency to your infrastructure, ensuring your services are always available. This means no downtime affecting revenue or other critical operations.

  • Business transformation. When undergoing digital transformation, businesses often need to minimize complexity and disruption. A load balancer can help achieve this by providing continuity across different environments and architectures, and gracefully migrating traffic.

  • Business optimization. A load balancer reduces costs and increases performance, maximizing the return on investment in infrastructure.

  • Reputation protection. Providing a reliable and consistent experience protects your reputation and brand image. A load balancer future-proofs your business in the case of unexpected events.

What kind of traffic can load balancers handle?

The traffic handled by a load balancer can usually be sorted into two categories: transport layer traffic and application layer traffic (also known as layer 4 and layer 7). Within the transport and application layers, traffic uses various protocols. Some of the most recognizable protocols are HTTP (for web applications) and TCP/IP (for services like databases, message queues, mail servers, and IoT devices), but the list can be quite extensive. Read Your Comprehensive Guide to HAProxy Protocol Support to learn more about a broad range of protocols HAProxy’s products support.

  • TCP/IP (Transmission Control Protocol/Internet Protocol) is a transport layer protocol that defines how data should be broken down, exchanged, and routed between devices.

  • HTTP (Hypertext Transfer Protocol) is the application layer protocol, allowing data transfer between web browsers and servers. There have been several HTTP versions, most notably HTTP/1.1, HTTP/2, and HTTP/3. Each version attempts to solve the challenges faced by previous iterations, with the latest version, HTTP/3, moving the stream notion to the new lower transport protocol called QUIC (see below).

  • UDP (User Datagram Protocol) is a connectionless transport layer protocol that does not require a dedicated connection before transmitting data. It is lightweight and does not provide error correction or retransmission of lost packets.

  • QUIC (Quick UDP Internet Connections) is a new transport layer protocol built on top of UDP. QUIC combines the best of TCP and UDP, providing reliable, ordered, and secure data transmission while also reducing latency and improving performance.

Other protocols include DNS (Domain Name System), SIP (Session Initiation Protocol), RTSP (Real-Time Streaming Protocol), RADIUS (Remote Authentication Dial-In User Service) and Diameter, and more.

Load balancing algorithms

Load balancing algorithms are logical methods of distributing web traffic across a fleet of servers. Algorithms can be separated into two main categories: dynamic and static.

Dynamic algorithms

Dynamic algorithms route traffic according to the client request and network conditions. This more active method of load balancing considers the contents of messages, the workloads required, and the servers' overall health to enable the most efficient processing. As such, dynamic algorithms only work with layer 7 load balancing, where the load balancer can understand the application traffic passing through it.

Least connection

With the least connection algorithm, the server managing the fewest connections will receive the next connection in line. This dynamic algorithm actively accounts for the server load being managed to ensure workloads are distributed to the resource with the most available capacity.

least connection load balancing algorithm diagram

Static algorithms

In contrast, static algorithms route traffic based on a fixed set of instructions. These sorting methods can be pattern-based and do not consider the contents of messages and the load servers are managing. It only cares about sorting fairly instead of adapting to cues in real-time.

Workloads that change over time (for example, websites with unpredictable session durations) may benefit from dynamic load balancing with its ability to adapt to changing conditions. More stable workloads can benefit from the simplicity static load balancing offers.

Round-robin

The round-robin algorithm distributes connections evenly across a group of servers, maintaining a queue of requests and assigning each connection to the next available server in rotation. This static algorithm ensures each server receives the same number of requests and works in a cycle, forwarding traffic to each server one by one. It is a simple but effective distribution method for predictable, consistent workloads.

round robin load balancing algorithm diagram

URI-hash

This algorithm ensures requests hit the servers with the corresponding cached results, optimizing server performance by remembering which server was used last time for the requested URL. This ensures that the same URL will always be directed to the same server as long as no server goes up or down. This algorithm is static and unaffected by changes in server weight.

uri hash load balancing algorithm diagram

Read more about choosing the right load balancing algorithm.

How do load balancers work?

Reverse proxies

HAProxy is a reverse proxy load balancer. A reverse proxy receives a request, then relays it to an upstream server.

basic load balancing diagram

For example, if a client makes a request to a website named example.com, a load balancer would receive the request first. Then, it would choose a web server from a list and pass the request along by opening a connection on the backend to that server (or re-using an already established connection). Meanwhile, the load balancer would keep its connection open to the client.

When the web server handling the request returns its response to the load balancer, the load balancer would relay it back to the client over the original connection.

Reverse proxy load balancers can operate in various forms (hardware, software, and virtual) and can, as mentioned, function in different layers of the application layer and the transport layer.

DNS Round-robin

Aside from using a reverse proxy-style load balancer, you can also use DNS Round-robin to load balance traffic. With this approach, a portion of traffic is sent to different servers by returning different IP addresses for a DNS lookup. For example, when accessing haproxy.com, the DNS server may return the address 209.126.35.1. With DNS Round-robin, the next time the same client accesses the website, I may get a different IP address back, such as 209.126.35.2. By rotating the addresses it returns for a DNS query, the DNS server sends users to different servers.

Reverse proxies can detect changes to servers more quickly than a DNS server can, such as when servers are added or removed, or when a server stops responding. DNS records are often cached by clients or intermediary DNS servers, which means that DNS Round-robin is slower to adapt to changes. However, people often use it as a second tier of load balancing to distribute traffic to multiple data centers.

Hardware vs software load balancers

A hardware load balancer is a physical appliance that distributes web traffic across a fleet of servers. Functioning as a standalone piece of hardware, this type of load balancer has components and an operating system optimized and dedicated for balancing and routing traffic, and usually for SSL termination. 

These load balancers are traditionally deployed on-premises in data centers, often in pairs to maintain availability if a load balancer fails. Deploying these hardware load balancers across multiple data centers allows businesses to take advantage of Global Server Load Balancing (GSLB), minimizing the impact on users accessing your web applications should an entire data center go offline.

A software load balancer provides the same function as a hardware load balancer, except without being tied to a physical appliance. Software load balancers can be installed onto hardware servers, virtual machines, containers, or in the cloud. Being more elastic and scalable, this type of load balancer can scale out in real-time to meet increasing web application demands—a limitation hardware appliances may face.

Learn more: Software Load Balancers vs Appliances

Hardware load balancer pros & cons


Hardware Load Balancer

Pros

Cons

Dedicated, battle-tested hardware optimized for load balancing, ensuring high performance and reliable traffic management.

Costly due to the need for sufficient hardware appliances to support peak web traffic and high availability.

Turnkey, plug-and-play load balancer configured to manage web traffic.

Scaling requires adding more hardware to the infrastructure.

Installation needs on-site operational staff and infrastructure.

Software load balancer pros & cons


Software Load Balancer

Pros

Cons

Versatile and capable of taking advantage of cloud and virtualization environments.

Performance is dependent on the hardware where the load balancer is installed. Standard commodity hardware usually lacks the customized chipsets and network interfaces of dedicated hardware. However, HAProxy’s software load balancer can match the performance of hardware load balancers, especially on optimized servers.

Scalable in real-time, making it easier to achieve high availability and accommodate fluctuating web traffic.

Resource consumption might be high for some software load balancers, affecting scalability and cost-efficiency. But HAProxy is designed for efficient resource use.

L4 vs L7 load balancing

Layer 4 (L4) load balancing operates in the transport layer of the Open Systems Interconnection (OSI) model and handles the TCP/IP and UDP protocols. It bases traffic management decisions solely on network information as opposed to the contents of messages.

Load balancing in this layer is quick and secure since the contents of messages do not need to be unpacked. However, this doesn’t allow the more intelligent load balancing that can be achieved in layer 7.

Layer 7 (L7) load balancing operates in the application layer of the OSI model and handles HTTP/HTTPS protocol. It bases traffic management decisions on the contents of the messages and health checks on backend servers.

Load balancing in this layer allows more efficient use of upstream server resources and theoretically better reliability. However, it is not as quick and requires more resources for layer 7 processing, unpacking messages, and modifying messages. The higher processing requirements of L7 load balancing can add latency, which can be mitigated by caching responses upstream. 

Health checks

Health checks provide a load balancer with information about the availability and performance of upstream servers. When a server is deemed unhealthy (for example, because it is unresponsive), the load balancer will move connections over to the healthy servers, ensuring clients still have access to your services. When an unhealthy server becomes healthy again, the load balancer will resume sending it connections 

HAProxy offers three types of health checks:

  • Active health checks. Connecting with backend servers at intervals, using failed responses to determine health status.

  • Passive health checks. Monitoring live traffic for connection errors for insight into server health.

  • Agent checks. Communicating with software on servers to monitor system performance and vitals more closely.

Read more about how to enable health checks with HAProxy.

Security

Operating as an endpoint in your infrastructure, load balancing can provide effective security layers before cyber attacks can penetrate your network.

A load balancer can enhance the security of a network by encrypting network traffic. Load balancers such as HAProxy support SSL and TLS encryption, which means data traveling between the load balancer and the client/server is secure. It also offloads SSL/TLS termination so that the server can reserve resources for more meaningful work and better defend against cyber attacks.

Have experience with load balancers? Check out this video on how to set up SSL termination in HAProxy:

Some load balancers, such as HAProxy Enterprise and HAProxy ALOHA, include additional security features to protect applications and APIs from bad actors, bots, attacks, abuse, and more. Examples of security features include:

Read our blog on website security threats.

Does HAProxy do load balancing?  

Yes! HAProxy is the world’s fastest and most widely used software load balancer. Organizations rapidly deploy HAProxy products to deliver websites and applications with the utmost performance, observability, and security at any scale and in any environment.

  1. HAProxy is open source and highly customizable, backed by a great community that helps push the platform forward.

  2. HAProxy Enterprise combines HAProxy with scalable and observable central control, enterprise-class security, integrations, and authoritative support.

  3. HAProxy works anywhere and in any form factor, regardless of architecture.

  4. HAProxy enables varied use cases by supporting a broad range of protocols, customizations, and scripting.

  5. HAProxy fits many different workflows, allowing configuration using multiple methods and providing REST APIs and rich integrations with common tools..

  6. HAProxy achieves industry-leading performance metrics with low resource requirements, saving costs in the cloud or on-premises data centers and saving energy.

  7. HAProxy is highly secure and observable, enabling application delivery teams to predict, prevent, and resolve issues quickly.

Load balancing: final word

Load balancing delivers highly available and responsive web applications and APIs to clients by distributing web traffic across a pool of servers to process the connections. The versatility of load balancing and the breadth of solutions it provides can be summarized with the following key takeaways:

  • Load balancing offers several benefits, including increased performance, improved user experience, high scalability, fortified security, improved cost efficiency, fault-tolerant infrastructure, and reputation protection.

  • Load balancers can work either: as a reverse proxy, sitting between client and host, and relaying information between the two while managing server load; or through DNS round-robin, managing traffic by returning different IP addresses for DNS lookup and sending users to different servers.

  • Hardware load balancers are optimized for managing traffic and can provide high performance and reliability, but they can be expensive and difficult to scale. Software load balancers are versatile and scalable, but not every option on the market can provide the same level of performance as dedicated hardware appliances.

  • L4 load balancing operates in the transport layer and makes traffic management decisions based on network information, while L7 load balancing operates in the application layer and makes decisions based on the contents of messages and telemetry from servers.

  • Load balancers can enhance the security of a network by encrypting network traffic and blocking attacks. 

  • Health checks can be used to to identify vulnerable servers that should be removed from rotation.

  • HAProxy is the fastest and most widely used software load balancer that offers simple adoption, scalable performance, secure infrastructure, and observable operation. HAProxy Enterprise is strengthened by community testing and feedback and is backed by enterprise-level support.

Are you new to load balancing? You can learn a whole lot more from our experts. Watch our FREE On-Demand Webinar “Introduction to HAProxy” and take your first steps with load balancing

Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.