Load Balancing Algorithms: Choosing the Right Strategy

Not all load balancing algorithms are equal. The right choice depends on your traffic patterns, backend capabilities, and consistency requirements.

Round Robin

The default. Requests go to each server in turn.

Nginx:

1
2
3
4
5
6
upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
    # Round-robin is default, no directive needed
}

HAProxy:

1
2
3
4
5
backend servers
    balance roundrobin
    server srv1 10.0.1.1:8080 check
    server srv2 10.0.1.2:8080 check
    server srv3 10.0.1.3:8080 check

When to use:

All backends have identical capacity
Requests have similar processing time
No session affinity needed

When to avoid:

Backends have different specs (CPU, RAM)
Some requests are much heavier than others

Weighted Round Robin

Like round-robin, but servers get proportional traffic based on weight.

Nginx:

1
2
3
4
5
upstream backend {
    server 10.0.1.1:8080 weight=5;  # 50% of traffic
    server 10.0.1.2:8080 weight=3;  # 30% of traffic
    server 10.0.1.3:8080 weight=2;  # 20% of traffic
}

When to use:

Mixed server capacity (some servers more powerful)
Gradual traffic shifting (canary deployments)
Cost optimization (send more to cheaper instances)

1
2
3
4
5
# Canary deployment: 95% to stable, 5% to canary
upstream backend {
    server stable.internal:8080 weight=95;
    server canary.internal:8080 weight=5;
}

Least Connections

Route to the server with fewest active connections.

Nginx:

1
2
3
4
5
6
upstream backend {
    least_conn;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
}

HAProxy:

1
2
3
4
backend servers
    balance leastconn
    server srv1 10.0.1.1:8080 check
    server srv2 10.0.1.2:8080 check

When to use:

Long-lived connections (WebSockets, database proxies)
Highly variable request duration
Some requests much heavier than others

When to avoid:

Very short requests (overhead of tracking connections)
All requests take similar time (round-robin is simpler)

Weighted Least Connections

Combines weights with connection counting:

1
2
3
4
5
6
upstream backend {
    least_conn;
    server 10.0.1.1:8080 weight=5;
    server 10.0.1.2:8080 weight=3;
    server 10.0.1.3:8080 weight=2;
}

The algorithm considers connections / weight ratio, so a server with weight=5 and 10 connections (ratio=2) beats a server with weight=2 and 6 connections (ratio=3).

IP Hash (Source Affinity)

Same client IP always goes to same server:

Nginx:

1
2
3
4
5
6
upstream backend {
    ip_hash;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
}

HAProxy:

1
2
3
4
5
backend servers
    balance source
    hash-type consistent
    server srv1 10.0.1.1:8080 check
    server srv2 10.0.1.2:8080 check

When to use:

Session affinity without cookies
Caching layers (same content on same server)
Stateful applications that can’t share state

When to avoid:

Clients behind NAT (many users share one IP)
Need to drain servers gracefully (sticky users won’t move)

Consistent Hashing

Advanced hashing that minimizes disruption when servers change:

1
2
3
4
5
6
upstream backend {
    hash $request_uri consistent;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080;
}

When a server is added or removed, only 1/n of requests remap (not all of them).

When to use:

Caching proxies (maximize cache hits)
Sharded data stores
Any system where “same request → same server” matters

1
2
3
4
5
6
7
# Cache layer: same URL always hits same cache server
upstream cache_layer {
    hash $request_uri consistent;
    server cache1:11211;
    server cache2:11211;
    server cache3:11211;
}

Random with Two Choices

Pick two servers randomly, choose the one with fewer connections:

HAProxy:

1
2
3
4
5
backend servers
    balance random(2)
    server srv1 10.0.1.1:8080 check
    server srv2 10.0.1.2:8080 check
    server srv3 10.0.1.3:8080 check

Surprisingly effective—nearly as good as least-connections with less overhead. Good for very high throughput scenarios.

Health Checks

All algorithms should include health checks:

Nginx (passive):

1
2
3
4
upstream backend {
    server 10.0.1.1:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.2:8080 max_fails=3 fail_timeout=30s;
}

Nginx Plus (active):

1
2
3
4
5
6
7
upstream backend {
    zone backend 64k;
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    
    health_check interval=5s fails=3 passes=2;
}

HAProxy:

1
2
3
4
5
backend servers
    option httpchk GET /health
    http-check expect status 200
    server srv1 10.0.1.1:8080 check inter 5s fall 3 rise 2
    server srv2 10.0.1.2:8080 check inter 5s fall 3 rise 2

Graceful Degradation

Mark servers as backup or slowly drain them:

1
2
3
4
5
upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080;
    server 10.0.1.3:8080 backup;  # Only used when others down
}

Draining a server:

1
2
3
4
5
upstream backend {
    server 10.0.1.1:8080;
    server 10.0.1.2:8080 down;  # Stop sending new requests
    server 10.0.1.3:8080;
}

Existing connections to the down server continue until complete.

Layer 4 vs Layer 7

Layer 4 (TCP/UDP):

Faster (no HTTP parsing)
Works for any protocol
Limited routing options

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
stream {
    upstream backend {
        least_conn;
        server 10.0.1.1:5432;
        server 10.0.1.2:5432;
    }
    
    server {
        listen 5432;
        proxy_pass backend;
    }
}

Layer 7 (HTTP):

Content-based routing
Can modify requests/responses
More features (caching, compression)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
http {
    upstream api {
        least_conn;
        server 10.0.1.1:8080;
    }
    
    upstream static {
        server 10.0.2.1:80;
    }
    
    server {
        location /api/ {
            proxy_pass http://api;
        }
        location /static/ {
            proxy_pass http://static;
        }
    }
}

Quick Reference

Algorithm	Best For	Avoid When
Round Robin	Equal servers, uniform requests	Mixed capacity
Weighted	Mixed capacity, canary deploys	All servers equal
Least Conn	Long/variable requests	Short uniform requests
IP Hash	Session affinity	NAT, server changes
Consistent Hash	Caching, sharding	No locality benefit
Random(2)	High throughput	Need determinism

My Recommendations

Start with round-robin — simplest, works for most cases
Add weights when server capacity varies
Switch to least-conn when request duration varies significantly
Use consistent hash for caching layers
Avoid IP hash unless you have no other option for session affinity

Always combine with proper health checks. A perfect algorithm sending traffic to a dead server is worse than a simple algorithm with good health checks.

The best load balancer is one you don’t have to think about. Choose the simplest algorithm that meets your needs.

Round Robin#

Weighted Round Robin#

Least Connections#

Weighted Least Connections#

IP Hash (Source Affinity)#

Consistent Hashing#

Random with Two Choices#

Health Checks#

Graceful Degradation#

Layer 4 vs Layer 7#

Quick Reference#

My Recommendations#

📬 Get the Newsletter

Round Robin

Weighted Round Robin

Least Connections

Weighted Least Connections

IP Hash (Source Affinity)

Consistent Hashing

Random with Two Choices

Health Checks

Graceful Degradation

Layer 4 vs Layer 7

Quick Reference

My Recommendations