Load Balancing: Beyond Round Robin

Round robin is the default, but it’s rarely the best choice. Here’s when to use each algorithm and why.

The Algorithms

Round Robin

1
2
3
4
5
upstream backend {
    server 192.168.1.1:8080;
    server 192.168.1.2:8080;
    server 192.168.1.3:8080;
}

Requests go 1→2→3→1→2→3. Simple, fair, ignores server load.

Use when: All servers are identical and requests are uniform. Problem: A slow server gets the same traffic as a fast one.

Weighted Round Robin

1
2
3
4
5
upstream backend {
    server 192.168.1.1:8080 weight=5;
    server 192.168.1.2:8080 weight=3;
    server 192.168.1.3:8080 weight=2;
}

Server 1 gets 50%, server 2 gets 30%, server 3 gets 20%.

Use when: Servers have different capacities (mix of instance sizes).

Least Connections

1
2
3
4
5
upstream backend {
    least_conn;
    server 192.168.1.1:8080;
    server 192.168.1.2:8080;
}

New requests go to the server with fewest active connections.

Use when: Request processing time varies significantly (some fast, some slow).

IP Hash

1
2
3
4
5
upstream backend {
    ip_hash;
    server 192.168.1.1:8080;
    server 192.168.1.2:8080;
}

Same client IP always hits the same server.

Use when: You need sticky sessions without cookies (legacy apps). Problem: Uneven distribution if traffic comes through a proxy.

Least Time (Nginx Plus)

1
2
3
4
5
upstream backend {
    least_time header;
    server 192.168.1.1:8080;
    server 192.168.1.2:8080;
}

Routes to fastest responding server.

Use when: Servers have variable performance and you can measure it.

HAProxy Configuration

1
2
3
4
5
6
backend webservers
    balance roundrobin
    option httpchk GET /health
    server web1 192.168.1.1:8080 check weight 5
    server web2 192.168.1.2:8080 check weight 3
    server web3 192.168.1.3:8080 check backup

Options:

roundrobin — Standard round robin
leastconn — Least connections
source — IP hash
uri — Hash based on URI (good for caching)
hdr(Host) — Hash based on Host header

Health Checks

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
upstream backend {
    server 192.168.1.1:8080;
    server 192.168.1.2:8080;
    
    # Passive health checks (open source nginx)
    # Server marked down after 3 failures
}

# Active health checks (nginx plus)
upstream backend {
    zone backend 64k;
    server 192.168.1.1:8080;
    
    health_check interval=5s fails=3 passes=2;
}

HAProxy active checks:

1
2
3
4
5
backend webservers
    option httpchk GET /health HTTP/1.1\r\nHost:\ localhost
    http-check expect status 200
    
    server web1 192.168.1.1:8080 check inter 3s fall 3 rise 2

Draining Connections

For graceful deploys:

1
2
3
4
upstream backend {
    server 192.168.1.1:8080;
    server 192.168.1.2:8080 down;  # Marked down, no new connections
}

HAProxy:

1
2
3
4
5
# Runtime: disable server
echo "disable server webservers/web2" | socat stdio /var/run/haproxy.sock

# Wait for connections to drain
echo "show servers state" | socat stdio /var/run/haproxy.sock

Session Persistence

1
2
3
4
5
backend webservers
    balance roundrobin
    cookie SERVERID insert indirect nocache
    server web1 192.168.1.1:8080 check cookie web1
    server web2 192.168.1.2:8080 check cookie web2

HAProxy adds SERVERID=web1 cookie. Subsequent requests go to same server.

Sticky Table

1
2
3
4
5
backend webservers
    stick-table type ip size 200k expire 30m
    stick on src
    server web1 192.168.1.1:8080
    server web2 192.168.1.2:8080

Tracks source IP → server mapping.

Layer 4 vs Layer 7

Layer 4 (TCP):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
frontend tcp_front
    bind *:3306
    mode tcp
    default_backend mysql_servers

backend mysql_servers
    mode tcp
    balance roundrobin
    server db1 192.168.1.1:3306 check
    server db2 192.168.1.2:3306 check

Faster, can’t inspect HTTP.

Layer 7 (HTTP):

1
2
3
4
5
6
7
frontend http_front
    bind *:80
    mode http
    
    acl is_api path_beg /api
    use_backend api_servers if is_api
    default_backend web_servers

Can route based on URL, headers, cookies.

Content-Based Routing

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
frontend http_front
    bind *:80
    
    # Route by path
    acl is_api path_beg /api
    acl is_static path_end .jpg .png .css .js
    
    use_backend api_servers if is_api
    use_backend cdn_servers if is_static
    default_backend web_servers

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
upstream api {
    server 192.168.1.1:8080;
}

upstream web {
    server 192.168.1.2:8080;
}

server {
    location /api {
        proxy_pass http://api;
    }
    
    location / {
        proxy_pass http://web;
    }
}

Rate Limiting

1
2
3
4
5
6
7
8
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;

server {
    location /api {
        limit_req zone=api burst=20 nodelay;
        proxy_pass http://backend;
    }
}

1
2
3
4
frontend http_front
    stick-table type ip size 100k expire 30s store http_req_rate(10s)
    http-request track-sc0 src
    http-request deny if { sc_http_req_rate(0) gt 100 }

Monitoring

Key metrics to watch:

Active connections per backend
Response time per backend
Error rate per backend
Queue depth (requests waiting)

HAProxy stats:

1
2
3
4
5
listen stats
    bind *:8404
    stats enable
    stats uri /stats
    stats auth admin:password

My Algorithm Cheat Sheet

Scenario	Algorithm
Uniform requests, identical servers	Round robin
Mixed server sizes	Weighted round robin
Variable request duration	Least connections
Need session affinity	IP hash or cookies
Caching layer	URI hash
Unknown/default	Least connections

When in doubt, start with least connections. It adapts to reality better than round robin.

The Algorithms#

Round Robin#

Weighted Round Robin#

Least Connections#

IP Hash#

Least Time (Nginx Plus)#

HAProxy Configuration#

Health Checks#

Draining Connections#

Session Persistence#

Cookie-Based (Recommended)#

Sticky Table#

Layer 4 vs Layer 7#

Content-Based Routing#

Rate Limiting#

Monitoring#

My Algorithm Cheat Sheet#

📬 Get the Newsletter