Database Connection Pooling: Patterns for High-Throughput Applications

Every database connection costs something: TCP handshake, TLS negotiation, authentication, session state allocation. For PostgreSQL, that’s 1.3-2MB of memory per connection. For MySQL, 256KB-1MB. At scale, creating connections on demand kills both your database and your latency.

Connection pooling solves this by reusing connections across requests. But misconfigured pools are worse than no pools — you get connection starvation, deadlocks, and debugging nightmares.

Here’s how to do it right.

The Problem: Connection Overhead

Without pooling, a typical web request:

With pooling:

The 90% reduction in overhead compounds across thousands of requests per second.

Pattern 1: Application-Level Pooling

Most database drivers include built-in connection pools. Start here.

Python with SQLAlchemy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

engine = create_engine(
    "postgresql://user:pass@localhost/db",
    poolclass=QueuePool,
    pool_size=10,           # Steady-state connections
    max_overflow=20,        # Additional connections under load
    pool_timeout=30,        # Seconds to wait for connection
    pool_recycle=1800,      # Recycle connections after 30 min
    pool_pre_ping=True,     # Verify connection before use
)

# Usage
with engine.connect() as conn:
    result = conn.execute("SELECT * FROM users")

Node.js with pg-pool

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
const { Pool } = require('pg');

const pool = new Pool({
  host: 'localhost',
  database: 'mydb',
  user: 'user',
  password: 'pass',
  max: 20,                    // Max connections
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 2000, // Fail fast if pool exhausted
});

// Usage
const client = await pool.connect();
try {
  const res = await client.query('SELECT * FROM users');
  return res.rows;
} finally {
  client.release();  // Return to pool, don't close
}

Go with database/sql

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import (
    "database/sql"
    "time"
    _ "github.com/lib/pq"
)

func initDB() *sql.DB {
    db, err := sql.Open("postgres", "postgres://user:pass@localhost/db")
    if err != nil {
        panic(err)
    }
    
    db.SetMaxOpenConns(25)
    db.SetMaxIdleConns(10)
    db.SetConnMaxLifetime(5 * time.Minute)
    db.SetConnMaxIdleTime(1 * time.Minute)
    
    return db
}

// Usage - connections managed automatically
rows, err := db.Query("SELECT * FROM users")

Pool Sizing: The Formula

The optimal pool size depends on your workload, not your traffic volume.

For CPU-bound queries:

For I/O-bound queries (typical web apps):

PostgreSQL’s recommendation:

A 4-core server with SSDs: (4 * 2) + 1 = 9 connections per application instance.

The counterintuitive truth: Larger pools are often slower. More connections mean more lock contention, more context switching, and more memory pressure. A pool of 50 connections fighting for locks is slower than a pool of 10 connections with a queue.

Benchmark Before Scaling

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import time
import concurrent.futures
from sqlalchemy import create_engine

def benchmark_pool_size(pool_size, num_requests=1000):
    engine = create_engine(
        "postgresql://user:pass@localhost/db",
        pool_size=pool_size,
        max_overflow=0,  # Strict pool size
    )
    
    def make_request():
        with engine.connect() as conn:
            conn.execute("SELECT pg_sleep(0.01)")  # 10ms query
    
    start = time.time()
    with concurrent.futures.ThreadPoolExecutor(max_workers=100) as executor:
        futures = [executor.submit(make_request) for _ in range(num_requests)]
        concurrent.futures.wait(futures)
    
    duration = time.time() - start
    print(f"Pool size {pool_size}: {duration:.2f}s ({num_requests/duration:.0f} req/s)")

# Test different sizes
for size in [5, 10, 20, 50, 100]:
    benchmark_pool_size(size)

Pattern 2: External Connection Poolers

Application-level pools work within a single process. But modern deployments have many processes: multiple containers, serverless functions, autoscaling groups. Each maintains its own pool, multiplying connections to the database.

External poolers sit between your applications and the database, maintaining a shared pool.

PgBouncer (PostgreSQL)

The standard for PostgreSQL pooling. Lightweight, battle-tested.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# pgbouncer.ini
[databases]
mydb = host=postgres port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Pool settings
pool_mode = transaction        # Most common
max_client_conn = 1000         # Client connections to PgBouncer
default_pool_size = 20         # Connections per user/database
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3

# Connection lifecycle
server_lifetime = 3600         # Recycle after 1 hour
server_idle_timeout = 600      # Close idle server connections

Pool modes:

Mode	Description	Use Case
`session`	Connection held for entire client session	Legacy apps using session state
`transaction`	Connection released after each transaction	Most web applications
`statement`	Connection released after each statement	Simple queries, no transactions

The transaction mode gotcha: Prepared statements don’t work across transaction boundaries. Use DEALLOCATE ALL or disable prepared statements in your driver.

1
2
3
4
5
6
7
# SQLAlchemy with PgBouncer transaction mode
engine = create_engine(
    "postgresql://user:pass@pgbouncer:6432/db",
    connect_args={
        "prepare_threshold": None  # Disable prepared statements
    }
)

ProxySQL (MySQL)

Similar concept for MySQL, with query routing and caching.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Configure backend servers
INSERT INTO mysql_servers(hostgroup_id, hostname, port) 
VALUES (10, 'mysql-primary', 3306);
INSERT INTO mysql_servers(hostgroup_id, hostname, port) 
VALUES (20, 'mysql-replica', 3306);

-- Configure connection pool
UPDATE global_variables 
SET variable_value='2000' 
WHERE variable_name='mysql-max_connections';

UPDATE global_variables 
SET variable_value='true' 
WHERE variable_name='mysql-multiplexing';

-- Route reads to replicas
INSERT INTO mysql_query_rules(rule_id, match_pattern, destination_hostgroup)
VALUES (1, '^SELECT', 20);

-- Everything else to primary
INSERT INTO mysql_query_rules(rule_id, match_pattern, destination_hostgroup)
VALUES (2, '.*', 10);

LOAD MYSQL SERVERS TO RUNTIME;
LOAD MYSQL QUERY RULES TO RUNTIME;

Pattern 3: Serverless Connection Management

Serverless functions are pool-hostile: they scale rapidly, have short lifetimes, and cold starts create connection bursts.

AWS RDS Proxy

AWS’s managed connection pooler. Handles the Lambda-to-RDS problem.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
resource "aws_db_proxy" "main" {
  name                   = "rds-proxy"
  engine_family          = "POSTGRESQL"
  role_arn               = aws_iam_role.rds_proxy.arn
  vpc_subnet_ids         = var.subnet_ids
  
  auth {
    auth_scheme = "SECRETS"
    secret_arn  = aws_secretsmanager_secret.db_creds.arn
  }
}

resource "aws_db_proxy_default_target_group" "main" {
  db_proxy_name = aws_db_proxy.main.name
  
  connection_pool_config {
    connection_borrow_timeout    = 120
    max_connections_percent      = 100
    max_idle_connections_percent = 50
  }
}

Connect Lambda to the proxy endpoint:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import boto3
import psycopg2

def handler(event, context):
    # RDS Proxy handles connection pooling
    conn = psycopg2.connect(
        host="rds-proxy.proxy-xxx.us-east-1.rds.amazonaws.com",
        database="mydb",
        user="admin",
        password=get_secret()  # Or use IAM auth
    )
    
    with conn.cursor() as cur:
        cur.execute("SELECT * FROM users")
        return cur.fetchall()

Connection Reuse in Lambda

Even without RDS Proxy, you can reuse connections across invocations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import psycopg2

# Connection created at module level persists across warm invocations
conn = None

def get_connection():
    global conn
    if conn is None or conn.closed:
        conn = psycopg2.connect(
            host="postgres",
            database="mydb",
            user="user",
            password="pass"
        )
    return conn

def handler(event, context):
    connection = get_connection()
    with connection.cursor() as cur:
        cur.execute("SELECT * FROM users")
        return cur.fetchall()

The limit: Lambda functions scale independently. 1000 concurrent invocations = 1000 connections, even with reuse. RDS Proxy caps this at the configured pool size.

Pattern 4: Health Checks and Recycling

Connections go stale. Firewalls drop idle connections. Databases restart. Your pool needs to detect and replace broken connections.

Connection Validation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# SQLAlchemy: ping before checkout
engine = create_engine(
    "postgresql://...",
    pool_pre_ping=True,  # Issues SELECT 1 before returning connection
)

# Custom validation query
from sqlalchemy import event

@event.listens_for(engine, "checkout")
def checkout(dbapi_conn, connection_record, connection_proxy):
    cursor = dbapi_conn.cursor()
    try:
        cursor.execute("SELECT 1")
    except:
        raise exc.DisconnectionError()
    finally:
        cursor.close()

Connection Recycling

Connections shouldn’t live forever. Memory leaks, statement cache growth, and server-side resource accumulation make old connections slower.

1
2
3
4
engine = create_engine(
    "postgresql://...",
    pool_recycle=1800,  # Recycle every 30 minutes
)

For MySQL with wait_timeout:

1
2
3
4
5
6
# MySQL closes connections after wait_timeout (default 8 hours)
# Recycle before that to avoid "MySQL server has gone away"
engine = create_engine(
    "mysql://...",
    pool_recycle=3600,  # Recycle every hour
)

Pattern 5: Multi-Tenant Connection Pools

SaaS applications often need separate pools per tenant, either for isolation or because each tenant has their own database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from functools import lru_cache

@lru_cache(maxsize=100)
def get_tenant_engine(tenant_id: str):
    """Create a cached engine per tenant"""
    connection_string = get_tenant_connection_string(tenant_id)
    return create_engine(
        connection_string,
        pool_size=5,
        max_overflow=10,
    )

def get_tenant_session(tenant_id: str):
    engine = get_tenant_engine(tenant_id)
    Session = sessionmaker(bind=engine)
    return Session()

# Usage
with get_tenant_session("tenant_123") as session:
    users = session.query(User).all()

Caveat: Unbounded tenant pools can exhaust database connections. Track total connections across all pools and implement eviction for inactive tenants.

Monitoring Your Pool

What to watch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- PostgreSQL: active connections
SELECT 
    datname,
    count(*) as connections,
    count(*) FILTER (WHERE state = 'active') as active,
    count(*) FILTER (WHERE state = 'idle') as idle,
    count(*) FILTER (WHERE wait_event_type = 'Lock') as waiting
FROM pg_stat_activity
WHERE datname = 'mydb'
GROUP BY datname;

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# SQLAlchemy pool stats
from sqlalchemy import event

@event.listens_for(engine, "checkin")
def log_checkin(dbapi_conn, connection_record):
    pool = engine.pool
    print(f"Pool: {pool.checkedout()}/{pool.size()} checked out")

@event.listens_for(engine, "checkout")
def log_checkout(dbapi_conn, connection_record, connection_proxy):
    pool = engine.pool
    print(f"Pool: {pool.checkedout()}/{pool.size()} checked out")

Alerts to set:

Connection pool exhaustion (checkout timeouts)
High connection wait time (> 100ms)
Connections approaching max_connections
Connection errors/failures rate

The Configuration Checklist

Size your pool based on benchmarks, not guesses
Enable connection validation (pool_pre_ping or equivalent)
Set connection recycling shorter than server timeout
Configure timeouts — fail fast is better than hang forever
Use external poolers for multi-process or serverless
Monitor pool metrics — exhaustion sneaks up on you
Test failover — what happens when the database restarts?

Connection pooling isn’t glamorous, but it’s the difference between an app that handles 100 requests/second and one that handles 10,000. Get it right once, and it quietly does its job forever. Get it wrong, and you’ll spend hours debugging mysterious timeouts at 2 AM.

The pool doesn’t care about your deadlines. Configure it properly.

The Problem: Connection Overhead#

Pattern 1: Application-Level Pooling#

Python with SQLAlchemy#

Node.js with pg-pool#

Go with database/sql#

Pool Sizing: The Formula#

Benchmark Before Scaling#

Pattern 2: External Connection Poolers#

PgBouncer (PostgreSQL)#

ProxySQL (MySQL)#

Pattern 3: Serverless Connection Management#

AWS RDS Proxy#

Connection Reuse in Lambda#

Pattern 4: Health Checks and Recycling#

Connection Validation#

Connection Recycling#

Pattern 5: Multi-Tenant Connection Pools#

Monitoring Your Pool#

The Configuration Checklist#

📬 Get the Newsletter