Every database connection costs something: TCP handshake, TLS negotiation, authentication, session state allocation. For PostgreSQL, that’s 1.3-2MB of memory per connection. For MySQL, 256KB-1MB. At scale, creating connections on demand kills both your database and your latency.

Connection pooling solves this by reusing connections across requests. But misconfigured pools are worse than no pools — you get connection starvation, deadlocks, and debugging nightmares.

Here’s how to do it right.

The Problem: Connection Overhead

Without pooling, a typical web request:

RCTAECTerLuxloqeSteotuahcsaetheuelseanttntecTdioaCscqnrPhaunrateeickercvoeytenisnoenction5511201--0-2m-21m2-s305sm1mmms0sss0m(slo(cmaols)tl/y1n0o-t50ymosur(rqeumeortye))

With pooling:

RAERTecxeoqqettuucuaeiurlsrtnteetacqorournepinroveyoeclstion00101m.0.0s1m1.msm2ssms(frompool)

The 90% reduction in overhead compounds across thousands of requests per second.

Pattern 1: Application-Level Pooling

Most database drivers include built-in connection pools. Start here.

Python with SQLAlchemy

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

engine = create_engine(
    "postgresql://user:pass@localhost/db",
    poolclass=QueuePool,
    pool_size=10,           # Steady-state connections
    max_overflow=20,        # Additional connections under load
    pool_timeout=30,        # Seconds to wait for connection
    pool_recycle=1800,      # Recycle connections after 30 min
    pool_pre_ping=True,     # Verify connection before use
)

# Usage
with engine.connect() as conn:
    result = conn.execute("SELECT * FROM users")

Node.js with pg-pool

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
const { Pool } = require('pg');

const pool = new Pool({
  host: 'localhost',
  database: 'mydb',
  user: 'user',
  password: 'pass',
  max: 20,                    // Max connections
  idleTimeoutMillis: 30000,   // Close idle connections after 30s
  connectionTimeoutMillis: 2000, // Fail fast if pool exhausted
});

// Usage
const client = await pool.connect();
try {
  const res = await client.query('SELECT * FROM users');
  return res.rows;
} finally {
  client.release();  // Return to pool, don't close
}

Go with database/sql

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
import (
    "database/sql"
    "time"
    _ "github.com/lib/pq"
)

func initDB() *sql.DB {
    db, err := sql.Open("postgres", "postgres://user:pass@localhost/db")
    if err != nil {
        panic(err)
    }
    
    db.SetMaxOpenConns(25)
    db.SetMaxIdleConns(10)
    db.SetConnMaxLifetime(5 * time.Minute)
    db.SetConnMaxIdleTime(1 * time.Minute)
    
    return db
}

// Usage - connections managed automatically
rows, err := db.Query("SELECT * FROM users")

Pool Sizing: The Formula

The optimal pool size depends on your workload, not your traffic volume.

For CPU-bound queries:

pool_size=number_of_cores

For I/O-bound queries (typical web apps):

pool_size=(number_of_cores2)+spinning_disk_count

PostgreSQL’s recommendation:

max_connections=(core_count2)+effective_spindle_count)

A 4-core server with SSDs: (4 * 2) + 1 = 9 connections per application instance.

The counterintuitive truth: Larger pools are often slower. More connections mean more lock contention, more context switching, and more memory pressure. A pool of 50 connections fighting for locks is slower than a pool of 10 connections with a queue.

Benchmark Before Scaling

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import time
import concurrent.futures
from sqlalchemy import create_engine

def benchmark_pool_size(pool_size, num_requests=1000):
    engine = create_engine(
        "postgresql://user:pass@localhost/db",
        pool_size=pool_size,
        max_overflow=0,  # Strict pool size
    )
    
    def make_request():
        with engine.connect() as conn:
            conn.execute("SELECT pg_sleep(0.01)")  # 10ms query
    
    start = time.time()
    with concurrent.futures.ThreadPoolExecutor(max_workers=100) as executor:
        futures = [executor.submit(make_request) for _ in range(num_requests)]
        concurrent.futures.wait(futures)
    
    duration = time.time() - start
    print(f"Pool size {pool_size}: {duration:.2f}s ({num_requests/duration:.0f} req/s)")

# Test different sizes
for size in [5, 10, 20, 50, 100]:
    benchmark_pool_size(size)

Pattern 2: External Connection Poolers

Application-level pools work within a single process. But modern deployments have many processes: multiple containers, serverless functions, autoscaling groups. Each maintains its own pool, multiplying connections to the database.

External poolers sit between your applications and the database, maintaining a shared pool.

PgBouncer (PostgreSQL)

The standard for PostgreSQL pooling. Lightweight, battle-tested.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# pgbouncer.ini
[databases]
mydb = host=postgres port=5432 dbname=mydb

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Pool settings
pool_mode = transaction        # Most common
max_client_conn = 1000         # Client connections to PgBouncer
default_pool_size = 20         # Connections per user/database
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3

# Connection lifecycle
server_lifetime = 3600         # Recycle after 1 hour
server_idle_timeout = 600      # Close idle server connections

Pool modes:

ModeDescriptionUse Case
sessionConnection held for entire client sessionLegacy apps using session state
transactionConnection released after each transactionMost web applications
statementConnection released after each statementSimple queries, no transactions

The transaction mode gotcha: Prepared statements don’t work across transaction boundaries. Use DEALLOCATE ALL or disable prepared statements in your driver.

1
2
3
4
5
6
7
# SQLAlchemy with PgBouncer transaction mode
engine = create_engine(
    "postgresql://user:pass@pgbouncer:6432/db",
    connect_args={
        "prepare_threshold": None  # Disable prepared statements
    }
)

ProxySQL (MySQL)

Similar concept for MySQL, with query routing and caching.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
-- Configure backend servers
INSERT INTO mysql_servers(hostgroup_id, hostname, port) 
VALUES (10, 'mysql-primary', 3306);
INSERT INTO mysql_servers(hostgroup_id, hostname, port) 
VALUES (20, 'mysql-replica', 3306);

-- Configure connection pool
UPDATE global_variables 
SET variable_value='2000' 
WHERE variable_name='mysql-max_connections';

UPDATE global_variables 
SET variable_value='true' 
WHERE variable_name='mysql-multiplexing';

-- Route reads to replicas
INSERT INTO mysql_query_rules(rule_id, match_pattern, destination_hostgroup)
VALUES (1, '^SELECT', 20);

-- Everything else to primary
INSERT INTO mysql_query_rules(rule_id, match_pattern, destination_hostgroup)
VALUES (2, '.*', 10);

LOAD MYSQL SERVERS TO RUNTIME;
LOAD MYSQL QUERY RULES TO RUNTIME;

Pattern 3: Serverless Connection Management

Serverless functions are pool-hostile: they scale rapidly, have short lifetimes, and cold starts create connection bursts.

AWS RDS Proxy

AWS’s managed connection pooler. Handles the Lambda-to-RDS problem.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
resource "aws_db_proxy" "main" {
  name                   = "rds-proxy"
  engine_family          = "POSTGRESQL"
  role_arn               = aws_iam_role.rds_proxy.arn
  vpc_subnet_ids         = var.subnet_ids
  
  auth {
    auth_scheme = "SECRETS"
    secret_arn  = aws_secretsmanager_secret.db_creds.arn
  }
}

resource "aws_db_proxy_default_target_group" "main" {
  db_proxy_name = aws_db_proxy.main.name
  
  connection_pool_config {
    connection_borrow_timeout    = 120
    max_connections_percent      = 100
    max_idle_connections_percent = 50
  }
}

Connect Lambda to the proxy endpoint:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
import boto3
import psycopg2

def handler(event, context):
    # RDS Proxy handles connection pooling
    conn = psycopg2.connect(
        host="rds-proxy.proxy-xxx.us-east-1.rds.amazonaws.com",
        database="mydb",
        user="admin",
        password=get_secret()  # Or use IAM auth
    )
    
    with conn.cursor() as cur:
        cur.execute("SELECT * FROM users")
        return cur.fetchall()

Connection Reuse in Lambda

Even without RDS Proxy, you can reuse connections across invocations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import psycopg2

# Connection created at module level persists across warm invocations
conn = None

def get_connection():
    global conn
    if conn is None or conn.closed:
        conn = psycopg2.connect(
            host="postgres",
            database="mydb",
            user="user",
            password="pass"
        )
    return conn

def handler(event, context):
    connection = get_connection()
    with connection.cursor() as cur:
        cur.execute("SELECT * FROM users")
        return cur.fetchall()

The limit: Lambda functions scale independently. 1000 concurrent invocations = 1000 connections, even with reuse. RDS Proxy caps this at the configured pool size.

Pattern 4: Health Checks and Recycling

Connections go stale. Firewalls drop idle connections. Databases restart. Your pool needs to detect and replace broken connections.

Connection Validation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# SQLAlchemy: ping before checkout
engine = create_engine(
    "postgresql://...",
    pool_pre_ping=True,  # Issues SELECT 1 before returning connection
)

# Custom validation query
from sqlalchemy import event

@event.listens_for(engine, "checkout")
def checkout(dbapi_conn, connection_record, connection_proxy):
    cursor = dbapi_conn.cursor()
    try:
        cursor.execute("SELECT 1")
    except:
        raise exc.DisconnectionError()
    finally:
        cursor.close()

Connection Recycling

Connections shouldn’t live forever. Memory leaks, statement cache growth, and server-side resource accumulation make old connections slower.

1
2
3
4
engine = create_engine(
    "postgresql://...",
    pool_recycle=1800,  # Recycle every 30 minutes
)

For MySQL with wait_timeout:

1
2
3
4
5
6
# MySQL closes connections after wait_timeout (default 8 hours)
# Recycle before that to avoid "MySQL server has gone away"
engine = create_engine(
    "mysql://...",
    pool_recycle=3600,  # Recycle every hour
)

Pattern 5: Multi-Tenant Connection Pools

SaaS applications often need separate pools per tenant, either for isolation or because each tenant has their own database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from functools import lru_cache

@lru_cache(maxsize=100)
def get_tenant_engine(tenant_id: str):
    """Create a cached engine per tenant"""
    connection_string = get_tenant_connection_string(tenant_id)
    return create_engine(
        connection_string,
        pool_size=5,
        max_overflow=10,
    )

def get_tenant_session(tenant_id: str):
    engine = get_tenant_engine(tenant_id)
    Session = sessionmaker(bind=engine)
    return Session()

# Usage
with get_tenant_session("tenant_123") as session:
    users = session.query(User).all()

Caveat: Unbounded tenant pools can exhaust database connections. Track total connections across all pools and implement eviction for inactive tenants.

Monitoring Your Pool

What to watch:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- PostgreSQL: active connections
SELECT 
    datname,
    count(*) as connections,
    count(*) FILTER (WHERE state = 'active') as active,
    count(*) FILTER (WHERE state = 'idle') as idle,
    count(*) FILTER (WHERE wait_event_type = 'Lock') as waiting
FROM pg_stat_activity
WHERE datname = 'mydb'
GROUP BY datname;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# SQLAlchemy pool stats
from sqlalchemy import event

@event.listens_for(engine, "checkin")
def log_checkin(dbapi_conn, connection_record):
    pool = engine.pool
    print(f"Pool: {pool.checkedout()}/{pool.size()} checked out")

@event.listens_for(engine, "checkout")
def log_checkout(dbapi_conn, connection_record, connection_proxy):
    pool = engine.pool
    print(f"Pool: {pool.checkedout()}/{pool.size()} checked out")

Alerts to set:

  • Connection pool exhaustion (checkout timeouts)
  • High connection wait time (> 100ms)
  • Connections approaching max_connections
  • Connection errors/failures rate

The Configuration Checklist

  1. Size your pool based on benchmarks, not guesses
  2. Enable connection validation (pool_pre_ping or equivalent)
  3. Set connection recycling shorter than server timeout
  4. Configure timeouts — fail fast is better than hang forever
  5. Use external poolers for multi-process or serverless
  6. Monitor pool metrics — exhaustion sneaks up on you
  7. Test failover — what happens when the database restarts?

Connection pooling isn’t glamorous, but it’s the difference between an app that handles 100 requests/second and one that handles 10,000. Get it right once, and it quietly does its job forever. Get it wrong, and you’ll spend hours debugging mysterious timeouts at 2 AM.

The pool doesn’t care about your deadlines. Configure it properly.