Opening a database connection is expensive. TCP handshake, SSL negotiation, authentication, session setup—it all adds up. Do that for every query and your application crawls. Connection pooling fixes this by reusing connections. Here’s how to do it right.

The Problem

Without pooling, every request opens a new connection:

1
2
3
4
5
6
7
8
# BAD: New connection per request
def get_user(user_id):
    conn = psycopg2.connect(DATABASE_URL)  # ~50-100ms
    cursor = conn.cursor()
    cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
    user = cursor.fetchone()
    conn.close()
    return user

At 100 requests per second, that’s 100 connections opening and closing per second. Your database server has a connection limit (typically 100-500). You’ll exhaust it fast.

The Solution: Connection Pools

A pool maintains a set of open connections and hands them out as needed:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
from psycopg2 import pool

# Create pool once at startup
connection_pool = pool.ThreadedConnectionPool(
    minconn=5,      # Minimum connections to keep open
    maxconn=20,     # Maximum connections allowed
    dsn=DATABASE_URL
)

def get_user(user_id):
    conn = connection_pool.getconn()  # ~0ms (reuses existing)
    try:
        cursor = conn.cursor()
        cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,))
        return cursor.fetchone()
    finally:
        connection_pool.putconn(conn)  # Return to pool, don't close

Result: Connections are reused. Connection overhead drops from ~100ms to ~0ms per query.

Pool Sizing: The Math

Too few connections = requests wait. Too many = database overload.

Formula for CPU-bound databases:

connections=(core_count2)+effective_spindle_count

For SSDs, effective_spindle_count is typically 1. So an 8-core database server:

connections=(82)+1=17

For your application: Consider how many app instances you have.

#T#oItYfaoluyrocuodnahntaeavcbetais4oenasbpep=tts4eerrvse2ur0psp,=ore8ta0c8h0+wictohnnpeocotliomnasxof20:

Start small and monitor. 10-20 connections per app instance is usually plenty. Add more only if you see connection wait times in your metrics.

SQLAlchemy Connection Pooling

SQLAlchemy includes built-in pooling:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from sqlalchemy import create_engine

engine = create_engine(
    DATABASE_URL,
    pool_size=10,           # Maintained connections
    max_overflow=20,        # Extra connections when busy
    pool_timeout=30,        # Seconds to wait for connection
    pool_recycle=1800,      # Recycle connections after 30 min
    pool_pre_ping=True      # Verify connection before use
)

Key settings:

  • pool_size: Base number of connections to maintain
  • max_overflow: How many extra connections during spikes
  • pool_timeout: How long to wait before raising an error
  • pool_recycle: Recreate connections periodically (handles server-side timeouts)
  • pool_pre_ping: Test connection before using (handles stale connections)

PgBouncer: External Pooling

For high-scale deployments, use an external pooler like PgBouncer:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# pgbouncer.ini
[databases]
myapp = host=localhost port=5432 dbname=myapp

[pgbouncer]
listen_addr = 0.0.0.0
listen_port = 6432
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt

# Pool settings
pool_mode = transaction
default_pool_size = 20
max_client_conn = 1000

Pool modes:

  • session: Connection held for entire client session (safest, least efficient)
  • transaction: Connection held only during transaction (good balance)
  • statement: Connection returned after each statement (most efficient, breaks some features)

Why external pooler?

  1. Absorbs connection spikes: 1000 app connections → 20 database connections
  2. Survives app restarts: Connections persist when app redeploys
  3. Works across languages: All apps share the same pool

Handling Connection Errors

Connections die. Networks blip. Handle it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
from sqlalchemy.exc import OperationalError
from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=1, max=10),
    retry=retry_if_exception_type(OperationalError)
)
def execute_query(query, params):
    with engine.connect() as conn:
        return conn.execute(query, params)

With pool_pre_ping:

1
2
3
4
5
# SQLAlchemy will automatically:
# 1. Get connection from pool
# 2. Send a ping (SELECT 1)
# 3. If ping fails, discard and get fresh connection
# 4. Execute your query

Connection Leaks

The silent killer. A leaked connection sits in the pool, unavailable:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# BAD: Connection leak if exception occurs
def bad_query():
    conn = pool.getconn()
    cursor = conn.cursor()
    cursor.execute("SELECT * FROM users")  # Exception here = leaked conn
    pool.putconn(conn)

# GOOD: Context manager ensures return
def good_query():
    conn = pool.getconn()
    try:
        cursor = conn.cursor()
        cursor.execute("SELECT * FROM users")
        return cursor.fetchall()
    finally:
        pool.putconn(conn)

# BETTER: Use SQLAlchemy's context manager
def better_query():
    with engine.connect() as conn:
        return conn.execute("SELECT * FROM users").fetchall()
    # Connection automatically returned to pool

Detect leaks: Monitor pool.checkedout() or equivalent. If it keeps growing, you have a leak.

Monitoring Pool Health

Expose pool metrics:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from prometheus_client import Gauge

pool_size = Gauge('db_pool_size', 'Total connections in pool')
pool_checked_out = Gauge('db_pool_checked_out', 'Connections in use')
pool_overflow = Gauge('db_pool_overflow', 'Overflow connections in use')

def update_pool_metrics():
    pool_size.set(engine.pool.size())
    pool_checked_out.set(engine.pool.checkedout())
    pool_overflow.set(engine.pool.overflow())

Alerts to set:

  • Pool exhaustion (checked_out == size + max_overflow)
  • High wait times for connections
  • Connection errors increasing

Quick Reference

SettingStart WithAdjust When
pool_size5-10Seeing connection waits
max_overflow10-20Traffic spikes cause errors
pool_timeout30sRequests timing out waiting for conn
pool_recycle1800sStale connection errors

Remember:

  • Fewer connections is usually better
  • Monitor before scaling up
  • External pooler (PgBouncer) for high scale
  • Always use context managers to prevent leaks

Connection pooling is one of those things that’s invisible when working and catastrophic when broken. Set it up once, monitor it always.