Database connections are expensive. Each new connection requires TCP handshake, authentication, session initialization, and memory allocation on both client and server. Do this for every web request and you’ve got a performance problem hiding in plain sight.
The Problem: Connection Overhead
A typical PostgreSQL connection takes 50-100ms to establish. For a web request that runs a 5ms query, you’re spending 10-20x more time on connection setup than actual work.
| |
Under load, this creates connection storms. Each concurrent request opens its own connection. The database has connection limits. You hit those limits, new connections queue, latency spikes, everything falls apart.
Connection Pooling: Reuse Instead of Recreate
A connection pool maintains a set of open connections that requests can borrow and return:
| |
Borrowing from a pool is nearly instant. The connection is already open, authenticated, and ready.
Pool Sizing: The Goldilocks Problem
Too few connections: requests queue waiting for a connection, adding latency. Too many connections: database overwhelmed, context switching overhead, memory pressure.
A reasonable starting formula:
For a 4-core server with SSDs (no spindles), that’s roughly 8-10 connections. For cloud databases, start with your vCPU count and tune from there.
PostgreSQL’s max_connections default is 100. If you have 10 application servers each opening 20 connections, you’re already at capacity. Plan accordingly.
PgBouncer: External Pool Management
For larger deployments, put a dedicated pooler between your application and database:
| |
PgBouncer supports three pool modes:
- Session: One backend connection per client session. Safest, least efficient.
- Transaction: Connection returned after each transaction. Good balance.
- Statement: Connection returned after each statement. Most aggressive, breaks some features.
Transaction mode is usually what you want. It allows high client concurrency while keeping database connections low.
Connection Lifecycle Management
Connections can go stale. Network issues, database restarts, idle timeouts — a connection that worked five minutes ago might be dead now.
Good pools validate connections before handing them out:
| |
pool_pre_ping adds a tiny overhead (one round-trip) but prevents the much larger cost of a request failing mid-query because the connection died.
Common Pitfalls
Leaking connections: Forgetting to return connections to the pool. Use context managers or try/finally to guarantee cleanup.
| |
Long-running transactions: Holding a connection while waiting for user input or external APIs. The connection is unavailable to other requests the entire time.
Connection pool per request: Some frameworks accidentally create a new pool for each request instead of sharing one pool across the application. Check your initialization code.
Ignoring pool exhaustion: When all connections are in use, what happens? Good pools have configurable behavior — wait with timeout, raise immediately, or create temporary overflow connections.
Monitoring What Matters
Track these metrics:
- Pool utilization: Connections in use vs. available. Consistently high (>80%) means you need more capacity.
- Wait time: How long requests wait for a connection. Should be near zero.
- Connection errors: Failed checkouts, broken connections, authentication failures.
- Connection churn: How often new connections are created. Should be rare after warmup.
| |
The Serverless Wrinkle
Serverless functions (Lambda, Cloud Functions) complicate pooling. Each function instance has its own process, potentially its own pool, and instances scale up/down unpredictably.
Solutions:
- External poolers like PgBouncer or RDS Proxy absorb the connection chaos
- Connection-per-invocation with very short-lived connections (accepts the overhead)
- Database-per-function for truly isolated workloads (usually overkill)
AWS RDS Proxy and similar managed services exist specifically for this problem.
Connection pooling is one of those optimizations that provides immediate, measurable improvement with minimal code changes. If you’re connecting to a database without pooling, you’re leaving performance on the table.
Start with your framework’s built-in pool. Measure. Tune pool size based on actual load. Graduate to external poolers when you outgrow application-level pools.