Performance

Caching Strategies: Making Slow Things Fast

The fastest database query is the one you don’t make. Caching is how you turn expensive operations into cheap lookups. Here’s how to do it without shooting yourself in the foot. Cache Placement Where to Cache B r o w F S s a m e ↑ s a r t l e l C s e a t s c t h e → F L a a C s r D ↑ t g N e → A p p l M M i e e c d d a i i t ↑ u u i m m o n C a c h e → D a S L t l a a o r b w g a ↑ e e s s s e t t C a c h e → D a t a b a s e Cache as close to the user as possible, but only what makes sense at each layer. ...

Caching Patterns That Actually Work

Caching seems simple. Add Redis, cache everything, go fast. Then you get stale data, cache stampedes, and bugs that only happen in production. Here’s how to cache correctly. When to Cache Good candidates: Expensive database queries (aggregations, joins) External API responses Computed values that don’t change often Static content (templates, configs) Session data Bad candidates: Data that changes frequently User-specific data with many variations Security-sensitive data Data where staleness causes real problems Rule of thumb: Cache when read frequency » write frequency. ...

Caching Strategies That Actually Scale

There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors. Caching is straightforward when your data never changes. Real systems aren’t that simple. Data changes, caches get stale, and suddenly your users see yesterday’s prices or last week’s profile pictures. Here’s how to build caching that scales without becoming a source of bugs and outages. Cache-Aside: The Default Pattern Most applications should start here: ...

API Rate Limiting: Protecting Your Service Without Annoying Your Users

Rate limiting is the immune system of your API. Without it, a single misbehaving client can take down your service for everyone. With poorly designed limits, you’ll frustrate legitimate users while sophisticated attackers route around you. The goal isn’t just protection—it’s fairness. Every user gets a reasonable share of your capacity. The Basic Algorithms Fixed Window The simplest approach: count requests per time window, reject when over limit. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 import time import redis def is_rate_limited(user_id: str, limit: int = 100, window: int = 60) -> bool: """Fixed window: 100 requests per minute.""" r = redis.Redis() # Window key based on current minute window_key = f"ratelimit:{user_id}:{int(time.time() // window)}" current = r.incr(window_key) if current == 1: r.expire(window_key, window) return current > limit Problem: Burst at window boundaries. A user can make 100 requests at 0:59 and 100 more at 1:00—200 requests in 2 seconds while technically staying under “100/minute.” ...

Database Connection Pooling: The Performance Win You're Probably Missing

Every database connection has a cost. TCP handshake, TLS negotiation, authentication, session setup—all before your first query runs. For PostgreSQL, that’s typically 20-50ms. For a single request, barely noticeable. For thousands of requests per second, catastrophic. Connection pooling solves this by maintaining a set of pre-established connections that your application reuses. Done right, it’s one of the highest-impact performance optimizations you can make. The Problem: Connection Overhead Without pooling, every request cycle looks like this: ...

Database Indexing: What Every Developer Should Know

Your query is slow. You add an index. It gets faster. Magic, right? Not quite. Indexes are powerful but misunderstood. Used well, they turn seconds into milliseconds. Used poorly, they slow down writes, waste storage, and sometimes make queries slower. Let’s demystify them. What Is an Index? An index is a separate data structure that helps the database find rows without scanning the entire table. Think of it like a book’s index—instead of reading every page to find “PostgreSQL,” you look it up in the index and jump directly to page 247. ...

Caching Strategies: When, Where, and How to Cache

Caching is one of the most powerful performance optimizations available. It’s also one of the easiest to get wrong. The classic joke—“there are only two hard things in computer science: cache invalidation and naming things”—exists for a reason. Let’s cut through the complexity. When to Cache Not everything should be cached. Before adding a cache, ask: Is this data read more than written? Caching write-heavy data creates invalidation nightmares. Is computing this expensive? Database queries, API calls, complex calculations—good candidates. Can I tolerate stale data? If not, caching gets complicated fast. Is this a hot path? Cache what’s accessed frequently, not everything. 1 2 3 4 5 6 7 8 9 # Good cache candidate: expensive query, rarely changes @cache(ttl=3600) def get_product_catalog(): return db.query("SELECT * FROM products WHERE active = true") # Bad cache candidate: changes every request @cache(ttl=60) # Don't do this def get_user_cart(user_id): return db.query("SELECT * FROM carts WHERE user_id = ?", user_id) Where to Cache Caching happens at multiple layers. Each has tradeoffs. ...

API Pagination Patterns: Offset, Cursor, and Keyset

Every API that returns lists needs pagination. Without it, a request for “all users” could return millions of rows, crushing your database and timing out the client. But pagination has tradeoffs—and choosing wrong can hurt performance or cause data inconsistencies. Offset Pagination The classic approach. Simple to implement, simple to understand: G G G E E E T T T / / / u u u s s s e e e r r r s s s ? ? ? l l l i i i m m m i i i t t t = = = 2 2 2 0 0 0 & & & o o o f f f f f f s s s e e e t t t = = = 0 2 4 0 0 # # # F S T i e h r c i s o r t n d d p p a p a g a g e g e e 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 @app.get("/users") def list_users(limit: int = 20, offset: int = 0): users = db.query( "SELECT * FROM users ORDER BY id LIMIT %s OFFSET %s", (limit, offset) ) total = db.query("SELECT COUNT(*) FROM users")[0][0] return { "data": users, "pagination": { "limit": limit, "offset": offset, "total": total } } Pros: ...

Database Connection Pooling: Stop Opening Connections for Every Query

Opening a database connection is expensive. TCP handshake, SSL negotiation, authentication, session setup—it all adds up. Do that for every query and your application crawls. Connection pooling fixes this by reusing connections. Here’s how to do it right. The Problem Without pooling, every request opens a new connection: 1 2 3 4 5 6 7 8 # BAD: New connection per request def get_user(user_id): conn = psycopg2.connect(DATABASE_URL) # ~50-100ms cursor = conn.cursor() cursor.execute("SELECT * FROM users WHERE id = %s", (user_id,)) user = cursor.fetchone() conn.close() return user At 100 requests per second, that’s 100 connections opening and closing per second. Your database server has a connection limit (typically 100-500). You’ll exhaust it fast. ...

Caching Strategies: What to Cache and When to Invalidate

Cache invalidation is one of the two hard problems in computer science. Here’s how to make it less painful. The Caching Patterns Cache-Aside (Lazy Loading) 1 2 3 4 5 6 7 8 9 10 11 12 13 def get_user(user_id: str) -> dict: # Check cache first cached = redis.get(f"user:{user_id}") if cached: return json.loads(cached) # Cache miss: fetch from database user = db.query("SELECT * FROM users WHERE id = %s", user_id) # Store in cache for next time redis.setex(f"user:{user_id}", 3600, json.dumps(user)) return user Pros: Only caches what’s actually used Cons: First request always slow (cache miss) ...