The fastest database query is the one you don’t make. Caching is how you turn expensive operations into cheap lookups. Here’s how to do it without shooting yourself in the foot.

Cache Placement

Where to Cache

BrowFSsamesartlelCseatsctheFLaaCsrDtgNeApplMMieecddaiituuimmonCacheDaSLtlaaorbwgaeesssettCacheDatabase

Cache as close to the user as possible, but only what makes sense at each layer.

Browser Cache

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Nginx: Cache static assets
location /static/ {
    expires 1y;
    add_header Cache-Control "public, immutable";
}

# Nginx: Cache with revalidation
location /api/config {
    add_header Cache-Control "public, max-age=300, must-revalidate";
    add_header ETag $request_uri;
}

For assets with hashed filenames, cache forever. For dynamic content, use shorter TTLs with ETags.

CDN Cache

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Set cache headers for CDN
@app.route('/api/products/<id>')
def get_product(id):
    product = fetch_product(id)
    response = jsonify(product)
    
    # Cache publicly for 5 minutes
    response.headers['Cache-Control'] = 'public, max-age=300'
    
    # Surrogate key for targeted invalidation
    response.headers['Surrogate-Key'] = f'product-{id}'
    
    return response

CDNs (Cloudflare, Fastly, CloudFront) cache based on these headers.

Application Cache

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import redis

cache = redis.Redis(host='localhost', port=6379, db=0)

def get_user(user_id: str) -> dict:
    # Try cache first
    cached = cache.get(f"user:{user_id}")
    if cached:
        return json.loads(cached)
    
    # Cache miss: fetch from database
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)
    
    # Store in cache with 1 hour TTL
    cache.setex(f"user:{user_id}", 3600, json.dumps(user))
    
    return user

Cache Strategies

Cache-Aside (Lazy Loading)

Application manages the cache:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def get_product(product_id):
    # 1. Check cache
    cached = cache.get(f"product:{product_id}")
    if cached:
        return json.loads(cached)
    
    # 2. Load from database
    product = db.get_product(product_id)
    
    # 3. Store in cache
    cache.setex(f"product:{product_id}", 3600, json.dumps(product))
    
    return product

Pros: Only caches what’s needed Cons: Cache miss penalty, potential inconsistency

Write-Through

Write to cache and database simultaneously:

1
2
3
4
5
6
def update_product(product_id, data):
    # Update database
    db.update_product(product_id, data)
    
    # Update cache immediately
    cache.setex(f"product:{product_id}", 3600, json.dumps(data))

Pros: Cache always consistent Cons: Write latency includes cache

Write-Behind (Write-Back)

Write to cache, async write to database:

1
2
3
4
5
6
def update_product(product_id, data):
    # Update cache immediately
    cache.setex(f"product:{product_id}", 3600, json.dumps(data))
    
    # Queue database write
    queue.enqueue('write_product', product_id, data)

Pros: Fast writes Cons: Risk of data loss, complexity

Refresh-Ahead

Proactively refresh before expiration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
def get_product(product_id):
    cached = cache.get(f"product:{product_id}")
    ttl = cache.ttl(f"product:{product_id}")
    
    if cached and ttl > 300:  # More than 5 min left
        return json.loads(cached)
    
    # Refresh in background
    if cached:
        queue.enqueue('refresh_product', product_id)
        return json.loads(cached)
    
    # Cache miss
    return fetch_and_cache_product(product_id)

Cache Invalidation

The hard part. Two hard problems in CS: cache invalidation and naming things.

TTL-Based

1
2
# Simple but eventually consistent
cache.setex("product:123", 300, data)  # 5 minute TTL

Stale data for up to TTL seconds. Simple, often good enough.

Event-Based

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# On product update
def update_product(product_id, data):
    db.update_product(product_id, data)
    cache.delete(f"product:{product_id}")
    
# Or publish event
def update_product(product_id, data):
    db.update_product(product_id, data)
    events.publish("product:updated", {"id": product_id})

# Subscriber invalidates cache
@events.subscribe("product:updated")
def invalidate_product_cache(event):
    cache.delete(f"product:{event['id']}")

Versioned Keys

1
2
3
4
5
6
7
8
9
def get_cache_version():
    return cache.get("cache:version") or "1"

def get_product(product_id):
    version = get_cache_version()
    return cache.get(f"product:{product_id}:v{version}")

def invalidate_all():
    cache.incr("cache:version")  # All old keys now orphaned

Nuclear option: increment version to invalidate everything.

Cache Patterns

Memoization

1
2
3
4
5
6
from functools import lru_cache

@lru_cache(maxsize=1000)
def expensive_calculation(x, y):
    # Complex computation
    return result

In-memory, per-process. Good for pure functions.

Request-Level Cache

1
2
3
4
5
6
from flask import g

def get_current_user():
    if not hasattr(g, 'current_user'):
        g.current_user = fetch_user(session['user_id'])
    return g.current_user

Deduplicate within a single request.

Batch Warming

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
def warm_cache():
    """Pre-populate cache with hot data"""
    popular_products = db.query("""
        SELECT id FROM products 
        ORDER BY views DESC 
        LIMIT 1000
    """)
    
    for product_id in popular_products:
        product = db.get_product(product_id)
        cache.setex(f"product:{product_id}", 3600, json.dumps(product))

Run before traffic spikes or after cache flush.

Redis Data Structures

Strings (Simple K/V)

1
2
cache.set("user:123", json.dumps(user))
cache.get("user:123")

Hashes (Structured Data)

1
2
3
4
5
6
7
8
cache.hset("user:123", mapping={
    "name": "Alice",
    "email": "alice@example.com",
    "role": "admin"
})

cache.hget("user:123", "name")  # Get single field
cache.hgetall("user:123")       # Get all fields

More memory efficient than JSON strings for partial reads.

Sets (Unique Collections)

1
2
3
4
5
# Track online users
cache.sadd("online_users", "user:123")
cache.srem("online_users", "user:123")
cache.smembers("online_users")
cache.scard("online_users")  # Count

Sorted Sets (Leaderboards, Rankings)

1
2
3
4
# Leaderboard
cache.zadd("leaderboard", {"user:123": 1500, "user:456": 1200})
cache.zrevrange("leaderboard", 0, 9, withscores=True)  # Top 10
cache.zrank("leaderboard", "user:123")  # User's rank

Cache Problems

Thundering Herd

Cache expires, 1000 requests hit database simultaneously.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
def get_product_safe(product_id):
    cached = cache.get(f"product:{product_id}")
    if cached:
        return json.loads(cached)
    
    # Lock to prevent thundering herd
    lock = cache.lock(f"lock:product:{product_id}", timeout=10)
    if lock.acquire(blocking=True, blocking_timeout=5):
        try:
            # Double-check cache after acquiring lock
            cached = cache.get(f"product:{product_id}")
            if cached:
                return json.loads(cached)
            
            product = db.get_product(product_id)
            cache.setex(f"product:{product_id}", 3600, json.dumps(product))
            return product
        finally:
            lock.release()
    else:
        # Couldn't get lock, try cache again or return stale
        return json.loads(cache.get(f"product:{product_id}") or '{}')

Cache Penetration

Requests for non-existent keys always hit database.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
def get_product_safe(product_id):
    cached = cache.get(f"product:{product_id}")
    
    if cached == "NULL":  # Cached negative result
        return None
    if cached:
        return json.loads(cached)
    
    product = db.get_product(product_id)
    
    if product is None:
        # Cache the miss
        cache.setex(f"product:{product_id}", 300, "NULL")
        return None
    
    cache.setex(f"product:{product_id}", 3600, json.dumps(product))
    return product

Hot Key

One key gets too much traffic.

Solutions:

  • Local in-memory cache in front of Redis
  • Replicate hot keys across multiple Redis nodes
  • Break key into shards: product:123:shard:1, product:123:shard:2

Monitoring

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Track cache performance
def get_product(product_id):
    start = time.time()
    cached = cache.get(f"product:{product_id}")
    
    if cached:
        metrics.increment("cache.hit", tags=["type:product"])
        metrics.timing("cache.latency", time.time() - start)
        return json.loads(cached)
    
    metrics.increment("cache.miss", tags=["type:product"])
    # ... fetch from database

Key metrics:

  • Hit rate (target: >90%)
  • Miss rate
  • Latency (p50, p99)
  • Memory usage
  • Eviction rate

Start Here

  1. Today: Add caching to your slowest endpoint
  2. This week: Monitor cache hit rate
  3. This month: Implement proper invalidation
  4. This quarter: Add multi-tier caching

The goal: users get fast responses, databases stay cool.


The best cache is invisible. Users just experience speed.