March 11, 2026 · 7 min · 1360 words · Rob Washington
Table of Contents
Caching seems simple. Add Redis, cache everything, go fast. Then you get stale data, cache stampedes, and bugs that only happen in production. Here’s how to cache correctly.
# Short TTL + event invalidation# TTL catches missed invalidations# Events provide freshness for known writesdefget_user(user_id:str):cached=redis.get(f"user:{user_id}")ifcached:returnUser.parse(cached)user=db.query(User).get(user_id)redis.setex(f"user:{user_id}",300,user.json())# 5 min TTLreturnuserdefupdate_user(user_id:str,data:dict):db.update(user_id,data)redis.delete(f"user:{user_id}")# Immediate invalidation
defget_user_with_lock(user_id:str):cached=redis.get(f"user:{user_id}")ifcached:returnUser.parse(cached)# Try to acquire locklock_key=f"lock:user:{user_id}"ifredis.setnx(lock_key,"1"):redis.expire(lock_key,10)# Lock timeouttry:user=db.query(User).get(user_id)redis.setex(f"user:{user_id}",3600,user.json())returnuserfinally:redis.delete(lock_key)else:# Someone else is loading, wait and retrytime.sleep(0.1)returnget_user_with_lock(user_id)
defget_user_with_early_refresh(user_id:str):cached,ttl=redis.get_with_ttl(f"user:{user_id}")ifcached:# Probabilistically refresh before expiryifttl<300andrandom.random()<0.1:# 10% chance in last 5 minrefresh_async(user_id)returnUser.parse(cached)# Cache missreturnload_and_cache_user(user_id)
# Never let cache expire - refresh before it does@scheduler.task(run_every=minutes(5))defrefresh_hot_users():hot_user_ids=get_frequently_accessed_users()foruser_idinhot_user_ids:user=db.query(User).get(user_id)redis.setex(f"user:{user_id}",3600,user.json())
Predictable (you can construct them without the data)
Namespaced (avoid collisions)
Versioned (for schema changes)
1
2
3
4
5
6
7
8
9
10
11
# Badkey=f"{user_id}"# Collides with other entities# Betterkey=f"user:{user_id}"# Bestkey=f"v2:user:{user_id}"# Versioned for schema changes# For querieskey=f"v1:users:active:page:{page}:limit:{limit}"
# Hit rate (should be >90%)hit_rate=cache_hits/(cache_hits+cache_misses)# Latencycache_latency_p99=measure_percentile(cache_response_times,99)# Memory usagememory_used=redis.info()['used_memory']# Evictions (sign of undersized cache)evictions=redis.info()['evicted_keys']
# Bad: Cache miss on every request for non-existent userdefget_user(user_id):cached=redis.get(f"user:{user_id}")ifcached:returnUser.parse(cached)user=db.query(User).get(user_id)ifuser:redis.setex(f"user:{user_id}",3600,user.json())returnuser# Never caches "not found"# Good: Cache negative results toodefget_user(user_id):cached=redis.get(f"user:{user_id}")ifcached=="NULL":returnNoneifcached:returnUser.parse(cached)user=db.query(User).get(user_id)ifuser:redis.setex(f"user:{user_id}",3600,user.json())else:redis.setex(f"user:{user_id}",300,"NULL")# Shorter TTL for negativesreturnuser
# Bad: Serialize/deserialize huge objects every timeredis.set("big_data",pickle.dumps(huge_object))obj=pickle.loads(redis.get("big_data"))# CPU expensive# Better: Store only what you needredis.set("user:123:name",user.name)redis.set("user:123:email",user.email)
# Bad: Cache failure = application failuredefget_user(user_id):returnUser.parse(redis.get(f"user:{user_id}"))# Crashes if Redis down# Good: Graceful degradationdefget_user(user_id):try:cached=redis.get(f"user:{user_id}")ifcached:returnUser.parse(cached)exceptRedisError:pass# Cache unavailable, fall through to DBreturndb.query(User).get(user_id)