Redis Patterns: Beyond Simple Caching

Redis gets introduced as a cache, but that undersells it. It’s an in-memory data structure server with atomic operations, pub/sub, streams, and more. These patterns show Redis’s real power. Basic Caching (The Familiar One) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 import redis import json r = redis.Redis(host='localhost', port=6379, decode_responses=True) def get_user(user_id): # Check cache first cached = r.get(f"user:{user_id}") if cached: return json.loads(cached) # Miss: fetch from database user = db.query("SELECT * FROM users WHERE id = %s", user_id) # Cache with TTL r.setex(f"user:{user_id}", 3600, json.dumps(user)) return user Rate Limiting Sliding window rate limiter with sorted sets: ...

February 23, 2026 Â· 5 min Â· 1055 words Â· Rob Washington

Caching Strategies: When, Where, and How to Cache

The fastest request is one you don’t make. Caching trades storage for speed, serving precomputed results instead of recalculating them. But caching done wrong is worse than no caching—stale data, inconsistencies, and debugging nightmares. When to Cache Cache when: Data is read more often than written Computing the result is expensive Slight staleness is acceptable The same data is requested repeatedly Don’t cache when: Data changes constantly Every request needs fresh data Storage cost exceeds compute savings Cache invalidation is harder than recomputation Cache Placement Client-Side Cache Browser cache, mobile app cache, CDN edge cache: ...

February 16, 2026 Â· 7 min Â· 1313 words Â· Rob Washington

Rate Limiting: Protecting Your APIs from Abuse and Overload

Every public API needs rate limiting. Without it, one misbehaving client can take down your entire service—whether through malice, bugs, or just enthusiasm. Rate limiting protects your infrastructure, ensures fair usage, and creates predictable behavior for all clients. The Core Algorithms Fixed Window Count requests in fixed time intervals (e.g., per minute): 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 class FixedWindowLimiter { constructor(redis, limit, windowSeconds) { this.redis = redis; this.limit = limit; this.windowSeconds = windowSeconds; } async isAllowed(clientId) { const window = Math.floor(Date.now() / 1000 / this.windowSeconds); const key = `ratelimit:${clientId}:${window}`; const count = await this.redis.incr(key); if (count === 1) { await this.redis.expire(key, this.windowSeconds); } return count <= this.limit; } } Pros: Simple, memory-efficient. Cons: Burst at window boundaries. Client could hit 100 requests at 0:59 and 100 more at 1:00. ...

February 16, 2026 Â· 6 min Â· 1120 words Â· Rob Washington

Message Queues: Decoupling Services for Scale and Reliability

When Service A needs to tell Service B something happened, the simplest approach is a direct HTTP call. But what happens when Service B is slow? Or down? Or overwhelmed? Message queues decouple your services, letting them communicate reliably even when things go wrong. Why Queues? Without a queue: U s e r R e q u e s t → A P I → P ( ( a i i y f f m e s d n l o t o w w n S , , e r u r v s e i e q c r u e e w s → a t i E t f m s a a ) i i l l s ) S e r v i c e → R e s p o n s e With a queue: ...

February 11, 2026 Â· 8 min Â· 1508 words Â· Rob Washington

Rate Limiting Patterns: Protecting Your APIs Without Frustrating Users

Every API needs rate limiting. Without it, a single misbehaving client can overwhelm your service, intentional attacks become trivial, and cost management becomes impossible. But implement it poorly, and you’ll frustrate legitimate users while barely slowing down bad actors. Let’s explore rate limiting patterns that actually work. The Fundamentals: Why Rate Limit? Rate limiting serves multiple purposes: Protection: Prevent abuse, DDoS attacks, and runaway scripts Fairness: Ensure one client can’t monopolize resources Cost control: Limit expensive operations (API calls, LLM tokens, etc.) Stability: Maintain consistent performance under load Algorithm 1: Token Bucket The token bucket is the most flexible approach. Imagine a bucket that fills with tokens at a steady rate. Each request consumes a token. If the bucket is empty, the request is denied. ...

February 11, 2026 Â· 6 min Â· 1201 words Â· Rob Washington

Caching Strategies: Make Your App Fast Without Breaking Everything

A practical guide to caching — when to cache, what to cache, and how to avoid the gotchas that make caching the second hardest problem in computer science.

February 11, 2026 Â· 7 min Â· 1371 words Â· Rob Washington