Testing in Production: Because Staging Never Tells the Whole Story

“We don’t test in production” sounds responsible until you realize: production is the only environment that’s actually production. Staging lies to you. Here’s how to test in production safely. Why Staging Fails Staging environments differ from production in ways that matter: Data: Sanitized, outdated, or synthetic Scale: 1% of production traffic Integrations: Sandbox APIs with different behavior Users: Developers clicking around, not real usage patterns Infrastructure: Smaller instances, shared resources That bug that only appears under real load with real data? Staging won’t catch it. ...

March 1, 2026 Â· 4 min Â· 836 words Â· Rob Washington

Terraform State Management: Keep Your Infrastructure Sane

Terraform state is the source of truth for your infrastructure. Mess it up and you’ll be manually reconciling resources at 2 AM. Here’s how to manage state properly from day one. What Is State? Terraform state maps your configuration to real resources: 1 2 3 4 5 # main.tf resource "aws_instance" "web" { ami = "ami-12345" instance_type = "t3.micro" } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 // terraform.tfstate (simplified) { "resources": [{ "type": "aws_instance", "name": "web", "instances": [{ "attributes": { "id": "i-0abc123def456", "ami": "ami-12345", "instance_type": "t3.micro" } }] }] } Without state, Terraform doesn’t know aws_instance.web corresponds to i-0abc123def456. It would try to create a new instance every time. ...

March 1, 2026 Â· 7 min Â· 1336 words Â· Rob Washington

Git Hooks: Automate Quality Checks Before Code Leaves Your Machine

Git hooks are scripts that run automatically at specific points in your Git workflow. Use them to catch problems before they become PR comments. Here’s how to set them up effectively. Hook Basics Git hooks live in .git/hooks/. They’re executable scripts that run at specific events: 1 2 3 4 5 6 .git/hooks/ ├── pre-commit # Before commit is created ├── commit-msg # After commit message is entered ├── pre-push # Before push to remote ├── post-merge # After merge completes └── ... To enable a hook, create an executable script with the hook’s name: ...

March 1, 2026 Â· 5 min Â· 1056 words Â· Rob Washington

Docker Multi-Stage Builds: Smaller Images, Faster Deploys

Your Docker images are probably too big. Build tools, dev dependencies, source files—all shipping to production when only the compiled binary matters. Multi-stage builds fix this by separating build environment from runtime environment. The Problem A typical single-stage Dockerfile: 1 2 3 4 5 6 7 8 9 FROM python:3.11 WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY . . CMD ["python", "app.py"] This image includes: ...

March 1, 2026 Â· 4 min Â· 852 words Â· Rob Washington

Environment Variables Done Right

Environment variables are the standard way to configure applications across environments. They’re simple, universal, and supported everywhere. But like any tool, they can be misused. Here’s how to do them right. The Basics Environment variables are key-value pairs available to your process: 1 2 3 4 5 6 7 8 9 10 11 12 13 # Setting them export DATABASE_URL="postgres://localhost/myapp" export LOG_LEVEL="info" # Using them (Bash) echo $DATABASE_URL # Using them (Python) import os db_url = os.environ.get("DATABASE_URL") # Using them (Node.js) const dbUrl = process.env.DATABASE_URL; Naming Conventions Be consistent. A common pattern: ...

March 1, 2026 Â· 5 min Â· 996 words Â· Rob Washington

Health Check Endpoints: More Than Just 200 OK

Every modern service needs health check endpoints. Load balancers probe them. Kubernetes uses them. Monitoring systems scrape them. But a naive implementation—returning 200 OK if the process is running—tells you almost nothing useful. Here’s how to build health checks that actually help. Two Types of Health Liveness: Is the process alive and not deadlocked? Readiness: Can this instance handle requests right now? These are different questions with different answers: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # Liveness: Am I alive? @app.get("/health/live") def liveness(): # If this returns, the process is alive return {"status": "alive"} # Readiness: Can I serve traffic? @app.get("/health/ready") def readiness(): checks = { "database": check_database(), "cache": check_cache(), "disk_space": check_disk_space(), } all_healthy = all(c["healthy"] for c in checks.values()) return JSONResponse( status_code=200 if all_healthy else 503, content={"status": "ready" if all_healthy else "not_ready", "checks": checks} ) Why separate them? ...

March 1, 2026 Â· 5 min Â· 920 words Â· Rob Washington

Configuration Management Patterns for Reliable Deployments

Configuration is where deployments go to die. A typo in an environment variable, a missing secret, a config file that works in staging but breaks in production. Here’s how to make configuration boring and reliable. The Hierarchy of Configuration Not all config is created equal. Layer it: 1 2 3 4 5 . . . . . D C E C F e o n o e f n v m a a f i m t u i r a u l g o n r t n d e s f m - i e l f ( l n i l i e t n a n s e g v s c ( a f o p r l ( d e i a r e r a g u ) b s n e l t n e i v s m i e r ) o n m e n t ) S O D a R n y f E u e n e n n - a s v t o m t i i f i r m f c f o e a n c l m v h l e v e a b n e r n a t r r g c - r i e k s i d s s p d e e e s c s i f i c Later layers override earlier ones. This lets you: ...

March 1, 2026 Â· 7 min Â· 1386 words Â· Rob Washington

Background Job Patterns That Actually Scale

Every production system eventually needs background jobs. Email notifications, report generation, data syncing, webhook processing—the work that can’t (or shouldn’t) happen during a user request. Here’s what I’ve learned about making them reliable. The Naive Approach (And Why It Breaks) Most developers start with something like this: 1 2 3 4 5 @app.route('/signup') def signup(): user = create_user(request.form) send_welcome_email(user) # Blocks the response return redirect('/dashboard') This works until it doesn’t. The email service has a 5-second timeout. Now your signup page feels broken. Or the email service is down, and signups fail entirely. ...

March 1, 2026 Â· 4 min Â· 831 words Â· Rob Washington

DNS Troubleshooting: When dig Is Your Best Friend

DNS issues are deceptively simple. “It’s always DNS” is a meme because it’s true. Here’s how to actually debug it. The Essential Commands dig: Your Primary Tool 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 # Basic lookup dig example.com # Short answer only dig +short example.com # Specific record type dig example.com MX dig example.com TXT dig example.com CNAME # Query specific nameserver dig @8.8.8.8 example.com # Trace the full resolution path dig +trace example.com Understanding dig Output ; ; ; ; e ; ; ; ; ; e ; x ; ; ; ; x a Q a A m Q S W M U m N p u E H S E p S l e R E G S l W e r V N D T e E . y E : S i I . R c R I G O c o t : S Z N o S m i a E 9 m E . m 1 t . S . C e 9 1 E T : 2 F r 8 C I . e c . T O 2 1 b v 1 I N 3 6 d O : 8 2 : N m . 8 : s 1 5 e . 2 6 c 1 0 # : e 5 3 x 3 3 0 a 6 ( : m 0 1 0 p 0 9 0 l 2 e . E . 1 S c 6 T o I I 8 m N N . 2 1 0 . 2 1 6 ) A A 9 3 . 1 8 4 . 2 1 6 . 3 4 Key fields: ...

February 28, 2026 Â· 5 min Â· 1057 words Â· Rob Washington

Database Migrations That Won't Ruin Your Weekend

Database migrations are where deploys go to die. Here’s how to make them safe. The Golden Rule Every migration must be backward compatible. Your old code will run alongside the new schema during deployment. If it can’t, you’ll have downtime. Safe Operations These are always safe: Adding a nullable column Adding a table Adding an index (with CONCURRENTLY) Adding a constraint (as NOT VALID, then validate later) Dangerous Operations These need careful handling: ...

February 28, 2026 Â· 4 min Â· 770 words Â· Rob Washington