Dotfiles Management: Your Dev Environment as Code

New machine? Reinstall? Your perfect dev environment should be one command away. Here’s how to manage dotfiles properly. The Problem You spend hours configuring: Shell (zsh, bash) Editor (vim, nvim, VS Code) Git config SSH config Tmux Aliases and functions Then you get a new laptop and do it all again. Badly. The Basic Solution Put dotfiles in a Git repo, symlink them. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 # Create repo mkdir ~/dotfiles cd ~/dotfiles git init # Move configs mv ~/.zshrc ~/dotfiles/zshrc mv ~/.vimrc ~/dotfiles/vimrc mv ~/.gitconfig ~/dotfiles/gitconfig # Create symlinks ln -sf ~/dotfiles/zshrc ~/.zshrc ln -sf ~/dotfiles/vimrc ~/.vimrc ln -sf ~/dotfiles/gitconfig ~/.gitconfig # Push to GitHub git remote add origin git@github.com:username/dotfiles.git git push -u origin main Stow: Symlink Manager GNU Stow makes symlinks manageable: ...

March 11, 2026 · 7 min · 1469 words · Rob Washington

Service Mesh Basics: What It Is and When You Need It

Service mesh is either the solution to all your microservices problems or unnecessary complexity you don’t need. Here’s how to tell which. What a Service Mesh Does A service mesh handles cross-cutting concerns for service-to-service communication: Traffic management — Load balancing, routing, retries Security — mTLS, authorization policies Observability — Metrics, tracing, logging Resilience — Circuit breakers, timeouts, fault injection Instead of implementing these in every service, the mesh handles them at the infrastructure layer. ...

March 11, 2026 · 5 min · 987 words · Rob Washington

Service Discovery: Finding Services Without Hardcoding

Hardcoded IPs are a maintenance nightmare. Here’s how to let services find each other dynamically. The Problem 1 2 3 4 5 6 7 # Bad: Hardcoded api_url = "http://192.168.1.50:8080" # What happens when: # - IP changes? # - Service moves to new host? # - You add a second instance? Service discovery solves this: services register themselves, and clients look them up by name. DNS-Based Discovery The simplest approach: use DNS. ...

March 11, 2026 · 5 min · 867 words · Rob Washington

SSL Certificates: Automation That Doesn't Expire

Certificate expiration is the outage you always see coming and somehow never prevent. Here’s how to automate SSL so it stops being a problem. The Problem SSL certificates expire. When they do: Users see scary browser warnings APIs reject connections Mobile apps fail silently Trust is broken And it’s always on a Friday night. Let’s Encrypt + Certbot Free, automated, trusted certificates. Basic Setup 1 2 3 4 5 6 7 8 # Install certbot apt install certbot python3-certbot-nginx # Get certificate (nginx plugin handles everything) certbot --nginx -d example.com -d www.example.com # Test renewal certbot renew --dry-run Certbot adds a cron job automatically for renewal. ...

March 11, 2026 · 3 min · 464 words · Rob Washington

Cloud Cost Optimization: Stop Burning Money

Your cloud bill is too high. It always is. Here’s how to actually reduce it without breaking things. Quick Wins 1. Find Unused Resources 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # AWS: Find unattached EBS volumes aws ec2 describe-volumes \ --filters Name=status,Values=available \ --query 'Volumes[*].[VolumeId,Size,CreateTime]' \ --output table # Find old snapshots (>90 days) aws ec2 describe-snapshots --owner-ids self \ --query 'Snapshots[?StartTime<=`2026-01-01`].[SnapshotId,VolumeSize,StartTime]' \ --output table # Find unattached Elastic IPs aws ec2 describe-addresses \ --query 'Addresses[?AssociationId==`null`].[PublicIp,AllocationId]' \ --output table Delete them. Unattached EBS volumes cost money. Unused EIPs cost $3.65/month each. ...

March 11, 2026 · 6 min · 1095 words · Rob Washington

Ansible Patterns That Scale

Ansible is easy to start, hard to scale. Here’s how to structure playbooks that don’t become unmaintainable nightmares. Directory Structure Start organized, stay organized: a ├ ├ │ │ │ │ │ │ │ │ │ ├ │ │ │ ├ │ │ │ └ n ─ ─ ─ ─ ─ s ─ ─ ─ ─ ─ i b a i ├ │ │ │ │ │ └ p ├ ├ └ r ├ ├ └ c l n n ─ ─ l ─ ─ ─ o ─ ─ ─ o e s v ─ ─ a ─ ─ ─ l ─ ─ ─ l / i e y e l b n p ├ └ s ├ └ b s w d s c n p e l t r ─ ─ t ─ ─ o i e a / o g o c e o o ─ ─ a ─ ─ o t b t m i s t . r d g k e s a m n t i c y u h g ├ ├ └ i h g s . e b o x g o f / c o r ─ ─ ─ n o r y r a n r n g t s o ─ ─ ─ g s m v s e s i t u / t u l e e s / o s p a w d s p r s q n . _ l e a . _ s . l y v l b t y v . y / m a . s a m a y m l r y e b l r m l s m r a s l / l v s / e e r s s . . y y m m l l Inventory Patterns Static YAML Inventory 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # inventory/production/hosts.yml all: children: webservers: hosts: web1.example.com: web2.example.com: vars: http_port: 80 databases: hosts: db1.example.com: postgresql_port: 5432 db2.example.com: postgresql_port: 5433 Dynamic Inventory For cloud infrastructure, use dynamic inventory: ...

March 11, 2026 · 9 min · 1795 words · Rob Washington

GitOps for Kubernetes: Deployments as Code

Push to Git, watch your cluster update. That’s the GitOps promise. Here’s how to actually implement it. What GitOps Is GitOps means: Git is the source of truth for infrastructure and application state Changes happen through Git (PRs, not kubectl apply) A controller watches Git and reconciles cluster state Drift is automatically corrected The cluster converges to match what’s in Git, continuously. Why GitOps Over kubectl apply 1 2 3 4 5 6 # Bad: Who ran this? When? From where? kubectl apply -f deployment.yaml # Good: PR reviewed, approved, merged, tracked forever git commit -m "Scale API to 5 replicas" git push Over CI-Push Traditional CI/CD pushes to the cluster: ...

March 11, 2026 · 7 min · 1380 words · Rob Washington

Backup Strategies That Actually Save You

Everyone knows backups are important. Few actually test them. Here’s how to build backup systems that work when you need them. The 3-2-1 Rule The classic foundation: 3 copies of your data 2 different storage types 1 offsite copy Example implementation: P C C C r o o o i p p p m y y y a r 1 2 3 y : : : : L R C P o e l r c m o o a o u d l t d u e c s b t n r a i a e c o p p k n s l u h i p d o c a t a ( t S a ( ( 3 b s d , a a i s m f d e e f i e f s r f e e e r n r v t e e n r d t , a t r d a e i g f c i f e o e n n r t ) e e n r t ) d i s k ) What to Back Up Always Back Up Databases — This is your business Configuration — Harder to recreate than you think Secrets — Encrypted, but backed up User uploads — Can’t regenerate these Maybe Back Up Application code — If not in Git, back it up Logs — For compliance, ship to log aggregator instead Build artifacts — Rebuild from source is often better Don’t Back Up Ephemeral data — Caches, temp files, sessions Derived data — Can regenerate from source Large static assets — Use CDN/object storage with its own durability Database Backups PostgreSQL 1 2 3 4 5 6 7 8 # Logical backup (SQL dump) pg_dump -Fc mydb > backup.dump # Restore pg_restore -d mydb backup.dump # All databases pg_dumpall > all_databases.sql For larger databases, use physical backups: ...

March 11, 2026 · 7 min · 1321 words · Rob Washington

Kubernetes Debugging: A Practical Field Guide

Your pod won’t start. The service isn’t routing. Something’s wrong but kubectl isn’t telling you what. Here’s how to actually debug Kubernetes problems. The Debugging Hierarchy Work from the outside in: Cluster level — Is the cluster healthy? Node level — Are nodes ready? Pod level — Is the pod running? Container level — Is the container healthy? Application level — Is the app working? Most problems are at levels 3-5. Start there. ...

March 11, 2026 · 6 min · 1271 words · Rob Washington

Load Balancing: Beyond Round Robin

Round robin is the default. It’s also often wrong. Here’s how to choose load balancing strategies that actually match your workload. The Strategies Round Robin Each request goes to the next server in rotation. 1 2 3 4 5 upstream backend { server 10.0.0.1; server 10.0.0.2; server 10.0.0.3; } Good for: Stateless services, similar server capacity Bad for: Long-running connections, mixed server specs, sticky sessions Weighted Round Robin Same rotation, but some servers get more traffic. ...

March 11, 2026 · 5 min · 1037 words · Rob Washington