Zero-Downtime Deployments: Strategies That Actually Work
Practical deployment strategies for keeping your services running while you ship new code.
February 22, 2026 · 7 min · 1329 words · Rob Washington
Table of Contents
“We’re deploying, please hold” is not an acceptable user experience. Whether you’re running a startup or enterprise infrastructure, users expect services to just work. Here’s how to ship code without the maintenance windows.
A zero-downtime deployment means users never notice you’re deploying. No error pages, no dropped connections, no “please refresh” messages. The old version serves traffic until the new version is proven healthy.
The simplest approach. Replace instances one at a time:
1
2
3
4
5
6
7
8
# Kubernetes rolling update configspec:replicas:4strategy:type:RollingUpdaterollingUpdate:maxSurge:1# Add 1 extra pod during updatemaxUnavailable:0# Never reduce below desired count
How it works:
Start one new pod with new version
Wait for it to pass health checks
Remove one old pod
Repeat until all pods are updated
Pros:
Simple to implement
Built into Kubernetes, ECS, most orchestrators
Minimal extra resource usage
Cons:
Slow for large deployments
Both versions run simultaneously (must be compatible)
livenessProbe:httpGet:path:/healthport:8080initialDelaySeconds:30# Give app time to startperiodSeconds:10failureThreshold:3# 3 failures before restartreadinessProbe:httpGet:path:/healthport:8080initialDelaySeconds:5periodSeconds:5failureThreshold:1# Remove from LB immediately
Key insight: Readiness checks should be stricter than liveness checks. A pod that can’t serve traffic should be removed from the load balancer immediately, but killing and restarting it might just make things worse.
When a pod is terminated, it needs time to finish in-flight requests:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
importsignalimportsysdefgraceful_shutdown(signum,frame):# Stop accepting new requestsserver.stop_accepting()# Wait for in-flight requests (with timeout)server.wait_for_completion(timeout=25)# Clean up connectionsdb.close()redis.close()sys.exit(0)signal.signal(signal.SIGTERM,graceful_shutdown)
Kubernetes config:
1
2
3
4
5
6
7
spec:terminationGracePeriodSeconds:30containers:- lifecycle:preStop:exec:command:["/bin/sh","-c","sleep 5"]# Let LB drain
The preStop hook runs before SIGTERM. Use it to give load balancers time to remove the pod from rotation.
Start with rolling deployments — it’s good enough for most apps
Add proper health checks — readiness and liveness, checking real dependencies
Implement graceful shutdown — finish in-flight requests before dying
Use expand-contract migrations — never break backward compatibility
Graduate to canary when scale demands it — with automated analysis
Zero-downtime deployments aren’t about fancy tools. They’re about respecting the contract between old code and new code, giving systems time to transition, and having automated guardrails to catch problems before users do.
Ship fast. Ship often. Ship invisibly.
📬 Get the Newsletter
Weekly insights on DevOps, automation, and CLI mastery. No spam, unsubscribe anytime.