Zero-Downtime Deployments: Blue-Green and Canary Strategies
Implement blue-green and canary deployment patterns to release with confidence, roll back instantly, and catch issues before they impact all users.
February 11, 2026 · 8 min · 1669 words · Rob Washington
Table of Contents
Deploying new code shouldn’t mean crossing your fingers. Blue-green and canary deployments let you release changes with confidence, validate them with real traffic, and roll back in seconds if something goes wrong.
Blue-green maintains two identical production environments. One serves traffic (blue), while the other stands ready (green). To deploy, you push to green, test it, then switch traffic over.
#!/bin/bash
set -e
NEW_VERSION=$1CURRENT=$(kubectl get svc myapp -o jsonpath='{.spec.selector.version}')if["$CURRENT"=="blue"];thenTARGET="green"elseTARGET="blue"fiecho"Current: $CURRENT, Deploying to: $TARGET"# Update the standby deploymentkubectl set image deployment/app-$TARGETapp=myapp:$NEW_VERSION# Wait for rolloutkubectl rollout status deployment/app-$TARGET --timeout=300s
# Run smoke tests against standbykubectl run smoke-test --rm -it --image=curlimages/curl \
--restart=Never -- curl -f http://app-$TARGET:8080/health
# Switch traffickubectl patch svc myapp -p "{\"spec\":{\"selector\":{\"version\":\"$TARGET\"}}}"echo"Switched to $TARGET (v$NEW_VERSION)"
#!/bin/bash
# rollback.sh - switch back to previous versionCURRENT=$(kubectl get svc myapp -o jsonpath='{.spec.selector.version}')if["$CURRENT"=="blue"];thenPREVIOUS="green"elsePREVIOUS="blue"fikubectl patch svc myapp -p "{\"spec\":{\"selector\":{\"version\":\"$PREVIOUS\"}}}"echo"Rolled back to $PREVIOUS"
# Using nginx ingress annotations for traffic splittingapiVersion:networking.k8s.io/v1kind:Ingressmetadata:name:myapp-canaryannotations:nginx.ingress.kubernetes.io/canary:"true"nginx.ingress.kubernetes.io/canary-weight:"10"# 10% to canaryspec:rules:- host:myapp.example.comhttp:paths:- path:/pathType:Prefixbackend:service:name:myapp-canaryport:number:80
importrandomfromflaskimportFlask,requestimportrequestsapp=Flask(__name__)# ConfigurationCANARY_PERCENTAGE=10STABLE_BACKEND="http://stable-service:8080"CANARY_BACKEND="http://canary-service:8080"defget_backend(user_id:str=None)->str:"""
Determine which backend to route to.
Use consistent hashing for user_id to ensure same user
always sees same version during rollout.
"""ifuser_id:# Consistent routing based on userhash_val=hash(user_id)%100use_canary=hash_val<CANARY_PERCENTAGEelse:# Random for anonymous usersuse_canary=random.randint(1,100)<=CANARY_PERCENTAGEreturnCANARY_BACKENDifuse_canaryelseSTABLE_BACKEND@app.route('/',defaults={'path':''})@app.route('/<path:path>')defproxy(path):user_id=request.headers.get('X-User-ID')backend=get_backend(user_id)# Forward requestresp=requests.request(method=request.method,url=f"{backend}/{path}",headers={k:vfork,vinrequest.headersifk!='Host'},data=request.get_data(),allow_redirects=False)# Add header indicating which backend served the requestresponse=app.make_response(resp.content)response.headers['X-Served-By']='canary'ifbackend==CANARY_BACKENDelse'stable'returnresponse
fromprometheus_clientimportCounter,Histogram# Track by versionrequests_total=Counter('http_requests_total','Total requests',['version','status_code','endpoint'])request_duration=Histogram('http_request_duration_seconds','Request duration',['version','endpoint'])# In your app@app.before_requestdeftrack_request():request.start_time=time.time()@app.after_requestdefrecord_metrics(response):version=os.environ.get('APP_VERSION','unknown')duration=time.time()-request.start_timerequests_total.labels(version=version,status_code=response.status_code,endpoint=request.endpoint).inc()request_duration.labels(version=version,endpoint=request.endpoint).observe(duration)returnresponse
Always have a rollback plan — test it before you need it
Monitor the right metrics — error rates, latency, business KPIs
Use consistent routing — same user should see same version
Automate promotion criteria — remove human error from the loop
Keep both versions compatible — especially for database schemas
Test the deployment process — not just the code
Zero-downtime deployments aren’t just about uptime — they’re about deploying with confidence. When rollback is instant and painless, you ship faster because the cost of mistakes is lower.
📬 Get the Newsletter
Weekly insights on DevOps, automation, and CLI mastery. No spam, unsubscribe anytime.