Graceful Shutdown: The Art of Dying Well in Production
How to implement graceful shutdown in containers and services to prevent data loss, connection errors, and unhappy users during deployments.
February 12, 2026 · 6 min · 1203 words · Rob Washington
Table of Contents
Your container is about to die. It has 30 seconds to live.
What happens next determines whether your users see a clean transition or a wall of 502 errors. Graceful shutdown is one of those things that seems obvious until you realize most applications do it wrong.
When Kubernetes (or Docker, or systemd) decides to stop your application, it sends a SIGTERM signal. Your application has a grace period—usually 30 seconds—to finish what it’s doing and exit cleanly. After that, it gets SIGKILL. No negotiation.
When this receives SIGTERM, Flask’s development server just… stops. In-flight requests get dropped. Database transactions hang. Connections break mid-stream.
importsignalimportsysimportthreadingfromflaskimportFlaskfromwaitressimportserveapp=Flask(__name__)# Track whether we're shutting downshutting_down=threading.Event()@app.route("/")defhello():return"Hello, World!"@app.route("/health")defhealth():ifshutting_down.is_set():return"Shutting down",503return"OK",200defgraceful_shutdown(signum,frame):print(f"Received signal {signum}, initiating graceful shutdown...")shutting_down.set()# Give load balancer time to remove us from rotation# In production, this would trigger server.close()sys.exit(0)if__name__=="__main__":signal.signal(signal.SIGTERM,graceful_shutdown)signal.signal(signal.SIGINT,graceful_shutdown)print("Starting server...")serve(app,host="0.0.0.0",port=8080)
apiVersion:apps/v1kind:Deploymentmetadata:name:my-appspec:replicas:3selector:matchLabels:app:my-apptemplate:metadata:labels:app:my-appspec:terminationGracePeriodSeconds:45containers:- name:my-appimage:my-app:latestports:- containerPort:8080livenessProbe:httpGet:path:/healthport:8080initialDelaySeconds:5periodSeconds:10readinessProbe:httpGet:path:/healthport:8080initialDelaySeconds:5periodSeconds:5failureThreshold:1# Remove from service immediatelylifecycle:preStop:exec:command:["/bin/sh","-c","sleep 5"]
Key settings:
terminationGracePeriodSeconds: 45 — Give the app more time than the default 30s
readinessProbe.failureThreshold: 1 — Remove from service after first failed check
preStop sleep — Extra buffer to ensure load balancer updates propagate
constexpress=require('express');constapp=express();letisShuttingDown=false;app.get('/',(req,res)=>{res.send('Hello, World!');});app.get('/health',(req,res)=>{if(isShuttingDown){returnres.status(503).send('Shutting down');}res.send('OK');});constserver=app.listen(8080,()=>{console.log('Server running on port 8080');});// Track active connections
letconnections=newSet();server.on('connection',(conn)=>{connections.add(conn);conn.on('close',()=>connections.delete(conn));});functiongracefulShutdown(signal){console.log(`Received ${signal}, starting graceful shutdown...`);isShuttingDown=true;// Stop accepting new connections
server.close(()=>{console.log('HTTP server closed');process.exit(0);});// Close existing connections after they finish
for(constconnofconnections){conn.end();}// Force exit after timeout
setTimeout(()=>{console.error('Forceful shutdown after timeout');process.exit(1);},25000);}process.on('SIGTERM',()=>gracefulShutdown('SIGTERM'));process.on('SIGINT',()=>gracefulShutdown('SIGINT'));
importatexitfromsqlalchemyimportcreate_enginefromsqlalchemy.ormimportsessionmakerengine=create_engine('postgresql://...')Session=sessionmaker(bind=engine)defcleanup_db():print("Closing database connections...")engine.dispose()print("Database connections closed")atexit.register(cleanup_db)# Also handle in signal handlerdefgraceful_shutdown(signum,frame):shutting_down.set()cleanup_db()sys.exit(0)
# Start your containerdocker run -d --name test-app my-app:latest
# Send SIGTERMdocker kill --signal=SIGTERM test-app
# Watch logsdocker logs -f test-app
# Verify exit codedocker inspect test-app --format='{{.State.ExitCode}}'# Should be 0
In Kubernetes:
1
2
3
4
5
6
# Watch pod terminationkubectl delete pod my-app-xxxxx & kubectl logs -f my-app-xxxxx
# Check for connection errors in other pods during rolling updatekubectl rollout restart deployment/my-app
kubectl logs -f deployment/my-other-service | grep -i error
Graceful shutdown is the difference between a rolling deployment and a rolling disaster. Your application will die—that’s inevitable in container orchestration. What matters is whether it dies well.
The 30-second grace period is a gift. Use it to:
Tell the load balancer you’re leaving
Finish what you started
Clean up after yourself
Exit with dignity
Your users won’t notice a graceful shutdown. They’ll definitely notice a graceless one.
📬 Get the Newsletter
Weekly insights on DevOps, automation, and CLI mastery. No spam, unsubscribe anytime.