How to implement graceful shutdown so deployments don't interrupt in-flight requests — signals, drain periods, and health check coordination.
February 19, 2026 · 10 min · 2111 words · Rob Washington
Table of Contents
Your deploy shouldn’t kill requests mid-flight. Every dropped connection is a failed payment, a lost form submission, or a frustrated user. Graceful shutdown ensures your application finishes what it started before dying.
importsignalimportsysimportthreadingfromhttp.serverimportHTTPServerimporttime# Track in-flight requestsactive_requests=threading.Semaphore(100)# Max concurrentshutdown_event=threading.Event()classGracefulServer:def__init__(self,server):self.server=serverself.setup_signal_handlers()defsetup_signal_handlers(self):signal.signal(signal.SIGTERM,self.handle_shutdown)signal.signal(signal.SIGINT,self.handle_shutdown)defhandle_shutdown(self,signum,frame):print(f"Received signal {signum}, starting graceful shutdown...")shutdown_event.set()# Stop accepting new connectionsself.server.shutdown()# Wait for in-flight requests (max 30 seconds)deadline=time.time()+30whileactive_requests._value<100andtime.time()<deadline:time.sleep(0.1)remaining=100-active_requests._valueifremaining>0:print(f"Warning: {remaining} requests still in flight at shutdown")sys.exit(0)# Decorator for request handlersdeftracked_request(func):defwrapper(*args,**kwargs):ifshutdown_event.is_set():return{"error":"Server shutting down"},503acquired=active_requests.acquire(blocking=False)ifnotacquired:return{"error":"Too many requests"},429try:returnfunc(*args,**kwargs)finally:active_requests.release()returnwrapper
consthttp=require('http');letisShuttingDown=false;letactiveConnections=newSet();constserver=http.createServer((req,res)=>{if(isShuttingDown){res.writeHead(503);res.end('Server is shutting down');return;}// Track connection
activeConnections.add(res);res.on('finish',()=>activeConnections.delete(res));// Handle request
handleRequest(req,res);});functiongracefulShutdown(signal){console.log(`Received ${signal}, starting graceful shutdown...`);isShuttingDown=true;// Stop accepting new connections
server.close(()=>{console.log('Server closed, no new connections');});// Wait for existing connections to finish
constshutdownTimeout=setTimeout(()=>{console.log(`Forcing shutdown with ${activeConnections.size} connections remaining`);process.exit(1);},30000);// Check periodically if all connections are done
constcheckInterval=setInterval(()=>{if(activeConnections.size===0){clearInterval(checkInterval);clearTimeout(shutdownTimeout);console.log('All connections closed, exiting cleanly');process.exit(0);}},100);}process.on('SIGTERM',()=>gracefulShutdown('SIGTERM'));process.on('SIGINT',()=>gracefulShutdown('SIGINT'));server.listen(8080);
Kubernetes sends SIGTERM, waits terminationGracePeriodSeconds (default 30), then SIGKILL. But there’s a race condition: the pod might still receive traffic after SIGTERM.
Add a delay before shutdown to let endpoints update:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion:apps/v1kind:Deploymentspec:template:spec:terminationGracePeriodSeconds:60containers:- name:applifecycle:preStop:exec:command:["/bin/sh","-c","sleep 10"]# Or use an HTTP endpoint# preStop:# httpGet:# path: /prestop# port: 8080
backend app
option httpchk GET /health
server app1 10.0.0.1:8080 check
server app2 10.0.0.2:8080 check
# Drain connections over 30 seconds when server removed
default-server inter 3s fall 3 rise 2
importsignalimportsysclassApplication:def__init__(self):self.db_pool=create_db_pool()self.redis=create_redis_client()self.background_workers=[]defstart_background_workers(self):foriinrange(4):worker=BackgroundWorker()worker.start()self.background_workers.append(worker)defshutdown(self,signum,frame):print("Shutdown initiated...")# 1. Stop accepting new workself.http_server.stop_accepting()# 2. Stop background workers (let them finish current job)forworkerinself.background_workers:worker.stop_gracefully()# 3. Wait for HTTP requests to drainself.http_server.wait_for_drain(timeout=15)# 4. Wait for background workersforworkerinself.background_workers:worker.join(timeout=10)# 5. Close database connectionsself.db_pool.close()# 6. Close Redisself.redis.close()# 7. Flush any buffered logs/metricslogging.shutdown()print("Clean shutdown complete")sys.exit(0)app=Application()signal.signal(signal.SIGTERM,app.shutdown)
Stop accepting new connections — Return 503 immediately
Wait for in-flight requests — With a reasonable timeout
Coordinate with load balancers — PreStop hooks, health checks
Clean up resources — Database connections, file handles, workers
Set appropriate timeouts — terminationGracePeriodSeconds > your drain time
Test it — Actually verify requests complete during deploys
Graceful shutdown is invisible when it works. Users never see the deploys. That’s the goal: infrastructure that serves reliability, not the other way around.
Deploy with confidence. Shut down with grace.
📬 Get the Newsletter
Weekly insights on DevOps, automation, and CLI mastery. No spam, unsubscribe anytime.