Your container is about to die. It has 30 seconds to live.

What happens next determines whether your users see a clean transition or a wall of 502 errors. Graceful shutdown is one of those things that seems obvious until you realize most applications do it wrong.

The Problem

When Kubernetes (or Docker, or systemd) decides to stop your application, it sends a SIGTERM signal. Your application has a grace period—usually 30 seconds—to finish what it’s doing and exit cleanly. After that, it gets SIGKILL. No negotiation.

The naive approach:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# Bad: ignores shutdown signals entirely
from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello, World!"

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

When this receives SIGTERM, Flask’s development server just… stops. In-flight requests get dropped. Database transactions hang. Connections break mid-stream.

The Solution: Signal Handling

A properly graceful application:

  1. Catches SIGTERM
  2. Stops accepting new connections
  3. Finishes processing in-flight requests
  4. Closes database connections cleanly
  5. Exits with code 0

Here’s how to do it in Python with a production WSGI server:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import signal
import sys
import threading
from flask import Flask
from waitress import serve

app = Flask(__name__)

# Track whether we're shutting down
shutting_down = threading.Event()

@app.route("/")
def hello():
    return "Hello, World!"

@app.route("/health")
def health():
    if shutting_down.is_set():
        return "Shutting down", 503
    return "OK", 200

def graceful_shutdown(signum, frame):
    print(f"Received signal {signum}, initiating graceful shutdown...")
    shutting_down.set()
    # Give load balancer time to remove us from rotation
    # In production, this would trigger server.close()
    sys.exit(0)

if __name__ == "__main__":
    signal.signal(signal.SIGTERM, graceful_shutdown)
    signal.signal(signal.SIGINT, graceful_shutdown)
    
    print("Starting server...")
    serve(app, host="0.0.0.0", port=8080)

The Health Check Dance

Notice the health endpoint returns 503 when shutting down. This is crucial. Here’s why:

T0001353isss---0m353sess0lsin-------eSAHLLAAoIpeooppfGpaaappTlddaEstfeRehbbixgMtaanirscllitarhaasscesennhechcccecfeukeesluitrrelvtsiaeitnrnnsdnaoe-lhgrtmfyu_tioltdscvidoeegowrsshwnetntfp:=uaorrideTnlqriefuundreegoshmt5es0ar3lotthatcihoencks

If your health check keeps returning 200 during shutdown, the load balancer keeps sending new requests to a dying container. Those requests fail.

Kubernetes Configuration

Your Kubernetes deployment needs to cooperate:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      terminationGracePeriodSeconds: 45
      containers:
        - name: my-app
          image: my-app:latest
          ports:
            - containerPort: 8080
          
          livenessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
            
          readinessProbe:
            httpGet:
              path: /health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 1  # Remove from service immediately
            
          lifecycle:
            preStop:
              exec:
                command: ["/bin/sh", "-c", "sleep 5"]

Key settings:

  • terminationGracePeriodSeconds: 45 — Give the app more time than the default 30s
  • readinessProbe.failureThreshold: 1 — Remove from service after first failed check
  • preStop sleep — Extra buffer to ensure load balancer updates propagate

Node.js Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
const express = require('express');
const app = express();

let isShuttingDown = false;

app.get('/', (req, res) => {
  res.send('Hello, World!');
});

app.get('/health', (req, res) => {
  if (isShuttingDown) {
    return res.status(503).send('Shutting down');
  }
  res.send('OK');
});

const server = app.listen(8080, () => {
  console.log('Server running on port 8080');
});

// Track active connections
let connections = new Set();

server.on('connection', (conn) => {
  connections.add(conn);
  conn.on('close', () => connections.delete(conn));
});

function gracefulShutdown(signal) {
  console.log(`Received ${signal}, starting graceful shutdown...`);
  isShuttingDown = true;
  
  // Stop accepting new connections
  server.close(() => {
    console.log('HTTP server closed');
    process.exit(0);
  });
  
  // Close existing connections after they finish
  for (const conn of connections) {
    conn.end();
  }
  
  // Force exit after timeout
  setTimeout(() => {
    console.error('Forceful shutdown after timeout');
    process.exit(1);
  }, 25000);
}

process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));

Database Connection Cleanup

Don’t forget your database connections:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
import atexit
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

engine = create_engine('postgresql://...')
Session = sessionmaker(bind=engine)

def cleanup_db():
    print("Closing database connections...")
    engine.dispose()
    print("Database connections closed")

atexit.register(cleanup_db)

# Also handle in signal handler
def graceful_shutdown(signum, frame):
    shutting_down.set()
    cleanup_db()
    sys.exit(0)

The Complete Checklist

Before your next deployment, verify:

  • Application catches SIGTERM
  • Health endpoint returns 503 during shutdown
  • In-flight requests are allowed to complete
  • Database connections are closed cleanly
  • Background workers finish current jobs
  • terminationGracePeriodSeconds is sufficient
  • preStop hook gives load balancer time to update
  • Application exits with code 0 on clean shutdown

Testing Graceful Shutdown

Don’t wait for production to find out:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Start your container
docker run -d --name test-app my-app:latest

# Send SIGTERM
docker kill --signal=SIGTERM test-app

# Watch logs
docker logs -f test-app

# Verify exit code
docker inspect test-app --format='{{.State.ExitCode}}'
# Should be 0

In Kubernetes:

1
2
3
4
5
6
# Watch pod termination
kubectl delete pod my-app-xxxxx & kubectl logs -f my-app-xxxxx

# Check for connection errors in other pods during rolling update
kubectl rollout restart deployment/my-app
kubectl logs -f deployment/my-other-service | grep -i error

Conclusion

Graceful shutdown is the difference between a rolling deployment and a rolling disaster. Your application will die—that’s inevitable in container orchestration. What matters is whether it dies well.

The 30-second grace period is a gift. Use it to:

  1. Tell the load balancer you’re leaving
  2. Finish what you started
  3. Clean up after yourself
  4. Exit with dignity

Your users won’t notice a graceful shutdown. They’ll definitely notice a graceless one.