Your application is running in production. You need to restart it for a config change. Do you:
A) kill -9 and hope for the best
B) Send a signal it can handle gracefully
If you picked A, you’ve probably lost data. Let’s fix that.
The Essential Signals#
| Signal | Number | Default Action | Use Case |
|---|
| SIGTERM | 15 | Terminate | Graceful shutdown request |
| SIGINT | 2 | Terminate | Ctrl+C, interactive stop |
| SIGHUP | 1 | Terminate | Config reload (by convention) |
| SIGKILL | 9 | Terminate | Force kill (cannot be caught) |
| SIGUSR1/2 | 10/12 | Terminate | Application-defined |
| SIGCHLD | 17 | Ignore | Child process state change |
SIGTERM is the polite ask. “Please shut down when convenient.”
SIGKILL is the eviction notice. No cleanup, no saving state, immediate death.
Always try SIGTERM first. Reserve SIGKILL for processes that refuse to die.
Handling Signals in Your Code#
Bash#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| #!/usr/bin/env bash
cleanup() {
echo "Caught signal, cleaning up..."
rm -f "$LOCK_FILE"
kill -- -$$ # Kill process group
exit 0
}
trap cleanup SIGTERM SIGINT SIGHUP
# Your long-running work
while true; do
do_work
sleep 60
done
|
The trap command registers a handler. When the signal arrives, cleanup runs.
Python#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| import signal
import sys
def graceful_shutdown(signum, frame):
print(f"Received signal {signum}, shutting down...")
# Close database connections
# Flush buffers
# Save state
sys.exit(0)
signal.signal(signal.SIGTERM, graceful_shutdown)
signal.signal(signal.SIGINT, graceful_shutdown)
# Your application loop
while True:
process_work()
|
Node.js#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| process.on('SIGTERM', () => {
console.log('SIGTERM received, shutting down gracefully...');
server.close(() => {
console.log('HTTP server closed');
db.end(() => {
console.log('Database connections closed');
process.exit(0);
});
});
});
process.on('SIGINT', () => {
// Same handling for Ctrl+C
});
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| func main() {
sigs := make(chan os.Signal, 1)
signal.Notify(sigs, syscall.SIGTERM, syscall.SIGINT)
go func() {
sig := <-sigs
log.Printf("Received %v, shutting down...", sig)
// Cleanup
os.Exit(0)
}()
// Your application
runServer()
}
|
SIGHUP for Config Reload#
By convention, SIGHUP triggers a config reload without restarting:
1
2
3
4
5
| # Reload nginx config
kill -HUP $(cat /var/run/nginx.pid)
# Or using systemctl
systemctl reload nginx
|
Implementing in your application:
1
2
3
4
5
6
7
8
9
10
11
| import signal
config = load_config()
def reload_config(signum, frame):
global config
print("Reloading configuration...")
config = load_config()
print("Configuration reloaded")
signal.signal(signal.SIGHUP, reload_config)
|
This is why kill -HUP reloads configs for nginx, Apache, and most daemons.
Process Groups and Job Control#
When you start a process, it gets a Process ID (PID). Related processes share a Process Group ID (PGID).
1
2
3
4
5
6
7
8
9
10
11
| # Kill a process
kill 12345
# Kill a process group (note the negative)
kill -- -12345
# Kill all processes with a name
pkill -f "python myapp.py"
# Kill processes owned by a user
pkill -u deploy
|
In scripts that spawn child processes, killing the parent doesn’t automatically kill children. Handle this:
1
2
3
4
5
| cleanup() {
# Kill entire process group
kill -- -$$
}
trap cleanup EXIT
|
Graceful Shutdown Pattern#
A robust service handles shutdown in phases:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
| import signal
import threading
shutdown_event = threading.Event()
def handle_shutdown(signum, frame):
print("Shutdown initiated...")
shutdown_event.set()
signal.signal(signal.SIGTERM, handle_shutdown)
def worker():
while not shutdown_event.is_set():
task = queue.get(timeout=1)
if task:
process(task)
print("Worker finished current task and exiting")
# Main loop checks shutdown_event
while not shutdown_event.is_set():
accept_new_work()
# Drain remaining work
wait_for_workers_to_finish(timeout=30)
print("Graceful shutdown complete")
|
Key points:
- Stop accepting new work
- Finish in-progress work (with timeout)
- Close connections and flush buffers
- Exit cleanly
Docker and Kubernetes Considerations#
Containers receive SIGTERM when stopped. Kubernetes sends SIGTERM, waits terminationGracePeriodSeconds (default 30s), then SIGKILL.
1
2
3
4
5
| # Your app must handle SIGTERM
CMD ["python", "app.py"]
# NOT this - shell doesn't forward signals
CMD python app.py
|
The second form runs through /bin/sh -c, which doesn’t forward signals to your app. Use exec form or handle it explicitly:
1
| CMD ["sh", "-c", "exec python app.py"]
|
In Kubernetes, add a preStop hook for additional cleanup time:
1
2
3
4
| lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
|
Debugging Signal Issues#
1
2
3
4
5
6
7
8
| # See what signals a process can receive
cat /proc/12345/status | grep Sig
# Trace signals sent to a process
strace -e signal -p 12345
# Send signal and watch behavior
kill -TERM 12345 && tail -f /var/log/myapp.log
|
Common Mistakes#
Not handling SIGTERM at all:
1
2
| # Bad: App ignores SIGTERM, gets SIGKILL after timeout
docker stop mycontainer # Waits 10s, then kills
|
Catching SIGKILL:
1
2
| # This does nothing - SIGKILL cannot be caught
signal.signal(signal.SIGKILL, handler) # Ignored
|
Not propagating to children:
1
2
| # Parent dies, children become zombies
kill $PARENT_PID # Children keep running
|
Blocking in signal handler:
1
2
3
| def handler(sig, frame):
time.sleep(30) # Bad! Blocks signal handling
save_state() # May never run
|
Keep signal handlers fast. Set a flag, let the main loop do cleanup.
Quick Reference#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| # Graceful stop
kill $PID # Sends SIGTERM (15)
kill -TERM $PID # Explicit SIGTERM
# Force stop
kill -9 $PID # SIGKILL - last resort
kill -KILL $PID # Same thing
# Reload config
kill -HUP $PID # SIGHUP
# Pause/resume
kill -STOP $PID # Pause process
kill -CONT $PID # Resume process
# Custom signals
kill -USR1 $PID # Application-defined
kill -USR2 $PID # Application-defined
|
Graceful shutdown isn’t optional — it’s the difference between “deployment” and “controlled chaos.” Handle your signals.
Computing Arts covers the fundamentals that make systems reliable. More at computingarts.com.