Most systemd services run with full system access by default. That’s fine until one gets compromised. Systemd provides powerful sandboxing options that most people never use.
The Basics: User and Group# Never run services as root if they don’t need it:
1
2
3
[Service]
User = myapp
Group = myapp
Create a dedicated user:
1
sudo useradd --system --no-create-home --shell /usr/sbin/nologin myapp
Filesystem Restrictions# Read-Only Root# Make the entire filesystem read-only:
1
2
[Service]
ProtectSystem = strict
Options:
true — /usr and /boot read-onlyfull — /etc also read-onlystrict — Entire filesystem read-only except explicit pathsWritable Directories# Allow writes only where needed:
1
2
3
[Service]
ProtectSystem = strict
ReadWritePaths = /var/lib/myapp /var/log/myapp
Temporary Directory Isolation# Give the service its own /tmp:
1
2
[Service]
PrivateTmp = true
The service sees an empty /tmp that’s invisible to other processes.
Hide Home Directories# 1
2
[Service]
ProtectHome = true
Options:
true — /home, /root, /run/user emptyread-only — Visible but read-onlytmpfs — Empty tmpfs mounted over themNetwork Restrictions# Disable Network Access# For services that don’t need network:
1
2
[Service]
PrivateNetwork = true
The service only sees a loopback interface.
Restrict Socket Types# 1
2
[Service]
RestrictAddressFamilies = AF_INET AF_INET6 AF_UNIX
Only allow specific socket families. Common values:
AF_INET — IPv4AF_INET6 — IPv6AF_UNIX — Unix socketsAF_NETLINK — Kernel communicationFilter System Calls# Block dangerous syscalls:
1
2
3
[Service]
SystemCallFilter = @system-service
SystemCallErrorNumber = EPERM
Predefined groups:
@system-service — Typical service syscalls@network-io — Network operations@file-system — File operations@privileged — Privileged operationsBlock specific syscalls:
1
SystemCallFilter = ~@mount @reboot @swap
Capability Restrictions# Drop All Capabilities# 1
2
3
[Service]
CapabilityBoundingSet =
NoNewPrivileges = true
Allow Only What’s Needed# 1
2
3
[Service]
CapabilityBoundingSet = CAP_NET_BIND_SERVICE
AmbientCapabilities = CAP_NET_BIND_SERVICE
Common capabilities:
CAP_NET_BIND_SERVICE — Bind ports < 1024CAP_NET_RAW — Raw sockets (ping)CAP_CHOWN — Change file ownershipCAP_SETUID — Change UIDDevice Access# Hide Device Nodes# 1
2
[Service]
PrivateDevices = true
Only sees pseudo-devices (/dev/null, /dev/zero, etc.).
Restrict Device Access# 1
2
3
[Service]
DevicePolicy = closed
DeviceAllow = /dev/sda rw
Policies:
strict — No device accessclosed — Only pseudo-devicesauto — Default behaviorKernel and System Protection# Protect Kernel Variables# 1
2
3
4
[Service]
ProtectKernelTunables = true
ProtectKernelModules = true
ProtectKernelLogs = true
Protect Control Groups# 1
2
[Service]
ProtectControlGroups = true
Restrict Realtime and Memory Locking# 1
2
3
4
[Service]
RestrictRealtime = true
MemoryDenyWriteExecute = true
LockPersonality = true
Complete Hardened Example# 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
[Unit]
Description = My Secure Application
After = network.target
[Service]
Type = simple
User = myapp
Group = myapp
ExecStart = /usr/local/bin/myapp
# Filesystem
ProtectSystem = strict
ProtectHome = true
PrivateTmp = true
ReadWritePaths = /var/lib/myapp
# Network (if needed)
RestrictAddressFamilies = AF_INET AF_INET6 AF_UNIX
# Capabilities
CapabilityBoundingSet =
NoNewPrivileges = true
# Kernel
ProtectKernelTunables = true
ProtectKernelModules = true
ProtectKernelLogs = true
ProtectControlGroups = true
# Devices
PrivateDevices = true
# Other
RestrictRealtime = true
MemoryDenyWriteExecute = true
LockPersonality = true
RestrictSUIDSGID = true
# System calls
SystemCallFilter = @system-service
SystemCallErrorNumber = EPERM
SystemCallArchitectures = native
[Install]
WantedBy = multi-user.target
Testing Security Settings# Analyze a Service# 1
systemd-analyze security myapp.service
Output shows a security score and lists what’s exposed:
✓ ✗ ✓ → N P U C O A r s a v M i e p e E v r a r a = b a t / i l e D l l N y i e n t e t a y x w m B p o i o o r c u s k U n u = s d r e i e r n = g l S e e v t e = l ~ C f A o P r . . m . y a D S S S p E e e e p S r r r . C v v v s R i i i e I c c c r P e e e v T i I h r c c O a u a e N s n n : s n a o 4 c a t . c s 2 e c s r r M s o e E o a D t t t I o e U M t S h U e I D h . . . . . . E X 0 0 0 P . . . O 5 4 0 S U R E
Aim for a low score (under 3 is good).
Test Incrementally# Add restrictions one at a time and test:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Edit service
sudo systemctl edit myapp.service
# Add restriction
[ Service]
ProtectSystem = strict
# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart myapp
# Check if it still works
sudo systemctl status myapp
journalctl -u myapp -f
Debug Permission Errors# If the service fails after hardening:
1
2
3
4
5
# Check audit log
sudo ausearch -m AVC -ts recent
# Or watch journal for permission errors
journalctl -u myapp | grep -i "permission\|denied\|access"
Common Service Patterns# Web Application# 1
2
3
4
5
6
7
8
9
10
[Service]
User = webapp
ProtectSystem = strict
ProtectHome = true
PrivateTmp = true
ReadWritePaths = /var/lib/webapp /var/log/webapp
PrivateDevices = true
RestrictAddressFamilies = AF_INET AF_INET6 AF_UNIX
CapabilityBoundingSet = CAP_NET_BIND_SERVICE
NoNewPrivileges = true
Database# 1
2
3
4
5
6
7
8
9
[Service]
User = postgres
ProtectSystem = full
ProtectHome = true
ReadWritePaths = /var/lib/postgresql /var/run/postgresql
PrivateDevices = true
PrivateTmp = true
# Need more capabilities for shared memory
CapabilityBoundingSet = CAP_IPC_LOCK
Background Worker (No Network)# 1
2
3
4
5
6
7
8
9
10
[Service]
User = worker
ProtectSystem = strict
ProtectHome = true
PrivateNetwork = true
PrivateTmp = true
PrivateDevices = true
ReadWritePaths = /var/spool/worker
CapabilityBoundingSet =
NoNewPrivileges = true
Quick Reference# Directive Purpose User=Run as non-root user ProtectSystem=strictRead-only filesystem ProtectHome=trueHide home directories PrivateTmp=trueIsolated /tmp PrivateNetwork=trueNo network access PrivateDevices=trueNo device access NoNewPrivileges=trueCan’t gain privileges CapabilityBoundingSet=Drop all capabilities SystemCallFilter=@system-serviceAllow only safe syscalls
Security is about reducing attack surface. Every restriction you add is one less thing an attacker can exploit if your service is compromised. Start with the basics (non-root user, ProtectSystem), then add more restrictions until the service breaks, then back off one step.