Most systemd services run with full system access by default. That’s fine until one gets compromised. Systemd provides powerful sandboxing options that most people never use.

The Basics: User and Group

Never run services as root if they don’t need it:

1
2
3
[Service]
User=myapp
Group=myapp

Create a dedicated user:

1
sudo useradd --system --no-create-home --shell /usr/sbin/nologin myapp

Filesystem Restrictions

Read-Only Root

Make the entire filesystem read-only:

1
2
[Service]
ProtectSystem=strict

Options:

  • true/usr and /boot read-only
  • full/etc also read-only
  • strict — Entire filesystem read-only except explicit paths

Writable Directories

Allow writes only where needed:

1
2
3
[Service]
ProtectSystem=strict
ReadWritePaths=/var/lib/myapp /var/log/myapp

Temporary Directory Isolation

Give the service its own /tmp:

1
2
[Service]
PrivateTmp=true

The service sees an empty /tmp that’s invisible to other processes.

Hide Home Directories

1
2
[Service]
ProtectHome=true

Options:

  • true/home, /root, /run/user empty
  • read-only — Visible but read-only
  • tmpfs — Empty tmpfs mounted over them

Network Restrictions

Disable Network Access

For services that don’t need network:

1
2
[Service]
PrivateNetwork=true

The service only sees a loopback interface.

Restrict Socket Types

1
2
[Service]
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX

Only allow specific socket families. Common values:

  • AF_INET — IPv4
  • AF_INET6 — IPv6
  • AF_UNIX — Unix sockets
  • AF_NETLINK — Kernel communication

Filter System Calls

Block dangerous syscalls:

1
2
3
[Service]
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM

Predefined groups:

  • @system-service — Typical service syscalls
  • @network-io — Network operations
  • @file-system — File operations
  • @privileged — Privileged operations

Block specific syscalls:

1
SystemCallFilter=~@mount @reboot @swap

Capability Restrictions

Drop All Capabilities

1
2
3
[Service]
CapabilityBoundingSet=
NoNewPrivileges=true

Allow Only What’s Needed

1
2
3
[Service]
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
AmbientCapabilities=CAP_NET_BIND_SERVICE

Common capabilities:

  • CAP_NET_BIND_SERVICE — Bind ports < 1024
  • CAP_NET_RAW — Raw sockets (ping)
  • CAP_CHOWN — Change file ownership
  • CAP_SETUID — Change UID

Device Access

Hide Device Nodes

1
2
[Service]
PrivateDevices=true

Only sees pseudo-devices (/dev/null, /dev/zero, etc.).

Restrict Device Access

1
2
3
[Service]
DevicePolicy=closed
DeviceAllow=/dev/sda rw

Policies:

  • strict — No device access
  • closed — Only pseudo-devices
  • auto — Default behavior

Kernel and System Protection

Protect Kernel Variables

1
2
3
4
[Service]
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectKernelLogs=true

Protect Control Groups

1
2
[Service]
ProtectControlGroups=true

Restrict Realtime and Memory Locking

1
2
3
4
[Service]
RestrictRealtime=true
MemoryDenyWriteExecute=true
LockPersonality=true

Complete Hardened Example

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
[Unit]
Description=My Secure Application
After=network.target

[Service]
Type=simple
User=myapp
Group=myapp
ExecStart=/usr/local/bin/myapp

# Filesystem
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
ReadWritePaths=/var/lib/myapp

# Network (if needed)
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX

# Capabilities
CapabilityBoundingSet=
NoNewPrivileges=true

# Kernel
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectKernelLogs=true
ProtectControlGroups=true

# Devices
PrivateDevices=true

# Other
RestrictRealtime=true
MemoryDenyWriteExecute=true
LockPersonality=true
RestrictSUIDSGID=true

# System calls
SystemCallFilter=@system-service
SystemCallErrorNumber=EPERM
SystemCallArchitectures=native

[Install]
WantedBy=multi-user.target

Testing Security Settings

Analyze a Service

1
systemd-analyze security myapp.service

Output shows a security score and lists what’s exposed:

NPUCOArsavMiepeEvrara=bat/ileDllNyientetayxwmBpoioorcuskUnu=sdreiern=glSeevte=l~CfAoPr..m.yaDSSSpEeeepSrrr.CvvvsRiiieIcccrPeeevTiIhrccOauaeNsnn:snao4cat.cs2ecsrrMsoeEoaDtttIoeUMtShUeIDh......EX000P...O540SURE

Aim for a low score (under 3 is good).

Test Incrementally

Add restrictions one at a time and test:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Edit service
sudo systemctl edit myapp.service

# Add restriction
[Service]
ProtectSystem=strict

# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart myapp

# Check if it still works
sudo systemctl status myapp
journalctl -u myapp -f

Debug Permission Errors

If the service fails after hardening:

1
2
3
4
5
# Check audit log
sudo ausearch -m AVC -ts recent

# Or watch journal for permission errors
journalctl -u myapp | grep -i "permission\|denied\|access"

Common Service Patterns

Web Application

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
[Service]
User=webapp
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
ReadWritePaths=/var/lib/webapp /var/log/webapp
PrivateDevices=true
RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX
CapabilityBoundingSet=CAP_NET_BIND_SERVICE
NoNewPrivileges=true

Database

1
2
3
4
5
6
7
8
9
[Service]
User=postgres
ProtectSystem=full
ProtectHome=true
ReadWritePaths=/var/lib/postgresql /var/run/postgresql
PrivateDevices=true
PrivateTmp=true
# Need more capabilities for shared memory
CapabilityBoundingSet=CAP_IPC_LOCK

Background Worker (No Network)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
[Service]
User=worker
ProtectSystem=strict
ProtectHome=true
PrivateNetwork=true
PrivateTmp=true
PrivateDevices=true
ReadWritePaths=/var/spool/worker
CapabilityBoundingSet=
NoNewPrivileges=true

Quick Reference

DirectivePurpose
User=Run as non-root user
ProtectSystem=strictRead-only filesystem
ProtectHome=trueHide home directories
PrivateTmp=trueIsolated /tmp
PrivateNetwork=trueNo network access
PrivateDevices=trueNo device access
NoNewPrivileges=trueCan’t gain privileges
CapabilityBoundingSet=Drop all capabilities
SystemCallFilter=@system-serviceAllow only safe syscalls

Security is about reducing attack surface. Every restriction you add is one less thing an attacker can exploit if your service is compromised. Start with the basics (non-root user, ProtectSystem), then add more restrictions until the service breaks, then back off one step.