Systemd Hardening

Evaluate Security of Services

Security score:

$ systemd-analyze security
prometheus-node-exporter.service           8.6 EXPOSED   🙁
prometheus.service                         4.3 OK        🙂
rescue.service                             9.5 UNSAFE    😨
rtkit-daemon.service                       7.2 MEDIUM    😐
...

List of All Options

Service Options

Options that can be used in the [Service] section of a systemd service unit.

Capabilities

On Linux, super-user privileges are divided into capabilities. Available capabilities are listed in capabilities(7) and systemd-analyze capability lists all capabilities known to systemd.

AmbientCapabilities

By default, all capabilities are dropped when running a service as a non-root user. In order to grant a non-root user limited super-user capabilities, this directive can be used.

Grant user backup-daemon capability CAP_DAC_READ_SEARCH:

User=backup-daemon
AmbientCapabilities=CAP_DAC_READ_SEARCH

This is generally preferred to running a service as root and dropping capabilities.

See:

NoNewPrivileges

Deny access to escalating privileges:

NoNewPrivileges=yes

In particular, the service process and all its children will ignore setuid and setgid bits used by su and sudo to gain privileges.

See prctl(2)#PR_SET_NO_NEW_PRIVS

Devices

DeviceAllow

Allow device /dev/loop-control, /dev/loop[0-9]:

DeviceAllow=/dev/loop-control
DeviceAllow=block-loop

Allow read-only access to /dev/sda:

DeviceAllow=/dev/sda:r

Use PrivateDevices when only the default set of pseudo-devices like /dev/null, /dev/zero, and /dev/urandom is needed.

By default, access to common pseudo-devices like /dev/null or /dev/urandom is always granted.

See systemd.resource-control(5) → DeviceAllow=

PrivateDevices

Only provide a minimal set of devices like /dev/null, /dev/zero or /dev/urandom to the service. Systemd will also take other measures to prevent device creation and access.

Enable private devices:

PrivateDevices=yes

See systemd.exec(5) → PrivateDevices=

PrivateIPC

Create a private IPC namespace for the service:

PrivateIPC=yes

Multiple services can be made to share their IPC namespace using JoinsNamespaceOf

See systemd.exec(5) → PrivateIPC=

PrivatePIDs

Create a private PID namespace:

PrivatePIDs=true

Isolate PID space

Caveats You probably wan to use ProtectProc instead.

RemoveIPC

Remove IPC objects when service is stopped:

User=exampled
RemoveIPC=yes

Remove all IPC objects owned by the user running the service.

See systemd.exec(5) → RemoveIPC=

/proc and /sys Filesystem

ProcSubset

Only allow access to PID information in /proc (eg /proc/<pid>):

ProcSubset=pid

See systemd.exec(5) → ProcSubset=

ProtectProc

Control access to processes in /proc

Deny access to other users processes:

ProtectProc=noaccess

Hide other users processes:

ProtectProc=invisible

Hide non-ptraceable processes:

ProtectProc=ptraceable

You should usually prefer invisible over noaccess as many services do not handle being denied access well.

These directives correspond to the hidepid= mount option of proc. See proc(5)#Mount_options

See systemd.exec(5) → ProtectProc=

ProtectKernelTunables

Protect kernel veriables accessible in /proc, /sys or via sysctl(8)/sysctl.conf(5):

ProtectKernelTunables=yes

See systemd.exec(5) → ProtectKernelTunables=

ProtectClock

Prevent service from changing the time:

ProtectClock=yes

See systemd.exec(5) → ProtectClock=

ProtectControlGroups

Prevent modifications to cgroup hierarchies:

ProtectControlGroups=yes

See systemd.exec(5) → ProtectControlGroups=

Filesystem Access

NoExecPaths

Only allow execution of /usr/bin/serviced:

NoExecPaths=/
ExecPaths=/usr/bin/serviced

This, in combination with MemoryDenyWriteExecute may be used to make arbitrary code execution harder.

See systemd.exec(5) → NoExecPaths=

PrivateTmp

Create private, empty /tmp and /var/tmp for the service:

PrivateTmp=yes

Explicitly create a disconnected tmpfs(5) for /var/tmp and /tmp:

PrivateTmp=disconnect

Use disconnect to avoid temporary data hitting the disk when /tmp and /var/tmp is unencrypted, persistent storage.

Multiple services can be made to share their /tmp and /var/tmp using JoinsNamespaceOf. Cannot be used with disconnect.

Temporary files are cleaned when the service is stopped.

See systemd.exec(5) → PrivateTmp=

ProtectHome

Restrict access to /home, /root and /run/user for a service.

Make /home inaccessible:

ProtectHome=yes

Make /home read-only:

ProtectHome=read-only

You can use ReadWritePaths to lift read-only restriction on subdirectories.

Replace /home with an empty, read-only directory:

ProtectHome=tmpfs

See:

InaccessiblePaths

Make directories at /etc/hidden, /hidden and /home inaccessible:

InaccessiblePaths=/etc/hidden /hidden
InaccessiblePaths=/home

See:

ReadOnlyPaths

Make files and directories read-only at /etc/read-only, /read-only and /home:

ReadOnlyPaths=/etc/read-only /read-only
ReadOnlyPaths=/home

See:

ReadWritePaths

Make files and directories read-write at /etc/read-write, /read-write and /home:

ReadWritePaths=/etc/read-write /read-write
ReadWritePaths=/home/

Directories otherwise read-only or inaccessible due to the use of ProtectHome or ProtectSystem may be made readable/writable.

Subdirectories or files specified in ReadOnlyPaths may be made writable. However, this does not extend to InaccessiblePaths.

See:

RestrictFileSystems

Only allow opening files on a ext4 or tmpfs filesystem:

RestrictFileSystems=ext4 tmpfs

Only deny access to network filesystems:

RestrictFileSystems=~@network

Obtain a list of all known filesystems and groups:

$ systemd-analyze filesystems

See systemd.exec(5) → RestrictFileSystems=

ProtectSystem

Mount /usr/, /boot/ and /efi/ read-only:

ProtectSystem=yes

Additionally mount /etc/ read-only:

ProtectSystem=full

Mount everything read-only except /dev/, /proc/ and /sys

ProtectSystem=strict

Use ReadWritePaths to allow write access to specific files or directories.

See

ProtectHostname

Prevent service from manipulating hostname (UTS namespace):

ProtectHostname=yes

See systemd.exec(5) → ProtectHostname=

ProtectKernelLogs

Deny service access to kernel logs (e.g. via dmesg(1)):

ProtectKernelLogs=yes

See systemd.exec(5) → ProtectKernelLogs=

ProtectKernelModules

Prevent loading of kernel modules by service:

ProtectKernelModules=yes

See systemd.exec(5) → ProtectKernelModules=

TemporaryFileSystem

Place a empty tmpfs filesystem at /path/directory:

TemporaryFileSystem=/path/directory

The same but make the directory read-only:

TemporaryFileSystem=/path/directory:ro

This is often useful when a service can’t deal with a directory being read-only or inaccessible but is fine with it being empty.

See systemd.exec(5) → TemporaryFileSystem=

Networking

PrivateNetwork

Create a private network namespace with only a private loopback interface:

PrivateNetwork=yes

Multiple services can be made to share their network namespace using JoinsNamespaceOf. Restricting access to the (global) loopback interface, or any other interface, can be done using RestrictNetworkInterfaces.

See systemd.exec(5) → PrivateNetwork=

RestrictAddressFamilies

Restrict socket access to IPv6, IPv4 and Unix socket families respectively:

RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX

Allow no address family:

RestrictAddressFamilies=none

The special value of none is only supported starting with systemd 249.

Often required:

Family

Reason

AF_NETLINK

Enumerating network interfaces, for instance, to be able to bind to specific interfaces.

AF_UNIX

Logging via syslog(3).

See

RestrictNetworkInterfaces

Restrict access to the loopback (lo) interface:

RestrictNetworkInterfaces=lo

Deny access to interface eth0 only:

RestrictNetworkInterfaces=~eth0

When no network access is needed use PrivateNetwork.

See systemd.resource-conntrol(5) → RestrictNetworkInterfaces=

SocketBindAllow

Important

Without also specifying SocketBindDeny=any, the service may bind to all ports.

Allow service to bind to TCP ports 80 and 443 only:

SocketBindAllow=tcp:80
SocketBindAllow=tcp:443
SocketBindDeny=any

Omit protocol to allow TCP and UDP:

SocketBindAllow=80
SocketBindAllow=443
SocketBindDeny=any

Port ranges, like 1200-1300, are accepted too.

Allow an unprivileged service to bind to TCP ports 80 and 443 only:

AmbientCapabilities=CAP_NET_BIND_SERVICE
SocketBindAllow=tcp:80
SocketBindAllow=tcp:443
SocketBindDeny=any
User=www-data

Capability CAP_NET_BIND_SERVICE is required to bind to any port lower than 1024. SocketBindAllow can be used to restrict this privilege to certain ports.

See systemd.resource-control(5) → SocketBindAllow=

IPAddressAllow

Important

Without also specifying IPAddressDeny=any, the service will be allowed to connect to any address.

Only allow connecting to CIDR networks 10.0.0.0/8 and fc00/7:

IPAddressAllow=10.0.0.0/8 fc00/7
IPAddressDeny=any

The value localhost can be used to restrict access to 127.0.0.1 and ::1. If you wish to restrict access to localhost only, consider using RestrictNetworkInterfaces=lo in addition.

See systemd.resource-control(5) → IPAddressAllow=

IPAddressDeny

Deny access to CIDR networks 10.0.0.0/8 and fc00/7:

IPAddressDeny=10.0.0.0/8 fc00/7

See systemd.resource-control(5) → IPAddressDeny=

RestrictNamespaces

Deny any namespace change:

RestrictNamespaces=yes

Only allow access namespaces ipc and net:

RestrictNamespaces=ipc net

Only deny access namespaces ipc and net:

RestrictNamespaces=~ipc net

See systemd.exec(5) → RestrictNamespaces=

RestrictRealtime

Deny access to any realtime scheduling functionality:

RestrictRealtime=yes

See systemd.exec(5) → RestrictRealtime=

RestrictSUIDSGID

Prevent setting of SUID and SGID bits for file permissions:

RestrictSUIDSGID=yes

See

UMask

Create files and directories that are only accessible by user/owner if permission are not explicitly set during creation:

UMask=0077

Allow user and group only:

UMask=0007

See

User / Group

DynamicUser

Dynamically create a Unix user as which the service is ran:

DynamicUser=yes

This is not suitable for services that write persistent data to disk or have to read private data. This because the UID/GID will be unpredictable and may be shared (though not at the same time) with other services.

Read sysemd.exec(5) → DynamicUser= before use.

See also ExecStart (run ExecStart=, ExecStartPre=, etc. with full privileges)

PrivateUsers

Run service in a private user namespace:

PrivateUsers=yes

See systemd.exec(5) → PrivateUsers=

User

Run process as user serviced:

User=serviced

Group is taken from the passwd database unless specified via Group and Supplementary groups from the group database.

See:

Group

Set users group to serviced:

Group=serviced

See:

SupplementaryGroups

On Unix, any process belongs to a user (UID) and group (GID) but it may also belong to additional/supplementary groups. Such supplementary groups are shown in groups= by id:

$ id user
uid=1000(user) gid=1000(user) groups=1000(user),999(qubes),126(docker)

Add service to supplementary group inet:

SupplementaryGroups=inet

Groups from the system’s group database are left untouched and SupplementaryGroups are appended.

See systemd.exec(5) → SupplementaryGroups

Exec{Start,Stop}{,Pre,Post}

Prefixes + and ! can be used to execute commands with full privileges (without User/Group/etc. being applied) and without filesystem access restriction being applied (PrivateHome/ReadOnlyPaths/etc.).

Call mkdir /etc/directory/ as root and with /etc/ being writable:

ExecStartPre=+mkdir /etc/directory
ExecStart=serviced --foreground
ReadOnlyPaths=/etc/
User=serviced

Use ! to only revert the effects of User, Group and SupplementaryGroups.

These prefixes can be used with ExecStart, ExecStartPre, ExecStartPost, ExecStop, ExecStopPre and ExecStopPost.

See systemd.service(5) → ExecStart=