diff options
Diffstat (limited to 'man/systemd.exec.xml')
-rw-r--r-- | man/systemd.exec.xml | 3678 |
1 files changed, 3678 insertions, 0 deletions
diff --git a/man/systemd.exec.xml b/man/systemd.exec.xml new file mode 100644 index 0000000..a9d863b --- /dev/null +++ b/man/systemd.exec.xml @@ -0,0 +1,3678 @@ +<?xml version='1.0'?> +<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" + "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> +<!-- SPDX-License-Identifier: LGPL-2.1-or-later --> + +<refentry id="systemd.exec" xmlns:xi="http://www.w3.org/2001/XInclude"> + <refentryinfo> + <title>systemd.exec</title> + <productname>systemd</productname> + </refentryinfo> + + <refmeta> + <refentrytitle>systemd.exec</refentrytitle> + <manvolnum>5</manvolnum> + </refmeta> + + <refnamediv> + <refname>systemd.exec</refname> + <refpurpose>Execution environment configuration</refpurpose> + </refnamediv> + + <refsynopsisdiv> + <para><filename><replaceable>service</replaceable>.service</filename>, + <filename><replaceable>socket</replaceable>.socket</filename>, + <filename><replaceable>mount</replaceable>.mount</filename>, + <filename><replaceable>swap</replaceable>.swap</filename></para> + </refsynopsisdiv> + + <refsect1> + <title>Description</title> + + <para>Unit configuration files for services, sockets, mount points, and swap devices share a subset of + configuration options which define the execution environment of spawned processes.</para> + + <para>This man page lists the configuration options shared by these four unit types. See + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry> for the common + options of all unit configuration files, and + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.swap</refentrytitle><manvolnum>5</manvolnum></citerefentry>, and + <citerefentry><refentrytitle>systemd.mount</refentrytitle><manvolnum>5</manvolnum></citerefentry> for more + information on the specific unit configuration files. The execution specific configuration options are configured + in the [Service], [Socket], [Mount], or [Swap] sections, depending on the unit type.</para> + + <para>In addition, options which control resources through Linux Control Groups (cgroups) are listed in + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + Those options complement options listed here.</para> + </refsect1> + + <refsect1> + <title>Implicit Dependencies</title> + + <para>A few execution parameters result in additional, automatic dependencies to be added:</para> + + <itemizedlist> + <listitem><para>Units with <varname>WorkingDirectory=</varname>, <varname>RootDirectory=</varname>, + <varname>RootImage=</varname>, <varname>RuntimeDirectory=</varname>, <varname>StateDirectory=</varname>, + <varname>CacheDirectory=</varname>, <varname>LogsDirectory=</varname> or + <varname>ConfigurationDirectory=</varname> set automatically gain dependencies of type + <varname>Requires=</varname> and <varname>After=</varname> on all mount units required to access the specified + paths. This is equivalent to having them listed explicitly in + <varname>RequiresMountsFor=</varname>.</para></listitem> + + <listitem><para>Similarly, units with <varname>PrivateTmp=</varname> enabled automatically get mount + unit dependencies for all mounts required to access <filename>/tmp/</filename> and + <filename>/var/tmp/</filename>. They will also gain an automatic <varname>After=</varname> dependency + on + <citerefentry><refentrytitle>systemd-tmpfiles-setup.service</refentrytitle><manvolnum>8</manvolnum></citerefentry>. + </para></listitem> + + <listitem><para>Units whose standard output or error output is connected to <option>journal</option> or + <option>kmsg</option> (or their combinations with console output, see below) automatically acquire + dependencies of type <varname>After=</varname> on + <filename>systemd-journald.socket</filename>.</para></listitem> + + <listitem><para>Units using <varname>LogNamespace=</varname> will automatically gain ordering and + requirement dependencies on the two socket units associated with + <filename>systemd-journald@.service</filename> instances.</para></listitem> + </itemizedlist> + </refsect1> + + <!-- We don't have any default dependency here. --> + + <refsect1> + <title>Paths</title> + + <para>The following settings may be used to change a service's view of the filesystem. Please note that the paths + must be absolute and must not contain a <literal>..</literal> path component.</para> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>WorkingDirectory=</varname></term> + + <listitem><para>Takes a directory path relative to the service's root directory specified by + <varname>RootDirectory=</varname>, or the special value <literal>~</literal>. Sets the working directory for + executed processes. If set to <literal>~</literal>, the home directory of the user specified in + <varname>User=</varname> is used. If not set, defaults to the root directory when systemd is running as a + system instance and the respective user's home directory if run as user. If the setting is prefixed with the + <literal>-</literal> character, a missing working directory is not considered fatal. If + <varname>RootDirectory=</varname>/<varname>RootImage=</varname> is not set, then + <varname>WorkingDirectory=</varname> is relative to the root of the system running the service manager. Note + that setting this parameter might result in additional dependencies to be added to the unit (see + above).</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RootDirectory=</varname></term> + + <listitem><para>Takes a directory path relative to the host's root directory (i.e. the root of the system + running the service manager). Sets the root directory for executed processes, with the <citerefentry + project='man-pages'><refentrytitle>chroot</refentrytitle><manvolnum>2</manvolnum></citerefentry> system + call. If this is used, it must be ensured that the process binary and all its auxiliary files are available in + the <function>chroot()</function> jail. Note that setting this parameter might result in additional + dependencies to be added to the unit (see above).</para> + + <para>The <varname>MountAPIVFS=</varname> and <varname>PrivateUsers=</varname> settings are particularly useful + in conjunction with <varname>RootDirectory=</varname>. For details, see below.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RootImage=</varname></term> + + <listitem><para>Takes a path to a block device node or regular file as argument. This call is similar + to <varname>RootDirectory=</varname> however mounts a file system hierarchy from a block device node + or loopback file instead of a directory. The device node or file system image file needs to contain a + file system without a partition table, or a file system within an MBR/MS-DOS or GPT partition table + with only a single Linux-compatible partition, or a set of file systems within a GPT partition table + that follows the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partitions + Specification</ulink>.</para> + + <para>When <varname>DevicePolicy=</varname> is set to <literal>closed</literal> or + <literal>strict</literal>, or set to <literal>auto</literal> and <varname>DeviceAllow=</varname> is + set, then this setting adds <filename>/dev/loop-control</filename> with <constant>rw</constant> mode, + <literal>block-loop</literal> and <literal>block-blkext</literal> with <constant>rwm</constant> mode + to <varname>DeviceAllow=</varname>. See + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for the details about <varname>DevicePolicy=</varname> or <varname>DeviceAllow=</varname>. Also, see + <varname>PrivateDevices=</varname> below, as it may change the setting of + <varname>DevicePolicy=</varname>.</para> + + <para>Units making use of <varname>RootImage=</varname> automatically gain an + <varname>After=</varname> dependency on <filename>systemd-udevd.service</filename>.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RootImageOptions=</varname></term> + + <listitem><para>Takes a comma-separated list of mount options that will be used on disk images specified by + <varname>RootImage=</varname>. Optionally a partition name can be prefixed, followed by colon, in + case the image has multiple partitions, otherwise partition name <literal>root</literal> is implied. + Options for multiple partitions can be specified in a single line with space separators. Assigning an empty + string removes previous assignments. Duplicated options are ignored. For a list of valid mount options, please + refer to + <citerefentry project='man-pages'><refentrytitle>mount</refentrytitle><manvolnum>8</manvolnum></citerefentry>. + </para> + + <para>Valid partition names follow the <ulink url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable + Partitions Specification</ulink>.</para> + + <table> + <title>Accepted partition names</title> + + <tgroup cols='1'> + <colspec colname='partition' /> + <thead> + <row> + <entry>Partition Name</entry> + </row> + </thead> + <tbody> + <row> + <entry>root</entry> + </row> + <row> + <entry>root-secondary</entry> + </row> + <row> + <entry>home</entry> + </row> + <row> + <entry>srv</entry> + </row> + <row> + <entry>esp</entry> + </row> + <row> + <entry>xbootldr</entry> + </row> + <row> + <entry>tmp</entry> + </row> + <row> + <entry>var</entry> + </row> + <row> + <entry>usr</entry> + </row> + </tbody> + </tgroup> + </table> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RootHash=</varname></term> + + <listitem><para>Takes a data integrity (dm-verity) root hash specified in hexadecimal, or the path to a file + containing a root hash in ASCII hexadecimal format. This option enables data integrity checks using dm-verity, + if the used image contains the appropriate integrity data (see above) or if <varname>RootVerity=</varname> is used. + The specified hash must match the root hash of integrity data, and is usually at least 256 bits (and hence 64 + formatted hexadecimal characters) long (in case of SHA256 for example). If this option is not specified, but + the image file carries the <literal>user.verity.roothash</literal> extended file attribute (see <citerefentry + project='man-pages'><refentrytitle>xattr</refentrytitle><manvolnum>7</manvolnum></citerefentry>), then the root + hash is read from it, also as formatted hexadecimal characters. If the extended file attribute is not found (or + is not supported by the underlying file system), but a file with the <filename>.roothash</filename> suffix is + found next to the image file, bearing otherwise the same name (except if the image has the + <filename>.raw</filename> suffix, in which case the root hash file must not have it in its name), the root hash + is read from it and automatically used, also as formatted hexadecimal characters.</para> + + <para>If the disk image contains a separate <filename>/usr/</filename> partition it may also be + Verity protected, in which case the root hash may configured via an extended attribute + <literal>user.verity.usrhash</literal> or a <filename>.usrhash</filename> file adjacent to the disk + image. There's currently no option to configure the root hash for the <filename>/usr/</filename> file + system via the unit file directly.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RootHashSignature=</varname></term> + + <listitem><para>Takes a PKCS7 signature of the <varname>RootHash=</varname> option as a path to a + DER-encoded signature file, or as an ASCII base64 string encoding of a DER-encoded signature prefixed + by <literal>base64:</literal>. The dm-verity volume will only be opened if the signature of the root + hash is valid and signed by a public key present in the kernel keyring. If this option is not + specified, but a file with the <filename>.roothash.p7s</filename> suffix is found next to the image + file, bearing otherwise the same name (except if the image has the <filename>.raw</filename> suffix, + in which case the signature file must not have it in its name), the signature is read from it and + automatically used.</para> + + <para>If the disk image contains a separate <filename>/usr/</filename> partition it may also be + Verity protected, in which case the signature for the root hash may configured via a + <filename>.usrhash.p7s</filename> file adjacent to the disk image. There's currently no option to + configure the root hash signature for the <filename>/usr/</filename> via the unit file + directly.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RootVerity=</varname></term> + + <listitem><para>Takes the path to a data integrity (dm-verity) file. This option enables data integrity checks + using dm-verity, if <varname>RootImage=</varname> is used and a root-hash is passed and if the used image itself + does not contains the integrity data. The integrity data must be matched by the root hash. If this option is not + specified, but a file with the <filename>.verity</filename> suffix is found next to the image file, bearing otherwise + the same name (except if the image has the <filename>.raw</filename> suffix, in which case the verity data file must + not have it in its name), the verity data is read from it and automatically used.</para> + + <para>This option is supported only for disk images that contain a single file system, without an + enveloping partition table. Images that contain a GPT partition table should instead include both + root file system and matching Verity data in the same image, implementing the <ulink + url="https://systemd.io/DISCOVERABLE_PARTITIONS">Discoverable Partition Specification</ulink>.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>MountAPIVFS=</varname></term> + + <listitem><para>Takes a boolean argument. If on, a private mount namespace for the unit's processes is created + and the API file systems <filename>/proc/</filename>, <filename>/sys/</filename>, and <filename>/dev/</filename> + are mounted inside of it, unless they are already mounted. Note that this option has no effect unless used in + conjunction with <varname>RootDirectory=</varname>/<varname>RootImage=</varname> as these three mounts are + generally mounted in the host anyway, and unless the root directory is changed, the private mount namespace + will be a 1:1 copy of the host's, and include these three mounts. Note that the <filename>/dev/</filename> file + system of the host is bind mounted if this option is used without <varname>PrivateDevices=</varname>. To run + the service with a private, minimal version of <filename>/dev/</filename>, combine this option with + <varname>PrivateDevices=</varname>.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectProc=</varname></term> + + <listitem><para>Takes one of <literal>noaccess</literal>, <literal>invisible</literal>, + <literal>ptraceable</literal> or <literal>default</literal> (which it defaults to). When set, this + controls the <literal>hidepid=</literal> mount option of the <literal>procfs</literal> instance for + the unit that controls which directories with process metainformation + (<filename>/proc/<replaceable>PID</replaceable></filename>) are visible and accessible: when set to + <literal>noaccess</literal> the ability to access most of other users' process metadata in + <filename>/proc/</filename> is taken away for processes of the service. When set to + <literal>invisible</literal> processes owned by other users are hidden from + <filename>/proc/</filename>. If <literal>ptraceable</literal> all processes that cannot be + <function>ptrace()</function>'ed by a process are hidden to it. If <literal>default</literal> no + restrictions on <filename>/proc/</filename> access or visibility are made. For further details see + <ulink url="https://www.kernel.org/doc/html/latest/filesystems/proc.html#mount-options">The /proc + Filesystem</ulink>. It is generally recommended to run most system services with this option set to + <literal>invisible</literal>. This option is implemented via file system namespacing, and thus cannot + be used with services that shall be able to install mount points in the host file system + hierarchy. It also cannot be used for services that need to access metainformation about other users' + processes. This option implies <varname>MountAPIVFS=</varname>.</para> + + <para>If the kernel doesn't support per-mount point <option>hidepid=</option> mount options this + setting remains without effect, and the unit's processes will be able to access and see other process + as if the option was not used.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProcSubset=</varname></term> + + <listitem><para>Takes one of <literal>all</literal> (the default) and <literal>pid</literal>. If + the latter all files and directories not directly associated with process management and introspection + are made invisible in the <filename>/proc/</filename> file system configured for the unit's + processes. This controls the <literal>subset=</literal> mount option of the <literal>procfs</literal> + instance for the unit. For further details see <ulink + url="https://www.kernel.org/doc/html/latest/filesystems/proc.html#mount-options">The /proc + Filesystem</ulink>. Note that Linux exposes various kernel APIs via <filename>/proc/</filename>, + which are made unavailable with this setting. Since these APIs are used frequently this option is + useful only in a few, specific cases, and is not suitable for most non-trivial programs.</para> + + <para>Much like <varname>ProtectProc=</varname> above, this is implemented via file system mount + namespacing, and hence the same restrictions apply: it is only available to system services, it + disables mount propagation to the host mount table, and it implies + <varname>MountAPIVFS=</varname>. Also, like <varname>ProtectProc=</varname> this setting is gracefully + disabled if the used kernel does not support the <literal>subset=</literal> mount option of + <literal>procfs</literal>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>BindPaths=</varname></term> + <term><varname>BindReadOnlyPaths=</varname></term> + + <listitem><para>Configures unit-specific bind mounts. A bind mount makes a particular file or directory + available at an additional place in the unit's view of the file system. Any bind mounts created with this + option are specific to the unit, and are not visible in the host's mount table. This option expects a + whitespace separated list of bind mount definitions. Each definition consists of a colon-separated triple of + source path, destination path and option string, where the latter two are optional. If only a source path is + specified the source and destination is taken to be the same. The option string may be either + <literal>rbind</literal> or <literal>norbind</literal> for configuring a recursive or non-recursive bind + mount. If the destination path is omitted, the option string must be omitted too. + Each bind mount definition may be prefixed with <literal>-</literal>, in which case it will be ignored + when its source path does not exist.</para> + + <para><varname>BindPaths=</varname> creates regular writable bind mounts (unless the source file system mount + is already marked read-only), while <varname>BindReadOnlyPaths=</varname> creates read-only bind mounts. These + settings may be used more than once, each usage appends to the unit's list of bind mounts. If the empty string + is assigned to either of these two options the entire list of bind mounts defined prior to this is reset. Note + that in this case both read-only and regular bind mounts are reset, regardless which of the two settings is + used.</para> + + <para>This option is particularly useful when <varname>RootDirectory=</varname>/<varname>RootImage=</varname> + is used. In this case the source path refers to a path on the host file system, while the destination path + refers to a path below the root directory of the unit.</para> + + <para>Note that the destination directory must exist or systemd must be able to create it. Thus, it + is not possible to use those options for mount points nested underneath paths specified in + <varname>InaccessiblePaths=</varname>, or under <filename>/home/</filename> and other protected + directories if <varname>ProtectHome=yes</varname> is + specified. <varname>TemporaryFileSystem=</varname> with <literal>:ro</literal> or + <varname>ProtectHome=tmpfs</varname> should be used instead.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>MountImages=</varname></term> + + <listitem><para>This setting is similar to <varname>RootImage=</varname> in that it mounts a file + system hierarchy from a block device node or loopback file, but the destination directory can be + specified as well as mount options. This option expects a whitespace separated list of mount + definitions. Each definition consists of a colon-separated tuple of source path and destination + definitions, optionally followed by another colon and a list of mount options.</para> + + <para>Mount options may be defined as a single comma-separated list of options, in which case they + will be implicitly applied to the root partition on the image, or a series of colon-separated tuples + of partition name and mount options. Valid partition names and mount options are the same as for + <varname>RootImageOptions=</varname> setting described above.</para> + + <para>Each mount definition may be prefixed with <literal>-</literal>, in which case it will be + ignored when its source path does not exist. The source argument is a path to a block device node or + regular file. If source or destination contain a <literal>:</literal>, it needs to be escaped as + <literal>\:</literal>. The device node or file system image file needs to follow the same rules as + specified for <varname>RootImage=</varname>. Any mounts created with this option are specific to the + unit, and are not visible in the host's mount table.</para> + + <para>These settings may be used more than once, each usage appends to the unit's list of mount + paths. If the empty string is assigned, the entire list of mount paths defined prior to this is + reset.</para> + + <para>Note that the destination directory must exist or systemd must be able to create it. Thus, it + is not possible to use those options for mount points nested underneath paths specified in + <varname>InaccessiblePaths=</varname>, or under <filename>/home/</filename> and other protected + directories if <varname>ProtectHome=yes</varname> is specified.</para> + + <para>When <varname>DevicePolicy=</varname> is set to <literal>closed</literal> or + <literal>strict</literal>, or set to <literal>auto</literal> and <varname>DeviceAllow=</varname> is + set, then this setting adds <filename>/dev/loop-control</filename> with <constant>rw</constant> mode, + <literal>block-loop</literal> and <literal>block-blkext</literal> with <constant>rwm</constant> mode + to <varname>DeviceAllow=</varname>. See + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for the details about <varname>DevicePolicy=</varname> or <varname>DeviceAllow=</varname>. Also, see + <varname>PrivateDevices=</varname> below, as it may change the setting of + <varname>DevicePolicy=</varname>.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Credentials</title> + + <xi:include href="system-only.xml" xpointer="plural"/> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>User=</varname></term> + <term><varname>Group=</varname></term> + + <listitem><para>Set the UNIX user or group that the processes are executed as, respectively. Takes a single + user or group name, or a numeric ID as argument. For system services (services run by the system service + manager, i.e. managed by PID 1) and for user services of the root user (services managed by root's instance of + <command>systemd --user</command>), the default is <literal>root</literal>, but <varname>User=</varname> may be + used to specify a different user. For user services of any other user, switching user identity is not + permitted, hence the only valid setting is the same user the user's service manager is running as. If no group + is set, the default group of the user is used. This setting does not affect commands whose command line is + prefixed with <literal>+</literal>.</para> + + <para>Note that this enforces only weak restrictions on the user/group name syntax, but will generate + warnings in many cases where user/group names do not adhere to the following rules: the specified + name should consist only of the characters a-z, A-Z, 0-9, <literal>_</literal> and + <literal>-</literal>, except for the first character which must be one of a-z, A-Z and + <literal>_</literal> (i.e. digits and <literal>-</literal> are not permitted as first character). The + user/group name must have at least one character, and at most 31. These restrictions are made in + order to avoid ambiguities and to ensure user/group names and unit files remain portable among Linux + systems. For further details on the names accepted and the names warned about see <ulink + url="https://systemd.io/USER_NAMES">User/Group Name Syntax</ulink>.</para> + + <para>When used in conjunction with <varname>DynamicUser=</varname> the user/group name specified is + dynamically allocated at the time the service is started, and released at the time the service is + stopped — unless it is already allocated statically (see below). If <varname>DynamicUser=</varname> + is not used the specified user and group must have been created statically in the user database no + later than the moment the service is started, for example using the + <citerefentry><refentrytitle>sysusers.d</refentrytitle><manvolnum>5</manvolnum></citerefentry> + facility, which is applied at boot or package install time. If the user does not exist by then + program invocation will fail.</para> + + <para>If the <varname>User=</varname> setting is used the supplementary group list is initialized + from the specified user's default group list, as defined in the system's user and group + database. Additional groups may be configured through the <varname>SupplementaryGroups=</varname> + setting (see below).</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>DynamicUser=</varname></term> + + <listitem><para>Takes a boolean parameter. If set, a UNIX user and group pair is allocated + dynamically when the unit is started, and released as soon as it is stopped. The user and group will + not be added to <filename>/etc/passwd</filename> or <filename>/etc/group</filename>, but are managed + transiently during runtime. The + <citerefentry><refentrytitle>nss-systemd</refentrytitle><manvolnum>8</manvolnum></citerefentry> glibc + NSS module provides integration of these dynamic users/groups into the system's user and group + databases. The user and group name to use may be configured via <varname>User=</varname> and + <varname>Group=</varname> (see above). If these options are not used and dynamic user/group + allocation is enabled for a unit, the name of the dynamic user/group is implicitly derived from the + unit name. If the unit name without the type suffix qualifies as valid user name it is used directly, + otherwise a name incorporating a hash of it is used. If a statically allocated user or group of the + configured name already exists, it is used and no dynamic user/group is allocated. Note that if + <varname>User=</varname> is specified and the static group with the name exists, then it is required + that the static user with the name already exists. Similarly, if <varname>Group=</varname> is + specified and the static user with the name exists, then it is required that the static group with + the name already exists. Dynamic users/groups are allocated from the UID/GID range 61184…65519. It is + recommended to avoid this range for regular system or login users. At any point in time each UID/GID + from this range is only assigned to zero or one dynamically allocated users/groups in use. However, + UID/GIDs are recycled after a unit is terminated. Care should be taken that any processes running as + part of a unit for which dynamic users/groups are enabled do not leave files or directories owned by + these users/groups around, as a different unit might get the same UID/GID assigned later on, and thus + gain access to these files or directories. If <varname>DynamicUser=</varname> is enabled, + <varname>RemoveIPC=</varname> and <varname>PrivateTmp=</varname> are implied (and cannot be turned + off). This ensures that the lifetime of IPC objects and temporary files created by the executed + processes is bound to the runtime of the service, and hence the lifetime of the dynamic + user/group. Since <filename>/tmp/</filename> and <filename>/var/tmp/</filename> are usually the only + world-writable directories on a system this ensures that a unit making use of dynamic user/group + allocation cannot leave files around after unit termination. Furthermore + <varname>NoNewPrivileges=</varname> and <varname>RestrictSUIDSGID=</varname> are implicitly enabled + (and cannot be disabled), to ensure that processes invoked cannot take benefit or create SUID/SGID + files or directories. Moreover <varname>ProtectSystem=strict</varname> and + <varname>ProtectHome=read-only</varname> are implied, thus prohibiting the service to write to + arbitrary file system locations. In order to allow the service to write to certain directories, they + have to be allow-listed using <varname>ReadWritePaths=</varname>, but care must be taken so that + UID/GID recycling doesn't create security issues involving files created by the service. Use + <varname>RuntimeDirectory=</varname> (see below) in order to assign a writable runtime directory to a + service, owned by the dynamic user/group and removed automatically when the unit is terminated. Use + <varname>StateDirectory=</varname>, <varname>CacheDirectory=</varname> and + <varname>LogsDirectory=</varname> in order to assign a set of writable directories for specific + purposes to the service in a way that they are protected from vulnerabilities due to UID reuse (see + below). If this option is enabled, care should be taken that the unit's processes do not get access + to directories outside of these explicitly configured and managed ones. Specifically, do not use + <varname>BindPaths=</varname> and be careful with <constant>AF_UNIX</constant> file descriptor + passing for directory file descriptors, as this would permit processes to create files or directories + owned by the dynamic user/group that are not subject to the lifecycle and access guarantees of the + service. Defaults to off.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SupplementaryGroups=</varname></term> + + <listitem><para>Sets the supplementary Unix groups the processes are executed as. This takes a space-separated + list of group names or IDs. This option may be specified more than once, in which case all listed groups are + set as supplementary groups. When the empty string is assigned, the list of supplementary groups is reset, and + all assignments prior to this one will have no effect. In any way, this option does not override, but extends + the list of supplementary groups configured in the system group database for the user. This does not affect + commands prefixed with <literal>+</literal>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PAMName=</varname></term> + + <listitem><para>Sets the PAM service name to set up a session as. If set, the executed process will be + registered as a PAM session under the specified service name. This is only useful in conjunction with the + <varname>User=</varname> setting, and is otherwise ignored. If not set, no PAM session will be opened for the + executed processes. See <citerefentry + project='man-pages'><refentrytitle>pam</refentrytitle><manvolnum>8</manvolnum></citerefentry> for + details.</para> + + <para>Note that for each unit making use of this option a PAM session handler process will be maintained as + part of the unit and stays around as long as the unit is active, to ensure that appropriate actions can be + taken when the unit and hence the PAM session terminates. This process is named <literal>(sd-pam)</literal> and + is an immediate child process of the unit's main process.</para> + + <para>Note that when this option is used for a unit it is very likely (depending on PAM configuration) that the + main unit process will be migrated to its own session scope unit when it is activated. This process will hence + be associated with two units: the unit it was originally started from (and for which + <varname>PAMName=</varname> was configured), and the session scope unit. Any child processes of that process + will however be associated with the session scope unit only. This has implications when used in combination + with <varname>NotifyAccess=</varname><option>all</option>, as these child processes will not be able to affect + changes in the original unit through notification messages. These messages will be considered belonging to the + session scope unit and not the original unit. It is hence not recommended to use <varname>PAMName=</varname> in + combination with <varname>NotifyAccess=</varname><option>all</option>.</para> + </listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Capabilities</title> + + <xi:include href="system-only.xml" xpointer="plural"/> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>CapabilityBoundingSet=</varname></term> + + <listitem><para>Controls which capabilities to include in the capability bounding set for the + executed process. See <citerefentry + project='man-pages'><refentrytitle>capabilities</refentrytitle><manvolnum>7</manvolnum></citerefentry> + for details. Takes a whitespace-separated list of capability names, + e.g. <constant>CAP_SYS_ADMIN</constant>, <constant>CAP_DAC_OVERRIDE</constant>, + <constant>CAP_SYS_PTRACE</constant>. Capabilities listed will be included in the bounding set, all + others are removed. If the list of capabilities is prefixed with <literal>~</literal>, all but the + listed capabilities will be included, the effect of the assignment inverted. Note that this option + also affects the respective capabilities in the effective, permitted and inheritable capability + sets. If this option is not used, the capability bounding set is not modified on process execution, + hence no limits on the capabilities of the process are enforced. This option may appear more than + once, in which case the bounding sets are merged by <constant>OR</constant>, or by + <constant>AND</constant> if the lines are prefixed with <literal>~</literal> (see below). If the + empty string is assigned to this option, the bounding set is reset to the empty capability set, and + all prior settings have no effect. If set to <literal>~</literal> (without any further argument), + the bounding set is reset to the full set of available capabilities, also undoing any previous + settings. This does not affect commands prefixed with <literal>+</literal>.</para> + + <para>Use + <citerefentry><refentrytitle>systemd-analyze</refentrytitle><manvolnum>1</manvolnum></citerefentry>'s + <command>capability</command> command to retrieve a list of capabilities defined on the local + system.</para> + + <para>Example: if a unit has the following, + <programlisting>CapabilityBoundingSet=CAP_A CAP_B +CapabilityBoundingSet=CAP_B CAP_C</programlisting> + then <constant index='false'>CAP_A</constant>, <constant index='false'>CAP_B</constant>, and + <constant index='false'>CAP_C</constant> are set. If the second line is prefixed with + <literal>~</literal>, e.g., + <programlisting>CapabilityBoundingSet=CAP_A CAP_B +CapabilityBoundingSet=~CAP_B CAP_C</programlisting> + then, only <constant index='false'>CAP_A</constant> is set.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>AmbientCapabilities=</varname></term> + + <listitem><para>Controls which capabilities to include in the ambient capability set for the executed + process. Takes a whitespace-separated list of capability names, e.g. <constant>CAP_SYS_ADMIN</constant>, + <constant>CAP_DAC_OVERRIDE</constant>, <constant>CAP_SYS_PTRACE</constant>. This option may appear more than + once in which case the ambient capability sets are merged (see the above examples in + <varname>CapabilityBoundingSet=</varname>). If the list of capabilities is prefixed with <literal>~</literal>, + all but the listed capabilities will be included, the effect of the assignment inverted. If the empty string is + assigned to this option, the ambient capability set is reset to the empty capability set, and all prior + settings have no effect. If set to <literal>~</literal> (without any further argument), the ambient capability + set is reset to the full set of available capabilities, also undoing any previous settings. Note that adding + capabilities to ambient capability set adds them to the process's inherited capability set. </para><para> + Ambient capability sets are useful if you want to execute a process as a non-privileged user but still want to + give it some capabilities. Note that in this case option <constant>keep-caps</constant> is automatically added + to <varname>SecureBits=</varname> to retain the capabilities over the user + change. <varname>AmbientCapabilities=</varname> does not affect commands prefixed with + <literal>+</literal>.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Security</title> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>NoNewPrivileges=</varname></term> + + <listitem><para>Takes a boolean argument. If true, ensures that the service process and all its + children can never gain new privileges through <function>execve()</function> (e.g. via setuid or + setgid bits, or filesystem capabilities). This is the simplest and most effective way to ensure that + a process and its children can never elevate privileges again. Defaults to false, but certain + settings override this and ignore the value of this setting. This is the case when + <varname>SystemCallFilter=</varname>, <varname>SystemCallArchitectures=</varname>, + <varname>RestrictAddressFamilies=</varname>, <varname>RestrictNamespaces=</varname>, + <varname>PrivateDevices=</varname>, <varname>ProtectKernelTunables=</varname>, + <varname>ProtectKernelModules=</varname>, <varname>ProtectKernelLogs=</varname>, + <varname>ProtectClock=</varname>, <varname>MemoryDenyWriteExecute=</varname>, + <varname>RestrictRealtime=</varname>, <varname>RestrictSUIDSGID=</varname>, <varname>DynamicUser=</varname> + or <varname>LockPersonality=</varname> are specified. Note that even if this setting is overridden by them, + <command>systemctl show</command> shows the original value of this setting. + Also see <ulink url="https://www.kernel.org/doc/html/latest/userspace-api/no_new_privs.html">No New Privileges + Flag</ulink>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SecureBits=</varname></term> + + <listitem><para>Controls the secure bits set for the executed process. Takes a space-separated combination of + options from the following list: <option>keep-caps</option>, <option>keep-caps-locked</option>, + <option>no-setuid-fixup</option>, <option>no-setuid-fixup-locked</option>, <option>noroot</option>, and + <option>noroot-locked</option>. This option may appear more than once, in which case the secure bits are + ORed. If the empty string is assigned to this option, the bits are reset to 0. This does not affect commands + prefixed with <literal>+</literal>. See <citerefentry + project='man-pages'><refentrytitle>capabilities</refentrytitle><manvolnum>7</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Mandatory Access Control</title> + + <xi:include href="system-only.xml" xpointer="plural"/> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>SELinuxContext=</varname></term> + + <listitem><para>Set the SELinux security context of the executed process. If set, this will override the + automated domain transition. However, the policy still needs to authorize the transition. This directive is + ignored if SELinux is disabled. If prefixed by <literal>-</literal>, all errors will be ignored. This does not + affect commands prefixed with <literal>+</literal>. See <citerefentry + project='die-net'><refentrytitle>setexeccon</refentrytitle><manvolnum>3</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>AppArmorProfile=</varname></term> + + <listitem><para>Takes a profile name as argument. The process executed by the unit will switch to + this profile when started. Profiles must already be loaded in the kernel, or the unit will fail. If + prefixed by <literal>-</literal>, all errors will be ignored. This setting has no effect if AppArmor + is not enabled. This setting does not affect commands prefixed with <literal>+</literal>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>SmackProcessLabel=</varname></term> + + <listitem><para>Takes a <option>SMACK64</option> security label as argument. The process executed by the unit + will be started under this label and SMACK will decide whether the process is allowed to run or not, based on + it. The process will continue to run under the label specified here unless the executable has its own + <option>SMACK64EXEC</option> label, in which case the process will transition to run under that label. When not + specified, the label that systemd is running under is used. This directive is ignored if SMACK is + disabled.</para> + + <para>The value may be prefixed by <literal>-</literal>, in which case all errors will be ignored. An empty + value may be specified to unset previous assignments. This does not affect commands prefixed with + <literal>+</literal>.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Process Properties</title> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>LimitCPU=</varname></term> + <term><varname>LimitFSIZE=</varname></term> + <term><varname>LimitDATA=</varname></term> + <term><varname>LimitSTACK=</varname></term> + <term><varname>LimitCORE=</varname></term> + <term><varname>LimitRSS=</varname></term> + <term><varname>LimitNOFILE=</varname></term> + <term><varname>LimitAS=</varname></term> + <term><varname>LimitNPROC=</varname></term> + <term><varname>LimitMEMLOCK=</varname></term> + <term><varname>LimitLOCKS=</varname></term> + <term><varname>LimitSIGPENDING=</varname></term> + <term><varname>LimitMSGQUEUE=</varname></term> + <term><varname>LimitNICE=</varname></term> + <term><varname>LimitRTPRIO=</varname></term> + <term><varname>LimitRTTIME=</varname></term> + + <listitem><para>Set soft and hard limits on various resources for executed processes. See + <citerefentry><refentrytitle>setrlimit</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details on the resource limit concept. Resource limits may be specified in two formats: either as + single value to set a specific soft and hard limit to the same value, or as colon-separated pair + <option>soft:hard</option> to set both limits individually (e.g. <literal>LimitAS=4G:16G</literal>). + Use the string <option>infinity</option> to configure no limit on a specific resource. The + multiplicative suffixes K, M, G, T, P and E (to the base 1024) may be used for resource limits + measured in bytes (e.g. <literal>LimitAS=16G</literal>). For the limits referring to time values, the + usual time units ms, s, min, h and so on may be used (see + <citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry> for + details). Note that if no time unit is specified for <varname>LimitCPU=</varname> the default unit of + seconds is implied, while for <varname>LimitRTTIME=</varname> the default unit of microseconds is + implied. Also, note that the effective granularity of the limits might influence their + enforcement. For example, time limits specified for <varname>LimitCPU=</varname> will be rounded up + implicitly to multiples of 1s. For <varname>LimitNICE=</varname> the value may be specified in two + syntaxes: if prefixed with <literal>+</literal> or <literal>-</literal>, the value is understood as + regular Linux nice value in the range -20..19. If not prefixed like this the value is understood as + raw resource limit parameter in the range 0..40 (with 0 being equivalent to 1).</para> + + <para>Note that most process resource limits configured with these options are per-process, and + processes may fork in order to acquire a new set of resources that are accounted independently of the + original process, and may thus escape limits set. Also note that <varname>LimitRSS=</varname> is not + implemented on Linux, and setting it has no effect. Often it is advisable to prefer the resource + controls listed in + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry> + over these per-process limits, as they apply to services as a whole, may be altered dynamically at + runtime, and are generally more expressive. For example, <varname>MemoryMax=</varname> is a more + powerful (and working) replacement for <varname>LimitRSS=</varname>.</para> + + <para>Resource limits not configured explicitly for a unit default to the value configured in the various + <varname>DefaultLimitCPU=</varname>, <varname>DefaultLimitFSIZE=</varname>, … options available in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, and – + if not configured there – the kernel or per-user defaults, as defined by the OS (the latter only for user + services, see below).</para> + + <para>For system units these resource limits may be chosen freely. When these settings are configured + in a user service (i.e. a service run by the per-user instance of the service manager) they cannot be + used to raise the limits above those set for the user manager itself when it was first invoked, as + the user's service manager generally lacks the privileges to do so. In user context these + configuration options are hence only useful to lower the limits passed in or to raise the soft limit + to the maximum of the hard limit as configured for the user. To raise the user's limits further, the + available configuration mechanisms differ between operating systems, but typically require + privileges. In most cases it is possible to configure higher per-user resource limits via PAM or by + setting limits on the system service encapsulating the user's service manager, i.e. the user's + instance of <filename>user@.service</filename>. After making such changes, make sure to restart the + user's service manager.</para> + + <table> + <title>Resource limit directives, their equivalent <command>ulimit</command> shell commands and the unit used</title> + + <tgroup cols='3'> + <colspec colname='directive' /> + <colspec colname='equivalent' /> + <colspec colname='unit' /> + <thead> + <row> + <entry>Directive</entry> + <entry><command>ulimit</command> equivalent</entry> + <entry>Unit</entry> + </row> + </thead> + <tbody> + <row> + <entry>LimitCPU=</entry> + <entry>ulimit -t</entry> + <entry>Seconds</entry> + </row> + <row> + <entry>LimitFSIZE=</entry> + <entry>ulimit -f</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitDATA=</entry> + <entry>ulimit -d</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitSTACK=</entry> + <entry>ulimit -s</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitCORE=</entry> + <entry>ulimit -c</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitRSS=</entry> + <entry>ulimit -m</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitNOFILE=</entry> + <entry>ulimit -n</entry> + <entry>Number of File Descriptors</entry> + </row> + <row> + <entry>LimitAS=</entry> + <entry>ulimit -v</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitNPROC=</entry> + <entry>ulimit -u</entry> + <entry>Number of Processes</entry> + </row> + <row> + <entry>LimitMEMLOCK=</entry> + <entry>ulimit -l</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitLOCKS=</entry> + <entry>ulimit -x</entry> + <entry>Number of Locks</entry> + </row> + <row> + <entry>LimitSIGPENDING=</entry> + <entry>ulimit -i</entry> + <entry>Number of Queued Signals</entry> + </row> + <row> + <entry>LimitMSGQUEUE=</entry> + <entry>ulimit -q</entry> + <entry>Bytes</entry> + </row> + <row> + <entry>LimitNICE=</entry> + <entry>ulimit -e</entry> + <entry>Nice Level</entry> + </row> + <row> + <entry>LimitRTPRIO=</entry> + <entry>ulimit -r</entry> + <entry>Realtime Priority</entry> + </row> + <row> + <entry>LimitRTTIME=</entry> + <entry>No equivalent</entry> + <entry>Microseconds</entry> + </row> + </tbody> + </tgroup> + </table></listitem> + </varlistentry> + + <varlistentry> + <term><varname>UMask=</varname></term> + + <listitem><para>Controls the file mode creation mask. Takes an access mode in octal notation. See + <citerefentry><refentrytitle>umask</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details. Defaults to 0022 for system units. For user units the default value is inherited from the + per-user service manager (whose default is in turn inherited from the system service manager, and + thus typically also is 0022 — unless overridden by a PAM module). In order to change the per-user mask + for all user services, consider setting the <varname>UMask=</varname> setting of the user's + <filename>user@.service</filename> system service instance. The per-user umask may also be set via + the <varname>umask</varname> field of a user's <ulink url="https://systemd.io/USER_RECORD">JSON User + Record</ulink> (for users managed by + <citerefentry><refentrytitle>systemd-homed.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> + this field may be controlled via <command>homectl --umask=</command>). It may also be set via a PAM + module, such as <citerefentry + project='man-pages'><refentrytitle>pam_umask</refentrytitle><manvolnum>8</manvolnum></citerefentry>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>CoredumpFilter=</varname></term> + + <listitem><para>Controls which types of memory mappings will be saved if the process dumps core + (using the <filename>/proc/<replaceable>pid</replaceable>/coredump_filter</filename> file). Takes a + whitespace-separated combination of mapping type names or numbers (with the default base 16). Mapping + type names are <constant>private-anonymous</constant>, <constant>shared-anonymous</constant>, + <constant>private-file-backed</constant>, <constant>shared-file-backed</constant>, + <constant>elf-headers</constant>, <constant>private-huge</constant>, + <constant>shared-huge</constant>, <constant>private-dax</constant>, <constant>shared-dax</constant>, + and the special values <constant>all</constant> (all types) and <constant>default</constant> (the + kernel default of <literal><constant>private-anonymous</constant> + <constant>shared-anonymous</constant> <constant>elf-headers</constant> + <constant>private-huge</constant></literal>). See + <citerefentry project='man-pages'><refentrytitle>core</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for the meaning of the mapping types. When specified multiple times, all specified masks are + ORed. When not set, or if the empty value is assigned, the inherited value is not changed.</para> + + <example> + <title>Add DAX pages to the dump filter</title> + + <programlisting>CoredumpFilter=default private-dax shared-dax</programlisting> + </example> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>KeyringMode=</varname></term> + + <listitem><para>Controls how the kernel session keyring is set up for the service (see <citerefentry + project='man-pages'><refentrytitle>session-keyring</refentrytitle><manvolnum>7</manvolnum></citerefentry> for + details on the session keyring). Takes one of <option>inherit</option>, <option>private</option>, + <option>shared</option>. If set to <option>inherit</option> no special keyring setup is done, and the kernel's + default behaviour is applied. If <option>private</option> is used a new session keyring is allocated when a + service process is invoked, and it is not linked up with any user keyring. This is the recommended setting for + system services, as this ensures that multiple services running under the same system user ID (in particular + the root user) do not share their key material among each other. If <option>shared</option> is used a new + session keyring is allocated as for <option>private</option>, but the user keyring of the user configured with + <varname>User=</varname> is linked into it, so that keys assigned to the user may be requested by the unit's + processes. In this modes multiple units running processes under the same user ID may share key material. Unless + <option>inherit</option> is selected the unique invocation ID for the unit (see below) is added as a protected + key by the name <literal>invocation_id</literal> to the newly created session keyring. Defaults to + <option>private</option> for services of the system service manager and to <option>inherit</option> for + non-service units and for services of the user service manager.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>OOMScoreAdjust=</varname></term> + + <listitem><para>Sets the adjustment value for the Linux kernel's Out-Of-Memory (OOM) killer score for + executed processes. Takes an integer between -1000 (to disable OOM killing of processes of this unit) + and 1000 (to make killing of processes of this unit under memory pressure very likely). See <ulink + url="https://www.kernel.org/doc/Documentation/filesystems/proc.txt">proc.txt</ulink> for details. If + not specified defaults to the OOM score adjustment level of the service manager itself, which is + normally at 0.</para> + + <para>Use the <varname>OOMPolicy=</varname> setting of service units to configure how the service + manager shall react to the kernel OOM killer terminating a process of the service. See + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TimerSlackNSec=</varname></term> + <listitem><para>Sets the timer slack in nanoseconds for the executed processes. The timer slack controls the + accuracy of wake-ups triggered by timers. See + <citerefentry><refentrytitle>prctl</refentrytitle><manvolnum>2</manvolnum></citerefentry> for more + information. Note that in contrast to most other time span definitions this parameter takes an integer value in + nano-seconds if no unit is specified. The usual time units are understood too.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>Personality=</varname></term> + + <listitem><para>Controls which kernel architecture <citerefentry + project='man-pages'><refentrytitle>uname</refentrytitle><manvolnum>2</manvolnum></citerefentry> shall report, + when invoked by unit processes. Takes one of the architecture identifiers <constant>x86</constant>, + <constant>x86-64</constant>, <constant>ppc</constant>, <constant>ppc-le</constant>, <constant>ppc64</constant>, + <constant>ppc64-le</constant>, <constant>s390</constant> or <constant>s390x</constant>. Which personality + architectures are supported depends on the system architecture. Usually the 64bit versions of the various + system architectures support their immediate 32bit personality architecture counterpart, but no others. For + example, <constant>x86-64</constant> systems support the <constant>x86-64</constant> and + <constant>x86</constant> personalities but no others. The personality feature is useful when running 32-bit + services on a 64-bit host system. If not specified, the personality is left unmodified and thus reflects the + personality of the host system's kernel.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>IgnoreSIGPIPE=</varname></term> + + <listitem><para>Takes a boolean argument. If true, causes <constant>SIGPIPE</constant> to be ignored in the + executed process. Defaults to true because <constant>SIGPIPE</constant> generally is useful only in shell + pipelines.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Scheduling</title> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>Nice=</varname></term> + + <listitem><para>Sets the default nice level (scheduling priority) for executed processes. Takes an integer + between -20 (highest priority) and 19 (lowest priority). See + <citerefentry><refentrytitle>setpriority</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>CPUSchedulingPolicy=</varname></term> + + <listitem><para>Sets the CPU scheduling policy for executed processes. Takes one of <option>other</option>, + <option>batch</option>, <option>idle</option>, <option>fifo</option> or <option>rr</option>. See + <citerefentry project='man-pages'><refentrytitle>sched_setscheduler</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>CPUSchedulingPriority=</varname></term> + + <listitem><para>Sets the CPU scheduling priority for executed processes. The available priority range depends + on the selected CPU scheduling policy (see above). For real-time scheduling policies an integer between 1 + (lowest priority) and 99 (highest priority) can be used. See + <citerefentry project='man-pages'><refentrytitle>sched_setscheduler</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details. </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>CPUSchedulingResetOnFork=</varname></term> + + <listitem><para>Takes a boolean argument. If true, elevated CPU scheduling priorities and policies + will be reset when the executed processes call + <citerefentry project='man-pages'><refentrytitle>fork</refentrytitle><manvolnum>2</manvolnum></citerefentry>, + and can hence not leak into child processes. See + <citerefentry project='man-pages'><refentrytitle>sched_setscheduler</refentrytitle><manvolnum>2</manvolnum></citerefentry> + for details. Defaults to false.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>CPUAffinity=</varname></term> + + <listitem><para>Controls the CPU affinity of the executed processes. Takes a list of CPU indices or ranges + separated by either whitespace or commas. Alternatively, takes a special "numa" value in which case systemd + automatically derives allowed CPU range based on the value of <varname>NUMAMask=</varname> option. CPU ranges + are specified by the lower and upper CPU indices separated by a dash. This option may be specified more than + once, in which case the specified CPU affinity masks are merged. If the empty string is assigned, the mask + is reset, all assignments prior to this will have no effect. See + <citerefentry project='man-pages'><refentrytitle>sched_setaffinity</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>NUMAPolicy=</varname></term> + + <listitem><para>Controls the NUMA memory policy of the executed processes. Takes a policy type, one of: + <option>default</option>, <option>preferred</option>, <option>bind</option>, <option>interleave</option> and + <option>local</option>. A list of NUMA nodes that should be associated with the policy must be specified + in <varname>NUMAMask=</varname>. For more details on each policy please see, + <citerefentry><refentrytitle>set_mempolicy</refentrytitle><manvolnum>2</manvolnum></citerefentry>. For overall + overview of NUMA support in Linux see, + <citerefentry project='man-pages'><refentrytitle>numa</refentrytitle><manvolnum>7</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>NUMAMask=</varname></term> + + <listitem><para>Controls the NUMA node list which will be applied alongside with selected NUMA policy. + Takes a list of NUMA nodes and has the same syntax as a list of CPUs for <varname>CPUAffinity=</varname> + option or special "all" value which will include all available NUMA nodes in the mask. Note that the list + of NUMA nodes is not required for <option>default</option> and <option>local</option> + policies and for <option>preferred</option> policy we expect a single NUMA node.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>IOSchedulingClass=</varname></term> + + <listitem><para>Sets the I/O scheduling class for executed processes. Takes an integer between 0 and 3 or one + of the strings <option>none</option>, <option>realtime</option>, <option>best-effort</option> or + <option>idle</option>. If the empty string is assigned to this option, all prior assignments to both + <varname>IOSchedulingClass=</varname> and <varname>IOSchedulingPriority=</varname> have no effect. See + <citerefentry><refentrytitle>ioprio_set</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>IOSchedulingPriority=</varname></term> + + <listitem><para>Sets the I/O scheduling priority for executed processes. Takes an integer between 0 (highest + priority) and 7 (lowest priority). The available priorities depend on the selected I/O scheduling class (see + above). If the empty string is assigned to this option, all prior assignments to both + <varname>IOSchedulingClass=</varname> and <varname>IOSchedulingPriority=</varname> have no effect. + See <citerefentry><refentrytitle>ioprio_set</refentrytitle><manvolnum>2</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Sandboxing</title> + + <para>The following sandboxing options are an effective way to limit the exposure of the system towards the unit's + processes. It is recommended to turn on as many of these options for each unit as is possible without negatively + affecting the process' ability to operate. Note that many of these sandboxing features are gracefully turned off on + systems where the underlying security mechanism is not available. For example, <varname>ProtectSystem=</varname> + has no effect if the kernel is built without file system namespacing or if the service manager runs in a container + manager that makes file system namespacing unavailable to its payload. Similar, + <varname>RestrictRealtime=</varname> has no effect on systems that lack support for SECCOMP system call filtering, + or in containers where support for this is turned off.</para> + + <para>Also note that some sandboxing functionality is generally not available in user services (i.e. services run + by the per-user service manager). Specifically, the various settings requiring file system namespacing support + (such as <varname>ProtectSystem=</varname>) are not available, as the underlying kernel functionality is only + accessible to privileged processes. However, most namespacing settings, that will not work on their own in user + services, will work when used in conjunction with <varname>PrivateUsers=</varname><option>true</option>.</para> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>ProtectSystem=</varname></term> + + <listitem><para>Takes a boolean argument or the special values <literal>full</literal> or + <literal>strict</literal>. If true, mounts the <filename>/usr/</filename> and the boot loader + directories (<filename>/boot</filename> and <filename>/efi</filename>) read-only for processes + invoked by this unit. If set to <literal>full</literal>, the <filename>/etc/</filename> directory is + mounted read-only, too. If set to <literal>strict</literal> the entire file system hierarchy is + mounted read-only, except for the API file system subtrees <filename>/dev/</filename>, + <filename>/proc/</filename> and <filename>/sys/</filename> (protect these directories using + <varname>PrivateDevices=</varname>, <varname>ProtectKernelTunables=</varname>, + <varname>ProtectControlGroups=</varname>). This setting ensures that any modification of the vendor-supplied + operating system (and optionally its configuration, and local mounts) is prohibited for the service. It is + recommended to enable this setting for all long-running services, unless they are involved with system updates + or need to modify the operating system in other ways. If this option is used, + <varname>ReadWritePaths=</varname> may be used to exclude specific directories from being made read-only. This + setting is implied if <varname>DynamicUser=</varname> is set. This setting cannot ensure protection in all + cases. In general it has the same limitations as <varname>ReadOnlyPaths=</varname>, see below. Defaults to + off.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectHome=</varname></term> + + <listitem><para>Takes a boolean argument or the special values <literal>read-only</literal> or + <literal>tmpfs</literal>. If true, the directories <filename>/home/</filename>, + <filename>/root</filename>, and <filename>/run/user</filename> are made inaccessible and empty for + processes invoked by this unit. If set to <literal>read-only</literal>, the three directories are + made read-only instead. If set to <literal>tmpfs</literal>, temporary file systems are mounted on the + three directories in read-only mode. The value <literal>tmpfs</literal> is useful to hide home + directories not relevant to the processes invoked by the unit, while still allowing necessary + directories to be made visible when listed in <varname>BindPaths=</varname> or + <varname>BindReadOnlyPaths=</varname>.</para> + + <para>Setting this to <literal>yes</literal> is mostly equivalent to set the three directories in + <varname>InaccessiblePaths=</varname>. Similarly, <literal>read-only</literal> is mostly equivalent to + <varname>ReadOnlyPaths=</varname>, and <literal>tmpfs</literal> is mostly equivalent to + <varname>TemporaryFileSystem=</varname> with <literal>:ro</literal>.</para> + + <para>It is recommended to enable this setting for all long-running services (in particular + network-facing ones), to ensure they cannot get access to private user data, unless the services + actually require access to the user's private data. This setting is implied if + <varname>DynamicUser=</varname> is set. This setting cannot ensure protection in all cases. In + general it has the same limitations as <varname>ReadOnlyPaths=</varname>, see below.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RuntimeDirectory=</varname></term> + <term><varname>StateDirectory=</varname></term> + <term><varname>CacheDirectory=</varname></term> + <term><varname>LogsDirectory=</varname></term> + <term><varname>ConfigurationDirectory=</varname></term> + + <listitem><para>These options take a whitespace-separated list of directory names. The specified + directory names must be relative, and may not include <literal>..</literal>. If set, when the unit is + started, one or more directories by the specified names will be created (including their parents) + below the locations defined in the following table. Also, the corresponding environment variable will + be defined with the full paths of the directories. If multiple directories are set, then in the + environment variable the paths are concatenated with colon (<literal>:</literal>).</para> + <table> + <title>Automatic directory creation and environment variables</title> + <tgroup cols='4'> + <thead> + <row> + <entry>Directory</entry> + <entry>Below path for system units</entry> + <entry>Below path for user units</entry> + <entry>Environment variable set</entry> + </row> + </thead> + <tbody> + <row> + <entry><varname>RuntimeDirectory=</varname></entry> + <entry><filename>/run/</filename></entry> + <entry><varname>$XDG_RUNTIME_DIR</varname></entry> + <entry><varname>$RUNTIME_DIRECTORY</varname></entry> + </row> + <row> + <entry><varname>StateDirectory=</varname></entry> + <entry><filename>/var/lib/</filename></entry> + <entry><varname>$XDG_CONFIG_HOME</varname></entry> + <entry><varname>$STATE_DIRECTORY</varname></entry> + </row> + <row> + <entry><varname>CacheDirectory=</varname></entry> + <entry><filename>/var/cache/</filename></entry> + <entry><varname>$XDG_CACHE_HOME</varname></entry> + <entry><varname>$CACHE_DIRECTORY</varname></entry> + </row> + <row> + <entry><varname>LogsDirectory=</varname></entry> + <entry><filename>/var/log/</filename></entry> + <entry><varname>$XDG_CONFIG_HOME</varname><filename>/log/</filename></entry> + <entry><varname>$LOGS_DIRECTORY</varname></entry> + </row> + <row> + <entry><varname>ConfigurationDirectory=</varname></entry> + <entry><filename>/etc/</filename></entry> + <entry><varname>$XDG_CONFIG_HOME</varname></entry> + <entry><varname>$CONFIGURATION_DIRECTORY</varname></entry> + </row> + </tbody> + </tgroup> + </table> + + <para>In case of <varname>RuntimeDirectory=</varname> the innermost subdirectories are removed when + the unit is stopped. It is possible to preserve the specified directories in this case if + <varname>RuntimeDirectoryPreserve=</varname> is configured to <option>restart</option> or + <option>yes</option> (see below). The directories specified with <varname>StateDirectory=</varname>, + <varname>CacheDirectory=</varname>, <varname>LogsDirectory=</varname>, + <varname>ConfigurationDirectory=</varname> are not removed when the unit is stopped.</para> + + <para>Except in case of <varname>ConfigurationDirectory=</varname>, the innermost specified directories will be + owned by the user and group specified in <varname>User=</varname> and <varname>Group=</varname>. If the + specified directories already exist and their owning user or group do not match the configured ones, all files + and directories below the specified directories as well as the directories themselves will have their file + ownership recursively changed to match what is configured. As an optimization, if the specified directories are + already owned by the right user and group, files and directories below of them are left as-is, even if they do + not match what is requested. The innermost specified directories will have their access mode adjusted to the + what is specified in <varname>RuntimeDirectoryMode=</varname>, <varname>StateDirectoryMode=</varname>, + <varname>CacheDirectoryMode=</varname>, <varname>LogsDirectoryMode=</varname> and + <varname>ConfigurationDirectoryMode=</varname>.</para> + + <para>These options imply <varname>BindPaths=</varname> for the specified paths. When combined with + <varname>RootDirectory=</varname> or <varname>RootImage=</varname> these paths always reside on the host and + are mounted from there into the unit's file system namespace.</para> + + <para>If <varname>DynamicUser=</varname> is used in conjunction with + <varname>StateDirectory=</varname>, the logic for <varname>CacheDirectory=</varname> and + <varname>LogsDirectory=</varname> is slightly altered: the directories are created below + <filename>/var/lib/private</filename>, <filename>/var/cache/private</filename> and + <filename>/var/log/private</filename>, respectively, which are host directories made inaccessible to + unprivileged users, which ensures that access to these directories cannot be gained through dynamic + user ID recycling. Symbolic links are created to hide this difference in behaviour. Both from + perspective of the host and from inside the unit, the relevant directories hence always appear + directly below <filename>/var/lib</filename>, <filename>/var/cache</filename> and + <filename>/var/log</filename>.</para> + + <para>Use <varname>RuntimeDirectory=</varname> to manage one or more runtime directories for the unit and bind + their lifetime to the daemon runtime. This is particularly useful for unprivileged daemons that cannot create + runtime directories in <filename>/run/</filename> due to lack of privileges, and to make sure the runtime + directory is cleaned up automatically after use. For runtime directories that require more complex or different + configuration or lifetime guarantees, please consider using + <citerefentry><refentrytitle>tmpfiles.d</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + + <para>The directories defined by these options are always created under the standard paths used by systemd + (<filename>/var/</filename>, <filename>/run/</filename>, <filename>/etc/</filename>, …). If the service needs + directories in a different location, a different mechanism has to be used to create them.</para> + + <para><citerefentry><refentrytitle>tmpfiles.d</refentrytitle><manvolnum>5</manvolnum></citerefentry> provides + functionality that overlaps with these options. Using these options is recommended, because the lifetime of + the directories is tied directly to the lifetime of the unit, and it is not necessary to ensure that the + <filename>tmpfiles.d</filename> configuration is executed before the unit is started.</para> + + <para>To remove any of the directories created by these settings, use the <command>systemctl clean + …</command> command on the relevant units, see + <citerefentry><refentrytitle>systemctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> for + details.</para> + + <para>Example: if a system service unit has the following, + <programlisting>RuntimeDirectory=foo/bar baz</programlisting> + the service manager creates <filename index='false'>/run/foo</filename> (if it does not exist), + + <filename index='false'>/run/foo/bar</filename>, and <filename index='false'>/run/baz</filename>. The + directories <filename index='false'>/run/foo/bar</filename> and + <filename index='false'>/run/baz</filename> except <filename index='false'>/run/foo</filename> are + owned by the user and group specified in <varname>User=</varname> and <varname>Group=</varname>, and removed + when the service is stopped.</para> + + <para>Example: if a system service unit has the following, + <programlisting>RuntimeDirectory=foo/bar +StateDirectory=aaa/bbb ccc</programlisting> + then the environment variable <literal>RUNTIME_DIRECTORY</literal> is set with <literal>/run/foo/bar</literal>, and + <literal>STATE_DIRECTORY</literal> is set with <literal>/var/lib/aaa/bbb:/var/lib/ccc</literal>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RuntimeDirectoryMode=</varname></term> + <term><varname>StateDirectoryMode=</varname></term> + <term><varname>CacheDirectoryMode=</varname></term> + <term><varname>LogsDirectoryMode=</varname></term> + <term><varname>ConfigurationDirectoryMode=</varname></term> + + <listitem><para>Specifies the access mode of the directories specified in <varname>RuntimeDirectory=</varname>, + <varname>StateDirectory=</varname>, <varname>CacheDirectory=</varname>, <varname>LogsDirectory=</varname>, or + <varname>ConfigurationDirectory=</varname>, respectively, as an octal number. Defaults to + <constant>0755</constant>. See "Permissions" in <citerefentry + project='man-pages'><refentrytitle>path_resolution</refentrytitle><manvolnum>7</manvolnum></citerefentry> for a + discussion of the meaning of permission bits.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RuntimeDirectoryPreserve=</varname></term> + + <listitem><para>Takes a boolean argument or <option>restart</option>. If set to <option>no</option> (the + default), the directories specified in <varname>RuntimeDirectory=</varname> are always removed when the service + stops. If set to <option>restart</option> the directories are preserved when the service is both automatically + and manually restarted. Here, the automatic restart means the operation specified in + <varname>Restart=</varname>, and manual restart means the one triggered by <command>systemctl restart + foo.service</command>. If set to <option>yes</option>, then the directories are not removed when the service is + stopped. Note that since the runtime directory <filename>/run/</filename> is a mount point of + <literal>tmpfs</literal>, then for system services the directories specified in + <varname>RuntimeDirectory=</varname> are removed when the system is rebooted.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TimeoutCleanSec=</varname></term> + <listitem><para>Configures a timeout on the clean-up operation requested through <command>systemctl + clean …</command>, see + <citerefentry><refentrytitle>systemctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> for + details. Takes the usual time values and defaults to <constant>infinity</constant>, i.e. by default + no timeout is applied. If a timeout is configured the clean operation will be aborted forcibly when + the timeout is reached, potentially leaving resources on disk.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ReadWritePaths=</varname></term> + <term><varname>ReadOnlyPaths=</varname></term> + <term><varname>InaccessiblePaths=</varname></term> + + <listitem><para>Sets up a new file system namespace for executed processes. These options may be used + to limit access a process has to the file system. Each setting takes a space-separated list of paths + relative to the host's root directory (i.e. the system running the service manager). Note that if + paths contain symlinks, they are resolved relative to the root directory set with + <varname>RootDirectory=</varname>/<varname>RootImage=</varname>.</para> + + <para>Paths listed in <varname>ReadWritePaths=</varname> are accessible from within the namespace + with the same access modes as from outside of it. Paths listed in <varname>ReadOnlyPaths=</varname> + are accessible for reading only, writing will be refused even if the usual file access controls would + permit this. Nest <varname>ReadWritePaths=</varname> inside of <varname>ReadOnlyPaths=</varname> in + order to provide writable subdirectories within read-only directories. Use + <varname>ReadWritePaths=</varname> in order to allow-list specific paths for write access if + <varname>ProtectSystem=strict</varname> is used.</para> + + <para>Paths listed in <varname>InaccessiblePaths=</varname> will be made inaccessible for processes inside + the namespace along with everything below them in the file system hierarchy. This may be more restrictive than + desired, because it is not possible to nest <varname>ReadWritePaths=</varname>, <varname>ReadOnlyPaths=</varname>, + <varname>BindPaths=</varname>, or <varname>BindReadOnlyPaths=</varname> inside it. For a more flexible option, + see <varname>TemporaryFileSystem=</varname>.</para> + + <para>Non-directory paths may be specified as well. These options may be specified more than once, + in which case all paths listed will have limited access from within the namespace. If the empty string is + assigned to this option, the specific list is reset, and all prior assignments have no effect.</para> + + <para>Paths in <varname>ReadWritePaths=</varname>, <varname>ReadOnlyPaths=</varname> and + <varname>InaccessiblePaths=</varname> may be prefixed with <literal>-</literal>, in which case they will be + ignored when they do not exist. If prefixed with <literal>+</literal> the paths are taken relative to the root + directory of the unit, as configured with <varname>RootDirectory=</varname>/<varname>RootImage=</varname>, + instead of relative to the root directory of the host (see above). When combining <literal>-</literal> and + <literal>+</literal> on the same path make sure to specify <literal>-</literal> first, and <literal>+</literal> + second.</para> + + <para>Note that these settings will disconnect propagation of mounts from the unit's processes to the + host. This means that this setting may not be used for services which shall be able to install mount points in + the main mount namespace. For <varname>ReadWritePaths=</varname> and <varname>ReadOnlyPaths=</varname> + propagation in the other direction is not affected, i.e. mounts created on the host generally appear in the + unit processes' namespace, and mounts removed on the host also disappear there too. In particular, note that + mount propagation from host to unit will result in unmodified mounts to be created in the unit's namespace, + i.e. writable mounts appearing on the host will be writable in the unit's namespace too, even when propagated + below a path marked with <varname>ReadOnlyPaths=</varname>! Restricting access with these options hence does + not extend to submounts of a directory that are created later on. This means the lock-down offered by that + setting is not complete, and does not offer full protection. </para> + + <para>Note that the effect of these settings may be undone by privileged processes. In order to set up an + effective sandboxed environment for a unit it is thus recommended to combine these settings with either + <varname>CapabilityBoundingSet=~CAP_SYS_ADMIN</varname> or + <varname>SystemCallFilter=~@mount</varname>.</para> + + <xi:include href="system-only.xml" xpointer="plural"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TemporaryFileSystem=</varname></term> + + <listitem><para>Takes a space-separated list of mount points for temporary file systems (tmpfs). If set, a new file + system namespace is set up for executed processes, and a temporary file system is mounted on each mount point. + This option may be specified more than once, in which case temporary file systems are mounted on all listed mount + points. If the empty string is assigned to this option, the list is reset, and all prior assignments have no effect. + Each mount point may optionally be suffixed with a colon (<literal>:</literal>) and mount options such as + <literal>size=10%</literal> or <literal>ro</literal>. By default, each temporary file system is mounted + with <literal>nodev,strictatime,mode=0755</literal>. These can be disabled by explicitly specifying the corresponding + mount options, e.g., <literal>dev</literal> or <literal>nostrictatime</literal>.</para> + + <para>This is useful to hide files or directories not relevant to the processes invoked by the unit, while necessary + files or directories can be still accessed by combining with <varname>BindPaths=</varname> or + <varname>BindReadOnlyPaths=</varname>:</para> + + <para>Example: if a unit has the following, + <programlisting>TemporaryFileSystem=/var:ro +BindReadOnlyPaths=/var/lib/systemd</programlisting> + then the invoked processes by the unit cannot see any files or directories under <filename>/var/</filename> except for + <filename>/var/lib/systemd</filename> or its contents.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PrivateTmp=</varname></term> + + <listitem><para>Takes a boolean argument. If true, sets up a new file system namespace for the + executed processes and mounts private <filename>/tmp/</filename> and <filename>/var/tmp/</filename> + directories inside it that are not shared by processes outside of the namespace. This is useful to + secure access to temporary files of the process, but makes sharing between processes via + <filename>/tmp/</filename> or <filename>/var/tmp/</filename> impossible. If this is enabled, all + temporary files created by a service in these directories will be removed after the service is + stopped. Defaults to false. It is possible to run two or more units within the same private + <filename>/tmp/</filename> and <filename>/var/tmp/</filename> namespace by using the + <varname>JoinsNamespaceOf=</varname> directive, see + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry> for + details. This setting is implied if <varname>DynamicUser=</varname> is set. For this setting the same + restrictions regarding mount propagation and privileges apply as for + <varname>ReadOnlyPaths=</varname> and related calls, see above. Enabling this setting has the side + effect of adding <varname>Requires=</varname> and <varname>After=</varname> dependencies on all mount + units necessary to access <filename>/tmp/</filename> and <filename>/var/tmp/</filename>. Moreover an + implicitly <varname>After=</varname> ordering on + <citerefentry><refentrytitle>systemd-tmpfiles-setup.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> + is added.</para> + + <para>Note that the implementation of this setting might be impossible (for example if mount namespaces are not + available), and the unit should be written in a way that does not solely rely on this setting for + security.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PrivateDevices=</varname></term> + + <listitem><para>Takes a boolean argument. If true, sets up a new <filename>/dev/</filename> mount for the + executed processes and only adds API pseudo devices such as <filename>/dev/null</filename>, + <filename>/dev/zero</filename> or <filename>/dev/random</filename> (as well as the pseudo TTY subsystem) to it, + but no physical devices such as <filename>/dev/sda</filename>, system memory <filename>/dev/mem</filename>, + system ports <filename>/dev/port</filename> and others. This is useful to securely turn off physical device + access by the executed process. Defaults to false. Enabling this option will install a system call filter to + block low-level I/O system calls that are grouped in the <varname>@raw-io</varname> set, will also remove + <constant>CAP_MKNOD</constant> and <constant>CAP_SYS_RAWIO</constant> from the capability bounding set for the + unit (see above), and set <varname>DevicePolicy=closed</varname> (see + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for details). Note that using this setting will disconnect propagation of mounts from the service to the host + (propagation in the opposite direction continues to work). This means that this setting may not be used for + services which shall be able to install mount points in the main mount namespace. The new + <filename>/dev/</filename> will be mounted read-only and 'noexec'. The latter may break old programs which try + to set up executable memory by using + <citerefentry><refentrytitle>mmap</refentrytitle><manvolnum>2</manvolnum></citerefentry> of + <filename>/dev/zero</filename> instead of using <constant>MAP_ANON</constant>. For this setting the same + restrictions regarding mount propagation and privileges apply as for <varname>ReadOnlyPaths=</varname> and + related calls, see above. If turned on and if running in user mode, or in system mode, but without the + <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=</varname>), + <varname>NoNewPrivileges=yes</varname> is implied.</para> + + <para>Note that the implementation of this setting might be impossible (for example if mount namespaces are not + available), and the unit should be written in a way that does not solely rely on this setting for + security.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PrivateNetwork=</varname></term> + + <listitem><para>Takes a boolean argument. If true, sets up a new network namespace for the executed processes + and configures only the loopback network device <literal>lo</literal> inside it. No other network devices will + be available to the executed process. This is useful to turn off network access by the executed process. + Defaults to false. It is possible to run two or more units within the same private network namespace by using + the <varname>JoinsNamespaceOf=</varname> directive, see + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry> for + details. Note that this option will disconnect all socket families from the host, including + <constant>AF_NETLINK</constant> and <constant>AF_UNIX</constant>. Effectively, for + <constant>AF_NETLINK</constant> this means that device configuration events received from + <citerefentry><refentrytitle>systemd-udevd.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> are + not delivered to the unit's processes. And for <constant>AF_UNIX</constant> this has the effect that + <constant>AF_UNIX</constant> sockets in the abstract socket namespace of the host will become unavailable to + the unit's processes (however, those located in the file system will continue to be accessible).</para> + + <para>Note that the implementation of this setting might be impossible (for example if network namespaces are + not available), and the unit should be written in a way that does not solely rely on this setting for + security.</para> + + <para>When this option is used on a socket unit any sockets bound on behalf of this unit will be + bound within a private network namespace. This may be combined with + <varname>JoinsNamespaceOf=</varname> to listen on sockets inside of network namespaces of other + services.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>NetworkNamespacePath=</varname></term> + + <listitem><para>Takes an absolute file system path refererring to a Linux network namespace + pseudo-file (i.e. a file like <filename>/proc/$PID/ns/net</filename> or a bind mount or symlink to + one). When set the invoked processes are added to the network namespace referenced by that path. The + path has to point to a valid namespace file at the moment the processes are forked off. If this + option is used <varname>PrivateNetwork=</varname> has no effect. If this option is used together with + <varname>JoinsNamespaceOf=</varname> then it only has an effect if this unit is started before any of + the listed units that have <varname>PrivateNetwork=</varname> or + <varname>NetworkNamespacePath=</varname> configured, as otherwise the network namespace of those + units is reused.</para> + + <para>When this option is used on a socket unit any sockets bound on behalf of this unit will be + bound within the specified network namespace.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PrivateUsers=</varname></term> + + <listitem><para>Takes a boolean argument. If true, sets up a new user namespace for the executed processes and + configures a minimal user and group mapping, that maps the <literal>root</literal> user and group as well as + the unit's own user and group to themselves and everything else to the <literal>nobody</literal> user and + group. This is useful to securely detach the user and group databases used by the unit from the rest of the + system, and thus to create an effective sandbox environment. All files, directories, processes, IPC objects and + other resources owned by users/groups not equaling <literal>root</literal> or the unit's own will stay visible + from within the unit but appear owned by the <literal>nobody</literal> user and group. If this mode is enabled, + all unit processes are run without privileges in the host user namespace (regardless if the unit's own + user/group is <literal>root</literal> or not). Specifically this means that the process will have zero process + capabilities on the host's user namespace, but full capabilities within the service's user namespace. Settings + such as <varname>CapabilityBoundingSet=</varname> will affect only the latter, and there's no way to acquire + additional capabilities in the host's user namespace. Defaults to off.</para> + + <para>When this setting is set up by a per-user instance of the service manager, the mapping of the + <literal>root</literal> user and group to itself is omitted (unless the user manager is root). + Additionally, in the per-user instance manager case, the + user namespace will be set up before most other namespaces. This means that combining + <varname>PrivateUsers=</varname><option>true</option> with other namespaces will enable use of features not + normally supported by the per-user instances of the service manager.</para> + + <para>This setting is particularly useful in conjunction with + <varname>RootDirectory=</varname>/<varname>RootImage=</varname>, as the need to synchronize the user and group + databases in the root directory and on the host is reduced, as the only users and groups who need to be matched + are <literal>root</literal>, <literal>nobody</literal> and the unit's own user and group.</para> + + <para>Note that the implementation of this setting might be impossible (for example if user namespaces are not + available), and the unit should be written in a way that does not solely rely on this setting for + security.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectHostname=</varname></term> + + <listitem><para>Takes a boolean argument. When set, sets up a new UTS namespace for the executed + processes. In addition, changing hostname or domainname is prevented. Defaults to off.</para> + + <para>Note that the implementation of this setting might be impossible (for example if UTS namespaces + are not available), and the unit should be written in a way that does not solely rely on this setting + for security.</para> + + <para>Note that when this option is enabled for a service hostname changes no longer propagate from + the system into the service, it is hence not suitable for services that need to take notice of system + hostname changes dynamically.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectClock=</varname></term> + + <listitem><para>Takes a boolean argument. If set, writes to the hardware clock or system clock will be denied. + It is recommended to turn this on for most services that do not need modify the clock. Defaults to off. Enabling + this option removes <constant>CAP_SYS_TIME</constant> and <constant>CAP_WAKE_ALARM</constant> from the + capability bounding set for this unit, installs a system call filter to block calls that can set the + clock, and <varname>DeviceAllow=char-rtc r</varname> is implied. This ensures <filename>/dev/rtc0</filename>, + <filename>/dev/rtc1</filename>, etc. are made read-only to the service. See + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for the details about <varname>DeviceAllow=</varname>.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectKernelTunables=</varname></term> + + <listitem><para>Takes a boolean argument. If true, kernel variables accessible through + <filename>/proc/sys/</filename>, <filename>/sys/</filename>, <filename>/proc/sysrq-trigger</filename>, + <filename>/proc/latency_stats</filename>, <filename>/proc/acpi</filename>, + <filename>/proc/timer_stats</filename>, <filename>/proc/fs</filename> and <filename>/proc/irq</filename> will + be made read-only to all processes of the unit. Usually, tunable kernel variables should be initialized only at + boot-time, for example with the + <citerefentry><refentrytitle>sysctl.d</refentrytitle><manvolnum>5</manvolnum></citerefentry> mechanism. Few + services need to write to these at runtime; it is hence recommended to turn this on for most services. For this + setting the same restrictions regarding mount propagation and privileges apply as for + <varname>ReadOnlyPaths=</varname> and related calls, see above. Defaults to off. If turned on and if running + in user mode, or in system mode, but without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. services + for which <varname>User=</varname> is set), <varname>NoNewPrivileges=yes</varname> is implied. Note that this + option does not prevent indirect changes to kernel tunables effected by IPC calls to other processes. However, + <varname>InaccessiblePaths=</varname> may be used to make relevant IPC file system objects inaccessible. If + <varname>ProtectKernelTunables=</varname> is set, <varname>MountAPIVFS=yes</varname> is + implied.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectKernelModules=</varname></term> + + <listitem><para>Takes a boolean argument. If true, explicit module loading will be denied. This allows + module load and unload operations to be turned off on modular kernels. It is recommended to turn this on for most services + that do not need special file systems or extra kernel modules to work. Defaults to off. Enabling this option + removes <constant>CAP_SYS_MODULE</constant> from the capability bounding set for the unit, and installs a + system call filter to block module system calls, also <filename>/usr/lib/modules</filename> is made + inaccessible. For this setting the same restrictions regarding mount propagation and privileges apply as for + <varname>ReadOnlyPaths=</varname> and related calls, see above. Note that limited automatic module loading due + to user configuration or kernel mapping tables might still happen as side effect of requested user operations, + both privileged and unprivileged. To disable module auto-load feature please see + <citerefentry><refentrytitle>sysctl.d</refentrytitle><manvolnum>5</manvolnum></citerefentry> + <constant>kernel.modules_disabled</constant> mechanism and + <filename>/proc/sys/kernel/modules_disabled</filename> documentation. If turned on and if running in user + mode, or in system mode, but without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting + <varname>User=</varname>), <varname>NoNewPrivileges=yes</varname> is implied.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectKernelLogs=</varname></term> + + <listitem><para>Takes a boolean argument. If true, access to the kernel log ring buffer will be denied. It is + recommended to turn this on for most services that do not need to read from or write to the kernel log ring + buffer. Enabling this option removes <constant>CAP_SYSLOG</constant> from the capability bounding set for this + unit, and installs a system call filter to block the + <citerefentry project='man-pages'><refentrytitle>syslog</refentrytitle><manvolnum>2</manvolnum></citerefentry> + system call (not to be confused with the libc API + <citerefentry project='man-pages'><refentrytitle>syslog</refentrytitle><manvolnum>3</manvolnum></citerefentry> + for userspace logging). The kernel exposes its log buffer to userspace via <filename>/dev/kmsg</filename> and + <filename>/proc/kmsg</filename>. If enabled, these are made inaccessible to all the processes in the unit.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>ProtectControlGroups=</varname></term> + + <listitem><para>Takes a boolean argument. If true, the Linux Control Groups (<citerefentry + project='man-pages'><refentrytitle>cgroups</refentrytitle><manvolnum>7</manvolnum></citerefentry>) hierarchies + accessible through <filename>/sys/fs/cgroup/</filename> will be made read-only to all processes of the + unit. Except for container managers no services should require write access to the control groups hierarchies; + it is hence recommended to turn this on for most services. For this setting the same restrictions regarding + mount propagation and privileges apply as for <varname>ReadOnlyPaths=</varname> and related calls, see + above. Defaults to off. If <varname>ProtectControlGroups=</varname> is set, <varname>MountAPIVFS=yes</varname> + is implied.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RestrictAddressFamilies=</varname></term> + + <listitem><para>Restricts the set of socket address families accessible to the processes of this + unit. Takes a space-separated list of address family names to allow-list, such as + <constant>AF_UNIX</constant>, <constant>AF_INET</constant> or <constant>AF_INET6</constant>. When + prefixed with <constant>~</constant> the listed address families will be applied as deny list, + otherwise as allow list. Note that this restricts access to the <citerefentry + project='man-pages'><refentrytitle>socket</refentrytitle><manvolnum>2</manvolnum></citerefentry> + system call only. Sockets passed into the process by other means (for example, by using socket + activation with socket units, see + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry>) + are unaffected. Also, sockets created with <function>socketpair()</function> (which creates connected + AF_UNIX sockets only) are unaffected. Note that this option has no effect on 32-bit x86, s390, s390x, + mips, mips-le, ppc, ppc-le, ppc64, ppc64-le and is ignored (but works correctly on other ABIs, + including x86-64). Note that on systems supporting multiple ABIs (such as x86/x86-64) it is + recommended to turn off alternative ABIs for services, so that they cannot be used to circumvent the + restrictions of this option. Specifically, it is recommended to combine this option with + <varname>SystemCallArchitectures=native</varname> or similar. If running in user mode, or in system + mode, but without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting + <varname>User=nobody</varname>), <varname>NoNewPrivileges=yes</varname> is implied. By default, no + restrictions apply, all address families are accessible to processes. If assigned the empty string, + any previous address family restriction changes are undone. This setting does not affect commands + prefixed with <literal>+</literal>.</para> + + <para>Use this option to limit exposure of processes to remote access, in particular via exotic and sensitive + network protocols, such as <constant>AF_PACKET</constant>. Note that in most cases, the local + <constant>AF_UNIX</constant> address family should be included in the configured allow list as it is frequently + used for local communication, including for + <citerefentry><refentrytitle>syslog</refentrytitle><manvolnum>2</manvolnum></citerefentry> + logging.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RestrictNamespaces=</varname></term> + + <listitem><para>Restricts access to Linux namespace functionality for the processes of this unit. For details + about Linux namespaces, see <citerefentry + project='man-pages'><refentrytitle>namespaces</refentrytitle><manvolnum>7</manvolnum></citerefentry>. Either + takes a boolean argument, or a space-separated list of namespace type identifiers. If false (the default), no + restrictions on namespace creation and switching are made. If true, access to any kind of namespacing is + prohibited. Otherwise, a space-separated list of namespace type identifiers must be specified, consisting of + any combination of: <constant>cgroup</constant>, <constant>ipc</constant>, <constant>net</constant>, + <constant>mnt</constant>, <constant>pid</constant>, <constant>user</constant> and <constant>uts</constant>. Any + namespace type listed is made accessible to the unit's processes, access to namespace types not listed is + prohibited (allow-listing). By prepending the list with a single tilde character (<literal>~</literal>) the + effect may be inverted: only the listed namespace types will be made inaccessible, all unlisted ones are + permitted (deny-listing). If the empty string is assigned, the default namespace restrictions are applied, + which is equivalent to false. This option may appear more than once, in which case the namespace types are + merged by <constant>OR</constant>, or by <constant>AND</constant> if the lines are prefixed with + <literal>~</literal> (see examples below). Internally, this setting limits access to the + <citerefentry><refentrytitle>unshare</refentrytitle><manvolnum>2</manvolnum></citerefentry>, + <citerefentry><refentrytitle>clone</refentrytitle><manvolnum>2</manvolnum></citerefentry> and + <citerefentry><refentrytitle>setns</refentrytitle><manvolnum>2</manvolnum></citerefentry> system calls, taking + the specified flags parameters into account. Note that — if this option is used — in addition to restricting + creation and switching of the specified types of namespaces (or all of them, if true) access to the + <function>setns()</function> system call with a zero flags parameter is prohibited. This setting is only + supported on x86, x86-64, mips, mips-le, mips64, mips64-le, mips64-n32, mips64-le-n32, ppc64, ppc64-le, s390 + and s390x, and enforces no restrictions on other architectures. If running in user mode, or in system mode, but + without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=</varname>), + <varname>NoNewPrivileges=yes</varname> is implied.</para> + + <para>Example: if a unit has the following, + <programlisting>RestrictNamespaces=cgroup ipc +RestrictNamespaces=cgroup net</programlisting> + then <constant>cgroup</constant>, <constant>ipc</constant>, and <constant>net</constant> are set. + If the second line is prefixed with <literal>~</literal>, e.g., + <programlisting>RestrictNamespaces=cgroup ipc +RestrictNamespaces=~cgroup net</programlisting> + then, only <constant>ipc</constant> is set.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>LockPersonality=</varname></term> + + <listitem><para>Takes a boolean argument. If set, locks down the <citerefentry + project='man-pages'><refentrytitle>personality</refentrytitle><manvolnum>2</manvolnum></citerefentry> system + call so that the kernel execution domain may not be changed from the default or the personality selected with + <varname>Personality=</varname> directive. This may be useful to improve security, because odd personality + emulations may be poorly tested and source of vulnerabilities. If running in user mode, or in system mode, but + without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=</varname>), + <varname>NoNewPrivileges=yes</varname> is implied.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryDenyWriteExecute=</varname></term> + + <listitem><para>Takes a boolean argument. If set, attempts to create memory mappings that are writable and + executable at the same time, or to change existing memory mappings to become executable, or mapping shared + memory segments as executable are prohibited. Specifically, a system call filter is added that rejects + <citerefentry><refentrytitle>mmap</refentrytitle><manvolnum>2</manvolnum></citerefentry> system calls with both + <constant>PROT_EXEC</constant> and <constant>PROT_WRITE</constant> set, + <citerefentry><refentrytitle>mprotect</refentrytitle><manvolnum>2</manvolnum></citerefentry> or + <citerefentry><refentrytitle>pkey_mprotect</refentrytitle><manvolnum>2</manvolnum></citerefentry> system calls + with <constant>PROT_EXEC</constant> set and + <citerefentry><refentrytitle>shmat</refentrytitle><manvolnum>2</manvolnum></citerefentry> system calls with + <constant>SHM_EXEC</constant> set. Note that this option is incompatible with programs and libraries that + generate program code dynamically at runtime, including JIT execution engines, executable stacks, and code + "trampoline" feature of various C compilers. This option improves service security, as it makes harder for + software exploits to change running code dynamically. However, the protection can be circumvented, if + the service can write to a filesystem, which is not mounted with <constant>noexec</constant> (such as + <filename>/dev/shm</filename>), or it can use <function>memfd_create()</function>. This can be + prevented by making such file systems inaccessible to the service + (e.g. <varname>InaccessiblePaths=/dev/shm</varname>) and installing further system call filters + (<varname>SystemCallFilter=~memfd_create</varname>). Note that this feature is fully available on + x86-64, and partially on x86. Specifically, the <function>shmat()</function> protection is not + available on x86. Note that on systems supporting multiple ABIs (such as x86/x86-64) it is + recommended to turn off alternative ABIs for services, so that they cannot be used to circumvent the + restrictions of this option. Specifically, it is recommended to combine this option with + <varname>SystemCallArchitectures=native</varname> or similar. If running in user mode, or in system + mode, but without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting + <varname>User=</varname>), <varname>NoNewPrivileges=yes</varname> is implied.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RestrictRealtime=</varname></term> + + <listitem><para>Takes a boolean argument. If set, any attempts to enable realtime scheduling in a process of + the unit are refused. This restricts access to realtime task scheduling policies such as + <constant>SCHED_FIFO</constant>, <constant>SCHED_RR</constant> or <constant>SCHED_DEADLINE</constant>. See + <citerefentry project='man-pages'><refentrytitle>sched</refentrytitle><manvolnum>7</manvolnum></citerefentry> + for details about these scheduling policies. If running in user mode, or in system mode, but without the + <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=</varname>), + <varname>NoNewPrivileges=yes</varname> is implied. Realtime scheduling policies may be used to monopolize CPU + time for longer periods of time, and may hence be used to lock up or otherwise trigger Denial-of-Service + situations on the system. It is hence recommended to restrict access to realtime scheduling to the few programs + that actually require them. Defaults to off.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RestrictSUIDSGID=</varname></term> + + <listitem><para>Takes a boolean argument. If set, any attempts to set the set-user-ID (SUID) or + set-group-ID (SGID) bits on files or directories will be denied (for details on these bits see + <citerefentry + project='man-pages'><refentrytitle>inode</refentrytitle><manvolnum>7</manvolnum></citerefentry>). If + running in user mode, or in system mode, but without the <constant>CAP_SYS_ADMIN</constant> + capability (e.g. setting <varname>User=</varname>), <varname>NoNewPrivileges=yes</varname> is + implied. As the SUID/SGID bits are mechanisms to elevate privileges, and allows users to acquire the + identity of other users, it is recommended to restrict creation of SUID/SGID files to the few + programs that actually require them. Note that this restricts marking of any type of file system + object with these bits, including both regular files and directories (where the SGID is a different + meaning than for files, see documentation). This option is implied if <varname>DynamicUser=</varname> + is enabled. Defaults to off.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>RemoveIPC=</varname></term> + + <listitem><para>Takes a boolean parameter. If set, all System V and POSIX IPC objects owned by the user and + group the processes of this unit are run as are removed when the unit is stopped. This setting only has an + effect if at least one of <varname>User=</varname>, <varname>Group=</varname> and + <varname>DynamicUser=</varname> are used. It has no effect on IPC objects owned by the root user. Specifically, + this removes System V semaphores, as well as System V and POSIX shared memory segments and message queues. If + multiple units use the same user or group the IPC objects are removed when the last of these units is + stopped. This setting is implied if <varname>DynamicUser=</varname> is set.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PrivateMounts=</varname></term> + + <listitem><para>Takes a boolean parameter. If set, the processes of this unit will be run in their own private + file system (mount) namespace with all mount propagation from the processes towards the host's main file system + namespace turned off. This means any file system mount points established or removed by the unit's processes + will be private to them and not be visible to the host. However, file system mount points established or + removed on the host will be propagated to the unit's processes. See <citerefentry + project='man-pages'><refentrytitle>mount_namespaces</refentrytitle><manvolnum>7</manvolnum></citerefentry> for + details on file system namespaces. Defaults to off.</para> + + <para>When turned on, this executes three operations for each invoked process: a new + <constant>CLONE_NEWNS</constant> namespace is created, after which all existing mounts are remounted to + <constant>MS_SLAVE</constant> to disable propagation from the unit's processes to the host (but leaving + propagation in the opposite direction in effect). Finally, the mounts are remounted again to the propagation + mode configured with <varname>MountFlags=</varname>, see below.</para> + + <para>File system namespaces are set up individually for each process forked off by the service manager. Mounts + established in the namespace of the process created by <varname>ExecStartPre=</varname> will hence be cleaned + up automatically as soon as that process exits and will not be available to subsequent processes forked off for + <varname>ExecStart=</varname> (and similar applies to the various other commands configured for + units). Similarly, <varname>JoinsNamespaceOf=</varname> does not permit sharing kernel mount namespaces between + units, it only enables sharing of the <filename>/tmp/</filename> and <filename>/var/tmp/</filename> + directories.</para> + + <para>Other file system namespace unit settings — <varname>PrivateMounts=</varname>, + <varname>PrivateTmp=</varname>, <varname>PrivateDevices=</varname>, <varname>ProtectSystem=</varname>, + <varname>ProtectHome=</varname>, <varname>ReadOnlyPaths=</varname>, <varname>InaccessiblePaths=</varname>, + <varname>ReadWritePaths=</varname>, … — also enable file system namespacing in a fashion equivalent to this + option. Hence it is primarily useful to explicitly request this behaviour if none of the other settings are + used.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>MountFlags=</varname></term> + + <listitem><para>Takes a mount propagation setting: <option>shared</option>, <option>slave</option> or + <option>private</option>, which controls whether file system mount points in the file system namespaces set up + for this unit's processes will receive or propagate mounts and unmounts from other file system namespaces. See + <citerefentry project='man-pages'><refentrytitle>mount</refentrytitle><manvolnum>2</manvolnum></citerefentry> + for details on mount propagation, and the three propagation flags in particular.</para> + + <para>This setting only controls the <emphasis>final</emphasis> propagation setting in effect on all mount + points of the file system namespace created for each process of this unit. Other file system namespacing unit + settings (see the discussion in <varname>PrivateMounts=</varname> above) will implicitly disable mount and + unmount propagation from the unit's processes towards the host by changing the propagation setting of all mount + points in the unit's file system namespace to <option>slave</option> first. Setting this option to + <option>shared</option> does not reestablish propagation in that case.</para> + + <para>If not set – but file system namespaces are enabled through another file system namespace unit setting – + <option>shared</option> mount propagation is used, but — as mentioned — as <option>slave</option> is applied + first, propagation from the unit's processes to the host is still turned off.</para> + + <para>It is not recommended to use <option>private</option> mount propagation for units, as this means + temporary mounts (such as removable media) of the host will stay mounted and thus indefinitely busy in forked + off processes, as unmount propagation events won't be received by the file system namespace of the unit.</para> + + <para>Usually, it is best to leave this setting unmodified, and use higher level file system namespacing + options instead, in particular <varname>PrivateMounts=</varname>, see above.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>System Call Filtering</title> + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>SystemCallFilter=</varname></term> + + <listitem><para>Takes a space-separated list of system call names. If this setting is used, all + system calls executed by the unit processes except for the listed ones will result in immediate + process termination with the <constant>SIGSYS</constant> signal (allow-listing). (See + <varname>SystemCallErrorNumber=</varname> below for changing the default action). If the first + character of the list is <literal>~</literal>, the effect is inverted: only the listed system calls + will result in immediate process termination (deny-listing). Deny-listed system calls and system call + groups may optionally be suffixed with a colon (<literal>:</literal>) and <literal>errno</literal> + error number (between 0 and 4095) or errno name such as <constant>EPERM</constant>, + <constant>EACCES</constant> or <constant>EUCLEAN</constant> (see <citerefentry + project='man-pages'><refentrytitle>errno</refentrytitle><manvolnum>3</manvolnum></citerefentry> for a + full list). This value will be returned when a deny-listed system call is triggered, instead of + terminating the processes immediately. Special setting <literal>kill</literal> can be used to + explicitly specify killing. This value takes precedence over the one given in + <varname>SystemCallErrorNumber=</varname>, see below. If running in user mode, or in system mode, + but without the <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting + <varname>User=nobody</varname>), <varname>NoNewPrivileges=yes</varname> is implied. This feature + makes use of the Secure Computing Mode 2 interfaces of the kernel ('seccomp filtering') and is useful + for enforcing a minimal sandboxing environment. Note that the <function>execve()</function>, + <function>exit()</function>, <function>exit_group()</function>, <function>getrlimit()</function>, + <function>rt_sigreturn()</function>, <function>sigreturn()</function> system calls and the system calls + for querying time and sleeping are implicitly allow-listed and do not need to be listed + explicitly. This option may be specified more than once, in which case the filter masks are + merged. If the empty string is assigned, the filter is reset, all prior assignments will have no + effect. This does not affect commands prefixed with <literal>+</literal>.</para> + + <para>Note that on systems supporting multiple ABIs (such as x86/x86-64) it is recommended to turn off + alternative ABIs for services, so that they cannot be used to circumvent the restrictions of this + option. Specifically, it is recommended to combine this option with + <varname>SystemCallArchitectures=native</varname> or similar.</para> + + <para>Note that strict system call filters may impact execution and error handling code paths of the service + invocation. Specifically, access to the <function>execve()</function> system call is required for the execution + of the service binary — if it is blocked service invocation will necessarily fail. Also, if execution of the + service binary fails for some reason (for example: missing service executable), the error handling logic might + require access to an additional set of system calls in order to process and log this failure correctly. It + might be necessary to temporarily disable system call filters in order to simplify debugging of such + failures.</para> + + <para>If you specify both types of this option (i.e. allow-listing and deny-listing), the first + encountered will take precedence and will dictate the default action (termination or approval of a + system call). Then the next occurrences of this option will add or delete the listed system calls + from the set of the filtered system calls, depending of its type and the default action. (For + example, if you have started with an allow list rule for <function>read()</function> and + <function>write()</function>, and right after it add a deny list rule for <function>write()</function>, + then <function>write()</function> will be removed from the set.)</para> + + <para>As the number of possible system calls is large, predefined sets of system calls are provided. A set + starts with <literal>@</literal> character, followed by name of the set. + + <table> + <title>Currently predefined system call sets</title> + + <tgroup cols='2'> + <colspec colname='set' /> + <colspec colname='description' /> + <thead> + <row> + <entry>Set</entry> + <entry>Description</entry> + </row> + </thead> + <tbody> + <row> + <entry>@aio</entry> + <entry>Asynchronous I/O (<citerefentry project='man-pages'><refentrytitle>io_setup</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>io_submit</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry> + </row> + <row> + <entry>@basic-io</entry> + <entry>System calls for basic I/O: reading, writing, seeking, file descriptor duplication and closing (<citerefentry project='man-pages'><refentrytitle>read</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>write</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry> + </row> + <row> + <entry>@chown</entry> + <entry>Changing file ownership (<citerefentry project='man-pages'><refentrytitle>chown</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>fchownat</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry> + </row> + <row> + <entry>@clock</entry> + <entry>System calls for changing the system clock (<citerefentry project='man-pages'><refentrytitle>adjtimex</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>settimeofday</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry> + </row> + <row> + <entry>@cpu-emulation</entry> + <entry>System calls for CPU emulation functionality (<citerefentry project='man-pages'><refentrytitle>vm86</refentrytitle><manvolnum>2</manvolnum></citerefentry> and related calls)</entry> + </row> + <row> + <entry>@debug</entry> + <entry>Debugging, performance monitoring and tracing functionality (<citerefentry project='man-pages'><refentrytitle>ptrace</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>perf_event_open</refentrytitle><manvolnum>2</manvolnum></citerefentry> and related calls)</entry> + </row> + <row> + <entry>@file-system</entry> + <entry>File system operations: opening, creating files and directories for read and write, renaming and removing them, reading file properties, or creating hard and symbolic links</entry> + </row> + <row> + <entry>@io-event</entry> + <entry>Event loop system calls (<citerefentry project='man-pages'><refentrytitle>poll</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>select</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>epoll</refentrytitle><manvolnum>7</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>eventfd</refentrytitle><manvolnum>2</manvolnum></citerefentry> and related calls)</entry> + </row> + <row> + <entry>@ipc</entry> + <entry>Pipes, SysV IPC, POSIX Message Queues and other IPC (<citerefentry project='man-pages'><refentrytitle>mq_overview</refentrytitle><manvolnum>7</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>svipc</refentrytitle><manvolnum>7</manvolnum></citerefentry>)</entry> + </row> + <row> + <entry>@keyring</entry> + <entry>Kernel keyring access (<citerefentry project='man-pages'><refentrytitle>keyctl</refentrytitle><manvolnum>2</manvolnum></citerefentry> and related calls)</entry> + </row> + <row> + <entry>@memlock</entry> + <entry>Locking of memory in RAM (<citerefentry project='man-pages'><refentrytitle>mlock</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>mlockall</refentrytitle><manvolnum>2</manvolnum></citerefentry> and related calls)</entry> + </row> + <row> + <entry>@module</entry> + <entry>Loading and unloading of kernel modules (<citerefentry project='man-pages'><refentrytitle>init_module</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>delete_module</refentrytitle><manvolnum>2</manvolnum></citerefentry> and related calls)</entry> + </row> + <row> + <entry>@mount</entry> + <entry>Mounting and unmounting of file systems (<citerefentry project='man-pages'><refentrytitle>mount</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>chroot</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry> + </row> + <row> + <entry>@network-io</entry> + <entry>Socket I/O (including local AF_UNIX): <citerefentry project='man-pages'><refentrytitle>socket</refentrytitle><manvolnum>7</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>unix</refentrytitle><manvolnum>7</manvolnum></citerefentry></entry> + </row> + <row> + <entry>@obsolete</entry> + <entry>Unusual, obsolete or unimplemented (<citerefentry project='man-pages'><refentrytitle>create_module</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>gtty</refentrytitle><manvolnum>2</manvolnum></citerefentry>, …)</entry> + </row> + <row> + <entry>@privileged</entry> + <entry>All system calls which need super-user capabilities (<citerefentry project='man-pages'><refentrytitle>capabilities</refentrytitle><manvolnum>7</manvolnum></citerefentry>)</entry> + </row> + <row> + <entry>@process</entry> + <entry>Process control, execution, namespacing operations (<citerefentry project='man-pages'><refentrytitle>clone</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>kill</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>namespaces</refentrytitle><manvolnum>7</manvolnum></citerefentry>, …)</entry> + </row> + <row> + <entry>@raw-io</entry> + <entry>Raw I/O port access (<citerefentry project='man-pages'><refentrytitle>ioperm</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>iopl</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <function>pciconfig_read()</function>, …)</entry> + </row> + <row> + <entry>@reboot</entry> + <entry>System calls for rebooting and reboot preparation (<citerefentry project='man-pages'><refentrytitle>reboot</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <function>kexec()</function>, …)</entry> + </row> + <row> + <entry>@resources</entry> + <entry>System calls for changing resource limits, memory and scheduling parameters (<citerefentry project='man-pages'><refentrytitle>setrlimit</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>setpriority</refentrytitle><manvolnum>2</manvolnum></citerefentry>, …)</entry> + </row> + <row> + <entry>@setuid</entry> + <entry>System calls for changing user ID and group ID credentials, (<citerefentry project='man-pages'><refentrytitle>setuid</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>setgid</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>setresuid</refentrytitle><manvolnum>2</manvolnum></citerefentry>, …)</entry> + </row> + <row> + <entry>@signal</entry> + <entry>System calls for manipulating and handling process signals (<citerefentry project='man-pages'><refentrytitle>signal</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>sigprocmask</refentrytitle><manvolnum>2</manvolnum></citerefentry>, …)</entry> + </row> + <row> + <entry>@swap</entry> + <entry>System calls for enabling/disabling swap devices (<citerefentry project='man-pages'><refentrytitle>swapon</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>swapoff</refentrytitle><manvolnum>2</manvolnum></citerefentry>)</entry> + </row> + <row> + <entry>@sync</entry> + <entry>Synchronizing files and memory to disk (<citerefentry project='man-pages'><refentrytitle>fsync</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>msync</refentrytitle><manvolnum>2</manvolnum></citerefentry>, and related calls)</entry> + </row> + <row> + <entry>@system-service</entry> + <entry>A reasonable set of system calls used by common system services, excluding any special purpose calls. This is the recommended starting point for allow-listing system calls for system services, as it contains what is typically needed by system services, but excludes overly specific interfaces. For example, the following APIs are excluded: <literal>@clock</literal>, <literal>@mount</literal>, <literal>@swap</literal>, <literal>@reboot</literal>.</entry> + </row> + <row> + <entry>@timer</entry> + <entry>System calls for scheduling operations by time (<citerefentry project='man-pages'><refentrytitle>alarm</refentrytitle><manvolnum>2</manvolnum></citerefentry>, <citerefentry project='man-pages'><refentrytitle>timer_create</refentrytitle><manvolnum>2</manvolnum></citerefentry>, …)</entry> + </row> + <row> + <entry>@known</entry> + <entry>All system calls defined by the kernel. This list is defined statically in systemd based on a kernel version that was available when this systemd version was released. It will become progressively more out-of-date as the kernel is updated.</entry> + </row> + </tbody> + </tgroup> + </table> + + Note, that as new system calls are added to the kernel, additional system calls might be added to the groups + above. Contents of the sets may also change between systemd versions. In addition, the list of system calls + depends on the kernel version and architecture for which systemd was compiled. Use + <command>systemd-analyze syscall-filter</command> to list the actual list of system calls in each + filter.</para> + + <para>Generally, allow-listing system calls (rather than deny-listing) is the safer mode of + operation. It is recommended to enforce system call allow lists for all long-running system + services. Specifically, the following lines are a relatively safe basic choice for the majority of + system services:</para> + + <programlisting>[Service] +SystemCallFilter=@system-service +SystemCallErrorNumber=EPERM</programlisting> + + <para>Note that various kernel system calls are defined redundantly: there are multiple system calls + for executing the same operation. For example, the <function>pidfd_send_signal()</function> system + call may be used to execute operations similar to what can be done with the older + <function>kill()</function> system call, hence blocking the latter without the former only provides + weak protection. Since new system calls are added regularly to the kernel as development progresses, + keeping system call deny lists comprehensive requires constant work. It is thus recommended to use + allow-listing instead, which offers the benefit that new system calls are by default implicitly + blocked until the allow list is updated.</para> + + <para>Also note that a number of system calls are required to be accessible for the dynamic linker to + work. The dynamic linker is required for running most regular programs (specifically: all dynamic ELF + binaries, which is how most distributions build packaged programs). This means that blocking these + system calls (which include <function>open()</function>, <function>openat()</function> or + <function>mmap()</function>) will make most programs typically shipped with generic distributions + unusable.</para> + + <para>It is recommended to combine the file system namespacing related options with + <varname>SystemCallFilter=~@mount</varname>, in order to prohibit the unit's processes to undo the + mappings. Specifically these are the options <varname>PrivateTmp=</varname>, + <varname>PrivateDevices=</varname>, <varname>ProtectSystem=</varname>, <varname>ProtectHome=</varname>, + <varname>ProtectKernelTunables=</varname>, <varname>ProtectControlGroups=</varname>, + <varname>ProtectKernelLogs=</varname>, <varname>ProtectClock=</varname>, <varname>ReadOnlyPaths=</varname>, + <varname>InaccessiblePaths=</varname> and <varname>ReadWritePaths=</varname>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SystemCallErrorNumber=</varname></term> + + <listitem><para>Takes an <literal>errno</literal> error number (between 1 and 4095) or errno name + such as <constant>EPERM</constant>, <constant>EACCES</constant> or <constant>EUCLEAN</constant>, to + return when the system call filter configured with <varname>SystemCallFilter=</varname> is triggered, + instead of terminating the process immediately. See <citerefentry + project='man-pages'><refentrytitle>errno</refentrytitle><manvolnum>3</manvolnum></citerefentry> for a + full list of error codes. When this setting is not used, or when the empty string or the special + setting <literal>kill</literal> is assigned, the process will be terminated immediately when the + filter is triggered.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SystemCallArchitectures=</varname></term> + + <listitem><para>Takes a space-separated list of architecture identifiers to include in the system call + filter. The known architecture identifiers are the same as for <varname>ConditionArchitecture=</varname> + described in <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + as well as <constant>x32</constant>, <constant>mips64-n32</constant>, <constant>mips64-le-n32</constant>, and + the special identifier <constant>native</constant>. The special identifier <constant>native</constant> + implicitly maps to the native architecture of the system (or more precisely: to the architecture the system + manager is compiled for). If running in user mode, or in system mode, but without the + <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=nobody</varname>), + <varname>NoNewPrivileges=yes</varname> is implied. By default, this option is set to the empty list, i.e. no + filtering is applied.</para> + + <para>If this setting is used, processes of this unit will only be permitted to call native system calls, and + system calls of the specified architectures. For the purposes of this option, the x32 architecture is treated + as including x86-64 system calls. However, this setting still fulfills its purpose, as explained below, on + x32.</para> + + <para>System call filtering is not equally effective on all architectures. For example, on x86 + filtering of network socket-related calls is not possible, due to ABI limitations — a limitation that x86-64 + does not have, however. On systems supporting multiple ABIs at the same time — such as x86/x86-64 — it is hence + recommended to limit the set of permitted system call architectures so that secondary ABIs may not be used to + circumvent the restrictions applied to the native ABI of the system. In particular, setting + <varname>SystemCallArchitectures=native</varname> is a good choice for disabling non-native ABIs.</para> + + <para>System call architectures may also be restricted system-wide via the + <varname>SystemCallArchitectures=</varname> option in the global configuration. See + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> for + details.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SystemCallLog=</varname></term> + + <listitem><para>Takes a space-separated list of system call names. If this setting is used, all + system calls executed by the unit processes for the listed ones will be logged. If the first + character of the list is <literal>~</literal>, the effect is inverted: all system calls except the + listed system calls will be logged. If running in user mode, or in system mode, but without the + <constant>CAP_SYS_ADMIN</constant> capability (e.g. setting <varname>User=nobody</varname>), + <varname>NoNewPrivileges=yes</varname> is implied. This feature makes use of the Secure Computing + Mode 2 interfaces of the kernel ('seccomp filtering') and is useful for auditing or setting up a + minimal sandboxing environment. This option may be specified more than once, in which case the filter + masks are merged. If the empty string is assigned, the filter is reset, all prior assignments will + have no effect. This does not affect commands prefixed with <literal>+</literal>.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Environment</title> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>Environment=</varname></term> + + <listitem><para>Sets environment variables for executed processes. Takes a space-separated list of + variable assignments. This option may be specified more than once, in which case all listed variables + will be set. If the same variable is set twice, the later setting will override the earlier + setting. If the empty string is assigned to this option, the list of environment variables is reset, + all prior assignments have no effect. Variable expansion is not performed inside the strings, + however, specifier expansion is possible. The <literal>$</literal> character has no special + meaning. If you need to assign a value containing spaces or the equals sign to a variable, use double + quotes (") for the assignment.</para> + + <para>The names of the variables can contain ASCII letters, digits, and the underscore + character. Variable names cannot be empty or start with a digit. In variable values, most characters + are allowed, but non-printable characters are currently rejected.</para> + + <para>Example: + <programlisting>Environment="VAR1=word1 word2" VAR2=word3 "VAR3=$word 5 6"</programlisting> + gives three variables <literal>VAR1</literal>, + <literal>VAR2</literal>, <literal>VAR3</literal> + with the values <literal>word1 word2</literal>, + <literal>word3</literal>, <literal>$word 5 6</literal>. + </para> + + <para> + See <citerefentry + project='man-pages'><refentrytitle>environ</refentrytitle><manvolnum>7</manvolnum></citerefentry> for details + about environment variables.</para> + + <para>Note that environment variables are not suitable for passing secrets (such as passwords, key + material, …) to service processes. Environment variables set for a unit are exposed to unprivileged + clients via D-Bus IPC, and generally not understood as being data that requires protection. Moreover, + environment variables are propagated down the process tree, including across security boundaries + (such as setuid/setgid executables), and hence might leak to processes that should not have access to + the secret data. Use <varname>LoadCredential=</varname> (see below) to pass data to unit processes + securely.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>EnvironmentFile=</varname></term> + + <listitem><para>Similar to <varname>Environment=</varname> but reads the environment variables from a text + file. The text file should contain new-line-separated variable assignments. Empty lines, lines without an + <literal>=</literal> separator, or lines starting with ; or # will be ignored, which may be used for + commenting. A line ending with a backslash will be concatenated with the following one, allowing multiline + variable definitions. The parser strips leading and trailing whitespace from the values of assignments, unless + you use double quotes (").</para> + + <para><ulink url="https://en.wikipedia.org/wiki/Escape_sequences_in_C#Table_of_escape_sequences">C escapes</ulink> + are supported, but not + <ulink url="https://en.wikipedia.org/wiki/Control_character#In_ASCII">most control characters</ulink>. + <literal>\t</literal> and <literal>\n</literal> can be used to insert tabs and newlines within + <varname>EnvironmentFile=</varname>.</para> + + <para>The argument passed should be an absolute filename or wildcard expression, optionally prefixed with + <literal>-</literal>, which indicates that if the file does not exist, it will not be read and no error or + warning message is logged. This option may be specified more than once in which case all specified files are + read. If the empty string is assigned to this option, the list of file to read is reset, all prior assignments + have no effect.</para> + + <para>The files listed with this directive will be read shortly before the process is executed (more + specifically, after all processes from a previous unit state terminated. This means you can generate these + files in one unit state, and read it with this option in the next. The files are read from the file + system of the service manager, before any file system changes like bind mounts take place).</para> + + <para>Settings from these files override settings made with <varname>Environment=</varname>. If the same + variable is set twice from these files, the files will be read in the order they are specified and the later + setting will override the earlier setting.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>PassEnvironment=</varname></term> + + <listitem><para>Pass environment variables set for the system service manager to executed processes. Takes a + space-separated list of variable names. This option may be specified more than once, in which case all listed + variables will be passed. If the empty string is assigned to this option, the list of environment variables to + pass is reset, all prior assignments have no effect. Variables specified that are not set for the system + manager will not be passed and will be silently ignored. Note that this option is only relevant for the system + service manager, as system services by default do not automatically inherit any environment variables set for + the service manager itself. However, in case of the user service manager all environment variables are passed + to the executed processes anyway, hence this option is without effect for the user service manager.</para> + + <para>Variables set for invoked processes due to this setting are subject to being overridden by those + configured with <varname>Environment=</varname> or <varname>EnvironmentFile=</varname>.</para> + + <para><ulink url="https://en.wikipedia.org/wiki/Escape_sequences_in_C#Table_of_escape_sequences">C escapes</ulink> + are supported, but not + <ulink url="https://en.wikipedia.org/wiki/Control_character#In_ASCII">most control characters</ulink>. + <literal>\t</literal> and <literal>\n</literal> can be used to insert tabs and newlines within + <varname>EnvironmentFile=</varname>.</para> + + <para>Example: + <programlisting>PassEnvironment=VAR1 VAR2 VAR3</programlisting> + passes three variables <literal>VAR1</literal>, + <literal>VAR2</literal>, <literal>VAR3</literal> + with the values set for those variables in PID1.</para> + + <para> + See <citerefentry + project='man-pages'><refentrytitle>environ</refentrytitle><manvolnum>7</manvolnum></citerefentry> for details + about environment variables.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>UnsetEnvironment=</varname></term> + + <listitem><para>Explicitly unset environment variable assignments that would normally be passed from the + service manager to invoked processes of this unit. Takes a space-separated list of variable names or variable + assignments. This option may be specified more than once, in which case all listed variables/assignments will + be unset. If the empty string is assigned to this option, the list of environment variables/assignments to + unset is reset. If a variable assignment is specified (that is: a variable name, followed by + <literal>=</literal>, followed by its value), then any environment variable matching this precise assignment is + removed. If a variable name is specified (that is a variable name without any following <literal>=</literal> or + value), then any assignment matching the variable name, regardless of its value is removed. Note that the + effect of <varname>UnsetEnvironment=</varname> is applied as final step when the environment list passed to + executed processes is compiled. That means it may undo assignments from any configuration source, including + assignments made through <varname>Environment=</varname> or <varname>EnvironmentFile=</varname>, inherited from + the system manager's global set of environment variables, inherited via <varname>PassEnvironment=</varname>, + set by the service manager itself (such as <varname>$NOTIFY_SOCKET</varname> and such), or set by a PAM module + (in case <varname>PAMName=</varname> is used).</para> + + <para>See "Environment Variables in Spawned Processes" below for a description of how those + settings combine to form the inherited environment. See <citerefentry + project='man-pages'><refentrytitle>environ</refentrytitle><manvolnum>7</manvolnum></citerefentry> for general + information about environment variables.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Logging and Standard Input/Output</title> + + <variablelist class='unit-directives'> + <varlistentry> + + <term><varname>StandardInput=</varname></term> + + <listitem><para>Controls where file descriptor 0 (STDIN) of the executed processes is connected to. Takes one + of <option>null</option>, <option>tty</option>, <option>tty-force</option>, <option>tty-fail</option>, + <option>data</option>, <option>file:<replaceable>path</replaceable></option>, <option>socket</option> or + <option>fd:<replaceable>name</replaceable></option>.</para> + + <para>If <option>null</option> is selected, standard input will be connected to <filename>/dev/null</filename>, + i.e. all read attempts by the process will result in immediate EOF.</para> + + <para>If <option>tty</option> is selected, standard input is connected to a TTY (as configured by + <varname>TTYPath=</varname>, see below) and the executed process becomes the controlling process of the + terminal. If the terminal is already being controlled by another process, the executed process waits until the + current controlling process releases the terminal.</para> + + <para><option>tty-force</option> is similar to <option>tty</option>, but the executed process is forcefully and + immediately made the controlling process of the terminal, potentially removing previous controlling processes + from the terminal.</para> + + <para><option>tty-fail</option> is similar to <option>tty</option>, but if the terminal already has a + controlling process start-up of the executed process fails.</para> + + <para>The <option>data</option> option may be used to configure arbitrary textual or binary data to pass via + standard input to the executed process. The data to pass is configured via + <varname>StandardInputText=</varname>/<varname>StandardInputData=</varname> (see below). Note that the actual + file descriptor type passed (memory file, regular file, UNIX pipe, …) might depend on the kernel and available + privileges. In any case, the file descriptor is read-only, and when read returns the specified data followed by + EOF.</para> + + <para>The <option>file:<replaceable>path</replaceable></option> option may be used to connect a specific file + system object to standard input. An absolute path following the <literal>:</literal> character is expected, + which may refer to a regular file, a FIFO or special file. If an <constant>AF_UNIX</constant> socket in the + file system is specified, a stream socket is connected to it. The latter is useful for connecting standard + input of processes to arbitrary system services.</para> + + <para>The <option>socket</option> option is valid in socket-activated services only, and requires the relevant + socket unit file (see + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry> for details) + to have <varname>Accept=yes</varname> set, or to specify a single socket only. If this option is set, standard + input will be connected to the socket the service was activated from, which is primarily useful for + compatibility with daemons designed for use with the traditional <citerefentry + project='freebsd'><refentrytitle>inetd</refentrytitle><manvolnum>8</manvolnum></citerefentry> socket activation + daemon.</para> + + <para>The <option>fd:<replaceable>name</replaceable></option> option connects standard input to a specific, + named file descriptor provided by a socket unit. The name may be specified as part of this option, following a + <literal>:</literal> character (e.g. <literal>fd:foobar</literal>). If no name is specified, the name + <literal>stdin</literal> is implied (i.e. <literal>fd</literal> is equivalent to <literal>fd:stdin</literal>). + At least one socket unit defining the specified name must be provided via the <varname>Sockets=</varname> + option, and the file descriptor name may differ from the name of its containing socket unit. If multiple + matches are found, the first one will be used. See <varname>FileDescriptorName=</varname> in + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry> for more + details about named file descriptors and their ordering.</para> + + <para>This setting defaults to <option>null</option>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>StandardOutput=</varname></term> + + <listitem><para>Controls where file descriptor 1 (stdout) of the executed processes is connected + to. Takes one of <option>inherit</option>, <option>null</option>, <option>tty</option>, + <option>journal</option>, <option>kmsg</option>, <option>journal+console</option>, + <option>kmsg+console</option>, <option>file:<replaceable>path</replaceable></option>, + <option>append:<replaceable>path</replaceable></option>, <option>socket</option> or + <option>fd:<replaceable>name</replaceable></option>.</para> + + <para><option>inherit</option> duplicates the file descriptor of standard input for standard output.</para> + + <para><option>null</option> connects standard output to <filename>/dev/null</filename>, i.e. everything written + to it will be lost.</para> + + <para><option>tty</option> connects standard output to a tty (as configured via <varname>TTYPath=</varname>, + see below). If the TTY is used for output only, the executed process will not become the controlling process of + the terminal, and will not fail or wait for other processes to release the terminal.</para> + + <para><option>journal</option> connects standard output with the journal, which is accessible via + <citerefentry><refentrytitle>journalctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>. Note + that everything that is written to kmsg (see below) is implicitly stored in the journal as well, the + specific option listed below is hence a superset of this one. (Also note that any external, + additional syslog daemons receive their log data from the journal, too, hence this is the option to + use when logging shall be processed with such a daemon.)</para> + + <para><option>kmsg</option> connects standard output with the kernel log buffer which is accessible via + <citerefentry project='man-pages'><refentrytitle>dmesg</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + in addition to the journal. The journal daemon might be configured to send all logs to kmsg anyway, in which + case this option is no different from <option>journal</option>.</para> + + <para><option>journal+console</option> and <option>kmsg+console</option> work in a similar way as the + two options above but copy the output to the system console as well.</para> + + <para>The <option>file:<replaceable>path</replaceable></option> option may be used to connect a specific file + system object to standard output. The semantics are similar to the same option of + <varname>StandardInput=</varname>, see above. If <replaceable>path</replaceable> refers to a regular file + on the filesystem, it is opened (created if it doesn't exist yet) for writing at the beginning of the file, + but without truncating it. + If standard input and output are directed to the same file path, it is opened only once, for reading as well + as writing and duplicated. This is particularly useful when the specified path refers to an + <constant>AF_UNIX</constant> socket in the file system, as in that case only a + single stream connection is created for both input and output.</para> + + <para><option>append:<replaceable>path</replaceable></option> is similar to + <option>file:<replaceable>path</replaceable></option> above, but it opens the file in append mode. + </para> + + <para><option>socket</option> connects standard output to a socket acquired via socket activation. The + semantics are similar to the same option of <varname>StandardInput=</varname>, see above.</para> + + <para>The <option>fd:<replaceable>name</replaceable></option> option connects standard output to a specific, + named file descriptor provided by a socket unit. A name may be specified as part of this option, following a + <literal>:</literal> character (e.g. <literal>fd:foobar</literal>). If no name is specified, the name + <literal>stdout</literal> is implied (i.e. <literal>fd</literal> is equivalent to + <literal>fd:stdout</literal>). At least one socket unit defining the specified name must be provided via the + <varname>Sockets=</varname> option, and the file descriptor name may differ from the name of its containing + socket unit. If multiple matches are found, the first one will be used. See + <varname>FileDescriptorName=</varname> in + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry> for more + details about named descriptors and their ordering.</para> + + <para>If the standard output (or error output, see below) of a unit is connected to the journal or + the kernel log buffer, the unit will implicitly gain a dependency of type <varname>After=</varname> + on <filename>systemd-journald.socket</filename> (also see the "Implicit Dependencies" section + above). Also note that in this case stdout (or stderr, see below) will be an + <constant>AF_UNIX</constant> stream socket, and not a pipe or FIFO that can be re-opened. This means + when executing shell scripts the construct <command>echo "hello" > /dev/stderr</command> for + writing text to stderr will not work. To mitigate this use the construct <command>echo "hello" + >&2</command> instead, which is mostly equivalent and avoids this pitfall.</para> + + <para>This setting defaults to the value set with <varname>DefaultStandardOutput=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, which + defaults to <option>journal</option>. Note that setting this parameter might result in additional dependencies + to be added to the unit (see above).</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>StandardError=</varname></term> + + <listitem><para>Controls where file descriptor 2 (stderr) of the executed processes is connected to. The + available options are identical to those of <varname>StandardOutput=</varname>, with some exceptions: if set to + <option>inherit</option> the file descriptor used for standard output is duplicated for standard error, while + <option>fd:<replaceable>name</replaceable></option> will use a default file descriptor name of + <literal>stderr</literal>.</para> + + <para>This setting defaults to the value set with <varname>DefaultStandardError=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, which + defaults to <option>inherit</option>. Note that setting this parameter might result in additional dependencies + to be added to the unit (see above).</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>StandardInputText=</varname></term> + <term><varname>StandardInputData=</varname></term> + + <listitem><para>Configures arbitrary textual or binary data to pass via file descriptor 0 (STDIN) to the + executed processes. These settings have no effect unless <varname>StandardInput=</varname> is set to + <option>data</option>. Use this option to embed process input data directly in the unit file.</para> + + <para><varname>StandardInputText=</varname> accepts arbitrary textual data. C-style escapes for special + characters as well as the usual <literal>%</literal>-specifiers are resolved. Each time this setting is used + the specified text is appended to the per-unit data buffer, followed by a newline character (thus every use + appends a new line to the end of the buffer). Note that leading and trailing whitespace of lines configured + with this option is removed. If an empty line is specified the buffer is cleared (hence, in order to insert an + empty line, add an additional <literal>\n</literal> to the end or beginning of a line).</para> + + <para><varname>StandardInputData=</varname> accepts arbitrary binary data, encoded in <ulink + url="https://tools.ietf.org/html/rfc2045#section-6.8">Base64</ulink>. No escape sequences or specifiers are + resolved. Any whitespace in the encoded version is ignored during decoding.</para> + + <para>Note that <varname>StandardInputText=</varname> and <varname>StandardInputData=</varname> operate on the + same data buffer, and may be mixed in order to configure both binary and textual data for the same input + stream. The textual or binary data is joined strictly in the order the settings appear in the unit + file. Assigning an empty string to either will reset the data buffer.</para> + + <para>Please keep in mind that in order to maintain readability long unit file settings may be split into + multiple lines, by suffixing each line (except for the last) with a <literal>\</literal> character (see + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry> for + details). This is particularly useful for large data configured with these two options. Example:</para> + + <programlisting>… +StandardInput=data +StandardInputData=SWNrIHNpdHplIGRhIHVuJyBlc3NlIEtsb3BzLAp1ZmYgZWVtYWwga2xvcHAncy4KSWNrIGtpZWtl \ + LCBzdGF1bmUsIHd1bmRyZSBtaXIsCnVmZiBlZW1hbCBqZWh0IHNlIHVmZiBkaWUgVMO8ci4KTmFu \ + dSwgZGVuayBpY2ssIGljayBkZW5rIG5hbnUhCkpldHogaXNzZSB1ZmYsIGVyc2NodCB3YXIgc2Ug \ + enUhCkljayBqZWhlIHJhdXMgdW5kIGJsaWNrZSDigJQKdW5kIHdlciBzdGVodCBkcmF1w59lbj8g \ + SWNrZSEK +…</programlisting></listitem> + </varlistentry> + + <varlistentry> + <term><varname>LogLevelMax=</varname></term> + + <listitem><para>Configures filtering by log level of log messages generated by this unit. Takes a + <command>syslog</command> log level, one of <option>emerg</option> (lowest log level, only highest priority + messages), <option>alert</option>, <option>crit</option>, <option>err</option>, <option>warning</option>, + <option>notice</option>, <option>info</option>, <option>debug</option> (highest log level, also lowest priority + messages). See <citerefentry + project='man-pages'><refentrytitle>syslog</refentrytitle><manvolnum>3</manvolnum></citerefentry> for + details. By default no filtering is applied (i.e. the default maximum log level is <option>debug</option>). Use + this option to configure the logging system to drop log messages of a specific service above the specified + level. For example, set <varname>LogLevelMax=</varname><option>info</option> in order to turn off debug logging + of a particularly chatty unit. Note that the configured level is applied to any log messages written by any + of the processes belonging to this unit, sent via any supported logging protocol. The filtering is applied + early in the logging pipeline, before any kind of further processing is done. Moreover, messages which pass + through this filter successfully might still be dropped by filters applied at a later stage in the logging + subsystem. For example, <varname>MaxLevelStore=</varname> configured in + <citerefentry><refentrytitle>journald.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> might + prohibit messages of higher log levels to be stored on disk, even though the per-unit + <varname>LogLevelMax=</varname> permitted it to be processed.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>LogExtraFields=</varname></term> + + <listitem><para>Configures additional log metadata fields to include in all log records generated by + processes associated with this unit. This setting takes one or more journal field assignments in the + format <literal>FIELD=VALUE</literal> separated by whitespace. See + <citerefentry><refentrytitle>systemd.journal-fields</refentrytitle><manvolnum>7</manvolnum></citerefentry> + for details on the journal field concept. Even though the underlying journal implementation permits + binary field values, this setting accepts only valid UTF-8 values. To include space characters in a + journal field value, enclose the assignment in double quotes ("). <!-- " fake closing quote for emacs--> + The usual specifiers are expanded in all assignments (see below). Note that this setting is not only + useful for attaching additional metadata to log records of a unit, but given that all fields and + values are indexed may also be used to implement cross-unit log record matching. Assign an empty + string to reset the list.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>LogRateLimitIntervalSec=</varname></term> + <term><varname>LogRateLimitBurst=</varname></term> + + <listitem><para>Configures the rate limiting that is applied to messages generated by this unit. If, in the + time interval defined by <varname>LogRateLimitIntervalSec=</varname>, more messages than specified in + <varname>LogRateLimitBurst=</varname> are logged by a service, all further messages within the interval are + dropped until the interval is over. A message about the number of dropped messages is generated. The time + specification for <varname>LogRateLimitIntervalSec=</varname> may be specified in the following units: "s", + "min", "h", "ms", "us" (see + <citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry> for details). + The default settings are set by <varname>RateLimitIntervalSec=</varname> and <varname>RateLimitBurst=</varname> + configured in <citerefentry><refentrytitle>journald.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>LogNamespace=</varname></term> + + <listitem><para>Run the unit's processes in the specified journal namespace. Expects a short + user-defined string identifying the namespace. If not used the processes of the service are run in + the default journal namespace, i.e. their log stream is collected and processed by + <filename>systemd-journald.service</filename>. If this option is used any log data generated by + processes of this unit (regardless if via the <function>syslog()</function>, journal native logging + or stdout/stderr logging) is collected and processed by an instance of the + <filename>systemd-journald@.service</filename> template unit, which manages the specified + namespace. The log data is stored in a data store independent from the default log namespace's data + store. See + <citerefentry><refentrytitle>systemd-journald.service</refentrytitle><manvolnum>8</manvolnum></citerefentry> + for details about journal namespaces.</para> + + <para>Internally, journal namespaces are implemented through Linux mount namespacing and + over-mounting the directory that contains the relevant <constant>AF_UNIX</constant> sockets used for + logging in the unit's mount namespace. Since mount namespaces are used this setting disconnects + propagation of mounts from the unit's processes to the host, similar to how + <varname>ReadOnlyPaths=</varname> and similar settings (see above) work. Journal namespaces may hence + not be used for services that need to establish mount points on the host.</para> + + <para>When this option is used the unit will automatically gain ordering and requirement dependencies + on the two socket units associated with the <filename>systemd-journald@.service</filename> instance + so that they are automatically established prior to the unit starting up. Note that when this option + is used log output of this service does not appear in the regular + <citerefentry><refentrytitle>journalctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> + output, unless the <option>--namespace=</option> option is used.</para> + + <xi:include href="system-only.xml" xpointer="singular"/></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SyslogIdentifier=</varname></term> + + <listitem><para>Sets the process name ("<command>syslog</command> tag") to prefix log lines sent to + the logging system or the kernel log buffer with. If not set, defaults to the process name of the + executed process. This option is only useful when <varname>StandardOutput=</varname> or + <varname>StandardError=</varname> are set to <option>journal</option> or <option>kmsg</option> (or to + the same settings in combination with <option>+console</option>) and only applies to log messages + written to stdout or stderr.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SyslogFacility=</varname></term> + + <listitem><para>Sets the <command>syslog</command> facility identifier to use when logging. One of + <option>kern</option>, <option>user</option>, <option>mail</option>, <option>daemon</option>, + <option>auth</option>, <option>syslog</option>, <option>lpr</option>, <option>news</option>, + <option>uucp</option>, <option>cron</option>, <option>authpriv</option>, <option>ftp</option>, + <option>local0</option>, <option>local1</option>, <option>local2</option>, <option>local3</option>, + <option>local4</option>, <option>local5</option>, <option>local6</option> or + <option>local7</option>. See <citerefentry + project='man-pages'><refentrytitle>syslog</refentrytitle><manvolnum>3</manvolnum></citerefentry> for + details. This option is only useful when <varname>StandardOutput=</varname> or + <varname>StandardError=</varname> are set to <option>journal</option> or <option>kmsg</option> (or to + the same settings in combination with <option>+console</option>), and only applies to log messages + written to stdout or stderr. Defaults to <option>daemon</option>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SyslogLevel=</varname></term> + + <listitem><para>The default <command>syslog</command> log level to use when logging to the logging system or + the kernel log buffer. One of <option>emerg</option>, <option>alert</option>, <option>crit</option>, + <option>err</option>, <option>warning</option>, <option>notice</option>, <option>info</option>, + <option>debug</option>. See <citerefentry + project='man-pages'><refentrytitle>syslog</refentrytitle><manvolnum>3</manvolnum></citerefentry> for + details. This option is only useful when <varname>StandardOutput=</varname> or + <varname>StandardError=</varname> are set to <option>journal</option> or + <option>kmsg</option> (or to the same settings in combination with <option>+console</option>), and only applies + to log messages written to stdout or stderr. Note that individual lines output by executed processes may be + prefixed with a different log level which can be used to override the default log level specified here. The + interpretation of these prefixes may be disabled with <varname>SyslogLevelPrefix=</varname>, see below. For + details, see <citerefentry><refentrytitle>sd-daemon</refentrytitle><manvolnum>3</manvolnum></citerefentry>. + Defaults to <option>info</option>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SyslogLevelPrefix=</varname></term> + + <listitem><para>Takes a boolean argument. If true and <varname>StandardOutput=</varname> or + <varname>StandardError=</varname> are set to <option>journal</option> or <option>kmsg</option> (or to + the same settings in combination with <option>+console</option>), log lines written by the executed + process that are prefixed with a log level will be processed with this log level set but the prefix + removed. If set to false, the interpretation of these prefixes is disabled and the logged lines are + passed on as-is. This only applies to log messages written to stdout or stderr. For details about + this prefixing see + <citerefentry><refentrytitle>sd-daemon</refentrytitle><manvolnum>3</manvolnum></citerefentry>. + Defaults to true.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TTYPath=</varname></term> + + <listitem><para>Sets the terminal device node to use if standard input, output, or error are connected to a TTY + (see above). Defaults to <filename>/dev/console</filename>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TTYReset=</varname></term> + + <listitem><para>Reset the terminal device specified with <varname>TTYPath=</varname> before and after + execution. Defaults to <literal>no</literal>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TTYVHangup=</varname></term> + + <listitem><para>Disconnect all clients which have opened the terminal device specified with + <varname>TTYPath=</varname> before and after execution. Defaults to <literal>no</literal>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>TTYVTDisallocate=</varname></term> + + <listitem><para>If the terminal device specified with <varname>TTYPath=</varname> is a virtual console + terminal, try to deallocate the TTY before and after execution. This ensures that the screen and scrollback + buffer is cleared. Defaults to <literal>no</literal>.</para></listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Credentials</title> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>LoadCredential=</varname><replaceable>ID</replaceable>:<replaceable>PATH</replaceable></term> + + <listitem><para>Pass a credential to the unit. Credentials are limited-size binary or textual objects + that may be passed to unit processes. They are primarily used for passing cryptographic keys (both + public and private) or certificates, user account information or identity information from host to + services. The data is accessible from the unit's processes via the file system, at a read-only + location that (if possible and permitted) is backed by non-swappable memory. The data is only + accessible to the user associated with the unit, via the + <varname>User=</varname>/<varname>DynamicUser=</varname> settings (as well as the superuser). When + available, the location of credentials is exported as the <varname>$CREDENTIALS_DIRECTORY</varname> + environment variable to the unit's processes.</para> + + <para>The <varname>LoadCredential=</varname> setting takes a textual ID to use as name for a + credential plus a file system path. The ID must be a short ASCII string suitable as filename in the + filesystem, and may be chosen freely by the user. If the specified path is absolute it is opened as + regular file and the credential data is read from it. If the absolute path refers to an + <constant>AF_UNIX</constant> stream socket in the file system a connection is made to it (only once + at unit start-up) and the credential data read from the connection, providing an easy IPC integration + point for dynamically providing credentials from other services. If the specified path is not + absolute and itself qualifies as valid credential identifier it is understood to refer to a + credential that the service manager itself received via the <varname>$CREDENTIALS_DIRECTORY</varname> + environment variable, which may be used to propagate credentials from an invoking environment (e.g. a + container manager that invoked the service manager) into a service. The contents of the file/socket + may be arbitrary binary or textual data, including newline characters and <constant>NUL</constant> + bytes. This option may be used multiple times, each time defining an additional credential to pass to + the unit.</para> + + <para>The credential files/IPC sockets must be accessible to the service manager, but don't have to + be directly accessible to the unit's processes: the credential data is read and copied into separate, + read-only copies for the unit that are accessible to appropriately privileged processes. This is + particularly useful in combination with <varname>DynamicUser=</varname> as this way privileged data + can be made available to processes running under a dynamic UID (i.e. not a previously known one) + without having to open up access to all users.</para> + + <para>In order to reference the path a credential may be read from within a + <varname>ExecStart=</varname> command line use <literal>${CREDENTIALS_DIRECTORY}/mycred</literal>, + e.g. <literal>ExecStart=cat ${CREDENTIALS_DIRECTORY}/mycred</literal>.</para> + + <para>Currently, an accumulated credential size limit of 1M bytes per unit is + enforced.</para> + + <para>If referencing an <constant>AF_UNIX</constant> stream socket to connect to, the connection will + originate from an abstract namespace socket, that includes information about the unit and the + credential ID in its socket name. Use <citerefentry + project='man-pages'><refentrytitle>getpeername</refentrytitle><manvolnum>2</manvolnum></citerefentry> + to query this information. The returned socket name is formatted as <constant>NUL</constant> + <replaceable>RANDOM</replaceable> <literal>/unit/</literal> <replaceable>UNIT</replaceable> + <literal>/</literal> <replaceable>ID</replaceable>, i.e. a <constant>NUL</constant> byte (as required + for abstract namespace socket names), followed by a random string (consisting of alphadecimal + characters), followed by the literal string <literal>/unit/</literal>, followed by the requesting + unit name, followed by the literal character <literal>/</literal>, followed by the textual credential + ID requested. Example: <literal>\0adf9d86b6eda275e/unit/foobar.service/credx</literal> in case the + credential <literal>credx</literal> is requested for a unit <literal>foobar.service</literal>. This + functionality is useful for using a single listening socket to serve credentials to multiple + consumers.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>SetCredential=</varname><replaceable>ID</replaceable>:<replaceable>VALUE</replaceable></term> + + <listitem><para>The <varname>SetCredential=</varname> setting is similar to + <varname>LoadCredential=</varname> but accepts a literal value to use as data for the credential, + instead of a file system path to read the data from. Do not use this option for data that is supposed + to be secret, as it is accessible to unprivileged processes via IPC. It's only safe to use this for + user IDs, public key material and similar non-sensitive data. For everything else use + <varname>LoadCredential=</varname>. In order to embed binary data into the credential data use + C-style escaping (i.e. <literal>\n</literal> to embed a newline, or <literal>\x00</literal> to embed + a <constant>NUL</constant> byte).</para> + + <para>If a credential of the same ID is listed in both <varname>LoadCredential=</varname> and + <varname>SetCredential=</varname>, the latter will act as default if the former cannot be + retrieved. In this case not being able to retrieve the credential from the path specified in + <varname>LoadCredential=</varname> is not considered fatal.</para></listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>System V Compatibility</title> + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>UtmpIdentifier=</varname></term> + + <listitem><para>Takes a four character identifier string for an <citerefentry + project='man-pages'><refentrytitle>utmp</refentrytitle><manvolnum>5</manvolnum></citerefentry> and wtmp entry + for this service. This should only be set for services such as <command>getty</command> implementations (such + as <citerefentry + project='die-net'><refentrytitle>agetty</refentrytitle><manvolnum>8</manvolnum></citerefentry>) where utmp/wtmp + entries must be created and cleared before and after execution, or for services that shall be executed as if + they were run by a <command>getty</command> process (see below). If the configured string is longer than four + characters, it is truncated and the terminal four characters are used. This setting interprets %I style string + replacements. This setting is unset by default, i.e. no utmp/wtmp entries are created or cleaned up for this + service.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>UtmpMode=</varname></term> + + <listitem><para>Takes one of <literal>init</literal>, <literal>login</literal> or <literal>user</literal>. If + <varname>UtmpIdentifier=</varname> is set, controls which type of <citerefentry + project='man-pages'><refentrytitle>utmp</refentrytitle><manvolnum>5</manvolnum></citerefentry>/wtmp entries + for this service are generated. This setting has no effect unless <varname>UtmpIdentifier=</varname> is set + too. If <literal>init</literal> is set, only an <constant>INIT_PROCESS</constant> entry is generated and the + invoked process must implement a <command>getty</command>-compatible utmp/wtmp logic. If + <literal>login</literal> is set, first an <constant>INIT_PROCESS</constant> entry, followed by a + <constant>LOGIN_PROCESS</constant> entry is generated. In this case, the invoked process must implement a + <citerefentry + project='die-net'><refentrytitle>login</refentrytitle><manvolnum>1</manvolnum></citerefentry>-compatible + utmp/wtmp logic. If <literal>user</literal> is set, first an <constant>INIT_PROCESS</constant> entry, then a + <constant>LOGIN_PROCESS</constant> entry and finally a <constant>USER_PROCESS</constant> entry is + generated. In this case, the invoked process may be any process that is suitable to be run as session + leader. Defaults to <literal>init</literal>.</para></listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>Environment Variables in Spawned Processes</title> + + <para>Processes started by the service manager are executed with an environment variable block assembled from + multiple sources. Processes started by the system service manager generally do not inherit environment variables + set for the service manager itself (but this may be altered via <varname>PassEnvironment=</varname>), but processes + started by the user service manager instances generally do inherit all environment variables set for the service + manager itself.</para> + + <para>For each invoked process the list of environment variables set is compiled from the following sources:</para> + + <itemizedlist> + <listitem><para>Variables globally configured for the service manager, using the + <varname>DefaultEnvironment=</varname> setting in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + the kernel command line option <varname>systemd.setenv=</varname> understood by + <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>, or via + <citerefentry><refentrytitle>systemctl</refentrytitle><manvolnum>1</manvolnum></citerefentry> + <command>set-environment</command> verb.</para></listitem> + + <listitem><para>Variables defined by the service manager itself (see the list below).</para></listitem> + + <listitem><para>Variables set in the service manager's own environment variable block (subject to + <varname>PassEnvironment=</varname> for the system service manager).</para></listitem> + + <listitem><para>Variables set via <varname>Environment=</varname> in the unit file.</para></listitem> + + <listitem><para>Variables read from files specified via <varname>EnvironmentFile=</varname> in the unit + file.</para></listitem> + + <listitem><para>Variables set by any PAM modules in case <varname>PAMName=</varname> is in effect, + cf. <citerefentry + project='man-pages'><refentrytitle>pam_env</refentrytitle><manvolnum>8</manvolnum></citerefentry>. + </para></listitem> + </itemizedlist> + + <para>If the same environment variable is set by multiple of these sources, the later source — according + to the order of the list above — wins. Note that as the final step all variables listed in + <varname>UnsetEnvironment=</varname> are removed from the compiled environment variable list, immediately + before it is passed to the executed process.</para> + + <para>The general philosophy is to expose a small curated list of environment variables to processes. + Services started by the system manager (PID 1) will be started, without additional service-specific + configuration, with just a few environment variables. The user manager inherits environment variables as + any other system service, but in addition may receive additional environment variables from PAM, and, + typically, additional imported variables when the user starts a graphical session. It is recommended to + keep the environment blocks in both the system and user managers managers lean.</para> + + <para>Hint: <command>systemd-run -P env</command> and <command>systemd-run --user -P env</command> print + the effective system and user service environment blocks.</para> + + <refsect2> + <title>Environment Variables Set or Propagated by the Service Manager</title> + + <para>The following environment variables are propagated by the service manager or generated internally + for each invoked process:</para> + + <variablelist class='environment-variables'> + <varlistentry> + <term><varname>$PATH</varname></term> + + <listitem><para>Colon-separated list of directories to use when launching + executables. <command>systemd</command> uses a fixed value of + <literal><filename>/usr/local/sbin</filename>:<filename>/usr/local/bin</filename>:<filename>/usr/sbin</filename>:<filename>/usr/bin</filename></literal> + in the system manager. When compiled for systems with "unmerged <filename>/usr/</filename>" + (<filename>/bin</filename> is not a symlink to <filename>/usr/bin</filename>), + <literal>:<filename>/sbin</filename>:<filename>/bin</filename></literal> is appended. In case of + the the user manager, a different path may be configured by the distribution. It is recommended to + not rely on the order of entries, and have only one program with a given name in + <varname>$PATH</varname>.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$LANG</varname></term> + + <listitem><para>Locale. Can be set in <citerefentry + project='man-pages'><refentrytitle>locale.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry> + or on the kernel command line (see + <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry> and + <citerefentry><refentrytitle>kernel-command-line</refentrytitle><manvolnum>7</manvolnum></citerefentry>). + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$USER</varname></term> + <term><varname>$LOGNAME</varname></term> + <term><varname>$HOME</varname></term> + <term><varname>$SHELL</varname></term> + + <listitem><para>User name (twice), home directory, and the + login shell. The variables are set for the units that have + <varname>User=</varname> set, which includes user + <command>systemd</command> instances. See + <citerefentry project='die-net'><refentrytitle>passwd</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$INVOCATION_ID</varname></term> + + <listitem><para>Contains a randomized, unique 128bit ID identifying each runtime cycle of the unit, formatted + as 32 character hexadecimal string. A new ID is assigned each time the unit changes from an inactive state into + an activating or active state, and may be used to identify this specific runtime cycle, in particular in data + stored offline, such as the journal. The same ID is passed to all processes run as part of the + unit.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$XDG_RUNTIME_DIR</varname></term> + + <listitem><para>The directory to use for runtime objects (such as IPC objects) and volatile state. Set for all + services run by the user <command>systemd</command> instance, as well as any system services that use + <varname>PAMName=</varname> with a PAM stack that includes <command>pam_systemd</command>. See below and + <citerefentry><refentrytitle>pam_systemd</refentrytitle><manvolnum>8</manvolnum></citerefentry> for more + information.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$RUNTIME_DIRECTORY</varname></term> + <term><varname>$STATE_DIRECTORY</varname></term> + <term><varname>$CACHE_DIRECTORY</varname></term> + <term><varname>$LOGS_DIRECTORY</varname></term> + <term><varname>$CONFIGURATION_DIRECTORY</varname></term> + + <listitem><para>Absolute paths to the directories defined with + <varname>RuntimeDirectory=</varname>, <varname>StateDirectory=</varname>, + <varname>CacheDirectory=</varname>, <varname>LogsDirectory=</varname>, and + <varname>ConfigurationDirectory=</varname> when those settings are used.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>$CREDENTIALS_DIRECTORY</varname></term> + + <listitem><para>An absolute path to the per-unit directory with credentials configured via + <varname>LoadCredential=</varname>/<varname>SetCredential=</varname>. The directory is marked + read-only and is placed in unswappable memory (if supported and permitted), and is only accessible to + the UID associated with the unit via <varname>User=</varname> or <varname>DynamicUser=</varname> (and + the superuser).</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$MAINPID</varname></term> + + <listitem><para>The PID of the unit's main process if it is + known. This is only set for control processes as invoked by + <varname>ExecReload=</varname> and similar. </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$MANAGERPID</varname></term> + + <listitem><para>The PID of the user <command>systemd</command> + instance, set for processes spawned by it. </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$LISTEN_FDS</varname></term> + <term><varname>$LISTEN_PID</varname></term> + <term><varname>$LISTEN_FDNAMES</varname></term> + + <listitem><para>Information about file descriptors passed to a + service for socket activation. See + <citerefentry><refentrytitle>sd_listen_fds</refentrytitle><manvolnum>3</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$NOTIFY_SOCKET</varname></term> + + <listitem><para>The socket + <function>sd_notify()</function> talks to. See + <citerefentry><refentrytitle>sd_notify</refentrytitle><manvolnum>3</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$WATCHDOG_PID</varname></term> + <term><varname>$WATCHDOG_USEC</varname></term> + + <listitem><para>Information about watchdog keep-alive notifications. See + <citerefentry><refentrytitle>sd_watchdog_enabled</refentrytitle><manvolnum>3</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$TERM</varname></term> + + <listitem><para>Terminal type, set only for units connected to + a terminal (<varname>StandardInput=tty</varname>, + <varname>StandardOutput=tty</varname>, or + <varname>StandardError=tty</varname>). See + <citerefentry project='man-pages'><refentrytitle>termcap</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + </para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$LOG_NAMESPACE</varname></term> + + <listitem><para>Contains the name of the selected logging namespace when the + <varname>LogNamespace=</varname> service setting is used.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$JOURNAL_STREAM</varname></term> + + <listitem><para>If the standard output or standard error output of the executed processes are connected to the + journal (for example, by setting <varname>StandardError=journal</varname>) <varname>$JOURNAL_STREAM</varname> + contains the device and inode numbers of the connection file descriptor, formatted in decimal, separated by a + colon (<literal>:</literal>). This permits invoked processes to safely detect whether their standard output or + standard error output are connected to the journal. The device and inode numbers of the file descriptors should + be compared with the values set in the environment variable to determine whether the process output is still + connected to the journal. Note that it is generally not sufficient to only check whether + <varname>$JOURNAL_STREAM</varname> is set at all as services might invoke external processes replacing their + standard output or standard error output, without unsetting the environment variable.</para> + + <para>If both standard output and standard error of the executed processes are connected to the journal via a + stream socket, this environment variable will contain information about the standard error stream, as that's + usually the preferred destination for log data. (Note that typically the same stream is used for both standard + output and standard error, hence very likely the environment variable contains device and inode information + matching both stream file descriptors.)</para> + + <para>This environment variable is primarily useful to allow services to optionally upgrade their used log + protocol to the native journal protocol (using + <citerefentry><refentrytitle>sd_journal_print</refentrytitle><manvolnum>3</manvolnum></citerefentry> and other + functions) if their standard output or standard error output is connected to the journal anyway, thus enabling + delivery of structured metadata along with logged messages.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$SERVICE_RESULT</varname></term> + + <listitem><para>Only defined for the service unit type, this environment variable is passed to all + <varname>ExecStop=</varname> and <varname>ExecStopPost=</varname> processes, and encodes the service + "result". Currently, the following values are defined:</para> + + <table> + <title>Defined <varname>$SERVICE_RESULT</varname> values</title> + <tgroup cols='2'> + <colspec colname='result'/> + <colspec colname='meaning'/> + <thead> + <row> + <entry>Value</entry> + <entry>Meaning</entry> + </row> + </thead> + + <tbody> + <row> + <entry><literal>success</literal></entry> + <entry>The service ran successfully and exited cleanly.</entry> + </row> + <row> + <entry><literal>protocol</literal></entry> + <entry>A protocol violation occurred: the service did not take the steps required by its unit configuration (specifically what is configured in its <varname>Type=</varname> setting).</entry> + </row> + <row> + <entry><literal>timeout</literal></entry> + <entry>One of the steps timed out.</entry> + </row> + <row> + <entry><literal>exit-code</literal></entry> + <entry>Service process exited with a non-zero exit code; see <varname>$EXIT_CODE</varname> below for the actual exit code returned.</entry> + </row> + <row> + <entry><literal>signal</literal></entry> + <entry>A service process was terminated abnormally by a signal, without dumping core. See <varname>$EXIT_CODE</varname> below for the actual signal causing the termination.</entry> + </row> + <row> + <entry><literal>core-dump</literal></entry> + <entry>A service process terminated abnormally with a signal and dumped core. See <varname>$EXIT_CODE</varname> below for the signal causing the termination.</entry> + </row> + <row> + <entry><literal>watchdog</literal></entry> + <entry>Watchdog keep-alive ping was enabled for the service, but the deadline was missed.</entry> + </row> + <row> + <entry><literal>start-limit-hit</literal></entry> + <entry>A start limit was defined for the unit and it was hit, causing the unit to fail to start. See <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry>'s <varname>StartLimitIntervalSec=</varname> and <varname>StartLimitBurst=</varname> for details.</entry> + </row> + <row> + <entry><literal>resources</literal></entry> + <entry>A catch-all condition in case a system operation failed.</entry> + </row> + </tbody> + </tgroup> + </table> + + <para>This environment variable is useful to monitor failure or successful termination of a service. Even + though this variable is available in both <varname>ExecStop=</varname> and <varname>ExecStopPost=</varname>, it + is usually a better choice to place monitoring tools in the latter, as the former is only invoked for services + that managed to start up correctly, and the latter covers both services that failed during their start-up and + those which failed during their runtime.</para></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$EXIT_CODE</varname></term> + <term><varname>$EXIT_STATUS</varname></term> + + <listitem><para>Only defined for the service unit type, these environment variables are passed to all + <varname>ExecStop=</varname>, <varname>ExecStopPost=</varname> processes and contain exit status/code + information of the main process of the service. For the precise definition of the exit code and status, see + <citerefentry><refentrytitle>wait</refentrytitle><manvolnum>2</manvolnum></citerefentry>. <varname>$EXIT_CODE</varname> + is one of <literal>exited</literal>, <literal>killed</literal>, + <literal>dumped</literal>. <varname>$EXIT_STATUS</varname> contains the numeric exit code formatted as string + if <varname>$EXIT_CODE</varname> is <literal>exited</literal>, and the signal name in all other cases. Note + that these environment variables are only set if the service manager succeeded to start and identify the main + process of the service.</para> + + <table> + <title>Summary of possible service result variable values</title> + <tgroup cols='3'> + <colspec colname='result' /> + <colspec colname='code' /> + <colspec colname='status' /> + <thead> + <row> + <entry><varname>$SERVICE_RESULT</varname></entry> + <entry><varname>$EXIT_CODE</varname></entry> + <entry><varname>$EXIT_STATUS</varname></entry> + </row> + </thead> + + <tbody> + <row> + <entry morerows="1" valign="top"><literal>success</literal></entry> + <entry valign="top"><literal>killed</literal></entry> + <entry><literal>HUP</literal>, <literal>INT</literal>, <literal>TERM</literal>, <literal>PIPE</literal></entry> + </row> + <row> + <entry valign="top"><literal>exited</literal></entry> + <entry><literal>0</literal></entry> + </row> + <row> + <entry morerows="1" valign="top"><literal>protocol</literal></entry> + <entry valign="top">not set</entry> + <entry>not set</entry> + </row> + <row> + <entry><literal>exited</literal></entry> + <entry><literal>0</literal></entry> + </row> + <row> + <entry morerows="1" valign="top"><literal>timeout</literal></entry> + <entry valign="top"><literal>killed</literal></entry> + <entry><literal>TERM</literal>, <literal>KILL</literal></entry> + </row> + <row> + <entry valign="top"><literal>exited</literal></entry> + <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal>, <literal + >3</literal>, …, <literal>255</literal></entry> + </row> + <row> + <entry valign="top"><literal>exit-code</literal></entry> + <entry valign="top"><literal>exited</literal></entry> + <entry><literal>1</literal>, <literal>2</literal>, <literal + >3</literal>, …, <literal>255</literal></entry> + </row> + <row> + <entry valign="top"><literal>signal</literal></entry> + <entry valign="top"><literal>killed</literal></entry> + <entry><literal>HUP</literal>, <literal>INT</literal>, <literal>KILL</literal>, …</entry> + </row> + <row> + <entry valign="top"><literal>core-dump</literal></entry> + <entry valign="top"><literal>dumped</literal></entry> + <entry><literal>ABRT</literal>, <literal>SEGV</literal>, <literal>QUIT</literal>, …</entry> + </row> + <row> + <entry morerows="2" valign="top"><literal>watchdog</literal></entry> + <entry><literal>dumped</literal></entry> + <entry><literal>ABRT</literal></entry> + </row> + <row> + <entry><literal>killed</literal></entry> + <entry><literal>TERM</literal>, <literal>KILL</literal></entry> + </row> + <row> + <entry><literal>exited</literal></entry> + <entry><literal>0</literal>, <literal>1</literal>, <literal>2</literal>, <literal + >3</literal>, …, <literal>255</literal></entry> + </row> + <row> + <entry valign="top"><literal>exec-condition</literal></entry> + <entry><literal>exited</literal></entry> + <entry><literal>1</literal>, <literal>2</literal>, <literal>3</literal>, <literal + >4</literal>, …, <literal>254</literal></entry> + </row> + <row> + <entry valign="top"><literal>oom-kill</literal></entry> + <entry valign="top"><literal>killed</literal></entry> + <entry><literal>TERM</literal>, <literal>KILL</literal></entry> + </row> + <row> + <entry><literal>start-limit-hit</literal></entry> + <entry>not set</entry> + <entry>not set</entry> + </row> + <row> + <entry><literal>resources</literal></entry> + <entry>any of the above</entry> + <entry>any of the above</entry> + </row> + <row> + <entry namest="results" nameend="status">Note: the process may be also terminated by a signal not sent by systemd. In particular the process may send an arbitrary signal to itself in a handler for any of the non-maskable signals. Nevertheless, in the <literal>timeout</literal> and <literal>watchdog</literal> rows above only the signals that systemd sends have been included. Moreover, using <varname>SuccessExitStatus=</varname> additional exit statuses may be declared to indicate clean termination, which is not reflected by this table.</entry> + </row> + </tbody> + </tgroup> + </table></listitem> + </varlistentry> + + <varlistentry> + <term><varname>$PIDFILE</varname></term> + + <listitem><para>The path to the configured PID file, in case the process is forked off on behalf of + a service that uses the <varname>PIDFile=</varname> setting, see + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for details. Service code may use this environment variable to automatically generate a PID file at + the location configured in the unit file. This field is set to an absolute path in the file + system.</para></listitem> + </varlistentry> + + </variablelist> + + <para>For system services, when <varname>PAMName=</varname> is enabled and <command>pam_systemd</command> is part + of the selected PAM stack, additional environment variables defined by systemd may be set for + services. Specifically, these are <varname>$XDG_SEAT</varname>, <varname>$XDG_VTNR</varname>, see + <citerefentry><refentrytitle>pam_systemd</refentrytitle><manvolnum>8</manvolnum></citerefentry> for details.</para> + </refsect2> + + </refsect1> + + <refsect1> + <title>Process Exit Codes</title> + + <para>When invoking a unit process the service manager possibly fails to apply the execution parameters configured + with the settings above. In that case the already created service process will exit with a non-zero exit code + before the configured command line is executed. (Or in other words, the child process possibly exits with these + error codes, after having been created by the <citerefentry + project='man-pages'><refentrytitle>fork</refentrytitle><manvolnum>2</manvolnum></citerefentry> system call, but + before the matching <citerefentry + project='man-pages'><refentrytitle>execve</refentrytitle><manvolnum>2</manvolnum></citerefentry> system call is + called.) Specifically, exit codes defined by the C library, by the LSB specification and by the systemd service + manager itself are used.</para> + + <para>The following basic service exit codes are defined by the C library.</para> + + <table> + <title>Basic C library exit codes</title> + <tgroup cols='3'> + <thead> + <row> + <entry>Exit Code</entry> + <entry>Symbolic Name</entry> + <entry>Description</entry> + </row> + </thead> + <tbody> + <row> + <entry>0</entry> + <entry><constant>EXIT_SUCCESS</constant></entry> + <entry>Generic success code.</entry> + </row> + <row> + <entry>1</entry> + <entry><constant>EXIT_FAILURE</constant></entry> + <entry>Generic failure or unspecified error.</entry> + </row> + </tbody> + </tgroup> + </table> + + <para>The following service exit codes are defined by the <ulink + url="https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html">LSB specification</ulink>. + </para> + + <table> + <title>LSB service exit codes</title> + <tgroup cols='3'> + <thead> + <row> + <entry>Exit Code</entry> + <entry>Symbolic Name</entry> + <entry>Description</entry> + </row> + </thead> + <tbody> + <row> + <entry>2</entry> + <entry><constant>EXIT_INVALIDARGUMENT</constant></entry> + <entry>Invalid or excess arguments.</entry> + </row> + <row> + <entry>3</entry> + <entry><constant>EXIT_NOTIMPLEMENTED</constant></entry> + <entry>Unimplemented feature.</entry> + </row> + <row> + <entry>4</entry> + <entry><constant>EXIT_NOPERMISSION</constant></entry> + <entry>The user has insufficient privileges.</entry> + </row> + <row> + <entry>5</entry> + <entry><constant>EXIT_NOTINSTALLED</constant></entry> + <entry>The program is not installed.</entry> + </row> + <row> + <entry>6</entry> + <entry><constant>EXIT_NOTCONFIGURED</constant></entry> + <entry>The program is not configured.</entry> + </row> + <row> + <entry>7</entry> + <entry><constant>EXIT_NOTRUNNING</constant></entry> + <entry>The program is not running.</entry> + </row> + </tbody> + </tgroup> + </table> + + <para> + The LSB specification suggests that error codes 200 and above are reserved for implementations. Some of them are + used by the service manager to indicate problems during process invocation: + </para> + <table> + <title>systemd-specific exit codes</title> + <tgroup cols='3'> + <thead> + <row> + <entry>Exit Code</entry> + <entry>Symbolic Name</entry> + <entry>Description</entry> + </row> + </thead> + <tbody> + <row> + <entry>200</entry> + <entry><constant>EXIT_CHDIR</constant></entry> + <entry>Changing to the requested working directory failed. See <varname>WorkingDirectory=</varname> above.</entry> + </row> + <row> + <entry>201</entry> + <entry><constant>EXIT_NICE</constant></entry> + <entry>Failed to set up process scheduling priority (nice level). See <varname>Nice=</varname> above.</entry> + </row> + <row> + <entry>202</entry> + <entry><constant>EXIT_FDS</constant></entry> + <entry>Failed to close unwanted file descriptors, or to adjust passed file descriptors.</entry> + </row> + <row> + <entry>203</entry> + <entry><constant>EXIT_EXEC</constant></entry> + <entry>The actual process execution failed (specifically, the <citerefentry project='man-pages'><refentrytitle>execve</refentrytitle><manvolnum>2</manvolnum></citerefentry> system call). Most likely this is caused by a missing or non-accessible executable file.</entry> + </row> + <row> + <entry>204</entry> + <entry><constant>EXIT_MEMORY</constant></entry> + <entry>Failed to perform an action due to memory shortage.</entry> + </row> + <row> + <entry>205</entry> + <entry><constant>EXIT_LIMITS</constant></entry> + <entry>Failed to adjust resource limits. See <varname>LimitCPU=</varname> and related settings above.</entry> + </row> + <row> + <entry>206</entry> + <entry><constant>EXIT_OOM_ADJUST</constant></entry> + <entry>Failed to adjust the OOM setting. See <varname>OOMScoreAdjust=</varname> above.</entry> + </row> + <row> + <entry>207</entry> + <entry><constant>EXIT_SIGNAL_MASK</constant></entry> + <entry>Failed to set process signal mask.</entry> + </row> + <row> + <entry>208</entry> + <entry><constant>EXIT_STDIN</constant></entry> + <entry>Failed to set up standard input. See <varname>StandardInput=</varname> above.</entry> + </row> + <row> + <entry>209</entry> + <entry><constant>EXIT_STDOUT</constant></entry> + <entry>Failed to set up standard output. See <varname>StandardOutput=</varname> above.</entry> + </row> + <row> + <entry>210</entry> + <entry><constant>EXIT_CHROOT</constant></entry> + <entry>Failed to change root directory (<citerefentry project='man-pages'><refentrytitle>chroot</refentrytitle><manvolnum>2</manvolnum></citerefentry>). See <varname>RootDirectory=</varname>/<varname>RootImage=</varname> above.</entry> + </row> + <row> + <entry>211</entry> + <entry><constant>EXIT_IOPRIO</constant></entry> + <entry>Failed to set up IO scheduling priority. See <varname>IOSchedulingClass=</varname>/<varname>IOSchedulingPriority=</varname> above.</entry> + </row> + <row> + <entry>212</entry> + <entry><constant>EXIT_TIMERSLACK</constant></entry> + <entry>Failed to set up timer slack. See <varname>TimerSlackNSec=</varname> above.</entry> + </row> + <row> + <entry>213</entry> + <entry><constant>EXIT_SECUREBITS</constant></entry> + <entry>Failed to set process secure bits. See <varname>SecureBits=</varname> above.</entry> + </row> + <row> + <entry>214</entry> + <entry><constant>EXIT_SETSCHEDULER</constant></entry> + <entry>Failed to set up CPU scheduling. See <varname>CPUSchedulingPolicy=</varname>/<varname>CPUSchedulingPriority=</varname> above.</entry> + </row> + <row> + <entry>215</entry> + <entry><constant>EXIT_CPUAFFINITY</constant></entry> + <entry>Failed to set up CPU affinity. See <varname>CPUAffinity=</varname> above.</entry> + </row> + <row> + <entry>216</entry> + <entry><constant>EXIT_GROUP</constant></entry> + <entry>Failed to determine or change group credentials. See <varname>Group=</varname>/<varname>SupplementaryGroups=</varname> above.</entry> + </row> + <row> + <entry>217</entry> + <entry><constant>EXIT_USER</constant></entry> + <entry>Failed to determine or change user credentials, or to set up user namespacing. See <varname>User=</varname>/<varname>PrivateUsers=</varname> above.</entry> + </row> + <row> + <entry>218</entry> + <entry><constant>EXIT_CAPABILITIES</constant></entry> + <entry>Failed to drop capabilities, or apply ambient capabilities. See <varname>CapabilityBoundingSet=</varname>/<varname>AmbientCapabilities=</varname> above.</entry> + </row> + <row> + <entry>219</entry> + <entry><constant>EXIT_CGROUP</constant></entry> + <entry>Setting up the service control group failed.</entry> + </row> + <row> + <entry>220</entry> + <entry><constant>EXIT_SETSID</constant></entry> + <entry>Failed to create new process session.</entry> + </row> + <row> + <entry>221</entry> + <entry><constant>EXIT_CONFIRM</constant></entry> + <entry>Execution has been cancelled by the user. See the <varname>systemd.confirm_spawn=</varname> kernel command line setting on <citerefentry><refentrytitle>kernel-command-line</refentrytitle><manvolnum>7</manvolnum></citerefentry> for details.</entry> + </row> + <row> + <entry>222</entry> + <entry><constant>EXIT_STDERR</constant></entry> + <entry>Failed to set up standard error output. See <varname>StandardError=</varname> above.</entry> + </row> + <row> + <entry>224</entry> + <entry><constant>EXIT_PAM</constant></entry> + <entry>Failed to set up PAM session. See <varname>PAMName=</varname> above.</entry> + </row> + <row> + <entry>225</entry> + <entry><constant>EXIT_NETWORK</constant></entry> + <entry>Failed to set up network namespacing. See <varname>PrivateNetwork=</varname> above.</entry> + </row> + <row> + <entry>226</entry> + <entry><constant>EXIT_NAMESPACE</constant></entry> + <entry>Failed to set up mount namespacing. See <varname>ReadOnlyPaths=</varname> and related settings above.</entry> + </row> + <row> + <entry>227</entry> + <entry><constant>EXIT_NO_NEW_PRIVILEGES</constant></entry> + <entry>Failed to disable new privileges. See <varname>NoNewPrivileges=yes</varname> above.</entry> + </row> + <row> + <entry>228</entry> + <entry><constant>EXIT_SECCOMP</constant></entry> + <entry>Failed to apply system call filters. See <varname>SystemCallFilter=</varname> and related settings above.</entry> + </row> + <row> + <entry>229</entry> + <entry><constant>EXIT_SELINUX_CONTEXT</constant></entry> + <entry>Determining or changing SELinux context failed. See <varname>SELinuxContext=</varname> above.</entry> + </row> + <row> + <entry>230</entry> + <entry><constant>EXIT_PERSONALITY</constant></entry> + <entry>Failed to set up an execution domain (personality). See <varname>Personality=</varname> above.</entry> + </row> + <row> + <entry>231</entry> + <entry><constant>EXIT_APPARMOR_PROFILE</constant></entry> + <entry>Failed to prepare changing AppArmor profile. See <varname>AppArmorProfile=</varname> above.</entry> + </row> + <row> + <entry>232</entry> + <entry><constant>EXIT_ADDRESS_FAMILIES</constant></entry> + <entry>Failed to restrict address families. See <varname>RestrictAddressFamilies=</varname> above.</entry> + </row> + <row> + <entry>233</entry> + <entry><constant>EXIT_RUNTIME_DIRECTORY</constant></entry> + <entry>Setting up runtime directory failed. See <varname>RuntimeDirectory=</varname> and related settings above.</entry> + </row> + <row> + <entry>235</entry> + <entry><constant>EXIT_CHOWN</constant></entry> + <entry>Failed to adjust socket ownership. Used for socket units only.</entry> + </row> + <row> + <entry>236</entry> + <entry><constant>EXIT_SMACK_PROCESS_LABEL</constant></entry> + <entry>Failed to set SMACK label. See <varname>SmackProcessLabel=</varname> above.</entry> + </row> + <row> + <entry>237</entry> + <entry><constant>EXIT_KEYRING</constant></entry> + <entry>Failed to set up kernel keyring.</entry> + </row> + <row> + <entry>238</entry> + <entry><constant>EXIT_STATE_DIRECTORY</constant></entry> + <entry>Failed to set up unit's state directory. See <varname>StateDirectory=</varname> above.</entry> + </row> + <row> + <entry>239</entry> + <entry><constant>EXIT_CACHE_DIRECTORY</constant></entry> + <entry>Failed to set up unit's cache directory. See <varname>CacheDirectory=</varname> above.</entry> + </row> + <row> + <entry>240</entry> + <entry><constant>EXIT_LOGS_DIRECTORY</constant></entry> + <entry>Failed to set up unit's logging directory. See <varname>LogsDirectory=</varname> above.</entry> + </row> + <row> + <entry>241</entry> + <entry><constant>EXIT_CONFIGURATION_DIRECTORY</constant></entry> + <entry>Failed to set up unit's configuration directory. See <varname>ConfigurationDirectory=</varname> above.</entry> + </row> + <row> + <entry>242</entry> + <entry><constant>EXIT_NUMA_POLICY</constant></entry> + <entry>Failed to set up unit's NUMA memory policy. See <varname>NUMAPolicy=</varname> and <varname>NUMAMask=</varname> above.</entry> + </row> + <row> + <entry>243</entry> + <entry><constant>EXIT_CREDENTIALS</constant></entry> + <entry>Failed to set up unit's credentials. See <varname>LoadCredential=</varname> and <varname>SetCredential=</varname> above.</entry> + </row> + </tbody> + </tgroup> + </table> + + <para>Finally, the BSD operating systems define a set of exit codes, typically defined on Linux systems too:</para> + + <table> + <title>BSD exit codes</title> + <tgroup cols='3'> + <thead> + <row> + <entry>Exit Code</entry> + <entry>Symbolic Name</entry> + <entry>Description</entry> + </row> + </thead> + <tbody> + <row> + <entry>64</entry> + <entry><constant>EX_USAGE</constant></entry> + <entry>Command line usage error</entry> + </row> + <row> + <entry>65</entry> + <entry><constant>EX_DATAERR</constant></entry> + <entry>Data format error</entry> + </row> + <row> + <entry>66</entry> + <entry><constant>EX_NOINPUT</constant></entry> + <entry>Cannot open input</entry> + </row> + <row> + <entry>67</entry> + <entry><constant>EX_NOUSER</constant></entry> + <entry>Addressee unknown</entry> + </row> + <row> + <entry>68</entry> + <entry><constant>EX_NOHOST</constant></entry> + <entry>Host name unknown</entry> + </row> + <row> + <entry>69</entry> + <entry><constant>EX_UNAVAILABLE</constant></entry> + <entry>Service unavailable</entry> + </row> + <row> + <entry>70</entry> + <entry><constant>EX_SOFTWARE</constant></entry> + <entry>internal software error</entry> + </row> + <row> + <entry>71</entry> + <entry><constant>EX_OSERR</constant></entry> + <entry>System error (e.g., can't fork)</entry> + </row> + <row> + <entry>72</entry> + <entry><constant>EX_OSFILE</constant></entry> + <entry>Critical OS file missing</entry> + </row> + <row> + <entry>73</entry> + <entry><constant>EX_CANTCREAT</constant></entry> + <entry>Can't create (user) output file</entry> + </row> + <row> + <entry>74</entry> + <entry><constant>EX_IOERR</constant></entry> + <entry>Input/output error</entry> + </row> + <row> + <entry>75</entry> + <entry><constant>EX_TEMPFAIL</constant></entry> + <entry>Temporary failure; user is invited to retry</entry> + </row> + <row> + <entry>76</entry> + <entry><constant>EX_PROTOCOL</constant></entry> + <entry>Remote error in protocol</entry> + </row> + <row> + <entry>77</entry> + <entry><constant>EX_NOPERM</constant></entry> + <entry>Permission denied</entry> + </row> + <row> + <entry>78</entry> + <entry><constant>EX_CONFIG</constant></entry> + <entry>Configuration error</entry> + </row> + </tbody> + </tgroup> + </table> + </refsect1> + + <refsect1> + <title>See Also</title> + <para> + <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd-analyze</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry><refentrytitle>journalctl</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.swap</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.mount</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.kill</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.resource-control</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.time</refentrytitle><manvolnum>7</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.directives</refentrytitle><manvolnum>7</manvolnum></citerefentry>, + <citerefentry><refentrytitle>tmpfiles.d</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry project='man-pages'><refentrytitle>exec</refentrytitle><manvolnum>3</manvolnum></citerefentry>, + <citerefentry project='man-pages'><refentrytitle>fork</refentrytitle><manvolnum>2</manvolnum></citerefentry> + </para> + </refsect1> + +</refentry> |