diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-06 02:25:50 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-06 02:25:50 +0000 |
commit | 19f4f86bfed21c5326ed2acebe1163f3a83e832b (patch) | |
tree | d59b9989ce55ed23693e80974d94c856f1c2c8b1 /man/systemd.resource-control.xml | |
parent | Initial commit. (diff) | |
download | systemd-19f4f86bfed21c5326ed2acebe1163f3a83e832b.tar.xz systemd-19f4f86bfed21c5326ed2acebe1163f3a83e832b.zip |
Adding upstream version 241.upstream/241upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'man/systemd.resource-control.xml')
-rw-r--r-- | man/systemd.resource-control.xml | 929 |
1 files changed, 929 insertions, 0 deletions
diff --git a/man/systemd.resource-control.xml b/man/systemd.resource-control.xml new file mode 100644 index 0000000..a4d793c --- /dev/null +++ b/man/systemd.resource-control.xml @@ -0,0 +1,929 @@ +<?xml version='1.0'?> +<!DOCTYPE refentry PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" +"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"> + +<!-- + SPDX-License-Identifier: LGPL-2.1+ +--> + +<refentry id="systemd.resource-control"> + <refentryinfo> + <title>systemd.resource-control</title> + <productname>systemd</productname> + </refentryinfo> + + <refmeta> + <refentrytitle>systemd.resource-control</refentrytitle> + <manvolnum>5</manvolnum> + </refmeta> + + <refnamediv> + <refname>systemd.resource-control</refname> + <refpurpose>Resource control unit settings</refpurpose> + </refnamediv> + + <refsynopsisdiv> + <para> + <filename><replaceable>slice</replaceable>.slice</filename>, + <filename><replaceable>scope</replaceable>.scope</filename>, + <filename><replaceable>service</replaceable>.service</filename>, + <filename><replaceable>socket</replaceable>.socket</filename>, + <filename><replaceable>mount</replaceable>.mount</filename>, + <filename><replaceable>swap</replaceable>.swap</filename> + </para> + </refsynopsisdiv> + + <refsect1> + <title>Description</title> + + <para>Unit configuration files for services, slices, scopes, sockets, mount points, and swap devices share a subset + of configuration options for resource control of spawned processes. Internally, this relies on the Linux Control + Groups (cgroups) kernel concept for organizing processes in a hierarchical tree of named groups for the purpose of + resource management.</para> + + <para>This man page lists the configuration options shared by + those six unit types. See + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for the common options of all unit configuration files, and + <citerefentry><refentrytitle>systemd.slice</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.scope</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.mount</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + and + <citerefentry><refentrytitle>systemd.swap</refentrytitle><manvolnum>5</manvolnum></citerefentry> + for more information on the specific unit configuration files. The + resource control configuration options are configured in the + [Slice], [Scope], [Service], [Socket], [Mount], or [Swap] + sections, depending on the unit type.</para> + + <para>In addition, options which control resources available to programs + <emphasis>executed</emphasis> by systemd are listed in + <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry>. + Those options complement options listed here.</para> + + <para>See the <ulink + url="https://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/">New + Control Group Interfaces</ulink> for an introduction on how to make + use of resource control APIs from programs.</para> + </refsect1> + + <refsect1> + <title>Implicit Dependencies</title> + + <para>The following dependencies are implicitly added:</para> + + <itemizedlist> + <listitem><para>Units with the <varname>Slice=</varname> setting set automatically acquire + <varname>Requires=</varname> and <varname>After=</varname> dependencies on the specified + slice unit.</para></listitem> + </itemizedlist> + </refsect1> + + <!-- We don't have any default dependency here. --> + + <refsect1> + <title>Unified and Legacy Control Group Hierarchies</title> + + <para>The unified control group hierarchy is the new version of kernel control group interface, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>. Depending on the resource type, + there are differences in resource control capabilities. Also, because of interface changes, some resource types + have separate set of options on the unified hierarchy.</para> + + <para> + <variablelist> + + <varlistentry> + <term><option>CPU</option></term> + <listitem> + <para><varname>CPUWeight=</varname> and <varname>StartupCPUWeight=</varname> replace + <varname>CPUShares=</varname> and <varname>StartupCPUShares=</varname>, respectively.</para> + + <para>The <literal>cpuacct</literal> controller does not exist separately on the unified hierarchy.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>Memory</option></term> + <listitem> + <para><varname>MemoryMax=</varname> replaces <varname>MemoryLimit=</varname>. <varname>MemoryLow=</varname> + and <varname>MemoryHigh=</varname> are effective only on unified hierarchy.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>IO</option></term> + <listitem> + <para><varname>IO</varname> prefixed settings are a superset of and replace <varname>BlockIO</varname> + prefixed ones. On unified hierarchy, IO resource control also applies to buffered writes.</para> + </listitem> + </varlistentry> + + </variablelist> + </para> + + <para>To ease the transition, there is best-effort translation between the two versions of settings. For each + controller, if any of the settings for the unified hierarchy are present, all settings for the legacy hierarchy are + ignored. If the resulting settings are for the other type of hierarchy, the configurations are translated before + application.</para> + + <para>Legacy control group hierarchy (see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt">cgroups.txt</ulink>), also called cgroup-v1, + doesn't allow safe delegation of controllers to unprivileged processes. If the system uses the legacy control group + hierarchy, resource control is disabled for systemd user instance, see + <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>. + </para> + </refsect1> + + <refsect1> + <title>Options</title> + + <para>Units of the types listed above can have settings + for resource control configuration:</para> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>CPUAccounting=</varname></term> + + <listitem> + <para>Turn on CPU usage accounting for this unit. Takes a + boolean argument. Note that turning on CPU accounting for + one unit will also implicitly turn it on for all units + contained in the same slice and for all its parent slices + and the units contained therein. The system default for this + setting may be controlled with + <varname>DefaultCPUAccounting=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>CPUWeight=<replaceable>weight</replaceable></varname></term> + <term><varname>StartupCPUWeight=<replaceable>weight</replaceable></varname></term> + + <listitem> + <para>Assign the specified CPU time weight to the processes executed, if the unified control group hierarchy + is used on the system. These options take an integer value and control the <literal>cpu.weight</literal> + control group attribute. The allowed range is 1 to 10000. Defaults to 100. For details about this control + group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink> and <ulink + url="https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt">sched-design-CFS.txt</ulink>. + The available CPU time is split up among all units within one slice relative to their CPU time weight.</para> + + <para>While <varname>StartupCPUWeight=</varname> only applies to the startup phase of the system, + <varname>CPUWeight=</varname> applies to normal runtime of the system, and if the former is not set also to + the startup phase. Using <varname>StartupCPUWeight=</varname> allows prioritizing specific services at + boot-up differently than during normal runtime.</para> + + <para>These settings replace <varname>CPUShares=</varname> and <varname>StartupCPUShares=</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>CPUQuota=</varname></term> + + <listitem> + <para>Assign the specified CPU time quota to the processes executed. Takes a percentage value, suffixed with + "%". The percentage specifies how much CPU time the unit shall get at maximum, relative to the total CPU time + available on one CPU. Use values > 100% for allotting CPU time on more than one CPU. This controls the + <literal>cpu.max</literal> attribute on the unified control group hierarchy and + <literal>cpu.cfs_quota_us</literal> on legacy. For details about these control group attributes, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink> and <ulink + url="https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt">sched-bwc.txt</ulink>.</para> + + <para>Example: <varname>CPUQuota=20%</varname> ensures that the executed processes will never get more than + 20% CPU time on one CPU.</para> + + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryAccounting=</varname></term> + + <listitem> + <para>Turn on process and kernel memory accounting for this + unit. Takes a boolean argument. Note that turning on memory + accounting for one unit will also implicitly turn it on for + all units contained in the same slice and for all its parent + slices and the units contained therein. The system default + for this setting may be controlled with + <varname>DefaultMemoryAccounting=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryMin=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the memory usage protection of the executed processes in this unit. If the memory usages of + this unit and all its ancestors are below their minimum boundaries, this unit's memory won't be reclaimed.</para> + + <para>Takes a memory size in bytes. If the value is suffixed with K, M, G or T, the specified memory size is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. Alternatively, a + percentage value may be specified, which is taken relative to the installed physical memory on the + system. This controls the <literal>memory.min</literal> control group attribute. For details about this + control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>This setting is supported only if the unified control group hierarchy is used and disables + <varname>MemoryLimit=</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryLow=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the best-effort memory usage protection of the executed processes in this unit. If the memory + usages of this unit and all its ancestors are below their low boundaries, this unit's memory won't be + reclaimed as long as memory can be reclaimed from unprotected units.</para> + + <para>Takes a memory size in bytes. If the value is suffixed with K, M, G or T, the specified memory size is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. Alternatively, a + percentage value may be specified, which is taken relative to the installed physical memory on the + system. This controls the <literal>memory.low</literal> control group attribute. For details about this + control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>This setting is supported only if the unified control group hierarchy is used and disables + <varname>MemoryLimit=</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryHigh=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the high limit on memory usage of the executed processes in this unit. Memory usage may go + above the limit if unavoidable, but the processes are heavily slowed down and memory is taken away + aggressively in such cases. This is the main mechanism to control memory usage of a unit.</para> + + <para>Takes a memory size in bytes. If the value is suffixed with K, M, G or T, the specified memory size is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. Alternatively, a + percentage value may be specified, which is taken relative to the installed physical memory on the + system. If assigned the + special value <literal>infinity</literal>, no memory limit is applied. This controls the + <literal>memory.high</literal> control group attribute. For details about this control group attribute, see + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>This setting is supported only if the unified control group hierarchy is used and disables + <varname>MemoryLimit=</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryMax=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the absolute limit on memory usage of the executed processes in this unit. If memory usage + cannot be contained under the limit, out-of-memory killer is invoked inside the unit. It is recommended to + use <varname>MemoryHigh=</varname> as the main control mechanism and use <varname>MemoryMax=</varname> as the + last line of defense.</para> + + <para>Takes a memory size in bytes. If the value is suffixed with K, M, G or T, the specified memory size is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. Alternatively, a + percentage value may be specified, which is taken relative to the installed physical memory on the system. If + assigned the special value <literal>infinity</literal>, no memory limit is applied. This controls the + <literal>memory.max</literal> control group attribute. For details about this control group attribute, see + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>This setting replaces <varname>MemoryLimit=</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemorySwapMax=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the absolute limit on swap usage of the executed processes in this unit.</para> + + <para>Takes a swap size in bytes. If the value is suffixed with K, M, G or T, the specified swap size is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes (with the base 1024), respectively. If assigned the + special value <literal>infinity</literal>, no swap limit is applied. This controls the + <literal>memory.swap.max</literal> control group attribute. For details about this control group attribute, + see <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>This setting is supported only if the unified control group hierarchy is used and disables + <varname>MemoryLimit=</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>TasksAccounting=</varname></term> + + <listitem> + <para>Turn on task accounting for this unit. Takes a + boolean argument. If enabled, the system manager will keep + track of the number of tasks in the unit. The number of + tasks accounted this way includes both kernel threads and + userspace processes, with each thread counting + individually. Note that turning on tasks accounting for one + unit will also implicitly turn it on for all units contained + in the same slice and for all its parent slices and the + units contained therein. The system default for this setting + may be controlled with + <varname>DefaultTasksAccounting=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>TasksMax=<replaceable>N</replaceable></varname></term> + + <listitem> + <para>Specify the maximum number of tasks that may be created in the unit. This ensures that the number of + tasks accounted for the unit (see above) stays below a specific limit. This either takes an absolute number + of tasks or a percentage value that is taken relative to the configured maximum number of tasks on the + system. If assigned the special value <literal>infinity</literal>, no tasks limit is applied. This controls + the <literal>pids.max</literal> control group attribute. For details about this control group attribute, see + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v1/pids.txt">pids.txt</ulink>.</para> + + <para>The + system default for this setting may be controlled with + <varname>DefaultTasksMax=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IOAccounting=</varname></term> + + <listitem> + <para>Turn on Block I/O accounting for this unit, if the unified control group hierarchy is used on the + system. Takes a boolean argument. Note that turning on block I/O accounting for one unit will also implicitly + turn it on for all units contained in the same slice and all for its parent slices and the units contained + therein. The system default for this setting may be controlled with <varname>DefaultIOAccounting=</varname> + in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + + <para>This setting replaces <varname>BlockIOAccounting=</varname> and disables settings prefixed with + <varname>BlockIO</varname> or <varname>StartupBlockIO</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IOWeight=<replaceable>weight</replaceable></varname></term> + <term><varname>StartupIOWeight=<replaceable>weight</replaceable></varname></term> + + <listitem> + <para>Set the default overall block I/O weight for the executed processes, if the unified control group + hierarchy is used on the system. Takes a single weight value (between 1 and 10000) to set the default block + I/O weight. This controls the <literal>io.weight</literal> control group attribute, which defaults to + 100. For details about this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>. The available I/O + bandwidth is split up among all units within one slice relative to their block I/O weight.</para> + + <para>While <varname>StartupIOWeight=</varname> only applies + to the startup phase of the system, + <varname>IOWeight=</varname> applies to the later runtime of + the system, and if the former is not set also to the startup + phase. This allows prioritizing specific services at boot-up + differently than during runtime.</para> + + <para>These settings replace <varname>BlockIOWeight=</varname> and <varname>StartupBlockIOWeight=</varname> + and disable settings prefixed with <varname>BlockIO</varname> or <varname>StartupBlockIO</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IODeviceWeight=<replaceable>device</replaceable> <replaceable>weight</replaceable></varname></term> + + <listitem> + <para>Set the per-device overall block I/O weight for the executed processes, if the unified control group + hierarchy is used on the system. Takes a space-separated pair of a file path and a weight value to specify + the device specific weight value, between 1 and 10000. (Example: <literal>/dev/sda 1000</literal>). The file + path may be specified as path to a block device node or as any other file, in which case the backing block + device of the file system of the file is determined. This controls the <literal>io.weight</literal> control + group attribute, which defaults to 100. Use this option multiple times to set weights for multiple devices. + For details about this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>This setting replaces <varname>BlockIODeviceWeight=</varname> and disables settings prefixed with + <varname>BlockIO</varname> or <varname>StartupBlockIO</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IOReadBandwidthMax=<replaceable>device</replaceable> <replaceable>bytes</replaceable></varname></term> + <term><varname>IOWriteBandwidthMax=<replaceable>device</replaceable> <replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Set the per-device overall block I/O bandwidth maximum limit for the executed processes, if the unified + control group hierarchy is used on the system. This limit is not work-conserving and the executed processes + are not allowed to use more even if the device has idle capacity. Takes a space-separated pair of a file + path and a bandwidth value (in bytes per second) to specify the device specific bandwidth. The file path may + be a path to a block device node, or as any other file in which case the backing block device of the file + system of the file is used. If the bandwidth is suffixed with K, M, G, or T, the specified bandwidth is + parsed as Kilobytes, Megabytes, Gigabytes, or Terabytes, respectively, to the base of 1000. (Example: + "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 5M"). This controls the <literal>io.max</literal> control + group attributes. Use this option multiple times to set bandwidth limits for multiple devices. For details + about this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>. + </para> + + <para>These settings replace <varname>BlockIOReadBandwidth=</varname> and + <varname>BlockIOWriteBandwidth=</varname> and disable settings prefixed with <varname>BlockIO</varname> or + <varname>StartupBlockIO</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IOReadIOPSMax=<replaceable>device</replaceable> <replaceable>IOPS</replaceable></varname></term> + <term><varname>IOWriteIOPSMax=<replaceable>device</replaceable> <replaceable>IOPS</replaceable></varname></term> + + <listitem> + <para>Set the per-device overall block I/O IOs-Per-Second maximum limit for the executed processes, if the + unified control group hierarchy is used on the system. This limit is not work-conserving and the executed + processes are not allowed to use more even if the device has idle capacity. Takes a space-separated pair of + a file path and an IOPS value to specify the device specific IOPS. The file path may be a path to a block + device node, or as any other file in which case the backing block device of the file system of the file is + used. If the IOPS is suffixed with K, M, G, or T, the specified IOPS is parsed as KiloIOPS, MegaIOPS, + GigaIOPS, or TeraIOPS, respectively, to the base of 1000. (Example: + "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 1K"). This controls the <literal>io.max</literal> control + group attributes. Use this option multiple times to set IOPS limits for multiple devices. For details about + this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>. + </para> + + <para>These settings are supported only if the unified control group hierarchy is used and disable settings + prefixed with <varname>BlockIO</varname> or <varname>StartupBlockIO</varname>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IODeviceLatencyTargetSec=<replaceable>device</replaceable> <replaceable>target</replaceable></varname></term> + + <listitem> + <para>Set the per-device average target I/O latency for the executed processes, if the unified control group + hierarchy is used on the system. Takes a file path and a timespan separated by a space to specify + the device specific latency target. (Example: "/dev/sda 25ms"). The file path may be specified + as path to a block device node or as any other file, in which case the backing block device of the file + system of the file is determined. This controls the <literal>io.latency</literal> control group + attribute. Use this option multiple times to set latency target for multiple devices. For details about this + control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v2.txt">cgroup-v2.txt</ulink>.</para> + + <para>Implies <literal>IOAccounting=yes</literal>.</para> + + <para>These settings are supported only if the unified control group hierarchy is used.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IPAccounting=</varname></term> + + <listitem> + <para>Takes a boolean argument. If true, turns on IPv4 and IPv6 network traffic accounting for packets sent + or received by the unit. When this option is turned on, all IPv4 and IPv6 sockets created by any process of + the unit are accounted for.</para> + + <para>When this option is used in socket units, it applies to all IPv4 and IPv6 sockets + associated with it (including both listening and connection sockets where this applies). Note that for + socket-activated services, this configuration setting and the accounting data of the service unit and the + socket unit are kept separate, and displayed separately. No propagation of the setting and the collected + statistics is done, in either direction. Moreover, any traffic sent or received on any of the socket unit's + sockets is accounted to the socket unit — and never to the service unit it might have activated, even if the + socket is used by it.</para> + + <para>The system default for this setting may be controlled with <varname>DefaultIPAccounting=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>IPAddressAllow=<replaceable>ADDRESS[/PREFIXLENGTH]…</replaceable></varname></term> + <term><varname>IPAddressDeny=<replaceable>ADDRESS[/PREFIXLENGTH]…</replaceable></varname></term> + + <listitem> + <para>Turn on address range network traffic filtering for packets sent and received over AF_INET and AF_INET6 + sockets. Both directives take a space separated list of IPv4 or IPv6 addresses, each optionally suffixed + with an address prefix length (separated by a <literal>/</literal> character). If the latter is omitted, the + address is considered a host address, i.e. the prefix covers the whole address (32 for IPv4, 128 for IPv6). + </para> + + <para>The access lists configured with this option are applied to all sockets created by processes of this + unit (or in the case of socket units, associated with it). The lists are implicitly combined with any lists + configured for any of the parent slice units this unit might be a member of. By default all access lists are + empty. When configured the lists are enforced as follows:</para> + + <itemizedlist> + <listitem><para>Access will be granted in case its destination/source address matches any entry in the + <varname>IPAddressAllow=</varname> setting.</para></listitem> + + <listitem><para>Otherwise, access will be denied in case its destination/source address matches any entry + in the <varname>IPAddressDeny=</varname> setting.</para></listitem> + + <listitem><para>Otherwise, access will be granted.</para></listitem> + </itemizedlist> + + <para>In order to implement a whitelisting IP firewall, it is recommended to use a + <varname>IPAddressDeny=</varname><constant>any</constant> setting on an upper-level slice unit (such as the + root slice <filename>-.slice</filename> or the slice containing all system services + <filename>system.slice</filename> – see + <citerefentry><refentrytitle>systemd.special</refentrytitle><manvolnum>7</manvolnum></citerefentry> for + details on these slice units), plus individual per-service <varname>IPAddressAllow=</varname> lines + permitting network access to relevant services, and only them.</para> + + <para>Note that for socket-activated services, the IP access list configured on the socket unit applies to + all sockets associated with it directly, but not to any sockets created by the ultimately activated services + for it. Conversely, the IP access list configured for the service is not applied to any sockets passed into + the service via socket activation. Thus, it is usually a good idea, to replicate the IP access lists on both + the socket and the service unit, however it often makes sense to maintain one list more open and the other + one more restricted, depending on the usecase.</para> + + <para>If these settings are used multiple times in the same unit the specified lists are combined. If an + empty string is assigned to these settings the specific access list is reset and all previous settings undone.</para> + + <para>In place of explicit IPv4 or IPv6 address and prefix length specifications a small set of symbolic + names may be used. The following names are defined:</para> + + <table> + <title>Special address/network names</title> + + <tgroup cols='3'> + <colspec colname='name'/> + <colspec colname='definition'/> + <colspec colname='meaning'/> + + <thead> + <row> + <entry>Symbolic Name</entry> + <entry>Definition</entry> + <entry>Meaning</entry> + </row> + </thead> + + <tbody> + <row> + <entry><constant>any</constant></entry> + <entry>0.0.0.0/0 ::/0</entry> + <entry>Any host</entry> + </row> + + <row> + <entry><constant>localhost</constant></entry> + <entry>127.0.0.0/8 ::1/128</entry> + <entry>All addresses on the local loopback</entry> + </row> + + <row> + <entry><constant>link-local</constant></entry> + <entry>169.254.0.0/16 fe80::/64</entry> + <entry>All link-local IP addresses</entry> + </row> + + <row> + <entry><constant>multicast</constant></entry> + <entry>224.0.0.0/4 ff00::/8</entry> + <entry>All IP multicasting addresses</entry> + </row> + </tbody> + </tgroup> + </table> + + <para>Note that these settings might not be supported on some systems (for example if eBPF control group + support is not enabled in the underlying kernel or container manager). These settings will have no effect in + that case. If compatibility with such systems is desired it is hence recommended to not exclusively rely on + them for IP security.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>DeviceAllow=</varname></term> + + <listitem> + <para>Control access to specific device nodes by the + executed processes. Takes two space-separated strings: a + device node specifier followed by a combination of + <constant>r</constant>, <constant>w</constant>, + <constant>m</constant> to control + <emphasis>r</emphasis>eading, <emphasis>w</emphasis>riting, + or creation of the specific device node(s) by the unit + (<emphasis>m</emphasis>knod), respectively. This controls + the <literal>devices.allow</literal> and + <literal>devices.deny</literal> control group + attributes. For details about these control group + attributes, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt">devices.txt</ulink>.</para> + + <para>The device node specifier is either a path to a device + node in the file system, starting with + <filename>/dev/</filename>, or a string starting with either + <literal>char-</literal> or <literal>block-</literal> + followed by a device group name, as listed in + <filename>/proc/devices</filename>. The latter is useful to + whitelist all current and future devices belonging to a + specific device group at once. The device group is matched + according to filename globbing rules, you may hence use the + <literal>*</literal> and <literal>?</literal> + wildcards. Examples: <filename>/dev/sda5</filename> is a + path to a device node, referring to an ATA or SCSI block + device. <literal>char-pts</literal> and + <literal>char-alsa</literal> are specifiers for all pseudo + TTYs and all ALSA sound devices, + respectively. <literal>char-cpu/*</literal> is a specifier + matching all CPU related device groups.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>DevicePolicy=auto|closed|strict</varname></term> + + <listitem> + <para> + Control the policy for allowing device access: + </para> + <variablelist> + <varlistentry> + <term><option>strict</option></term> + <listitem> + <para>means to only allow types of access that are + explicitly specified.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>closed</option></term> + <listitem> + <para>in addition, allows access to standard pseudo + devices including + <filename>/dev/null</filename>, + <filename>/dev/zero</filename>, + <filename>/dev/full</filename>, + <filename>/dev/random</filename>, and + <filename>/dev/urandom</filename>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><option>auto</option></term> + <listitem> + <para> + in addition, allows access to all devices if no + explicit <varname>DeviceAllow=</varname> is present. + This is the default. + </para> + </listitem> + </varlistentry> + </variablelist> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>Slice=</varname></term> + + <listitem> + <para>The name of the slice unit to place the unit + in. Defaults to <filename>system.slice</filename> for all + non-instantiated units of all unit types (except for slice + units themselves see below). Instance units are by default + placed in a subslice of <filename>system.slice</filename> + that is named after the template name.</para> + + <para>This option may be used to arrange systemd units in a + hierarchy of slices each of which might have resource + settings applied.</para> + + <para>For units of type slice, the only accepted value for + this setting is the parent slice. Since the name of a slice + unit implies the parent slice, it is hence redundant to ever + set this parameter directly for slice units.</para> + + <para>Special care should be taken when relying on the default slice assignment in templated service units + that have <varname>DefaultDependencies=no</varname> set, see + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>, section + "Default Dependencies" for details.</para> + + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>Delegate=</varname></term> + + <listitem> + <para>Turns on delegation of further resource control partitioning to processes of the unit. Units where this + is enabled may create and manage their own private subhierarchy of control groups below the control group of + the unit itself. For unprivileged services (i.e. those using the <varname>User=</varname> setting) the unit's + control group will be made accessible to the relevant user. When enabled the service manager will refrain + from manipulating control groups or moving processes below the unit's control group, so that a clear concept + of ownership is established: the control group tree above the unit's control group (i.e. towards the root + control group) is owned and managed by the service manager of the host, while the control group tree below + the unit's control group is owned and managed by the unit itself. Takes either a boolean argument or a list + of control group controller names. If true, delegation is turned on, and all supported controllers are + enabled for the unit, making them available to the unit's processes for management. If false, delegation is + turned off entirely (and no additional controllers are enabled). If set to a list of controllers, delegation + is turned on, and the specified controllers are enabled for the unit. Note that additional controllers than + the ones specified might be made available as well, depending on configuration of the containing slice unit + or other units contained in it. Note that assigning the empty string will enable delegation, but reset the + list of controllers, all assignments prior to this will have no effect. Defaults to false.</para> + + <para>Note that controller delegation to less privileged code is only safe on the unified control group + hierarchy. Accordingly, access to the specified controllers will not be granted to unprivileged services on + the legacy hierarchy, even when requested.</para> + + <para>The following controller names may be specified: <option>cpu</option>, <option>cpuacct</option>, + <option>io</option>, <option>blkio</option>, <option>memory</option>, <option>devices</option>, + <option>pids</option>. Not all of these controllers are available on all kernels however, and some are + specific to the unified hierarchy while others are specific to the legacy hierarchy. Also note that the + kernel might support further controllers, which aren't covered here yet as delegation is either not supported + at all for them or not defined cleanly.</para> + + <para>For further details on the delegation model consult <ulink + url="https://systemd.io/CGROUP_DELEGATION">Control Group APIs and Delegation</ulink>.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>DisableControllers=</varname></term> + + <listitem> + <para>Disables controllers from being enabled for a unit's children. If a controller listed is already in use + in its subtree, the controller will be removed from the subtree. This can be used to avoid child units being + able to implicitly or explicitly enable a controller. Defaults to not disabling any controllers.</para> + + <para>It may not be possible to successfully disable a controller if the unit or any child of the unit in + question delegates controllers to its children, as any delegated subtree of the cgroup hierarchy is unmanaged + by systemd.</para> + + <para>Multiple controllers may be specified, separated by spaces. You may also pass + <varname>DisableControllers=</varname> multiple times, in which case each new instance adds another controller + to disable. Passing <varname>DisableControllers=</varname> by itself with no controller name present resets + the disabled controller list.</para> + + <para>Valid controllers are <option>cpu</option>, <option>cpuacct</option>, <option>io</option>, + <option>blkio</option>, <option>memory</option>, <option>devices</option>, and <option>pids</option>.</para> + </listitem> + </varlistentry> + </variablelist> + </refsect1> + + <refsect1> + <title>Deprecated Options</title> + + <para>The following options are deprecated. Use the indicated superseding options instead:</para> + + <variablelist class='unit-directives'> + + <varlistentry> + <term><varname>CPUShares=<replaceable>weight</replaceable></varname></term> + <term><varname>StartupCPUShares=<replaceable>weight</replaceable></varname></term> + + <listitem> + <para>Assign the specified CPU time share weight to the processes executed. These options take an integer + value and control the <literal>cpu.shares</literal> control group attribute. The allowed range is 2 to + 262144. Defaults to 1024. For details about this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt">sched-design-CFS.txt</ulink>. + The available CPU time is split up among all units within one slice relative to their CPU time share + weight.</para> + + <para>While <varname>StartupCPUShares=</varname> only applies to the startup phase of the system, + <varname>CPUShares=</varname> applies to normal runtime of the system, and if the former is not set also to + the startup phase. Using <varname>StartupCPUShares=</varname> allows prioritizing specific services at + boot-up differently than during normal runtime.</para> + + <para>Implies <literal>CPUAccounting=yes</literal>.</para> + + <para>These settings are deprecated. Use <varname>CPUWeight=</varname> and + <varname>StartupCPUWeight=</varname> instead.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>MemoryLimit=<replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Specify the limit on maximum memory usage of the executed processes. The limit specifies how much + process and kernel memory can be used by tasks in this unit. Takes a memory size in bytes. If the value is + suffixed with K, M, G or T, the specified memory size is parsed as Kilobytes, Megabytes, Gigabytes, or + Terabytes (with the base 1024), respectively. Alternatively, a percentage value may be specified, which is + taken relative to the installed physical memory on the system. If assigned the special value + <literal>infinity</literal>, no memory limit is applied. This controls the + <literal>memory.limit_in_bytes</literal> control group attribute. For details about this control group + attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt">memory.txt</ulink>.</para> + + <para>Implies <literal>MemoryAccounting=yes</literal>.</para> + + <para>This setting is deprecated. Use <varname>MemoryMax=</varname> instead.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>BlockIOAccounting=</varname></term> + + <listitem> + <para>Turn on Block I/O accounting for this unit, if the legacy control group hierarchy is used on the + system. Takes a boolean argument. Note that turning on block I/O accounting for one unit will also implicitly + turn it on for all units contained in the same slice and all for its parent slices and the units contained + therein. The system default for this setting may be controlled with + <varname>DefaultBlockIOAccounting=</varname> in + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>.</para> + + <para>This setting is deprecated. Use <varname>IOAccounting=</varname> instead.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>BlockIOWeight=<replaceable>weight</replaceable></varname></term> + <term><varname>StartupBlockIOWeight=<replaceable>weight</replaceable></varname></term> + + <listitem><para>Set the default overall block I/O weight for the executed processes, if the legacy control + group hierarchy is used on the system. Takes a single weight value (between 10 and 1000) to set the default + block I/O weight. This controls the <literal>blkio.weight</literal> control group attribute, which defaults to + 500. For details about this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt">blkio-controller.txt</ulink>. + The available I/O bandwidth is split up among all units within one slice relative to their block I/O + weight.</para> + + <para>While <varname>StartupBlockIOWeight=</varname> only + applies to the startup phase of the system, + <varname>BlockIOWeight=</varname> applies to the later runtime + of the system, and if the former is not set also to the + startup phase. This allows prioritizing specific services at + boot-up differently than during runtime.</para> + + <para>Implies + <literal>BlockIOAccounting=yes</literal>.</para> + + <para>These settings are deprecated. Use <varname>IOWeight=</varname> and <varname>StartupIOWeight=</varname> + instead.</para> + + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>BlockIODeviceWeight=<replaceable>device</replaceable> <replaceable>weight</replaceable></varname></term> + + <listitem> + <para>Set the per-device overall block I/O weight for the executed processes, if the legacy control group + hierarchy is used on the system. Takes a space-separated pair of a file path and a weight value to specify + the device specific weight value, between 10 and 1000. (Example: "/dev/sda 500"). The file path may be + specified as path to a block device node or as any other file, in which case the backing block device of the + file system of the file is determined. This controls the <literal>blkio.weight_device</literal> control group + attribute, which defaults to 1000. Use this option multiple times to set weights for multiple devices. For + details about this control group attribute, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt">blkio-controller.txt</ulink>.</para> + + <para>Implies + <literal>BlockIOAccounting=yes</literal>.</para> + + <para>This setting is deprecated. Use <varname>IODeviceWeight=</varname> instead.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term><varname>BlockIOReadBandwidth=<replaceable>device</replaceable> <replaceable>bytes</replaceable></varname></term> + <term><varname>BlockIOWriteBandwidth=<replaceable>device</replaceable> <replaceable>bytes</replaceable></varname></term> + + <listitem> + <para>Set the per-device overall block I/O bandwidth limit for the executed processes, if the legacy control + group hierarchy is used on the system. Takes a space-separated pair of a file path and a bandwidth value (in + bytes per second) to specify the device specific bandwidth. The file path may be a path to a block device + node, or as any other file in which case the backing block device of the file system of the file is used. If + the bandwidth is suffixed with K, M, G, or T, the specified bandwidth is parsed as Kilobytes, Megabytes, + Gigabytes, or Terabytes, respectively, to the base of 1000. (Example: + "/dev/disk/by-path/pci-0000:00:1f.2-scsi-0:0:0:0 5M"). This controls the + <literal>blkio.throttle.read_bps_device</literal> and <literal>blkio.throttle.write_bps_device</literal> + control group attributes. Use this option multiple times to set bandwidth limits for multiple devices. For + details about these control group attributes, see <ulink + url="https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt">blkio-controller.txt</ulink>. + </para> + + <para>Implies + <literal>BlockIOAccounting=yes</literal>.</para> + + <para>These settings are deprecated. Use <varname>IOReadBandwidthMax=</varname> and + <varname>IOWriteBandwidthMax=</varname> instead.</para> + </listitem> + </varlistentry> + + </variablelist> + </refsect1> + + <refsect1> + <title>See Also</title> + <para> + <citerefentry><refentrytitle>systemd</refentrytitle><manvolnum>1</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd-system.conf</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.unit</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.service</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.slice</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.scope</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.socket</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.mount</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.swap</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.exec</refentrytitle><manvolnum>5</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.directives</refentrytitle><manvolnum>7</manvolnum></citerefentry>, + <citerefentry><refentrytitle>systemd.special</refentrytitle><manvolnum>7</manvolnum></citerefentry>, + The documentation for control groups and specific controllers in the Linux kernel: + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt">cgroups.txt</ulink>, + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v1/cpuacct.txt">cpuacct.txt</ulink>, + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v1/memory.txt">memory.txt</ulink>, + <ulink url="https://www.kernel.org/doc/Documentation/cgroup-v1/blkio-controller.txt">blkio-controller.txt</ulink>. + <ulink url="https://www.kernel.org/doc/Documentation/scheduler/sched-bwc.txt">sched-bwc.txt</ulink>. + </para> + </refsect1> +</refentry> |