summaryrefslogtreecommitdiffstats
path: root/man7/cgroups.7
diff options
context:
space:
mode:
Diffstat (limited to 'man7/cgroups.7')
-rw-r--r--man7/cgroups.71914
1 files changed, 0 insertions, 1914 deletions
diff --git a/man7/cgroups.7 b/man7/cgroups.7
deleted file mode 100644
index e2c3ec2..0000000
--- a/man7/cgroups.7
+++ /dev/null
@@ -1,1914 +0,0 @@
-.\" Copyright (C) 2015 Serge Hallyn <serge@hallyn.com>
-.\" and Copyright (C) 2016, 2017 Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.TH cgroups 7 2024-03-05 "Linux man-pages 6.7"
-.SH NAME
-cgroups \- Linux control groups
-.SH DESCRIPTION
-Control groups, usually referred to as cgroups,
-are a Linux kernel feature which allow processes to
-be organized into hierarchical groups whose usage of
-various types of resources can then be limited and monitored.
-The kernel's cgroup interface is provided through
-a pseudo-filesystem called cgroupfs.
-Grouping is implemented in the core cgroup kernel code,
-while resource tracking and limits are implemented in
-a set of per-resource-type subsystems (memory, CPU, and so on).
-.\"
-.SS Terminology
-A
-.I cgroup
-is a collection of processes that are bound to a set of
-limits or parameters defined via the cgroup filesystem.
-.P
-A
-.I subsystem
-is a kernel component that modifies the behavior of
-the processes in a cgroup.
-Various subsystems have been implemented, making it possible to do things
-such as limiting the amount of CPU time and memory available to a cgroup,
-accounting for the CPU time used by a cgroup,
-and freezing and resuming execution of the processes in a cgroup.
-Subsystems are sometimes also known as
-.I resource controllers
-(or simply, controllers).
-.P
-The cgroups for a controller are arranged in a
-.IR hierarchy .
-This hierarchy is defined by creating, removing, and
-renaming subdirectories within the cgroup filesystem.
-At each level of the hierarchy, attributes (e.g., limits) can be defined.
-The limits, control, and accounting provided by cgroups generally have
-effect throughout the subhierarchy underneath the cgroup where the
-attributes are defined.
-Thus, for example, the limits placed on
-a cgroup at a higher level in the hierarchy cannot be exceeded
-by descendant cgroups.
-.\"
-.SS Cgroups version 1 and version 2
-The initial release of the cgroups implementation was in Linux 2.6.24.
-Over time, various cgroup controllers have been added
-to allow the management of various types of resources.
-However, the development of these controllers was largely uncoordinated,
-with the result that many inconsistencies arose between controllers
-and management of the cgroup hierarchies became rather complex.
-A longer description of these problems can be found in the kernel
-source file
-.I Documentation/admin\-guide/cgroup\-v2.rst
-(or
-.I Documentation/cgroup\-v2.txt
-in Linux 4.17 and earlier).
-.P
-Because of the problems with the initial cgroups implementation
-(cgroups version 1),
-starting in Linux 3.10, work began on a new,
-orthogonal implementation to remedy these problems.
-Initially marked experimental, and hidden behind the
-.I "\-o\ __DEVEL__sane_behavior"
-mount option, the new version (cgroups version 2)
-was eventually made official with the release of Linux 4.5.
-Differences between the two versions are described in the text below.
-The file
-.IR cgroup.sane_behavior ,
-present in cgroups v1, is a relic of this mount option.
-The file always reports "0" and is only retained for backward compatibility.
-.P
-Although cgroups v2 is intended as a replacement for cgroups v1,
-the older system continues to exist
-(and for compatibility reasons is unlikely to be removed).
-Currently, cgroups v2 implements only a subset of the controllers
-available in cgroups v1.
-The two systems are implemented so that both v1 controllers and
-v2 controllers can be mounted on the same system.
-Thus, for example, it is possible to use those controllers
-that are supported under version 2,
-while also using version 1 controllers
-where version 2 does not yet support those controllers.
-The only restriction here is that a controller can't be simultaneously
-employed in both a cgroups v1 hierarchy and in the cgroups v2 hierarchy.
-.\"
-.SH CGROUPS VERSION 1
-Under cgroups v1, each controller may be mounted against a separate
-cgroup filesystem that provides its own hierarchical organization of the
-processes on the system.
-It is also possible to comount multiple (or even all) cgroups v1 controllers
-against the same cgroup filesystem, meaning that the comounted controllers
-manage the same hierarchical organization of processes.
-.P
-For each mounted hierarchy,
-the directory tree mirrors the control group hierarchy.
-Each control group is represented by a directory, with each of its child
-control cgroups represented as a child directory.
-For instance,
-.I /user/joe/1.session
-represents control group
-.IR 1.session ,
-which is a child of cgroup
-.IR joe ,
-which is a child of
-.IR /user .
-Under each cgroup directory is a set of files which can be read or
-written to, reflecting resource limits and a few general cgroup
-properties.
-.\"
-.SS Tasks (threads) versus processes
-In cgroups v1, a distinction is drawn between
-.I processes
-and
-.IR tasks .
-In this view, a process can consist of multiple tasks
-(more commonly called threads, from a user-space perspective,
-and called such in the remainder of this man page).
-In cgroups v1, it is possible to independently manipulate
-the cgroup memberships of the threads in a process.
-.P
-The cgroups v1 ability to split threads across different cgroups
-caused problems in some cases.
-For example, it made no sense for the
-.I memory
-controller,
-since all of the threads of a process share a single address space.
-Because of these problems,
-the ability to independently manipulate the cgroup memberships
-of the threads in a process was removed in the initial cgroups v2
-implementation, and subsequently restored in a more limited form
-(see the discussion of "thread mode" below).
-.\"
-.SS Mounting v1 controllers
-The use of cgroups requires a kernel built with the
-.B CONFIG_CGROUP
-option.
-In addition, each of the v1 controllers has an associated
-configuration option that must be set in order to employ that controller.
-.P
-In order to use a v1 controller,
-it must be mounted against a cgroup filesystem.
-The usual place for such mounts is under a
-.BR tmpfs (5)
-filesystem mounted at
-.IR /sys/fs/cgroup .
-Thus, one might mount the
-.I cpu
-controller as follows:
-.P
-.in +4n
-.EX
-mount \-t cgroup \-o cpu none /sys/fs/cgroup/cpu
-.EE
-.in
-.P
-It is possible to comount multiple controllers against the same hierarchy.
-For example, here the
-.I cpu
-and
-.I cpuacct
-controllers are comounted against a single hierarchy:
-.P
-.in +4n
-.EX
-mount \-t cgroup \-o cpu,cpuacct none /sys/fs/cgroup/cpu,cpuacct
-.EE
-.in
-.P
-Comounting controllers has the effect that a process is in the same cgroup for
-all of the comounted controllers.
-Separately mounting controllers allows a process to
-be in cgroup
-.I /foo1
-for one controller while being in
-.I /foo2/foo3
-for another.
-.P
-It is possible to comount all v1 controllers against the same hierarchy:
-.P
-.in +4n
-.EX
-mount \-t cgroup \-o all cgroup /sys/fs/cgroup
-.EE
-.in
-.P
-(One can achieve the same result by omitting
-.IR "\-o all" ,
-since it is the default if no controllers are explicitly specified.)
-.P
-It is not possible to mount the same controller
-against multiple cgroup hierarchies.
-For example, it is not possible to mount both the
-.I cpu
-and
-.I cpuacct
-controllers against one hierarchy, and to mount the
-.I cpu
-controller alone against another hierarchy.
-It is possible to create multiple mount with exactly
-the same set of comounted controllers.
-However, in this case all that results is multiple mount points
-providing a view of the same hierarchy.
-.P
-Note that on many systems, the v1 controllers are automatically mounted under
-.IR /sys/fs/cgroup ;
-in particular,
-.BR systemd (1)
-automatically creates such mounts.
-.\"
-.SS Unmounting v1 controllers
-A mounted cgroup filesystem can be unmounted using the
-.BR umount (8)
-command, as in the following example:
-.P
-.in +4n
-.EX
-umount /sys/fs/cgroup/pids
-.EE
-.in
-.P
-.IR "But note well" :
-a cgroup filesystem is unmounted only if it is not busy,
-that is, it has no child cgroups.
-If this is not the case, then the only effect of the
-.BR umount (8)
-is to make the mount invisible.
-Thus, to ensure that the mount is really removed,
-one must first remove all child cgroups,
-which in turn can be done only after all member processes
-have been moved from those cgroups to the root cgroup.
-.\"
-.SS Cgroups version 1 controllers
-Each of the cgroups version 1 controllers is governed
-by a kernel configuration option (listed below).
-Additionally, the availability of the cgroups feature is governed by the
-.B CONFIG_CGROUPS
-kernel configuration option.
-.TP
-.IR cpu " (since Linux 2.6.24; " \fBCONFIG_CGROUP_SCHED\fP )
-Cgroups can be guaranteed a minimum number of "CPU shares"
-when a system is busy.
-This does not limit a cgroup's CPU usage if the CPUs are not busy.
-For further information, see
-.I Documentation/scheduler/sched\-design\-CFS.rst
-(or
-.I Documentation/scheduler/sched\-design\-CFS.txt
-in Linux 5.2 and earlier).
-.IP
-In Linux 3.2,
-this controller was extended to provide CPU "bandwidth" control.
-If the kernel is configured with
-.BR CONFIG_CFS_BANDWIDTH ,
-then within each scheduling period
-(defined via a file in the cgroup directory), it is possible to define
-an upper limit on the CPU time allocated to the processes in a cgroup.
-This upper limit applies even if there is no other competition for the CPU.
-Further information can be found in the kernel source file
-.I Documentation/scheduler/sched\-bwc.rst
-(or
-.I Documentation/scheduler/sched\-bwc.txt
-in Linux 5.2 and earlier).
-.TP
-.IR cpuacct " (since Linux 2.6.24; " \fBCONFIG_CGROUP_CPUACCT\fP )
-This provides accounting for CPU usage by groups of processes.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/cpuacct.rst
-(or
-.I Documentation/cgroup\-v1/cpuacct.txt
-in Linux 5.2 and earlier).
-.TP
-.IR cpuset " (since Linux 2.6.24; " \fBCONFIG_CPUSETS\fP )
-This cgroup can be used to bind the processes in a cgroup to
-a specified set of CPUs and NUMA nodes.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/cpusets.rst
-(or
-.I Documentation/cgroup\-v1/cpusets.txt
-in Linux 5.2 and earlier).
-.
-.TP
-.IR memory " (since Linux 2.6.25; " \fBCONFIG_MEMCG\fP )
-The memory controller supports reporting and limiting of process memory, kernel
-memory, and swap used by cgroups.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/memory.rst
-(or
-.I Documentation/cgroup\-v1/memory.txt
-in Linux 5.2 and earlier).
-.TP
-.IR devices " (since Linux 2.6.26; " \fBCONFIG_CGROUP_DEVICE\fP )
-This supports controlling which processes may create (mknod) devices as
-well as open them for reading or writing.
-The policies may be specified as allow-lists and deny-lists.
-Hierarchy is enforced, so new rules must not
-violate existing rules for the target or ancestor cgroups.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/devices.rst
-(or
-.I Documentation/cgroup\-v1/devices.txt
-in Linux 5.2 and earlier).
-.TP
-.IR freezer " (since Linux 2.6.28; " \fBCONFIG_CGROUP_FREEZER\fP )
-The
-.I freezer
-cgroup can suspend and restore (resume) all processes in a cgroup.
-Freezing a cgroup
-.I /A
-also causes its children, for example, processes in
-.IR /A/B ,
-to be frozen.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/freezer\-subsystem.rst
-(or
-.I Documentation/cgroup\-v1/freezer\-subsystem.txt
-in Linux 5.2 and earlier).
-.TP
-.IR net_cls " (since Linux 2.6.29; " \fBCONFIG_CGROUP_NET_CLASSID\fP )
-This places a classid, specified for the cgroup, on network packets
-created by a cgroup.
-These classids can then be used in firewall rules,
-as well as used to shape traffic using
-.BR tc (8).
-This applies only to packets
-leaving the cgroup, not to traffic arriving at the cgroup.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/net_cls.rst
-(or
-.I Documentation/cgroup\-v1/net_cls.txt
-in Linux 5.2 and earlier).
-.TP
-.IR blkio " (since Linux 2.6.33; " \fBCONFIG_BLK_CGROUP\fP )
-The
-.I blkio
-cgroup controls and limits access to specified block devices by
-applying IO control in the form of throttling and upper limits against leaf
-nodes and intermediate nodes in the storage hierarchy.
-.IP
-Two policies are available.
-The first is a proportional-weight time-based division
-of disk implemented with CFQ.
-This is in effect for leaf nodes using CFQ.
-The second is a throttling policy which specifies
-upper I/O rate limits on a device.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/blkio\-controller.rst
-(or
-.I Documentation/cgroup\-v1/blkio\-controller.txt
-in Linux 5.2 and earlier).
-.TP
-.IR perf_event " (since Linux 2.6.39; " \fBCONFIG_CGROUP_PERF\fP )
-This controller allows
-.I perf
-monitoring of the set of processes grouped in a cgroup.
-.IP
-Further information can be found in the kernel source files
-.TP
-.IR net_prio " (since Linux 3.3; " \fBCONFIG_CGROUP_NET_PRIO\fP )
-This allows priorities to be specified, per network interface, for cgroups.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/net_prio.rst
-(or
-.I Documentation/cgroup\-v1/net_prio.txt
-in Linux 5.2 and earlier).
-.TP
-.IR hugetlb " (since Linux 3.5; " \fBCONFIG_CGROUP_HUGETLB\fP )
-This supports limiting the use of huge pages by cgroups.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/hugetlb.rst
-(or
-.I Documentation/cgroup\-v1/hugetlb.txt
-in Linux 5.2 and earlier).
-.TP
-.IR pids " (since Linux 4.3; " \fBCONFIG_CGROUP_PIDS\fP )
-This controller permits limiting the number of process that may be created
-in a cgroup (and its descendants).
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/pids.rst
-(or
-.I Documentation/cgroup\-v1/pids.txt
-in Linux 5.2 and earlier).
-.TP
-.IR rdma " (since Linux 4.11; " \fBCONFIG_CGROUP_RDMA\fP )
-The RDMA controller permits limiting the use of
-RDMA/IB-specific resources per cgroup.
-.IP
-Further information can be found in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v1/rdma.rst
-(or
-.I Documentation/cgroup\-v1/rdma.txt
-in Linux 5.2 and earlier).
-.\"
-.SS Creating cgroups and moving processes
-A cgroup filesystem initially contains a single root cgroup, '/',
-which all processes belong to.
-A new cgroup is created by creating a directory in the cgroup filesystem:
-.P
-.in +4n
-.EX
-mkdir /sys/fs/cgroup/cpu/cg1
-.EE
-.in
-.P
-This creates a new empty cgroup.
-.P
-A process may be moved to this cgroup by writing its PID into the cgroup's
-.I cgroup.procs
-file:
-.P
-.in +4n
-.EX
-echo $$ > /sys/fs/cgroup/cpu/cg1/cgroup.procs
-.EE
-.in
-.P
-Only one PID at a time should be written to this file.
-.P
-Writing the value 0 to a
-.I cgroup.procs
-file causes the writing process to be moved to the corresponding cgroup.
-.P
-When writing a PID into the
-.IR cgroup.procs ,
-all threads in the process are moved into the new cgroup at once.
-.P
-Within a hierarchy, a process can be a member of exactly one cgroup.
-Writing a process's PID to a
-.I cgroup.procs
-file automatically removes it from the cgroup of
-which it was previously a member.
-.P
-The
-.I cgroup.procs
-file can be read to obtain a list of the processes that are
-members of a cgroup.
-The returned list of PIDs is not guaranteed to be in order.
-Nor is it guaranteed to be free of duplicates.
-(For example, a PID may be recycled while reading from the list.)
-.P
-In cgroups v1, an individual thread can be moved to
-another cgroup by writing its thread ID
-(i.e., the kernel thread ID returned by
-.BR clone (2)
-and
-.BR gettid (2))
-to the
-.I tasks
-file in a cgroup directory.
-This file can be read to discover the set of threads
-that are members of the cgroup.
-.\"
-.SS Removing cgroups
-To remove a cgroup,
-it must first have no child cgroups and contain no (nonzombie) processes.
-So long as that is the case, one can simply
-remove the corresponding directory pathname.
-Note that files in a cgroup directory cannot and need not be
-removed.
-.\"
-.SS Cgroups v1 release notification
-Two files can be used to determine whether the kernel provides
-notifications when a cgroup becomes empty.
-A cgroup is considered to be empty when it contains no child
-cgroups and no member processes.
-.P
-A special file in the root directory of each cgroup hierarchy,
-.IR release_agent ,
-can be used to register the pathname of a program that may be invoked when
-a cgroup in the hierarchy becomes empty.
-The pathname of the newly empty cgroup (relative to the cgroup mount point)
-is provided as the sole command-line argument when the
-.I release_agent
-program is invoked.
-The
-.I release_agent
-program might remove the cgroup directory,
-or perhaps repopulate it with a process.
-.P
-The default value of the
-.I release_agent
-file is empty, meaning that no release agent is invoked.
-.P
-The content of the
-.I release_agent
-file can also be specified via a mount option when the
-cgroup filesystem is mounted:
-.P
-.in +4n
-.EX
-mount \-o release_agent=pathname ...
-.EE
-.in
-.P
-Whether or not the
-.I release_agent
-program is invoked when a particular cgroup becomes empty is determined
-by the value in the
-.I notify_on_release
-file in the corresponding cgroup directory.
-If this file contains the value 0, then the
-.I release_agent
-program is not invoked.
-If it contains the value 1, the
-.I release_agent
-program is invoked.
-The default value for this file in the root cgroup is 0.
-At the time when a new cgroup is created,
-the value in this file is inherited from the corresponding file
-in the parent cgroup.
-.\"
-.SS Cgroup v1 named hierarchies
-In cgroups v1,
-it is possible to mount a cgroup hierarchy that has no attached controllers:
-.P
-.in +4n
-.EX
-mount \-t cgroup \-o none,name=somename none /some/mount/point
-.EE
-.in
-.P
-Multiple instances of such hierarchies can be mounted;
-each hierarchy must have a unique name.
-The only purpose of such hierarchies is to track processes.
-(See the discussion of release notification below.)
-An example of this is the
-.I name=systemd
-cgroup hierarchy that is used by
-.BR systemd (1)
-to track services and user sessions.
-.P
-Since Linux 5.0, the
-.I cgroup_no_v1
-kernel boot option (described below) can be used to disable cgroup v1
-named hierarchies, by specifying
-.IR cgroup_no_v1=named .
-.\"
-.SH CGROUPS VERSION 2
-In cgroups v2,
-all mounted controllers reside in a single unified hierarchy.
-While (different) controllers may be simultaneously
-mounted under the v1 and v2 hierarchies,
-it is not possible to mount the same controller simultaneously
-under both the v1 and the v2 hierarchies.
-.P
-The new behaviors in cgroups v2 are summarized here,
-and in some cases elaborated in the following subsections.
-.IP \[bu] 3
-Cgroups v2 provides a unified hierarchy against
-which all controllers are mounted.
-.IP \[bu]
-"Internal" processes are not permitted.
-With the exception of the root cgroup, processes may reside
-only in leaf nodes (cgroups that do not themselves contain child cgroups).
-The details are somewhat more subtle than this, and are described below.
-.IP \[bu]
-Active cgroups must be specified via the files
-.I cgroup.controllers
-and
-.IR cgroup.subtree_control .
-.IP \[bu]
-The
-.I tasks
-file has been removed.
-In addition, the
-.I cgroup.clone_children
-file that is employed by the
-.I cpuset
-controller has been removed.
-.IP \[bu]
-An improved mechanism for notification of empty cgroups is provided by the
-.I cgroup.events
-file.
-.P
-For more changes, see the
-.I Documentation/admin\-guide/cgroup\-v2.rst
-file in the kernel source
-(or
-.I Documentation/cgroup\-v2.txt
-in Linux 4.17 and earlier).
-.
-.P
-Some of the new behaviors listed above saw subsequent modification with
-the addition in Linux 4.14 of "thread mode" (described below).
-.\"
-.SS Cgroups v2 unified hierarchy
-In cgroups v1, the ability to mount different controllers
-against different hierarchies was intended to allow great flexibility
-for application design.
-In practice, though,
-the flexibility turned out to be less useful than expected,
-and in many cases added complexity.
-Therefore, in cgroups v2,
-all available controllers are mounted against a single hierarchy.
-The available controllers are automatically mounted,
-meaning that it is not necessary (or possible) to specify the controllers
-when mounting the cgroup v2 filesystem using a command such as the following:
-.P
-.in +4n
-.EX
-mount \-t cgroup2 none /mnt/cgroup2
-.EE
-.in
-.P
-A cgroup v2 controller is available only if it is not currently in use
-via a mount against a cgroup v1 hierarchy.
-Or, to put things another way, it is not possible to employ
-the same controller against both a v1 hierarchy and the unified v2 hierarchy.
-This means that it may be necessary first to unmount a v1 controller
-(as described above) before that controller is available in v2.
-Since
-.BR systemd (1)
-makes heavy use of some v1 controllers by default,
-it can in some cases be simpler to boot the system with
-selected v1 controllers disabled.
-To do this, specify the
-.I cgroup_no_v1=list
-option on the kernel boot command line;
-.I list
-is a comma-separated list of the names of the controllers to disable,
-or the word
-.I all
-to disable all v1 controllers.
-(This situation is correctly handled by
-.BR systemd (1),
-which falls back to operating without the specified controllers.)
-.P
-Note that on many modern systems,
-.BR systemd (1)
-automatically mounts the
-.I cgroup2
-filesystem at
-.I /sys/fs/cgroup/unified
-during the boot process.
-.\"
-.SS Cgroups v2 mount options
-The following options
-.RI ( mount\~\-o )
-can be specified when mounting the group v2 filesystem:
-.TP
-.IR nsdelegate " (since Linux 4.15)"
-Treat cgroup namespaces as delegation boundaries.
-For details, see below.
-.TP
-.IR memory_localevents " (since Linux 5.2)"
-.\" commit 9852ae3fe5293264f01c49f2571ef7688f7823ce
-The
-.I memory.events
-should show statistics only for the cgroup itself,
-and not for any descendant cgroups.
-This was the behavior before Linux 5.2.
-Starting in Linux 5.2,
-the default behavior is to include statistics for descendant cgroups in
-.IR memory.events ,
-and this mount option can be used to revert to the legacy behavior.
-This option is system wide and can be set on mount or
-modified through remount only from the initial mount namespace;
-it is silently ignored in noninitial namespaces.
-.\"
-.SS Cgroups v2 controllers
-The following controllers, documented in the kernel source file
-.I Documentation/admin\-guide/cgroup\-v2.rst
-(or
-.I Documentation/cgroup\-v2.txt
-in Linux 4.17 and earlier),
-are supported in cgroups version 2:
-.TP
-.IR cpu " (since Linux 4.15)"
-This is the successor to the version 1
-.I cpu
-and
-.I cpuacct
-controllers.
-.TP
-.IR cpuset " (since Linux 5.0)"
-This is the successor of the version 1
-.I cpuset
-controller.
-.TP
-.IR freezer " (since Linux 5.2)"
-.\" commit 76f969e8948d82e78e1bc4beb6b9465908e74873
-This is the successor of the version 1
-.I freezer
-controller.
-.TP
-.IR hugetlb " (since Linux 5.6)"
-This is the successor of the version 1
-.I hugetlb
-controller.
-.TP
-.IR io " (since Linux 4.5)"
-This is the successor of the version 1
-.I blkio
-controller.
-.TP
-.IR memory " (since Linux 4.5)"
-This is the successor of the version 1
-.I memory
-controller.
-.TP
-.IR perf_event " (since Linux 4.11)"
-This is the same as the version 1
-.I perf_event
-controller.
-.TP
-.IR pids " (since Linux 4.5)"
-This is the same as the version 1
-.I pids
-controller.
-.TP
-.IR rdma " (since Linux 4.11)"
-This is the same as the version 1
-.I rdma
-controller.
-.P
-There is no direct equivalent of the
-.I net_cls
-and
-.I net_prio
-controllers from cgroups version 1.
-Instead, support has been added to
-.BR iptables (8)
-to allow eBPF filters that hook on cgroup v2 pathnames to make decisions
-about network traffic on a per-cgroup basis.
-.P
-The v2
-.I devices
-controller provides no interface files;
-instead, device control is gated by attaching an eBPF
-.RB ( BPF_CGROUP_DEVICE )
-program to a v2 cgroup.
-.\"
-.SS Cgroups v2 subtree control
-Each cgroup in the v2 hierarchy contains the following two files:
-.TP
-.I cgroup.controllers
-This read-only file exposes a list of the controllers that are
-.I available
-in this cgroup.
-The contents of this file match the contents of the
-.I cgroup.subtree_control
-file in the parent cgroup.
-.TP
-.I cgroup.subtree_control
-This is a list of controllers that are
-.I active
-.RI ( enabled )
-in the cgroup.
-The set of controllers in this file is a subset of the set in the
-.I cgroup.controllers
-of this cgroup.
-The set of active controllers is modified by writing strings to this file
-containing space-delimited controller names,
-each preceded by '+' (to enable a controller)
-or '\-' (to disable a controller), as in the following example:
-.IP
-.in +4n
-.EX
-echo \[aq]+pids \-memory\[aq] > x/y/cgroup.subtree_control
-.EE
-.in
-.IP
-An attempt to enable a controller
-that is not present in
-.I cgroup.controllers
-leads to an
-.B ENOENT
-error when writing to the
-.I cgroup.subtree_control
-file.
-.P
-Because the list of controllers in
-.I cgroup.subtree_control
-is a subset of those
-.IR cgroup.controllers ,
-a controller that has been disabled in one cgroup in the hierarchy
-can never be re-enabled in the subtree below that cgroup.
-.P
-A cgroup's
-.I cgroup.subtree_control
-file determines the set of controllers that are exercised in the
-.I child
-cgroups.
-When a controller (e.g.,
-.IR pids )
-is present in the
-.I cgroup.subtree_control
-file of a parent cgroup,
-then the corresponding controller-interface files (e.g.,
-.IR pids.max )
-are automatically created in the children of that cgroup
-and can be used to exert resource control in the child cgroups.
-.\"
-.SS Cgroups v2 \[dq]no internal processes\[dq] rule
-Cgroups v2 enforces a so-called "no internal processes" rule.
-Roughly speaking, this rule means that,
-with the exception of the root cgroup, processes may reside
-only in leaf nodes (cgroups that do not themselves contain child cgroups).
-This avoids the need to decide how to partition resources between
-processes which are members of cgroup A and processes in child cgroups of A.
-.P
-For instance, if cgroup
-.I /cg1/cg2
-exists, then a process may reside in
-.IR /cg1/cg2 ,
-but not in
-.IR /cg1 .
-This is to avoid an ambiguity in cgroups v1
-with respect to the delegation of resources between processes in
-.I /cg1
-and its child cgroups.
-The recommended approach in cgroups v2 is to create a subdirectory called
-.I leaf
-for any nonleaf cgroup which should contain processes, but no child cgroups.
-Thus, processes which previously would have gone into
-.I /cg1
-would now go into
-.IR /cg1/leaf .
-This has the advantage of making explicit
-the relationship between processes in
-.I /cg1/leaf
-and
-.IR /cg1 's
-other children.
-.P
-The "no internal processes" rule is in fact more subtle than stated above.
-More precisely, the rule is that a (nonroot) cgroup can't both
-(1) have member processes, and
-(2) distribute resources into child cgroups\[em]that is, have a nonempty
-.I cgroup.subtree_control
-file.
-Thus, it
-.I is
-possible for a cgroup to have both member processes and child cgroups,
-but before controllers can be enabled for that cgroup,
-the member processes must be moved out of the cgroup
-(e.g., perhaps into the child cgroups).
-.P
-With the Linux 4.14 addition of "thread mode" (described below),
-the "no internal processes" rule has been relaxed in some cases.
-.\"
-.SS Cgroups v2 cgroup.events file
-Each nonroot cgroup in the v2 hierarchy contains a read-only file,
-.IR cgroup.events ,
-whose contents are key-value pairs
-(delimited by newline characters, with the key and value separated by spaces)
-providing state information about the cgroup:
-.P
-.in +4n
-.EX
-$ \fBcat mygrp/cgroup.events\fP
-populated 1
-frozen 0
-.EE
-.in
-.P
-The following keys may appear in this file:
-.TP
-.I populated
-The value of this key is either 1,
-if this cgroup or any of its descendants has member processes,
-or otherwise 0.
-.TP
-.IR frozen " (since Linux 5.2)"
-.\" commit 76f969e8948d82e78e1bc4beb6b9465908e7487
-The value of this key is 1 if this cgroup is currently frozen,
-or 0 if it is not.
-.P
-The
-.I cgroup.events
-file can be monitored, in order to receive notification when the value of
-one of its keys changes.
-Such monitoring can be done using
-.BR inotify (7),
-which notifies changes as
-.B IN_MODIFY
-events, or
-.BR poll (2),
-which notifies changes by returning the
-.B POLLPRI
-and
-.B POLLERR
-bits in the
-.I revents
-field.
-.\"
-.SS Cgroup v2 release notification
-Cgroups v2 provides a new mechanism for obtaining notification
-when a cgroup becomes empty.
-The cgroups v1
-.I release_agent
-and
-.I notify_on_release
-files are removed, and replaced by the
-.I populated
-key in the
-.I cgroup.events
-file.
-This key either has the value 0,
-meaning that the cgroup (and its descendants)
-contain no (nonzombie) member processes,
-or 1, meaning that the cgroup (or one of its descendants)
-contains member processes.
-.P
-The cgroups v2 release-notification mechanism
-offers the following advantages over the cgroups v1
-.I release_agent
-mechanism:
-.IP \[bu] 3
-It allows for cheaper notification,
-since a single process can monitor multiple
-.I cgroup.events
-files (using the techniques described earlier).
-By contrast, the cgroups v1 mechanism requires the expense of creating
-a process for each notification.
-.IP \[bu]
-Notification for different cgroup subhierarchies can be delegated
-to different processes.
-By contrast, the cgroups v1 mechanism allows only one release agent
-for an entire hierarchy.
-.\"
-.SS Cgroups v2 cgroup.stat file
-.\" commit ec39225cca42c05ac36853d11d28f877fde5c42e
-Each cgroup in the v2 hierarchy contains a read-only
-.I cgroup.stat
-file (first introduced in Linux 4.14)
-that consists of lines containing key-value pairs.
-The following keys currently appear in this file:
-.TP
-.I nr_descendants
-This is the total number of visible (i.e., living) descendant cgroups
-underneath this cgroup.
-.TP
-.I nr_dying_descendants
-This is the total number of dying descendant cgroups
-underneath this cgroup.
-A cgroup enters the dying state after being deleted.
-It remains in that state for an undefined period
-(which will depend on system load)
-while resources are freed before the cgroup is destroyed.
-Note that the presence of some cgroups in the dying state is normal,
-and is not indicative of any problem.
-.IP
-A process can't be made a member of a dying cgroup,
-and a dying cgroup can't be brought back to life.
-.\"
-.SS Limiting the number of descendant cgroups
-Each cgroup in the v2 hierarchy contains the following files,
-which can be used to view and set limits on the number
-of descendant cgroups under that cgroup:
-.TP
-.IR cgroup.max.depth " (since Linux 4.14)"
-.\" commit 1a926e0bbab83bae8207d05a533173425e0496d1
-This file defines a limit on the depth of nesting of descendant cgroups.
-A value of 0 in this file means that no descendant cgroups can be created.
-An attempt to create a descendant whose nesting level exceeds
-the limit fails
-.RI ( mkdir (2)
-fails with the error
-.BR EAGAIN ).
-.IP
-Writing the string
-.I \[dq]max\[dq]
-to this file means that no limit is imposed.
-The default value in this file is
-.IR \[dq]max\[dq] .
-.TP
-.IR cgroup.max.descendants " (since Linux 4.14)"
-.\" commit 1a926e0bbab83bae8207d05a533173425e0496d1
-This file defines a limit on the number of live descendant cgroups that
-this cgroup may have.
-An attempt to create more descendants than allowed by the limit fails
-.RI ( mkdir (2)
-fails with the error
-.BR EAGAIN ).
-.IP
-Writing the string
-.I \[dq]max\[dq]
-to this file means that no limit is imposed.
-The default value in this file is
-.IR \[dq]max\[dq] .
-.\"
-.SH CGROUPS DELEGATION: DELEGATING A HIERARCHY TO A LESS PRIVILEGED USER
-In the context of cgroups,
-delegation means passing management of some subtree
-of the cgroup hierarchy to a nonprivileged user.
-Cgroups v1 provides support for delegation based on file permissions
-in the cgroup hierarchy but with less strict containment rules than v2
-(as noted below).
-Cgroups v2 supports delegation with containment by explicit design.
-The focus of the discussion in this section is on delegation in cgroups v2,
-with some differences for cgroups v1 noted along the way.
-.P
-Some terminology is required in order to describe delegation.
-A
-.I delegater
-is a privileged user (i.e., root) who owns a parent cgroup.
-A
-.I delegatee
-is a nonprivileged user who will be granted the permissions needed
-to manage some subhierarchy under that parent cgroup,
-known as the
-.IR "delegated subtree" .
-.P
-To perform delegation,
-the delegater makes certain directories and files writable by the delegatee,
-typically by changing the ownership of the objects to be the user ID
-of the delegatee.
-Assuming that we want to delegate the hierarchy rooted at (say)
-.I /dlgt_grp
-and that there are not yet any child cgroups under that cgroup,
-the ownership of the following is changed to the user ID of the delegatee:
-.TP
-.I /dlgt_grp
-Changing the ownership of the root of the subtree means that any new
-cgroups created under the subtree (and the files they contain)
-will also be owned by the delegatee.
-.TP
-.I /dlgt_grp/cgroup.procs
-Changing the ownership of this file means that the delegatee
-can move processes into the root of the delegated subtree.
-.TP
-.IR /dlgt_grp/cgroup.subtree_control " (cgroups v2 only)"
-Changing the ownership of this file means that the delegatee
-can enable controllers (that are present in
-.IR /dlgt_grp/cgroup.controllers )
-in order to further redistribute resources at lower levels in the subtree.
-(As an alternative to changing the ownership of this file,
-the delegater might instead add selected controllers to this file.)
-.TP
-.IR /dlgt_grp/cgroup.threads " (cgroups v2 only)"
-Changing the ownership of this file is necessary if a threaded subtree
-is being delegated (see the description of "thread mode", below).
-This permits the delegatee to write thread IDs to the file.
-(The ownership of this file can also be changed when delegating
-a domain subtree, but currently this serves no purpose,
-since, as described below, it is not possible to move a thread between
-domain cgroups by writing its thread ID to the
-.I cgroup.threads
-file.)
-.IP
-In cgroups v1, the corresponding file that should instead be delegated is the
-.I tasks
-file.
-.P
-The delegater should
-.I not
-change the ownership of any of the controller interfaces files (e.g.,
-.IR pids.max ,
-.IR memory.high )
-in
-.IR dlgt_grp .
-Those files are used from the next level above the delegated subtree
-in order to distribute resources into the subtree,
-and the delegatee should not have permission to change
-the resources that are distributed into the delegated subtree.
-.P
-See also the discussion of the
-.I /sys/kernel/cgroup/delegate
-file in NOTES for information about further delegatable files in cgroups v2.
-.P
-After the aforementioned steps have been performed,
-the delegatee can create child cgroups within the delegated subtree
-(the cgroup subdirectories and the files they contain
-will be owned by the delegatee)
-and move processes between cgroups in the subtree.
-If some controllers are present in
-.IR dlgt_grp/cgroup.subtree_control ,
-or the ownership of that file was passed to the delegatee,
-the delegatee can also control the further redistribution
-of the corresponding resources into the delegated subtree.
-.\"
-.SS Cgroups v2 delegation: nsdelegate and cgroup namespaces
-Starting with Linux 4.13,
-.\" commit 5136f6365ce3eace5a926e10f16ed2a233db5ba9
-there is a second way to perform cgroup delegation in the cgroups v2 hierarchy.
-This is done by mounting or remounting the cgroup v2 filesystem with the
-.I nsdelegate
-mount option.
-For example, if the cgroup v2 filesystem has already been mounted,
-we can remount it with the
-.I nsdelegate
-option as follows:
-.P
-.in +4n
-.EX
-mount \-t cgroup2 \-o remount,nsdelegate \e
- none /sys/fs/cgroup/unified
-.EE
-.in
-.\"
-.\" Alternatively, we could boot the kernel with the options:
-.\"
-.\" cgroup_no_v1=all systemd.legacy_systemd_cgroup_controller
-.\"
-.\" The effect of the latter option is to prevent systemd from employing
-.\" its "hybrid" cgroup mode, where it tries to make use of cgroups v2.
-.P
-The effect of this mount option is to cause cgroup namespaces
-to automatically become delegation boundaries.
-More specifically,
-the following restrictions apply for processes inside the cgroup namespace:
-.IP \[bu] 3
-Writes to controller interface files in the root directory of the namespace
-will fail with the error
-.BR EPERM .
-Processes inside the cgroup namespace can still write to delegatable
-files in the root directory of the cgroup namespace such as
-.I cgroup.procs
-and
-.IR cgroup.subtree_control ,
-and can create subhierarchy underneath the root directory.
-.IP \[bu]
-Attempts to migrate processes across the namespace boundary are denied
-(with the error
-.BR ENOENT ).
-Processes inside the cgroup namespace can still
-(subject to the containment rules described below)
-move processes between cgroups
-.I within
-the subhierarchy under the namespace root.
-.P
-The ability to define cgroup namespaces as delegation boundaries
-makes cgroup namespaces more useful.
-To understand why, suppose that we already have one cgroup hierarchy
-that has been delegated to a nonprivileged user,
-.IR cecilia ,
-using the older delegation technique described above.
-Suppose further that
-.I cecilia
-wanted to further delegate a subhierarchy
-under the existing delegated hierarchy.
-(For example, the delegated hierarchy might be associated with
-an unprivileged container run by
-.IR cecilia .)
-Even if a cgroup namespace was employed,
-because both hierarchies are owned by the unprivileged user
-.IR cecilia ,
-the following illegitimate actions could be performed:
-.IP \[bu] 3
-A process in the inferior hierarchy could change the
-resource controller settings in the root directory of that hierarchy.
-(These resource controller settings are intended to allow control to
-be exercised from the
-.I parent
-cgroup;
-a process inside the child cgroup should not be allowed to modify them.)
-.IP \[bu]
-A process inside the inferior hierarchy could move processes
-into and out of the inferior hierarchy if the cgroups in the
-superior hierarchy were somehow visible.
-.P
-Employing the
-.I nsdelegate
-mount option prevents both of these possibilities.
-.P
-The
-.I nsdelegate
-mount option only has an effect when performed in
-the initial mount namespace;
-in other mount namespaces, the option is silently ignored.
-.P
-.IR Note :
-On some systems,
-.BR systemd (1)
-automatically mounts the cgroup v2 filesystem.
-In order to experiment with the
-.I nsdelegate
-operation, it may be useful to boot the kernel with
-the following command-line options:
-.P
-.in +4n
-.EX
-cgroup_no_v1=all systemd.legacy_systemd_cgroup_controller
-.EE
-.in
-.P
-These options cause the kernel to boot with the cgroups v1 controllers
-disabled (meaning that the controllers are available in the v2 hierarchy),
-and tells
-.BR systemd (1)
-not to mount and use the cgroup v2 hierarchy,
-so that the v2 hierarchy can be manually mounted
-with the desired options after boot-up.
-.\"
-.SS Cgroup delegation containment rules
-Some delegation
-.I containment rules
-ensure that the delegatee can move processes between cgroups within the
-delegated subtree,
-but can't move processes from outside the delegated subtree into
-the subtree or vice versa.
-A nonprivileged process (i.e., the delegatee) can write the PID of
-a "target" process into a
-.I cgroup.procs
-file only if all of the following are true:
-.IP \[bu] 3
-The writer has write permission on the
-.I cgroup.procs
-file in the destination cgroup.
-.IP \[bu]
-The writer has write permission on the
-.I cgroup.procs
-file in the nearest common ancestor of the source and destination cgroups.
-Note that in some cases,
-the nearest common ancestor may be the source or destination cgroup itself.
-This requirement is not enforced for cgroups v1 hierarchies,
-with the consequence that containment in v1 is less strict than in v2.
-(For example, in cgroups v1 the user that owns two distinct
-delegated subhierarchies can move a process between the hierarchies.)
-.IP \[bu]
-If the cgroup v2 filesystem was mounted with the
-.I nsdelegate
-option, the writer must be able to see the source and destination cgroups
-from its cgroup namespace.
-.IP \[bu]
-In cgroups v1:
-the effective UID of the writer (i.e., the delegatee) matches the
-real user ID or the saved set-user-ID of the target process.
-Before Linux 4.11,
-.\" commit 576dd464505fc53d501bb94569db76f220104d28
-this requirement also applied in cgroups v2
-(This was a historical requirement inherited from cgroups v1
-that was later deemed unnecessary,
-since the other rules suffice for containment in cgroups v2.)
-.P
-.IR Note :
-one consequence of these delegation containment rules is that the
-unprivileged delegatee can't place the first process into
-the delegated subtree;
-instead, the delegater must place the first process
-(a process owned by the delegatee) into the delegated subtree.
-.\"
-.SH CGROUPS VERSION 2 THREAD MODE
-Among the restrictions imposed by cgroups v2 that were not present
-in cgroups v1 are the following:
-.IP \[bu] 3
-.IR "No thread-granularity control" :
-all of the threads of a process must be in the same cgroup.
-.IP \[bu]
-.IR "No internal processes" :
-a cgroup can't both have member processes and
-exercise controllers on child cgroups.
-.P
-Both of these restrictions were added because
-the lack of these restrictions had caused problems
-in cgroups v1.
-In particular, the cgroups v1 ability to allow thread-level granularity
-for cgroup membership made no sense for some controllers.
-(A notable example was the
-.I memory
-controller: since threads share an address space,
-it made no sense to split threads across different
-.I memory
-cgroups.)
-.P
-Notwithstanding the initial design decision in cgroups v2,
-there were use cases for certain controllers, notably the
-.I cpu
-controller,
-for which thread-level granularity of control was meaningful and useful.
-To accommodate such use cases, Linux 4.14 added
-.I "thread mode"
-for cgroups v2.
-.P
-Thread mode allows the following:
-.IP \[bu] 3
-The creation of
-.I threaded subtrees
-in which the threads of a process may
-be spread across cgroups inside the tree.
-(A threaded subtree may contain multiple multithreaded processes.)
-.IP \[bu]
-The concept of
-.IR "threaded controllers" ,
-which can distribute resources across the cgroups in a threaded subtree.
-.IP \[bu]
-A relaxation of the "no internal processes rule",
-so that, within a threaded subtree,
-a cgroup can both contain member threads and
-exercise resource control over child cgroups.
-.P
-With the addition of thread mode,
-each nonroot cgroup now contains a new file,
-.IR cgroup.type ,
-that exposes, and in some circumstances can be used to change,
-the "type" of a cgroup.
-This file contains one of the following type values:
-.TP
-.I domain
-This is a normal v2 cgroup that provides process-granularity control.
-If a process is a member of this cgroup,
-then all threads of the process are (by definition) in the same cgroup.
-This is the default cgroup type,
-and provides the same behavior that was provided for
-cgroups in the initial cgroups v2 implementation.
-.TP
-.I threaded
-This cgroup is a member of a threaded subtree.
-Threads can be added to this cgroup,
-and controllers can be enabled for the cgroup.
-.TP
-.I domain threaded
-This is a domain cgroup that serves as the root of a threaded subtree.
-This cgroup type is also known as "threaded root".
-.TP
-.I domain invalid
-This is a cgroup inside a threaded subtree
-that is in an "invalid" state.
-Processes can't be added to the cgroup,
-and controllers can't be enabled for the cgroup.
-The only thing that can be done with this cgroup (other than deleting it)
-is to convert it to a
-.I threaded
-cgroup by writing the string
-.I \[dq]threaded\[dq]
-to the
-.I cgroup.type
-file.
-.IP
-The rationale for the existence of this "interim" type
-during the creation of a threaded subtree
-(rather than the kernel simply immediately converting all cgroups
-under the threaded root to the type
-.IR threaded )
-is to allow for
-possible future extensions to the thread mode model
-.\"
-.SS Threaded versus domain controllers
-With the addition of threads mode,
-cgroups v2 now distinguishes two types of resource controllers:
-.IP \[bu] 3
-.I Threaded
-.\" In the kernel source, look for ".threaded[ \t]*= true" in
-.\" initializations of struct cgroup_subsys
-controllers: these controllers support thread-granularity for
-resource control and can be enabled inside threaded subtrees,
-with the result that the corresponding controller-interface files
-appear inside the cgroups in the threaded subtree.
-As at Linux 4.19, the following controllers are threaded:
-.IR cpu ,
-.IR perf_event ,
-and
-.IR pids .
-.IP \[bu]
-.I Domain
-controllers: these controllers support only process granularity
-for resource control.
-From the perspective of a domain controller,
-all threads of a process are always in the same cgroup.
-Domain controllers can't be enabled inside a threaded subtree.
-.\"
-.SS Creating a threaded subtree
-There are two pathways that lead to the creation of a threaded subtree.
-The first pathway proceeds as follows:
-.IP (1) 5
-We write the string
-.I \[dq]threaded\[dq]
-to the
-.I cgroup.type
-file of a cgroup
-.I y/z
-that currently has the type
-.IR domain .
-This has the following effects:
-.RS
-.IP \[bu] 3
-The type of the cgroup
-.I y/z
-becomes
-.IR threaded .
-.IP \[bu]
-The type of the parent cgroup,
-.IR y ,
-becomes
-.IR "domain threaded" .
-The parent cgroup is the root of a threaded subtree
-(also known as the "threaded root").
-.IP \[bu]
-All other cgroups under
-.I y
-that were not already of type
-.I threaded
-(because they were inside already existing threaded subtrees
-under the new threaded root)
-are converted to type
-.IR "domain invalid" .
-Any subsequently created cgroups under
-.I y
-will also have the type
-.IR "domain invalid" .
-.RE
-.IP (2)
-We write the string
-.I \[dq]threaded\[dq]
-to each of the
-.I domain invalid
-cgroups under
-.IR y ,
-in order to convert them to the type
-.IR threaded .
-As a consequence of this step, all threads under the threaded root
-now have the type
-.I threaded
-and the threaded subtree is now fully usable.
-The requirement to write
-.I \[dq]threaded\[dq]
-to each of these cgroups is somewhat cumbersome,
-but allows for possible future extensions to the thread-mode model.
-.P
-The second way of creating a threaded subtree is as follows:
-.IP (1) 5
-In an existing cgroup,
-.IR z ,
-that currently has the type
-.IR domain ,
-we (1.1) enable one or more threaded controllers and
-(1.2) make a process a member of
-.IR z .
-(These two steps can be done in either order.)
-This has the following consequences:
-.RS
-.IP \[bu] 3
-The type of
-.I z
-becomes
-.IR "domain threaded" .
-.IP \[bu]
-All of the descendant cgroups of
-.I z
-that were not already of type
-.I threaded
-are converted to type
-.IR "domain invalid" .
-.RE
-.IP (2)
-As before, we make the threaded subtree usable by writing the string
-.I \[dq]threaded\[dq]
-to each of the
-.I domain invalid
-cgroups under
-.IR z ,
-in order to convert them to the type
-.IR threaded .
-.P
-One of the consequences of the above pathways to creating a threaded subtree
-is that the threaded root cgroup can be a parent only to
-.I threaded
-(and
-.IR "domain invalid" )
-cgroups.
-The threaded root cgroup can't be a parent of a
-.I domain
-cgroups, and a
-.I threaded
-cgroup
-can't have a sibling that is a
-.I domain
-cgroup.
-.\"
-.SS Using a threaded subtree
-Within a threaded subtree, threaded controllers can be enabled
-in each subgroup whose type has been changed to
-.IR threaded ;
-upon doing so, the corresponding controller interface files
-appear in the children of that cgroup.
-.P
-A process can be moved into a threaded subtree by writing its PID to the
-.I cgroup.procs
-file in one of the cgroups inside the tree.
-This has the effect of making all of the threads
-in the process members of the corresponding cgroup
-and makes the process a member of the threaded subtree.
-The threads of the process can then be spread across
-the threaded subtree by writing their thread IDs (see
-.BR gettid (2))
-to the
-.I cgroup.threads
-files in different cgroups inside the subtree.
-The threads of a process must all reside in the same threaded subtree.
-.P
-As with writing to
-.IR cgroup.procs ,
-some containment rules apply when writing to the
-.I cgroup.threads
-file:
-.IP \[bu] 3
-The writer must have write permission on the
-cgroup.threads
-file in the destination cgroup.
-.IP \[bu]
-The writer must have write permission on the
-.I cgroup.procs
-file in the common ancestor of the source and destination cgroups.
-(In some cases,
-the common ancestor may be the source or destination cgroup itself.)
-.IP \[bu]
-The source and destination cgroups must be in the same threaded subtree.
-(Outside a threaded subtree, an attempt to move a thread by writing
-its thread ID to the
-.I cgroup.threads
-file in a different
-.I domain
-cgroup fails with the error
-.BR EOPNOTSUPP .)
-.P
-The
-.I cgroup.threads
-file is present in each cgroup (including
-.I domain
-cgroups) and can be read in order to discover the set of threads
-that is present in the cgroup.
-The set of thread IDs obtained when reading this file
-is not guaranteed to be ordered or free of duplicates.
-.P
-The
-.I cgroup.procs
-file in the threaded root shows the PIDs of all processes
-that are members of the threaded subtree.
-The
-.I cgroup.procs
-files in the other cgroups in the subtree are not readable.
-.P
-Domain controllers can't be enabled in a threaded subtree;
-no controller-interface files appear inside the cgroups underneath the
-threaded root.
-From the point of view of a domain controller,
-threaded subtrees are invisible:
-a multithreaded process inside a threaded subtree appears to a domain
-controller as a process that resides in the threaded root cgroup.
-.P
-Within a threaded subtree, the "no internal processes" rule does not apply:
-a cgroup can both contain member processes (or thread)
-and exercise controllers on child cgroups.
-.\"
-.SS Rules for writing to cgroup.type and creating threaded subtrees
-A number of rules apply when writing to the
-.I cgroup.type
-file:
-.IP \[bu] 3
-Only the string
-.I \[dq]threaded\[dq]
-may be written.
-In other words, the only explicit transition that is possible is to convert a
-.I domain
-cgroup to type
-.IR threaded .
-.IP \[bu]
-The effect of writing
-.I \[dq]threaded\[dq]
-depends on the current value in
-.IR cgroup.type ,
-as follows:
-.RS
-.IP \[bu] 3
-.I domain
-or
-.IR "domain threaded" :
-start the creation of a threaded subtree
-(whose root is the parent of this cgroup) via
-the first of the pathways described above;
-.IP \[bu]
-.IR "domain\ invalid" :
-convert this cgroup (which is inside a threaded subtree) to a usable (i.e.,
-.IR threaded )
-state;
-.IP \[bu]
-.IR threaded :
-no effect (a "no-op").
-.RE
-.IP \[bu]
-We can't write to a
-.I cgroup.type
-file if the parent's type is
-.IR "domain invalid" .
-In other words, the cgroups of a threaded subtree must be converted to the
-.I threaded
-state in a top-down manner.
-.P
-There are also some constraints that must be satisfied
-in order to create a threaded subtree rooted at the cgroup
-.IR x :
-.IP \[bu] 3
-There can be no member processes in the descendant cgroups of
-.IR x .
-(The cgroup
-.I x
-can itself have member processes.)
-.IP \[bu]
-No domain controllers may be enabled in
-.IR x 's
-.I cgroup.subtree_control
-file.
-.P
-If any of the above constraints is violated, then an attempt to write
-.I \[dq]threaded\[dq]
-to a
-.I cgroup.type
-file fails with the error
-.BR ENOTSUP .
-.\"
-.SS The \[dq]domain threaded\[dq] cgroup type
-According to the pathways described above,
-the type of a cgroup can change to
-.I domain threaded
-in either of the following cases:
-.IP \[bu] 3
-The string
-.I \[dq]threaded\[dq]
-is written to a child cgroup.
-.IP \[bu]
-A threaded controller is enabled inside the cgroup and
-a process is made a member of the cgroup.
-.P
-A
-.I domain threaded
-cgroup,
-.IR x ,
-can revert to the type
-.I domain
-if the above conditions no longer hold true\[em]that is, if all
-.I threaded
-child cgroups of
-.I x
-are removed and either
-.I x
-no longer has threaded controllers enabled or
-no longer has member processes.
-.P
-When a
-.I domain threaded
-cgroup
-.I x
-reverts to the type
-.IR domain :
-.IP \[bu] 3
-All
-.I domain invalid
-descendants of
-.I x
-that are not in lower-level threaded subtrees revert to the type
-.IR domain .
-.IP \[bu]
-The root cgroups in any lower-level threaded subtrees revert to the type
-.IR "domain threaded" .
-.\"
-.SS Exceptions for the root cgroup
-The root cgroup of the v2 hierarchy is treated exceptionally:
-it can be the parent of both
-.I domain
-and
-.I threaded
-cgroups.
-If the string
-.I \[dq]threaded\[dq]
-is written to the
-.I cgroup.type
-file of one of the children of the root cgroup, then
-.IP \[bu] 3
-The type of that cgroup becomes
-.IR threaded .
-.IP \[bu]
-The type of any descendants of that cgroup that
-are not part of lower-level threaded subtrees changes to
-.IR "domain invalid" .
-.P
-Note that in this case, there is no cgroup whose type becomes
-.IR "domain threaded" .
-(Notionally, the root cgroup can be considered as the threaded root
-for the cgroup whose type was changed to
-.IR threaded .)
-.P
-The aim of this exceptional treatment for the root cgroup is to
-allow a threaded cgroup that employs the
-.I cpu
-controller to be placed as high as possible in the hierarchy,
-so as to minimize the (small) cost of traversing the cgroup hierarchy.
-.\"
-.SS The cgroups v2 \[dq]cpu\[dq] controller and realtime threads
-As at Linux 4.19, the cgroups v2
-.I cpu
-controller does not support control of realtime threads
-(specifically threads scheduled under any of the policies
-.BR SCHED_FIFO ,
-.BR SCHED_RR ,
-described
-.BR SCHED_DEADLINE ;
-see
-.BR sched (7)).
-Therefore, the
-.I cpu
-controller can be enabled in the root cgroup only
-if all realtime threads are in the root cgroup.
-(If there are realtime threads in nonroot cgroups, then a
-.BR write (2)
-of the string
-.I \[dq]+cpu\[dq]
-to the
-.I cgroup.subtree_control
-file fails with the error
-.BR EINVAL .)
-.P
-On some systems,
-.BR systemd (1)
-places certain realtime threads in nonroot cgroups in the v2 hierarchy.
-On such systems,
-these threads must first be moved to the root cgroup before the
-.I cpu
-controller can be enabled.
-.\"
-.SH ERRORS
-The following errors can occur for
-.BR mount (2):
-.TP
-.B EBUSY
-An attempt to mount a cgroup version 1 filesystem specified neither the
-.I name=
-option (to mount a named hierarchy) nor a controller name (or
-.IR all ).
-.SH NOTES
-A child process created via
-.BR fork (2)
-inherits its parent's cgroup memberships.
-A process's cgroup memberships are preserved across
-.BR execve (2).
-.P
-The
-.BR clone3 (2)
-.B CLONE_INTO_CGROUP
-flag can be used to create a child process that begins its life in
-a different version 2 cgroup from the parent process.
-.\"
-.SS /proc files
-.TP
-.IR /proc/cgroups " (since Linux 2.6.24)"
-This file contains information about the controllers
-that are compiled into the kernel.
-An example of the contents of this file (reformatted for readability)
-is the following:
-.IP
-.in +4n
-.EX
-#subsys_name hierarchy num_cgroups enabled
-cpuset 4 1 1
-cpu 8 1 1
-cpuacct 8 1 1
-blkio 6 1 1
-memory 3 1 1
-devices 10 84 1
-freezer 7 1 1
-net_cls 9 1 1
-perf_event 5 1 1
-net_prio 9 1 1
-hugetlb 0 1 0
-pids 2 1 1
-.EE
-.in
-.IP
-The fields in this file are, from left to right:
-.RS
-.IP [1] 5
-The name of the controller.
-.IP [2]
-The unique ID of the cgroup hierarchy on which this controller is mounted.
-If multiple cgroups v1 controllers are bound to the same hierarchy,
-then each will show the same hierarchy ID in this field.
-The value in this field will be 0 if:
-.RS
-.IP \[bu] 3
-the controller is not mounted on a cgroups v1 hierarchy;
-.IP \[bu]
-the controller is bound to the cgroups v2 single unified hierarchy; or
-.IP \[bu]
-the controller is disabled (see below).
-.RE
-.IP [3]
-The number of control groups in this hierarchy using this controller.
-.IP [4]
-This field contains the value 1 if this controller is enabled,
-or 0 if it has been disabled (via the
-.I cgroup_disable
-kernel command-line boot parameter).
-.RE
-.TP
-.IR /proc/ pid /cgroup " (since Linux 2.6.24)"
-This file describes control groups to which the process
-with the corresponding PID belongs.
-The displayed information differs for
-cgroups version 1 and version 2 hierarchies.
-.IP
-For each cgroup hierarchy of which the process is a member,
-there is one entry containing three colon-separated fields:
-.IP
-.in +4n
-.EX
-hierarchy\-ID:controller\-list:cgroup\-path
-.EE
-.in
-.IP
-For example:
-.IP
-.in +4n
-.EX
-5:cpuacct,cpu,cpuset:/daemons
-.EE
-.in
-.IP
-The colon-separated fields are, from left to right:
-.RS
-.IP [1] 5
-For cgroups version 1 hierarchies,
-this field contains a unique hierarchy ID number
-that can be matched to a hierarchy ID in
-.IR /proc/cgroups .
-For the cgroups version 2 hierarchy, this field contains the value 0.
-.IP [2]
-For cgroups version 1 hierarchies,
-this field contains a comma-separated list of the controllers
-bound to the hierarchy.
-For the cgroups version 2 hierarchy, this field is empty.
-.IP [3]
-This field contains the pathname of the control group in the hierarchy
-to which the process belongs.
-This pathname is relative to the mount point of the hierarchy.
-.RE
-.\"
-.SS /sys/kernel/cgroup files
-.TP
-.IR /sys/kernel/cgroup/delegate " (since Linux 4.15)"
-.\" commit 01ee6cfb1483fe57c9cbd8e73817dfbf9bacffd3
-This file exports a list of the cgroups v2 files
-(one per line) that are delegatable
-(i.e., whose ownership should be changed to the user ID of the delegatee).
-In the future, the set of delegatable files may change or grow,
-and this file provides a way for the kernel to inform
-user-space applications of which files must be delegated.
-As at Linux 4.15, one sees the following when inspecting this file:
-.IP
-.in +4n
-.EX
-$ \fBcat /sys/kernel/cgroup/delegate\fP
-cgroup.procs
-cgroup.subtree_control
-cgroup.threads
-.EE
-.in
-.TP
-.IR /sys/kernel/cgroup/features " (since Linux 4.15)"
-.\" commit 5f2e673405b742be64e7c3604ed4ed3ac14f35ce
-Over time, the set of cgroups v2 features that are provided by the
-kernel may change or grow,
-or some features may not be enabled by default.
-This file provides a way for user-space applications to discover what
-features the running kernel supports and has enabled.
-Features are listed one per line:
-.IP
-.in +4n
-.EX
-$ \fBcat /sys/kernel/cgroup/features\fP
-nsdelegate
-memory_localevents
-.EE
-.in
-.IP
-The entries that can appear in this file are:
-.RS
-.TP
-.IR memory_localevents " (since Linux 5.2)"
-The kernel supports the
-.I memory_localevents
-mount option.
-.TP
-.IR nsdelegate " (since Linux 4.15)"
-The kernel supports the
-.I nsdelegate
-mount option.
-.TP
-.IR memory_recursiveprot " (since Linux 5.7)"
-.\" commit 8a931f801340c2be10552c7b5622d5f4852f3a36
-The kernel supports the
-.I memory_recursiveprot
-mount option.
-.RE
-.SH SEE ALSO
-.BR prlimit (1),
-.BR systemd (1),
-.BR systemd\-cgls (1),
-.BR systemd\-cgtop (1),
-.BR clone (2),
-.BR ioprio_set (2),
-.BR perf_event_open (2),
-.BR setrlimit (2),
-.BR cgroup_namespaces (7),
-.BR cpuset (7),
-.BR namespaces (7),
-.BR sched (7),
-.BR user_namespaces (7)
-.P
-The kernel source file
-.IR Documentation/admin\-guide/cgroup\-v2.rst .