diff options
Diffstat (limited to 'man7/pid_namespaces.7')
-rw-r--r-- | man7/pid_namespaces.7 | 388 |
1 files changed, 0 insertions, 388 deletions
diff --git a/man7/pid_namespaces.7 b/man7/pid_namespaces.7 deleted file mode 100644 index 90d062b..0000000 --- a/man7/pid_namespaces.7 +++ /dev/null @@ -1,388 +0,0 @@ -.\" Copyright (c) 2013 by Michael Kerrisk <mtk.manpages@gmail.com> -.\" and Copyright (c) 2012 by Eric W. Biederman <ebiederm@xmission.com> -.\" -.\" SPDX-License-Identifier: Linux-man-pages-copyleft -.\" -.\" -.TH pid_namespaces 7 2023-10-31 "Linux man-pages 6.7" -.SH NAME -pid_namespaces \- overview of Linux PID namespaces -.SH DESCRIPTION -For an overview of namespaces, see -.BR namespaces (7). -.P -PID namespaces isolate the process ID number space, -meaning that processes in different PID namespaces can have the same PID. -PID namespaces allow containers to provide functionality -such as suspending/resuming the set of processes in the container and -migrating the container to a new host -while the processes inside the container maintain the same PIDs. -.P -PIDs in a new PID namespace start at 1, -somewhat like a standalone system, and calls to -.BR fork (2), -.BR vfork (2), -or -.BR clone (2) -will produce processes with PIDs that are unique within the namespace. -.P -Use of PID namespaces requires a kernel that is configured with the -.B CONFIG_PID_NS -option. -.\" -.\" ============================================================ -.\" -.SS The namespace "init" process -The first process created in a new namespace -(i.e., the process created using -.BR clone (2) -with the -.B CLONE_NEWPID -flag, or the first child created by a process after a call to -.BR unshare (2) -using the -.B CLONE_NEWPID -flag) has the PID 1, and is the "init" process for the namespace (see -.BR init (1)). -This process becomes the parent of any child processes that are orphaned -because a process that resides in this PID namespace terminated -(see below for further details). -.P -If the "init" process of a PID namespace terminates, -the kernel terminates all of the processes in the namespace via a -.B SIGKILL -signal. -This behavior reflects the fact that the "init" process -is essential for the correct operation of a PID namespace. -In this case, a subsequent -.BR fork (2) -into this PID namespace fail with the error -.BR ENOMEM ; -it is not possible to create a new process in a PID namespace whose "init" -process has terminated. -Such scenarios can occur when, for example, -a process uses an open file descriptor for a -.IR /proc/ pid /ns/pid -file corresponding to a process that was in a namespace to -.BR setns (2) -into that namespace after the "init" process has terminated. -Another possible scenario can occur after a call to -.BR unshare (2): -if the first child subsequently created by a -.BR fork (2) -terminates, then subsequent calls to -.BR fork (2) -fail with -.BR ENOMEM . -.P -Only signals for which the "init" process has established a signal handler -can be sent to the "init" process by other members of the PID namespace. -This restriction applies even to privileged processes, -and prevents other members of the PID namespace from -accidentally killing the "init" process. -.P -Likewise, a process in an ancestor namespace -can\[em]subject to the usual permission checks described in -.BR kill (2)\[em]send -signals to the "init" process of a child PID namespace only -if the "init" process has established a handler for that signal. -(Within the handler, the -.I siginfo_t -.I si_pid -field described in -.BR sigaction (2) -will be zero.) -.B SIGKILL -or -.B SIGSTOP -are treated exceptionally: -these signals are forcibly delivered when sent from an ancestor PID namespace. -Neither of these signals can be caught by the "init" process, -and so will result in the usual actions associated with those signals -(respectively, terminating and stopping the process). -.P -Starting with Linux 3.4, the -.BR reboot (2) -system call causes a signal to be sent to the namespace "init" process. -See -.BR reboot (2) -for more details. -.\" -.\" ============================================================ -.\" -.SS Nesting PID namespaces -PID namespaces can be nested: -each PID namespace has a parent, -except for the initial ("root") PID namespace. -The parent of a PID namespace is the PID namespace of the process that -created the namespace using -.BR clone (2) -or -.BR unshare (2). -PID namespaces thus form a tree, -with all namespaces ultimately tracing their ancestry to the root namespace. -Since Linux 3.7, -.\" commit f2302505775fd13ba93f034206f1e2a587017929 -.\" The kernel constant MAX_PID_NS_LEVEL -the kernel limits the maximum nesting depth for PID namespaces to 32. -.P -A process is visible to other processes in its PID namespace, -and to the processes in each direct ancestor PID namespace -going back to the root PID namespace. -In this context, "visible" means that one process -can be the target of operations by another process using -system calls that specify a process ID. -Conversely, the processes in a child PID namespace can't see -processes in the parent and further removed ancestor namespaces. -More succinctly: a process can see (e.g., send signals with -.BR kill (2), -set nice values with -.BR setpriority (2), -etc.) only processes contained in its own PID namespace -and in descendants of that namespace. -.P -A process has one process ID in each of the layers of the PID -namespace hierarchy in which is visible, -and walking back though each direct ancestor namespace -through to the root PID namespace. -System calls that operate on process IDs always -operate using the process ID that is visible in the -PID namespace of the caller. -A call to -.BR getpid (2) -always returns the PID associated with the namespace in which -the process was created. -.P -Some processes in a PID namespace may have parents -that are outside of the namespace. -For example, the parent of the initial process in the namespace -(i.e., the -.BR init (1) -process with PID 1) is necessarily in another namespace. -Likewise, the direct children of a process that uses -.BR setns (2) -to cause its children to join a PID namespace are in a different -PID namespace from the caller of -.BR setns (2). -Calls to -.BR getppid (2) -for such processes return 0. -.P -While processes may freely descend into child PID namespaces -(e.g., using -.BR setns (2) -with a PID namespace file descriptor), -they may not move in the other direction. -That is to say, processes may not enter any ancestor namespaces -(parent, grandparent, etc.). -Changing PID namespaces is a one-way operation. -.P -The -.B NS_GET_PARENT -.BR ioctl (2) -operation can be used to discover the parental relationship -between PID namespaces; see -.BR ioctl_ns (2). -.\" -.\" ============================================================ -.\" -.SS setns(2) and unshare(2) semantics -Calls to -.BR setns (2) -that specify a PID namespace file descriptor -and calls to -.BR unshare (2) -with the -.B CLONE_NEWPID -flag cause children subsequently created -by the caller to be placed in a different PID namespace from the caller. -(Since Linux 4.12, that PID namespace is shown via the -.IR /proc/ pid /ns/pid_for_children -file, as described in -.BR namespaces (7).) -These calls do not, however, -change the PID namespace of the calling process, -because doing so would change the caller's idea of its own PID -(as reported by -.BR getpid ()), -which would break many applications and libraries. -.P -To put things another way: -a process's PID namespace membership is determined when the process is created -and cannot be changed thereafter. -Among other things, this means that the parental relationship -between processes mirrors the parental relationship between PID namespaces: -the parent of a process is either in the same namespace -or resides in the immediate parent PID namespace. -.P -A process may call -.BR unshare (2) -with the -.B CLONE_NEWPID -flag only once. -After it has performed this operation, its -.IR /proc/ pid /ns/pid_for_children -symbolic link will be empty until the first child is created in the namespace. -.\" -.\" ============================================================ -.\" -.SS Adoption of orphaned children -When a child process becomes orphaned, it is reparented to the "init" -process in the PID namespace of its parent -(unless one of the nearer ancestors of the parent employed the -.BR prctl (2) -.B PR_SET_CHILD_SUBREAPER -command to mark itself as the reaper of orphaned descendant processes). -Note that because of the -.BR setns (2) -and -.BR unshare (2) -semantics described above, this may be the "init" process in the PID -namespace that is the -.I parent -of the child's PID namespace, -rather than the "init" process in the child's own PID namespace. -.\" Furthermore, by definition, the parent of the "init" process -.\" of a PID namespace resides in the parent PID namespace. -.\" -.\" ============================================================ -.\" -.SS Compatibility of CLONE_NEWPID with other CLONE_* flags -In current versions of Linux, -.B CLONE_NEWPID -can't be combined with -.BR CLONE_THREAD . -Threads are required to be in the same PID namespace such that -the threads in a process can send signals to each other. -Similarly, it must be possible to see all of the threads -of a process in the -.BR proc (5) -filesystem. -Additionally, if two threads were in different PID -namespaces, the process ID of the process sending a signal -could not be meaningfully encoded when a signal is sent -(see the description of the -.I siginfo_t -type in -.BR sigaction (2)). -Since this is computed when a signal is enqueued, -a signal queue shared by processes in multiple PID namespaces -would defeat that. -.P -.\" Note these restrictions were all introduced in -.\" 8382fcac1b813ad0a4e68a838fc7ae93fa39eda0 -.\" when CLONE_NEWPID|CLONE_VM was disallowed -In earlier versions of Linux, -.B CLONE_NEWPID -was additionally disallowed (failing with the error -.BR EINVAL ) -in combination with -.B CLONE_SIGHAND -.\" (restriction lifted in faf00da544045fdc1454f3b9e6d7f65c841de302) -(before Linux 4.3) as well as -.\" (restriction lifted in e79f525e99b04390ca4d2366309545a836c03bf1) -.B CLONE_VM -(before Linux 3.12). -The changes that lifted these restrictions have also been ported to -earlier stable kernels. -.\" -.\" ============================================================ -.\" -.SS /proc and PID namespaces -A -.I /proc -filesystem shows (in the -.IR /proc/ pid -directories) only processes visible in the PID namespace -of the process that performed the mount, even if the -.I /proc -filesystem is viewed from processes in other namespaces. -.P -After creating a new PID namespace, -it is useful for the child to change its root directory -and mount a new procfs instance at -.I /proc -so that tools such as -.BR ps (1) -work correctly. -If a new mount namespace is simultaneously created by including -.B CLONE_NEWNS -in the -.I flags -argument of -.BR clone (2) -or -.BR unshare (2), -then it isn't necessary to change the root directory: -a new procfs instance can be mounted directly over -.IR /proc . -.P -From a shell, the command to mount -.I /proc -is: -.P -.in +4n -.EX -$ mount \-t proc proc /proc -.EE -.in -.P -Calling -.BR readlink (2) -on the path -.I /proc/self -yields the process ID of the caller in the PID namespace of the procfs mount -(i.e., the PID namespace of the process that mounted the procfs). -This can be useful for introspection purposes, -when a process wants to discover its PID in other namespaces. -.\" -.\" ============================================================ -.\" -.SS /proc files -.TP -.BR /proc/sys/kernel/ns_last_pid " (since Linux 3.3)" -.\" commit b8f566b04d3cddd192cfd2418ae6d54ac6353792 -This file -(which is virtualized per PID namespace) -displays the last PID that was allocated in this PID namespace. -When the next PID is allocated, -the kernel will search for the lowest unallocated PID -that is greater than this value, -and when this file is subsequently read it will show that PID. -.IP -This file is writable by a process that has the -.B CAP_SYS_ADMIN -or (since Linux 5.9) -.B CAP_CHECKPOINT_RESTORE -capability inside the user namespace that owns the PID namespace. -.\" This ability is necessary to support checkpoint restore in user-space -This makes it possible to determine the PID that is allocated -to the next process that is created inside this PID namespace. -.\" -.\" ============================================================ -.\" -.SS Miscellaneous -When a process ID is passed over a UNIX domain socket to a -process in a different PID namespace (see the description of -.B SCM_CREDENTIALS -in -.BR unix (7)), -it is translated into the corresponding PID value in -the receiving process's PID namespace. -.SH STANDARDS -Linux. -.SH EXAMPLES -See -.BR user_namespaces (7). -.SH SEE ALSO -.BR clone (2), -.BR reboot (2), -.BR setns (2), -.BR unshare (2), -.BR proc (5), -.BR capabilities (7), -.BR credentials (7), -.BR mount_namespaces (7), -.BR namespaces (7), -.BR user_namespaces (7), -.BR switch_root (8) |