diff options
Diffstat (limited to 'man7/mount_namespaces.7')
-rw-r--r-- | man7/mount_namespaces.7 | 1371 |
1 files changed, 0 insertions, 1371 deletions
diff --git a/man7/mount_namespaces.7 b/man7/mount_namespaces.7 deleted file mode 100644 index 475b44d..0000000 --- a/man7/mount_namespaces.7 +++ /dev/null @@ -1,1371 +0,0 @@ -'\" t -.\" Copyright (c) 2016, 2019, 2021 by Michael Kerrisk <mtk.manpages@gmail.com> -.\" -.\" SPDX-License-Identifier: Linux-man-pages-copyleft -.\" -.\" -.TH mount_namespaces 7 2023-10-31 "Linux man-pages 6.7" -.SH NAME -mount_namespaces \- overview of Linux mount namespaces -.SH DESCRIPTION -For an overview of namespaces, see -.BR namespaces (7). -.P -Mount namespaces provide isolation of the list of mounts seen -by the processes in each namespace instance. -Thus, the processes in each of the mount namespace instances -will see distinct single-directory hierarchies. -.P -The views provided by the -.IR /proc/ pid /mounts , -.IR /proc/ pid /mountinfo , -and -.IR /proc/ pid /mountstats -files (all described in -.BR proc (5)) -correspond to the mount namespace in which the process with the PID -.I pid -resides. -(All of the processes that reside in the same mount namespace -will see the same view in these files.) -.P -A new mount namespace is created using either -.BR clone (2) -or -.BR unshare (2) -with the -.B CLONE_NEWNS -flag. -When a new mount namespace is created, -its mount list is initialized as follows: -.IP \[bu] 3 -If the namespace is created using -.BR clone (2), -the mount list of the child's namespace is a copy -of the mount list in the parent process's mount namespace. -.IP \[bu] -If the namespace is created using -.BR unshare (2), -the mount list of the new namespace is a copy of -the mount list in the caller's previous mount namespace. -.P -Subsequent modifications to the mount list -.RB ( mount (2) -and -.BR umount (2)) -in either mount namespace will not (by default) affect the -mount list seen in the other namespace -(but see the following discussion of shared subtrees). -.\" -.SH SHARED SUBTREES -After the implementation of mount namespaces was completed, -experience showed that the isolation that they provided was, -in some cases, too great. -For example, in order to make a newly loaded optical disk -available in all mount namespaces, -a mount operation was required in each namespace. -For this use case, and others, -the shared subtree feature was introduced in Linux 2.6.15. -This feature allows for automatic, controlled propagation of -.BR mount (2) -and -.BR umount (2) -.I events -between namespaces -(or, more precisely, between the mounts that are members of a -.I peer group -that are propagating events to one another). -.P -Each mount is marked (via -.BR mount (2)) -as having one of the following -.IR "propagation types" : -.TP -.B MS_SHARED -This mount shares events with members of a peer group. -.BR mount (2) -and -.BR umount (2) -events immediately under this mount will propagate -to the other mounts that are members of the peer group. -.I Propagation -here means that the same -.BR mount (2) -or -.BR umount (2) -will automatically occur -under all of the other mounts in the peer group. -Conversely, -.BR mount (2) -and -.BR umount (2) -events that take place under -peer mounts will propagate to this mount. -.TP -.B MS_PRIVATE -This mount is private; it does not have a peer group. -.BR mount (2) -and -.BR umount (2) -events do not propagate into or out of this mount. -.TP -.B MS_SLAVE -.BR mount (2) -and -.BR umount (2) -events propagate into this mount from -a (master) shared peer group. -.BR mount (2) -and -.BR umount (2) -events under this mount do not propagate to any peer. -.IP -Note that a mount can be the slave of another peer group -while at the same time sharing -.BR mount (2) -and -.BR umount (2) -events -with a peer group of which it is a member. -(More precisely, one peer group can be the slave of another peer group.) -.TP -.B MS_UNBINDABLE -This is like a private mount, -and in addition this mount can't be bind mounted. -Attempts to bind mount this mount -.RB ( mount (2) -with the -.B MS_BIND -flag) will fail. -.IP -When a recursive bind mount -.RB ( mount (2) -with the -.B MS_BIND -and -.B MS_REC -flags) is performed on a directory subtree, -any bind mounts within the subtree are automatically pruned -(i.e., not replicated) -when replicating that subtree to produce the target subtree. -.P -For a discussion of the propagation type assigned to a new mount, -see NOTES. -.P -The propagation type is a per-mount-point setting; -some mounts may be marked as shared -(with each shared mount being a member of a distinct peer group), -while others are private -(or slaved or unbindable). -.P -Note that a mount's propagation type determines whether -.BR mount (2) -and -.BR umount (2) -of mounts -.I immediately under -the mount are propagated. -Thus, the propagation type does not affect propagation of events for -grandchildren and further removed descendant mounts. -What happens if the mount itself is unmounted is determined by -the propagation type that is in effect for the -.I parent -of the mount. -.P -Members are added to a -.I peer group -when a mount is marked as shared and either: -.IP (a) 5 -the mount is replicated during the creation of a new mount namespace; or -.IP (b) -a new bind mount is created from the mount. -.P -In both of these cases, the new mount joins the peer group -of which the existing mount is a member. -.P -A new peer group is also created when a child mount is created under -an existing mount that is marked as shared. -In this case, the new child mount is also marked as shared and -the resulting peer group consists of all the mounts -that are replicated under the peers of parent mounts. -.P -A mount ceases to be a member of a peer group when either -the mount is explicitly unmounted, -or when the mount is implicitly unmounted because a mount namespace is removed -(because it has no more member processes). -.P -The propagation type of the mounts in a mount namespace -can be discovered via the "optional fields" exposed in -.IR /proc/ pid /mountinfo . -(See -.BR proc (5) -for details of this file.) -The following tags can appear in the optional fields -for a record in that file: -.TP -.I shared:X -This mount is shared in peer group -.IR X . -Each peer group has a unique ID that is automatically -generated by the kernel, -and all mounts in the same peer group will show the same ID. -(These IDs are assigned starting from the value 1, -and may be recycled when a peer group ceases to have any members.) -.TP -.I master:X -This mount is a slave to shared peer group -.IR X . -.TP -.IR propagate_from:X " (since Linux 2.6.26)" -.\" commit 97e7e0f71d6d948c25f11f0a33878d9356d9579e -This mount is a slave and receives propagation from shared peer group -.IR X . -This tag will always appear in conjunction with a -.I master:X -tag. -Here, -.I X -is the closest dominant peer group under the process's root directory. -If -.I X -is the immediate master of the mount, -or if there is no dominant peer group under the same root, -then only the -.I master:X -field is present and not the -.I propagate_from:X -field. -For further details, see below. -.TP -.I unbindable -This is an unbindable mount. -.P -If none of the above tags is present, then this is a private mount. -.SS MS_SHARED and MS_PRIVATE example -Suppose that on a terminal in the initial mount namespace, -we mark one mount as shared and another as private, -and then view the mounts in -.IR /proc/self/mountinfo : -.P -.in +4n -.EX -sh1# \fBmount \-\-make\-shared /mntS\fP -sh1# \fBmount \-\-make\-private /mntP\fP -sh1# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -77 61 8:17 / /mntS rw,relatime shared:1 -83 61 8:15 / /mntP rw,relatime -.EE -.in -.P -From the -.I /proc/self/mountinfo -output, we see that -.I /mntS -is a shared mount in peer group 1, and that -.I /mntP -has no optional tags, indicating that it is a private mount. -The first two fields in each record in this file are the unique -ID for this mount, and the mount ID of the parent mount. -We can further inspect this file to see that the parent mount of -.I /mntS -and -.I /mntP -is the root directory, -.IR / , -which is mounted as private: -.P -.in +4n -.EX -sh1# \fBcat /proc/self/mountinfo | awk \[aq]$1 == 61\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -61 0 8:2 / / rw,relatime -.EE -.in -.P -On a second terminal, -we create a new mount namespace where we run a second shell -and inspect the mounts: -.P -.in +4n -.EX -$ \fBPS1=\[aq]sh2# \[aq] sudo unshare \-m \-\-propagation unchanged sh\fP -sh2# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -222 145 8:17 / /mntS rw,relatime shared:1 -225 145 8:15 / /mntP rw,relatime -.EE -.in -.P -The new mount namespace received a copy of the initial mount namespace's -mounts. -These new mounts maintain the same propagation types, -but have unique mount IDs. -(The -.I \-\-propagation\~unchanged -option prevents -.BR unshare (1) -from marking all mounts as private when creating a new mount namespace, -.\" Since util-linux 2.27 -which it does by default.) -.P -In the second terminal, we then create submounts under each of -.I /mntS -and -.I /mntP -and inspect the set-up: -.P -.in +4n -.EX -sh2# \fBmkdir /mntS/a\fP -sh2# \fBmount /dev/sdb6 /mntS/a\fP -sh2# \fBmkdir /mntP/b\fP -sh2# \fBmount /dev/sdb7 /mntP/b\fP -sh2# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -222 145 8:17 / /mntS rw,relatime shared:1 -225 145 8:15 / /mntP rw,relatime -178 222 8:22 / /mntS/a rw,relatime shared:2 -230 225 8:23 / /mntP/b rw,relatime -.EE -.in -.P -From the above, it can be seen that -.I /mntS/a -was created as shared (inheriting this setting from its parent mount) and -.I /mntP/b -was created as a private mount. -.P -Returning to the first terminal and inspecting the set-up, -we see that the new mount created under the shared mount -.I /mntS -propagated to its peer mount (in the initial mount namespace), -but the new mount created under the private mount -.I /mntP -did not propagate: -.P -.in +4n -.EX -sh1# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -77 61 8:17 / /mntS rw,relatime shared:1 -83 61 8:15 / /mntP rw,relatime -179 77 8:22 / /mntS/a rw,relatime shared:2 -.EE -.in -.\" -.SS MS_SLAVE example -Making a mount a slave allows it to receive propagated -.BR mount (2) -and -.BR umount (2) -events from a master shared peer group, -while preventing it from propagating events to that master. -This is useful if we want to (say) receive a mount event when -an optical disk is mounted in the master shared peer group -(in another mount namespace), -but want to prevent -.BR mount (2) -and -.BR umount (2) -events under the slave mount -from having side effects in other namespaces. -.P -We can demonstrate the effect of slaving by first marking -two mounts as shared in the initial mount namespace: -.P -.in +4n -.EX -sh1# \fBmount \-\-make\-shared /mntX\fP -sh1# \fBmount \-\-make\-shared /mntY\fP -sh1# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -132 83 8:23 / /mntX rw,relatime shared:1 -133 83 8:22 / /mntY rw,relatime shared:2 -.EE -.in -.P -On a second terminal, -we create a new mount namespace and inspect the mounts: -.P -.in +4n -.EX -sh2# \fBunshare \-m \-\-propagation unchanged sh\fP -sh2# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -168 167 8:23 / /mntX rw,relatime shared:1 -169 167 8:22 / /mntY rw,relatime shared:2 -.EE -.in -.P -In the new mount namespace, we then mark one of the mounts as a slave: -.P -.in +4n -.EX -sh2# \fBmount \-\-make\-slave /mntY\fP -sh2# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -168 167 8:23 / /mntX rw,relatime shared:1 -169 167 8:22 / /mntY rw,relatime master:2 -.EE -.in -.P -From the above output, we see that -.I /mntY -is now a slave mount that is receiving propagation events from -the shared peer group with the ID 2. -.P -Continuing in the new namespace, we create submounts under each of -.I /mntX -and -.IR /mntY : -.P -.in +4n -.EX -sh2# \fBmkdir /mntX/a\fP -sh2# \fBmount /dev/sda3 /mntX/a\fP -sh2# \fBmkdir /mntY/b\fP -sh2# \fBmount /dev/sda5 /mntY/b\fP -.EE -.in -.P -When we inspect the state of the mounts in the new mount namespace, -we see that -.I /mntX/a -was created as a new shared mount -(inheriting the "shared" setting from its parent mount) and -.I /mntY/b -was created as a private mount: -.P -.in +4n -.EX -sh2# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -168 167 8:23 / /mntX rw,relatime shared:1 -169 167 8:22 / /mntY rw,relatime master:2 -173 168 8:3 / /mntX/a rw,relatime shared:3 -175 169 8:5 / /mntY/b rw,relatime -.EE -.in -.P -Returning to the first terminal (in the initial mount namespace), -we see that the mount -.I /mntX/a -propagated to the peer (the shared -.IR /mntX ), -but the mount -.I /mntY/b -was not propagated: -.P -.in +4n -.EX -sh1# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -132 83 8:23 / /mntX rw,relatime shared:1 -133 83 8:22 / /mntY rw,relatime shared:2 -174 132 8:3 / /mntX/a rw,relatime shared:3 -.EE -.in -.P -Now we create a new mount under -.I /mntY -in the first shell: -.P -.in +4n -.EX -sh1# \fBmkdir /mntY/c\fP -sh1# \fBmount /dev/sda1 /mntY/c\fP -sh1# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -132 83 8:23 / /mntX rw,relatime shared:1 -133 83 8:22 / /mntY rw,relatime shared:2 -174 132 8:3 / /mntX/a rw,relatime shared:3 -178 133 8:1 / /mntY/c rw,relatime shared:4 -.EE -.in -.P -When we examine the mounts in the second mount namespace, -we see that in this case the new mount has been propagated -to the slave mount, -and that the new mount is itself a slave mount (to peer group 4): -.P -.in +4n -.EX -sh2# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -168 167 8:23 / /mntX rw,relatime shared:1 -169 167 8:22 / /mntY rw,relatime master:2 -173 168 8:3 / /mntX/a rw,relatime shared:3 -175 169 8:5 / /mntY/b rw,relatime -179 169 8:1 / /mntY/c rw,relatime master:4 -.EE -.in -.\" -.SS MS_UNBINDABLE example -One of the primary purposes of unbindable mounts is to avoid -the "mount explosion" problem when repeatedly performing bind mounts -of a higher-level subtree at a lower-level mount. -The problem is illustrated by the following shell session. -.P -Suppose we have a system with the following mounts: -.P -.in +4n -.EX -# \fBmount | awk \[aq]{print $1, $2, $3}\[aq]\fP -/dev/sda1 on / -/dev/sdb6 on /mntX -/dev/sdb7 on /mntY -.EE -.in -.P -Suppose furthermore that we wish to recursively bind mount -the root directory under several users' home directories. -We do this for the first user, and inspect the mounts: -.P -.in +4n -.EX -# \fBmount \-\-rbind / /home/cecilia/\fP -# \fBmount | awk \[aq]{print $1, $2, $3}\[aq]\fP -/dev/sda1 on / -/dev/sdb6 on /mntX -/dev/sdb7 on /mntY -/dev/sda1 on /home/cecilia -/dev/sdb6 on /home/cecilia/mntX -/dev/sdb7 on /home/cecilia/mntY -.EE -.in -.P -When we repeat this operation for the second user, -we start to see the explosion problem: -.P -.in +4n -.EX -# \fBmount \-\-rbind / /home/henry\fP -# \fBmount | awk \[aq]{print $1, $2, $3}\[aq]\fP -/dev/sda1 on / -/dev/sdb6 on /mntX -/dev/sdb7 on /mntY -/dev/sda1 on /home/cecilia -/dev/sdb6 on /home/cecilia/mntX -/dev/sdb7 on /home/cecilia/mntY -/dev/sda1 on /home/henry -/dev/sdb6 on /home/henry/mntX -/dev/sdb7 on /home/henry/mntY -/dev/sda1 on /home/henry/home/cecilia -/dev/sdb6 on /home/henry/home/cecilia/mntX -/dev/sdb7 on /home/henry/home/cecilia/mntY -.EE -.in -.P -Under -.IR /home/henry , -we have not only recursively added the -.I /mntX -and -.I /mntY -mounts, but also the recursive mounts of those directories under -.I /home/cecilia -that were created in the previous step. -Upon repeating the step for a third user, -it becomes obvious that the explosion is exponential in nature: -.P -.in +4n -.EX -# \fBmount \-\-rbind / /home/otto\fP -# \fBmount | awk \[aq]{print $1, $2, $3}\[aq]\fP -/dev/sda1 on / -/dev/sdb6 on /mntX -/dev/sdb7 on /mntY -/dev/sda1 on /home/cecilia -/dev/sdb6 on /home/cecilia/mntX -/dev/sdb7 on /home/cecilia/mntY -/dev/sda1 on /home/henry -/dev/sdb6 on /home/henry/mntX -/dev/sdb7 on /home/henry/mntY -/dev/sda1 on /home/henry/home/cecilia -/dev/sdb6 on /home/henry/home/cecilia/mntX -/dev/sdb7 on /home/henry/home/cecilia/mntY -/dev/sda1 on /home/otto -/dev/sdb6 on /home/otto/mntX -/dev/sdb7 on /home/otto/mntY -/dev/sda1 on /home/otto/home/cecilia -/dev/sdb6 on /home/otto/home/cecilia/mntX -/dev/sdb7 on /home/otto/home/cecilia/mntY -/dev/sda1 on /home/otto/home/henry -/dev/sdb6 on /home/otto/home/henry/mntX -/dev/sdb7 on /home/otto/home/henry/mntY -/dev/sda1 on /home/otto/home/henry/home/cecilia -/dev/sdb6 on /home/otto/home/henry/home/cecilia/mntX -/dev/sdb7 on /home/otto/home/henry/home/cecilia/mntY -.EE -.in -.P -The mount explosion problem in the above scenario can be avoided -by making each of the new mounts unbindable. -The effect of doing this is that recursive mounts of the root -directory will not replicate the unbindable mounts. -We make such a mount for the first user: -.P -.in +4n -.EX -# \fBmount \-\-rbind \-\-make\-unbindable / /home/cecilia\fP -.EE -.in -.P -Before going further, we show that unbindable mounts are indeed unbindable: -.P -.in +4n -.EX -# \fBmkdir /mntZ\fP -# \fBmount \-\-bind /home/cecilia /mntZ\fP -mount: wrong fs type, bad option, bad superblock on /home/cecilia, - missing codepage or helper program, or other error -\& - In some cases useful info is found in syslog \- try - dmesg | tail or so. -.EE -.in -.P -Now we create unbindable recursive bind mounts for the other two users: -.P -.in +4n -.EX -# \fBmount \-\-rbind \-\-make\-unbindable / /home/henry\fP -# \fBmount \-\-rbind \-\-make\-unbindable / /home/otto\fP -.EE -.in -.P -Upon examining the list of mounts, -we see there has been no explosion of mounts, -because the unbindable mounts were not replicated -under each user's directory: -.P -.in +4n -.EX -# \fBmount | awk \[aq]{print $1, $2, $3}\[aq]\fP -/dev/sda1 on / -/dev/sdb6 on /mntX -/dev/sdb7 on /mntY -/dev/sda1 on /home/cecilia -/dev/sdb6 on /home/cecilia/mntX -/dev/sdb7 on /home/cecilia/mntY -/dev/sda1 on /home/henry -/dev/sdb6 on /home/henry/mntX -/dev/sdb7 on /home/henry/mntY -/dev/sda1 on /home/otto -/dev/sdb6 on /home/otto/mntX -/dev/sdb7 on /home/otto/mntY -.EE -.in -.\" -.SS Propagation type transitions -The following table shows the effect that applying a new propagation type -(i.e., -.IR mount\~\-\-make\-xxxx ) -has on the existing propagation type of a mount. -The rows correspond to existing propagation types, -and the columns are the new propagation settings. -For reasons of space, "private" is abbreviated as "priv" and -"unbindable" as "unbind". -.TS -lb2 lb2 lb2 lb2 lb1 -lb | l l l l l. - make-shared make-slave make-priv make-unbind -_ -shared shared slave/priv [1] priv unbind -slave slave+shared slave [2] priv unbind -slave+shared slave+shared slave priv unbind -private shared priv [2] priv unbind -unbindable shared unbind [2] priv unbind -.TE -.P -Note the following details to the table: -.IP [1] 5 -If a shared mount is the only mount in its peer group, -making it a slave automatically makes it private. -.IP [2] -Slaving a nonshared mount has no effect on the mount. -.\" -.SS Bind (MS_BIND) semantics -Suppose that the following command is performed: -.P -.in +4n -.EX -mount \-\-bind A/a B/b -.EE -.in -.P -Here, -.I A -is the source mount, -.I B -is the destination mount, -.I a -is a subdirectory path under the mount point -.IR A , -and -.I b -is a subdirectory path under the mount point -.IR B . -The propagation type of the resulting mount, -.IR B/b , -depends on the propagation types of the mounts -.I A -and -.IR B , -and is summarized in the following table. -.P -.TS -lb2 lb1 lb2 lb2 lb2 lb0 -lb2 lb1 lb2 lb2 lb2 lb0 -lb lb | l l l l l. - source(A) - shared private slave unbind -_ -dest(B) shared shared shared slave+shared invalid - nonshared shared private slave invalid -.TE -.P -Note that a recursive bind of a subtree follows the same semantics -as for a bind operation on each mount in the subtree. -(Unbindable mounts are automatically pruned at the target mount point.) -.P -For further details, see -.I Documentation/filesystems/sharedsubtree.rst -in the kernel source tree. -.\" -.SS Move (MS_MOVE) semantics -Suppose that the following command is performed: -.P -.in +4n -.EX -mount \-\-move A B/b -.EE -.in -.P -Here, -.I A -is the source mount, -.I B -is the destination mount, and -.I b -is a subdirectory path under the mount point -.IR B . -The propagation type of the resulting mount, -.IR B/b , -depends on the propagation types of the mounts -.I A -and -.IR B , -and is summarized in the following table. -.P -.TS -lb2 lb1 lb2 lb2 lb2 lb0 -lb2 lb1 lb2 lb2 lb2 lb0 -lb lb | l l l l l. - source(A) - shared private slave unbind -_ -dest(B) shared shared shared slave+shared invalid - nonshared shared private slave unbindable -.TE -.P -Note: moving a mount that resides under a shared mount is invalid. -.P -For further details, see -.I Documentation/filesystems/sharedsubtree.rst -in the kernel source tree. -.\" -.SS Mount semantics -Suppose that we use the following command to create a mount: -.P -.in +4n -.EX -mount device B/b -.EE -.in -.P -Here, -.I B -is the destination mount, and -.I b -is a subdirectory path under the mount point -.IR B . -The propagation type of the resulting mount, -.IR B/b , -follows the same rules as for a bind mount, -where the propagation type of the source mount -is considered always to be private. -.\" -.SS Unmount semantics -Suppose that we use the following command to tear down a mount: -.P -.in +4n -.EX -umount A -.EE -.in -.P -Here, -.I A -is a mount on -.IR B/b , -where -.I B -is the parent mount and -.I b -is a subdirectory path under the mount point -.IR B . -If -.B B -is shared, then all most-recently-mounted mounts at -.I b -on mounts that receive propagation from mount -.I B -and do not have submounts under them are unmounted. -.\" -.SS The /proc/ pid /mountinfo "propagate_from" tag -The -.I propagate_from:X -tag is shown in the optional fields of a -.IR /proc/ pid /mountinfo -record in cases where a process can't see a slave's immediate master -(i.e., the pathname of the master is not reachable from -the filesystem root directory) -and so cannot determine the -chain of propagation between the mounts it can see. -.P -In the following example, we first create a two-link master-slave chain -between the mounts -.IR /mnt , -.IR /tmp/etc , -and -.IR /mnt/tmp/etc . -Then the -.BR chroot (1) -command is used to make the -.I /tmp/etc -mount point unreachable from the root directory, -creating a situation where the master of -.I /mnt/tmp/etc -is not reachable from the (new) root directory of the process. -.P -First, we bind mount the root directory onto -.I /mnt -and then bind mount -.I /proc -at -.I /mnt/proc -so that after the later -.BR chroot (1) -the -.BR proc (5) -filesystem remains visible at the correct location -in the chroot-ed environment. -.P -.in +4n -.EX -# \fBmkdir \-p /mnt/proc\fP -# \fBmount \-\-bind / /mnt\fP -# \fBmount \-\-bind /proc /mnt/proc\fP -.EE -.in -.P -Next, we ensure that the -.I /mnt -mount is a shared mount in a new peer group (with no peers): -.P -.in +4n -.EX -# \fBmount \-\-make\-private /mnt\fP # Isolate from any previous peer group -# \fBmount \-\-make\-shared /mnt\fP -# \fBcat /proc/self/mountinfo | grep \[aq]/mnt\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -239 61 8:2 / /mnt ... shared:102 -248 239 0:4 / /mnt/proc ... shared:5 -.EE -.in -.P -Next, we bind mount -.I /mnt/etc -onto -.IR /tmp/etc : -.P -.in +4n -.EX -# \fBmkdir \-p /tmp/etc\fP -# \fBmount \-\-bind /mnt/etc /tmp/etc\fP -# \fBcat /proc/self/mountinfo | egrep \[aq]/mnt|/tmp/\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -239 61 8:2 / /mnt ... shared:102 -248 239 0:4 / /mnt/proc ... shared:5 -267 40 8:2 /etc /tmp/etc ... shared:102 -.EE -.in -.P -Initially, these two mounts are in the same peer group, -but we then make the -.I /tmp/etc -a slave of -.IR /mnt/etc , -and then make -.I /tmp/etc -shared as well, -so that it can propagate events to the next slave in the chain: -.P -.in +4n -.EX -# \fBmount \-\-make\-slave /tmp/etc\fP -# \fBmount \-\-make\-shared /tmp/etc\fP -# \fBcat /proc/self/mountinfo | egrep \[aq]/mnt|/tmp/\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -239 61 8:2 / /mnt ... shared:102 -248 239 0:4 / /mnt/proc ... shared:5 -267 40 8:2 /etc /tmp/etc ... shared:105 master:102 -.EE -.in -.P -Then we bind mount -.I /tmp/etc -onto -.IR /mnt/tmp/etc . -Again, the two mounts are initially in the same peer group, -but we then make -.I /mnt/tmp/etc -a slave of -.IR /tmp/etc : -.P -.in +4n -.EX -# \fBmkdir \-p /mnt/tmp/etc\fP -# \fBmount \-\-bind /tmp/etc /mnt/tmp/etc\fP -# \fBmount \-\-make\-slave /mnt/tmp/etc\fP -# \fBcat /proc/self/mountinfo | egrep \[aq]/mnt|/tmp/\[aq] | sed \[aq]s/ \- .*//\[aq]\fP -239 61 8:2 / /mnt ... shared:102 -248 239 0:4 / /mnt/proc ... shared:5 -267 40 8:2 /etc /tmp/etc ... shared:105 master:102 -273 239 8:2 /etc /mnt/tmp/etc ... master:105 -.EE -.in -.P -From the above, we see that -.I /mnt -is the master of the slave -.IR /tmp/etc , -which in turn is the master of the slave -.IR /mnt/tmp/etc . -.P -We then -.BR chroot (1) -to the -.I /mnt -directory, which renders the mount with ID 267 unreachable -from the (new) root directory: -.P -.in +4n -.EX -# \fBchroot /mnt\fP -.EE -.in -.P -When we examine the state of the mounts inside the chroot-ed environment, -we see the following: -.P -.in +4n -.EX -# \fBcat /proc/self/mountinfo | sed \[aq]s/ \- .*//\[aq]\fP -239 61 8:2 / / ... shared:102 -248 239 0:4 / /proc ... shared:5 -273 239 8:2 /etc /tmp/etc ... master:105 propagate_from:102 -.EE -.in -.P -Above, we see that the mount with ID 273 -is a slave whose master is the peer group 105. -The mount point for that master is unreachable, and so a -.I propagate_from -tag is displayed, indicating that the closest dominant peer group -(i.e., the nearest reachable mount in the slave chain) -is the peer group with the ID 102 (corresponding to the -.I /mnt -mount point before the -.BR chroot (1) -was performed). -.\" -.SH STANDARDS -Linux. -.SH HISTORY -Linux 2.4.19. -.\" -.SH NOTES -The propagation type assigned to a new mount depends -on the propagation type of the parent mount. -If the mount has a parent (i.e., it is a non-root mount -point) and the propagation type of the parent is -.BR MS_SHARED , -then the propagation type of the new mount is also -.BR MS_SHARED . -Otherwise, the propagation type of the new mount is -.BR MS_PRIVATE . -.P -Notwithstanding the fact that the default propagation type -for new mount is in many cases -.BR MS_PRIVATE , -.B MS_SHARED -is typically more useful. -For this reason, -.BR systemd (1) -automatically remounts all mounts as -.B MS_SHARED -on system startup. -Thus, on most modern systems, the default propagation type is in practice -.BR MS_SHARED . -.P -Since, when one uses -.BR unshare (1) -to create a mount namespace, -the goal is commonly to provide full isolation of the mounts -in the new namespace, -.BR unshare (1) -(since -.I util\-linux -2.27) in turn reverses the step performed by -.BR systemd (1), -by making all mounts private in the new namespace. -That is, -.BR unshare (1) -performs the equivalent of the following in the new mount namespace: -.P -.in +4n -.EX -mount \-\-make\-rprivate / -.EE -.in -.P -To prevent this, one can use the -.I \-\-propagation\~unchanged -option to -.BR unshare (1). -.P -An application that creates a new mount namespace directly using -.BR clone (2) -or -.BR unshare (2) -may desire to prevent propagation of mount events to other mount namespaces -(as is done by -.BR unshare (1)). -This can be done by changing the propagation type of -mounts in the new namespace to either -.B MS_SLAVE -or -.BR MS_PRIVATE , -using a call such as the following: -.P -.in +4n -.EX -mount(NULL, "/", MS_SLAVE | MS_REC, NULL); -.EE -.in -.P -For a discussion of propagation types when moving mounts -.RB ( MS_MOVE ) -and creating bind mounts -.RB ( MS_BIND ), -see -.IR Documentation/filesystems/sharedsubtree.rst . -.\" -.\" ============================================================ -.\" -.SS Restrictions on mount namespaces -Note the following points with respect to mount namespaces: -.IP [1] 5 -Each mount namespace has an owner user namespace. -As explained above, when a new mount namespace is created, -its mount list is initialized as a copy of the mount list -of another mount namespace. -If the new namespace and the namespace from which the mount list -was copied are owned by different user namespaces, -then the new mount namespace is considered -.IR "less privileged" . -.IP [2] -When creating a less privileged mount namespace, -shared mounts are reduced to slave mounts. -This ensures that mappings performed in less -privileged mount namespaces will not propagate to more privileged -mount namespaces. -.IP [3] -Mounts that come as a single unit from a more privileged mount namespace are -locked together and may not be separated in a less privileged mount -namespace. -(The -.BR unshare (2) -.B CLONE_NEWNS -operation brings across all of the mounts from the original -mount namespace as a single unit, -and recursive mounts that propagate between -mount namespaces propagate as a single unit.) -.IP -In this context, "may not be separated" means that the mounts -are locked so that they may not be individually unmounted. -Consider the following example: -.IP -.in +4n -.EX -$ \fBsudo sh\fP -# \fBmount \-\-bind /dev/null /etc/shadow\fP -# \fBcat /etc/shadow\fP # Produces no output -.EE -.in -.IP -The above steps, performed in a more privileged mount namespace, -have created a bind mount that -obscures the contents of the shadow password file, -.IR /etc/shadow . -For security reasons, it should not be possible to -.BR umount (2) -that mount in a less privileged mount namespace, -since that would reveal the contents of -.IR /etc/shadow . -.IP -Suppose we now create a new mount namespace -owned by a new user namespace. -The new mount namespace will inherit copies of all of the mounts -from the previous mount namespace. -However, those mounts will be locked because the new mount namespace -is less privileged. -Consequently, an attempt to -.BR umount (2) -the mount fails as show -in the following step: -.IP -.in +4n -.EX -# \fBunshare \-\-user \-\-map\-root\-user \-\-mount \e\fP - \fBstrace \-o /tmp/log \e\fP - \fBumount /mnt/dir\fP -umount: /etc/shadow: not mounted. -# \fBgrep \[aq]\[ha]umount\[aq] /tmp/log\fP -umount2("/etc/shadow", 0) = \-1 EINVAL (Invalid argument) -.EE -.in -.IP -The error message from -.BR mount (8) -is a little confusing, but the -.BR strace (1) -output reveals that the underlying -.BR umount2 (2) -system call failed with the error -.BR EINVAL , -which is the error that the kernel returns to indicate that -the mount is locked. -.IP -Note, however, that it is possible to stack (and unstack) a -mount on top of one of the inherited locked mounts in a -less privileged mount namespace: -.IP -.in +4n -.EX -# \fBecho \[aq]aaaaa\[aq] > /tmp/a\fP # File to mount onto /etc/shadow -# \fBunshare \-\-user \-\-map\-root\-user \-\-mount \e\fP - \fBsh \-c \[aq]mount \-\-bind /tmp/a /etc/shadow; cat /etc/shadow\[aq]\fP -aaaaa -# \fBumount /etc/shadow\fP -.EE -.in -.IP -The final -.BR umount (8) -command above, which is performed in the initial mount namespace, -makes the original -.I /etc/shadow -file once more visible in that namespace. -.IP [4] -Following on from point [3], -note that it is possible to -.BR umount (2) -an entire subtree of mounts that -propagated as a unit into a less privileged mount namespace, -as illustrated in the following example. -.IP -First, we create new user and mount namespaces using -.BR unshare (1). -In the new mount namespace, -the propagation type of all mounts is set to private. -We then create a shared bind mount at -.IR /mnt , -and a small hierarchy of mounts underneath that mount. -.IP -.in +4n -.EX -$ \fBPS1=\[aq]ns1# \[aq] sudo unshare \-\-user \-\-map\-root\-user \e\fP - \fB\-\-mount \-\-propagation private bash\fP -ns1# \fBecho $$\fP # We need the PID of this shell later -778501 -ns1# \fBmount \-\-make\-shared \-\-bind /mnt /mnt\fP -ns1# \fBmkdir /mnt/x\fP -ns1# \fBmount \-\-make\-private \-t tmpfs none /mnt/x\fP -ns1# \fBmkdir /mnt/x/y\fP -ns1# \fBmount \-\-make\-private \-t tmpfs none /mnt/x/y\fP -ns1# \fBgrep /mnt /proc/self/mountinfo | sed \[aq]s/ \- .*//\[aq]\fP -986 83 8:5 /mnt /mnt rw,relatime shared:344 -989 986 0:56 / /mnt/x rw,relatime -990 989 0:57 / /mnt/x/y rw,relatime -.EE -.in -.IP -Continuing in the same shell session, -we then create a second shell in a new user namespace and a new -(less privileged) mount namespace and -check the state of the propagated mounts rooted at -.IR /mnt . -.IP -.in +4n -.EX -ns1# \fBPS1=\[aq]ns2# \[aq] unshare \-\-user \-\-map\-root\-user \e\fP - \fB\-\-mount \-\-propagation unchanged bash\fP -ns2# \fBgrep /mnt /proc/self/mountinfo | sed \[aq]s/ \- .*//\[aq]\fP -1239 1204 8:5 /mnt /mnt rw,relatime master:344 -1240 1239 0:56 / /mnt/x rw,relatime -1241 1240 0:57 / /mnt/x/y rw,relatime -.EE -.in -.IP -Of note in the above output is that the propagation type of the mount -.I /mnt -has been reduced to slave, as explained in point [2]. -This means that submount events will propagate from the master -.I /mnt -in "ns1", but propagation will not occur in the opposite direction. -.IP -From a separate terminal window, we then use -.BR nsenter (1) -to enter the mount and user namespaces corresponding to "ns1". -In that terminal window, we then recursively bind mount -.I /mnt/x -at the location -.IR /mnt/ppp . -.IP -.in +4n -.EX -$ \fBPS1=\[aq]ns3# \[aq] sudo nsenter \-t 778501 \-\-user \-\-mount\fP -ns3# \fBmount \-\-rbind \-\-make\-private /mnt/x /mnt/ppp\fP -ns3# \fBgrep /mnt /proc/self/mountinfo | sed \[aq]s/ \- .*//\[aq]\fP -986 83 8:5 /mnt /mnt rw,relatime shared:344 -989 986 0:56 / /mnt/x rw,relatime -990 989 0:57 / /mnt/x/y rw,relatime -1242 986 0:56 / /mnt/ppp rw,relatime -1243 1242 0:57 / /mnt/ppp/y rw,relatime shared:518 -.EE -.in -.IP -Because the propagation type of the parent mount, -.IR /mnt , -was shared, the recursive bind mount propagated a small subtree of -mounts under the slave mount -.I /mnt -into "ns2", -as can be verified by executing the following command in that shell session: -.IP -.in +4n -.EX -ns2# \fBgrep /mnt /proc/self/mountinfo | sed \[aq]s/ \- .*//\[aq]\fP -1239 1204 8:5 /mnt /mnt rw,relatime master:344 -1240 1239 0:56 / /mnt/x rw,relatime -1241 1240 0:57 / /mnt/x/y rw,relatime -1244 1239 0:56 / /mnt/ppp rw,relatime -1245 1244 0:57 / /mnt/ppp/y rw,relatime master:518 -.EE -.in -.IP -While it is not possible to -.BR umount (2) -a part of the propagated subtree -.RI ( /mnt/ppp/y ) -in "ns2", -it is possible to -.BR umount (2) -the entire subtree, -as shown by the following commands: -.IP -.in +4n -.EX -ns2# \fBumount /mnt/ppp/y\fP -umount: /mnt/ppp/y: not mounted. -ns2# \fBumount \-l /mnt/ppp | sed \[aq]s/ \- .*//\[aq]\fP # Succeeds... -ns2# \fBgrep /mnt /proc/self/mountinfo\fP -1239 1204 8:5 /mnt /mnt rw,relatime master:344 -1240 1239 0:56 / /mnt/x rw,relatime -1241 1240 0:57 / /mnt/x/y rw,relatime -.EE -.in -.IP [5] -The -.BR mount (2) -flags -.BR MS_RDONLY , -.BR MS_NOSUID , -.BR MS_NOEXEC , -and the "atime" flags -.RB ( MS_NOATIME , -.BR MS_NODIRATIME , -.BR MS_RELATIME ) -settings become locked -.\" commit 9566d6742852c527bf5af38af5cbb878dad75705 -.\" Author: Eric W. Biederman <ebiederm@xmission.com> -.\" Date: Mon Jul 28 17:26:07 2014 -0700 -.\" -.\" mnt: Correct permission checks in do_remount -.\" -when propagated from a more privileged to -a less privileged mount namespace, -and may not be changed in the less privileged mount namespace. -.IP -This point is illustrated in the following example where, -in a more privileged mount namespace, -we create a bind mount that is marked as read-only. -For security reasons, -it should not be possible to make the mount writable in -a less privileged mount namespace, and indeed the kernel prevents this: -.IP -.in +4n -.EX -$ \fBsudo mkdir /mnt/dir\fP -$ \fBsudo mount \-\-bind \-o ro /some/path /mnt/dir\fP -$ \fBsudo unshare \-\-user \-\-map\-root\-user \-\-mount \e\fP - \fBmount \-o remount,rw /mnt/dir\fP -mount: /mnt/dir: permission denied. -.EE -.in -.IP [6] -.\" (As of 3.18-rc1 (in Al Viro's 2014-08-30 vfs.git#for-next tree)) -A file or directory that is a mount point in one namespace that is not -a mount point in another namespace, may be renamed, unlinked, or removed -.RB ( rmdir (2)) -in the mount namespace in which it is not a mount point -(subject to the usual permission checks). -Consequently, the mount point is removed in the mount namespace -where it was a mount point. -.IP -Previously (before Linux 3.18), -.\" mtk: The change was in Linux 3.18, I think, with this commit: -.\" commit 8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe -.\" Author: Eric W. Biederman <ebiederman@twitter.com> -.\" Date: Tue Oct 1 18:33:48 2013 -0700 -.\" -.\" vfs: Lazily remove mounts on unlinked files and directories. -attempting to unlink, rename, or remove a file or directory -that was a mount point in another mount namespace would result in the error -.BR EBUSY . -That behavior had technical problems of enforcement (e.g., for NFS) -and permitted denial-of-service attacks against more privileged users -(i.e., preventing individual files from being updated -by bind mounting on top of them). -.SH EXAMPLES -See -.BR pivot_root (2). -.SH SEE ALSO -.BR unshare (1), -.BR clone (2), -.BR mount (2), -.BR mount_setattr (2), -.BR pivot_root (2), -.BR setns (2), -.BR umount (2), -.BR unshare (2), -.BR proc (5), -.BR namespaces (7), -.BR user_namespaces (7), -.BR findmnt (8), -.BR mount (8), -.BR pam_namespace (8), -.BR pivot_root (8), -.BR umount (8) -.P -.I Documentation/filesystems/sharedsubtree.rst -in the kernel source tree. |