diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-24 04:52:24 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-24 04:52:24 +0000 |
commit | 100d1b33f088fd38f69129afff7f9c2a1e084a57 (patch) | |
tree | 5bf6b0bb14f22ecf0a5e9439fdd4c4758402400c /man7/capabilities.7 | |
parent | Releasing progress-linux version 6.7-2~progress7.99u1. (diff) | |
download | manpages-100d1b33f088fd38f69129afff7f9c2a1e084a57.tar.xz manpages-100d1b33f088fd38f69129afff7f9c2a1e084a57.zip |
Merging upstream version 6.8.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'man7/capabilities.7')
-rw-r--r-- | man7/capabilities.7 | 1872 |
1 files changed, 0 insertions, 1872 deletions
diff --git a/man7/capabilities.7 b/man7/capabilities.7 deleted file mode 100644 index ad087f9..0000000 --- a/man7/capabilities.7 +++ /dev/null @@ -1,1872 +0,0 @@ -.\" Copyright (c) 2002 by Michael Kerrisk <mtk.manpages@gmail.com> -.\" -.\" SPDX-License-Identifier: Linux-man-pages-copyleft -.\" -.\" 6 Aug 2002 - Initial Creation -.\" Modified 2003-05-23, Michael Kerrisk, <mtk.manpages@gmail.com> -.\" Modified 2004-05-27, Michael Kerrisk, <mtk.manpages@gmail.com> -.\" 2004-12-08, mtk Added O_NOATIME for CAP_FOWNER -.\" 2005-08-16, mtk, Added CAP_AUDIT_CONTROL and CAP_AUDIT_WRITE -.\" 2008-07-15, Serge Hallyn <serue@us.bbm.com> -.\" Document file capabilities, per-process capability -.\" bounding set, changed semantics for CAP_SETPCAP, -.\" and other changes in Linux 2.6.2[45]. -.\" Add CAP_MAC_ADMIN, CAP_MAC_OVERRIDE, CAP_SETFCAP. -.\" 2008-07-15, mtk -.\" Add text describing circumstances in which CAP_SETPCAP -.\" (theoretically) permits a thread to change the -.\" capability sets of another thread. -.\" Add section describing rules for programmatically -.\" adjusting thread capability sets. -.\" Describe rationale for capability bounding set. -.\" Document "securebits" flags. -.\" Add text noting that if we set the effective flag for one file -.\" capability, then we must also set the effective flag for all -.\" other capabilities where the permitted or inheritable bit is set. -.\" 2011-09-07, mtk/Serge hallyn: Add CAP_SYSLOG -.\" -.TH Capabilities 7 2024-02-25 "Linux man-pages 6.7" -.SH NAME -capabilities \- overview of Linux capabilities -.SH DESCRIPTION -For the purpose of performing permission checks, -traditional UNIX implementations distinguish two categories of processes: -.I privileged -processes (whose effective user ID is 0, referred to as superuser or root), -and -.I unprivileged -processes (whose effective UID is nonzero). -Privileged processes bypass all kernel permission checks, -while unprivileged processes are subject to full permission -checking based on the process's credentials -(usually: effective UID, effective GID, and supplementary group list). -.P -Starting with Linux 2.2, Linux divides the privileges traditionally -associated with superuser into distinct units, known as -.IR capabilities , -which can be independently enabled and disabled. -Capabilities are a per-thread attribute. -.\" -.SS Capabilities list -The following list shows the capabilities implemented on Linux, -and the operations or behaviors that each capability permits: -.TP -.BR CAP_AUDIT_CONTROL " (since Linux 2.6.11)" -Enable and disable kernel auditing; change auditing filter rules; -retrieve auditing status and filtering rules. -.TP -.BR CAP_AUDIT_READ " (since Linux 3.16)" -.\" commit a29b694aa1739f9d76538e34ae25524f9c549d59 -.\" commit 3a101b8de0d39403b2c7e5c23fd0b005668acf48 -Allow reading the audit log via a multicast netlink socket. -.TP -.BR CAP_AUDIT_WRITE " (since Linux 2.6.11)" -Write records to kernel auditing log. -.\" FIXME Add FAN_ENABLE_AUDIT -.TP -.BR CAP_BLOCK_SUSPEND " (since Linux 3.5)" -Employ features that can block system suspend -.RB ( epoll (7) -.BR EPOLLWAKEUP , -.IR /proc/sys/wake_lock ). -.TP -.BR CAP_BPF " (since Linux 5.8)" -Employ privileged BPF operations; see -.BR bpf (2) -and -.BR bpf\-helpers (7). -.IP -This capability was added in Linux 5.8 to separate out -BPF functionality from the overloaded -.B CAP_SYS_ADMIN -capability. -.TP -.BR CAP_CHECKPOINT_RESTORE " (since Linux 5.9)" -.\" commit 124ea650d3072b005457faed69909221c2905a1f -.PD 0 -.RS -.IP \[bu] 3 -Update -.I /proc/sys/kernel/ns_last_pid -(see -.BR pid_namespaces (7)); -.IP \[bu] -employ the -.I set_tid -feature of -.BR clone3 (2); -.\" FIXME There is also some use case relating to -.\" prctl_set_mm_exe_file(); in the 5.9 sources, see -.\" prctl_set_mm_map(). -.IP \[bu] -read the contents of the symbolic links in -.IR /proc/ pid /map_files -for other processes. -.RE -.PD -.IP -This capability was added in Linux 5.9 to separate out -checkpoint/restore functionality from the overloaded -.B CAP_SYS_ADMIN -capability. -.TP -.B CAP_CHOWN -Make arbitrary changes to file UIDs and GIDs (see -.BR chown (2)). -.TP -.B CAP_DAC_OVERRIDE -Bypass file read, write, and execute permission checks. -(DAC is an abbreviation of "discretionary access control".) -.TP -.B CAP_DAC_READ_SEARCH -.PD 0 -.RS -.IP \[bu] 3 -Bypass file read permission checks and -directory read and execute permission checks; -.IP \[bu] -invoke -.BR open_by_handle_at (2); -.IP \[bu] -use the -.BR linkat (2) -.B AT_EMPTY_PATH -flag to create a link to a file referred to by a file descriptor. -.RE -.PD -.TP -.B CAP_FOWNER -.PD 0 -.RS -.IP \[bu] 3 -Bypass permission checks on operations that normally -require the filesystem UID of the process to match the UID of -the file (e.g., -.BR chmod (2), -.BR utime (2)), -excluding those operations covered by -.B CAP_DAC_OVERRIDE -and -.BR CAP_DAC_READ_SEARCH ; -.IP \[bu] -set inode flags (see -.BR ioctl_iflags (2)) -on arbitrary files; -.IP \[bu] -set Access Control Lists (ACLs) on arbitrary files; -.IP \[bu] -ignore directory sticky bit on file deletion; -.IP \[bu] -modify -.I user -extended attributes on sticky directory owned by any user; -.IP \[bu] -specify -.B O_NOATIME -for arbitrary files in -.BR open (2) -and -.BR fcntl (2). -.RE -.PD -.TP -.B CAP_FSETID -.PD 0 -.RS -.IP \[bu] 3 -Don't clear set-user-ID and set-group-ID mode -bits when a file is modified; -.IP \[bu] -set the set-group-ID bit for a file whose GID does not match -the filesystem or any of the supplementary GIDs of the calling process. -.RE -.PD -.TP -.B CAP_IPC_LOCK -.\" FIXME . As at Linux 3.2, there are some strange uses of this capability -.\" in other places; they probably should be replaced with something else. -.PD 0 -.RS -.IP \[bu] 3 -Lock memory -.RB ( mlock (2), -.BR mlockall (2), -.BR mmap (2), -.BR shmctl (2)); -.IP \[bu] -Allocate memory using huge pages -.RB ( memfd_create (2), -.BR mmap (2), -.BR shmctl (2)). -.RE -.PD -.TP -.B CAP_IPC_OWNER -Bypass permission checks for operations on System V IPC objects. -.TP -.B CAP_KILL -Bypass permission checks for sending signals (see -.BR kill (2)). -This includes use of the -.BR ioctl (2) -.B KDSIGACCEPT -operation. -.\" FIXME . CAP_KILL also has an effect for threads + setting child -.\" termination signal to other than SIGCHLD: without this -.\" capability, the termination signal reverts to SIGCHLD -.\" if the child does an exec(). What is the rationale -.\" for this? -.TP -.BR CAP_LEASE " (since Linux 2.4)" -Establish leases on arbitrary files (see -.BR fcntl (2)). -.TP -.B CAP_LINUX_IMMUTABLE -Set the -.B FS_APPEND_FL -and -.B FS_IMMUTABLE_FL -inode flags (see -.BR ioctl_iflags (2)). -.TP -.BR CAP_MAC_ADMIN " (since Linux 2.6.25)" -Allow MAC configuration or state changes. -Implemented for the Smack Linux Security Module (LSM). -.TP -.BR CAP_MAC_OVERRIDE " (since Linux 2.6.25)" -Override Mandatory Access Control (MAC). -Implemented for the Smack LSM. -.TP -.BR CAP_MKNOD " (since Linux 2.4)" -Create special files using -.BR mknod (2). -.TP -.B CAP_NET_ADMIN -Perform various network-related operations: -.PD 0 -.RS -.IP \[bu] 3 -interface configuration; -.IP \[bu] -administration of IP firewall, masquerading, and accounting; -.IP \[bu] -modify routing tables; -.IP \[bu] -bind to any address for transparent proxying; -.IP \[bu] -set type-of-service (TOS); -.IP \[bu] -clear driver statistics; -.IP \[bu] -set promiscuous mode; -.IP \[bu] -enabling multicasting; -.IP \[bu] -use -.BR setsockopt (2) -to set the following socket options: -.BR SO_DEBUG , -.BR SO_MARK , -.B SO_PRIORITY -(for a priority outside the range 0 to 6), -.BR SO_RCVBUFFORCE , -and -.BR SO_SNDBUFFORCE . -.RE -.PD -.TP -.B CAP_NET_BIND_SERVICE -Bind a socket to Internet domain privileged ports -(port numbers less than 1024). -.TP -.B CAP_NET_BROADCAST -(Unused) Make socket broadcasts, and listen to multicasts. -.\" FIXME Since Linux 4.2, there are use cases for netlink sockets -.\" commit 59324cf35aba5336b611074028777838a963d03b -.TP -.B CAP_NET_RAW -.PD 0 -.RS -.IP \[bu] 3 -Use RAW and PACKET sockets; -.IP \[bu] -bind to any address for transparent proxying. -.RE -.PD -.\" Also various IP options and setsockopt(SO_BINDTODEVICE) -.TP -.BR CAP_PERFMON " (since Linux 5.8)" -Employ various performance-monitoring mechanisms, including: -.RS -.IP \[bu] 3 -.PD 0 -call -.BR perf_event_open (2); -.IP \[bu] -employ various BPF operations that have performance implications. -.RE -.PD -.IP -This capability was added in Linux 5.8 to separate out -performance monitoring functionality from the overloaded -.B CAP_SYS_ADMIN -capability. -See also the kernel source file -.IR Documentation/admin\-guide/perf\-security.rst . -.TP -.B CAP_SETGID -.RS -.PD 0 -.IP \[bu] 3 -Make arbitrary manipulations of process GIDs and supplementary GID list; -.IP \[bu] -forge GID when passing socket credentials via UNIX domain sockets; -.IP \[bu] -write a group ID mapping in a user namespace (see -.BR user_namespaces (7)). -.PD -.RE -.TP -.BR CAP_SETFCAP " (since Linux 2.6.24)" -Set arbitrary capabilities on a file. -.IP -.\" commit db2e718a47984b9d71ed890eb2ea36ecf150de18 -Since Linux 5.12, this capability is -also needed to map user ID 0 in a new user namespace; see -.BR user_namespaces (7) -for details. -.TP -.B CAP_SETPCAP -If file capabilities are supported (i.e., since Linux 2.6.24): -add any capability from the calling thread's bounding set -to its inheritable set; -drop capabilities from the bounding set (via -.BR prctl (2) -.BR PR_CAPBSET_DROP ); -make changes to the -.I securebits -flags. -.IP -If file capabilities are not supported (i.e., before Linux 2.6.24): -grant or remove any capability in the -caller's permitted capability set to or from any other process. -(This property of -.B CAP_SETPCAP -is not available when the kernel is configured to support -file capabilities, since -.B CAP_SETPCAP -has entirely different semantics for such kernels.) -.TP -.B CAP_SETUID -.RS -.PD 0 -.IP \[bu] 3 -Make arbitrary manipulations of process UIDs -.RB ( setuid (2), -.BR setreuid (2), -.BR setresuid (2), -.BR setfsuid (2)); -.IP \[bu] -forge UID when passing socket credentials via UNIX domain sockets; -.IP \[bu] -write a user ID mapping in a user namespace (see -.BR user_namespaces (7)). -.PD -.RE -.\" FIXME CAP_SETUID also an effect in exec(); document this. -.TP -.B CAP_SYS_ADMIN -.IR Note : -this capability is overloaded; see -.I Notes to kernel developers -below. -.IP -.PD 0 -.RS -.IP \[bu] 3 -Perform a range of system administration operations including: -.BR quotactl (2), -.BR mount (2), -.BR umount (2), -.BR pivot_root (2), -.BR swapon (2), -.BR swapoff (2), -.BR sethostname (2), -and -.BR setdomainname (2); -.IP \[bu] -perform privileged -.BR syslog (2) -operations (since Linux 2.6.37, -.B CAP_SYSLOG -should be used to permit such operations); -.IP \[bu] -perform -.B VM86_REQUEST_IRQ -.BR vm86 (2) -command; -.IP \[bu] -access the same checkpoint/restore functionality that is governed by -.B CAP_CHECKPOINT_RESTORE -(but the latter, weaker capability is preferred for accessing -that functionality). -.IP \[bu] -perform the same BPF operations as are governed by -.B CAP_BPF -(but the latter, weaker capability is preferred for accessing -that functionality). -.IP \[bu] -employ the same performance monitoring mechanisms as are governed by -.B CAP_PERFMON -(but the latter, weaker capability is preferred for accessing -that functionality). -.IP \[bu] -perform -.B IPC_SET -and -.B IPC_RMID -operations on arbitrary System V IPC objects; -.IP \[bu] -override -.B RLIMIT_NPROC -resource limit; -.IP \[bu] -perform operations on -.I trusted -and -.I security -extended attributes (see -.BR xattr (7)); -.IP \[bu] -use -.BR lookup_dcookie (2); -.IP \[bu] -use -.BR ioprio_set (2) -to assign -.B IOPRIO_CLASS_RT -and (before Linux 2.6.25) -.B IOPRIO_CLASS_IDLE -I/O scheduling classes; -.IP \[bu] -forge PID when passing socket credentials via UNIX domain sockets; -.IP \[bu] -exceed -.IR /proc/sys/fs/file\-max , -the system-wide limit on the number of open files, -in system calls that open files (e.g., -.BR accept (2), -.BR execve (2), -.BR open (2), -.BR pipe (2)); -.IP \[bu] -employ -.B CLONE_* -flags that create new namespaces with -.BR clone (2) -and -.BR unshare (2) -(but, since Linux 3.8, -creating user namespaces does not require any capability); -.IP \[bu] -access privileged -.I perf -event information; -.IP \[bu] -call -.BR setns (2) -(requires -.B CAP_SYS_ADMIN -in the -.I target -namespace); -.IP \[bu] -call -.BR fanotify_init (2); -.IP \[bu] -perform privileged -.B KEYCTL_CHOWN -and -.B KEYCTL_SETPERM -.BR keyctl (2) -operations; -.IP \[bu] -perform -.BR madvise (2) -.B MADV_HWPOISON -operation; -.IP \[bu] -employ the -.B TIOCSTI -.BR ioctl (2) -to insert characters into the input queue of a terminal other than -the caller's controlling terminal; -.IP \[bu] -employ the obsolete -.BR nfsservctl (2) -system call; -.IP \[bu] -employ the obsolete -.BR bdflush (2) -system call; -.IP \[bu] -perform various privileged block-device -.BR ioctl (2) -operations; -.IP \[bu] -perform various privileged filesystem -.BR ioctl (2) -operations; -.IP \[bu] -perform privileged -.BR ioctl (2) -operations on the -.I /dev/random -device (see -.BR random (4)); -.IP \[bu] -install a -.BR seccomp (2) -filter without first having to set the -.I no_new_privs -thread attribute; -.IP \[bu] -modify allow/deny rules for device control groups; -.IP \[bu] -employ the -.BR ptrace (2) -.B PTRACE_SECCOMP_GET_FILTER -operation to dump tracee's seccomp filters; -.IP \[bu] -employ the -.BR ptrace (2) -.B PTRACE_SETOPTIONS -operation to suspend the tracee's seccomp protections (i.e., the -.B PTRACE_O_SUSPEND_SECCOMP -flag); -.IP \[bu] -perform administrative operations on many device drivers; -.IP \[bu] -modify autogroup nice values by writing to -.IR /proc/ pid /autogroup -(see -.BR sched (7)). -.RE -.PD -.TP -.B CAP_SYS_BOOT -Use -.BR reboot (2) -and -.BR kexec_load (2). -.TP -.B CAP_SYS_CHROOT -.RS -.PD 0 -.IP \[bu] 3 -Use -.BR chroot (2); -.IP \[bu] -change mount namespaces using -.BR setns (2). -.PD -.RE -.TP -.B CAP_SYS_MODULE -.RS -.PD 0 -.IP \[bu] 3 -Load and unload kernel modules -(see -.BR init_module (2) -and -.BR delete_module (2)); -.IP \[bu] -before Linux 2.6.25: -drop capabilities from the system-wide capability bounding set. -.PD -.RE -.TP -.B CAP_SYS_NICE -.PD 0 -.RS -.IP \[bu] 3 -Lower the process nice value -.RB ( nice (2), -.BR setpriority (2)) -and change the nice value for arbitrary processes; -.IP \[bu] -set real-time scheduling policies for calling process, -and set scheduling policies and priorities for arbitrary processes -.RB ( sched_setscheduler (2), -.BR sched_setparam (2), -.BR sched_setattr (2)); -.IP \[bu] -set CPU affinity for arbitrary processes -.RB ( sched_setaffinity (2)); -.IP \[bu] -set I/O scheduling class and priority for arbitrary processes -.RB ( ioprio_set (2)); -.IP \[bu] -apply -.BR migrate_pages (2) -to arbitrary processes and allow processes -to be migrated to arbitrary nodes; -.\" FIXME CAP_SYS_NICE also has the following effect for -.\" migrate_pages(2): -.\" do_migrate_pages(mm, &old, &new, -.\" capable(CAP_SYS_NICE) ? MPOL_MF_MOVE_ALL : MPOL_MF_MOVE); -.\" -.\" Document this. -.IP \[bu] -apply -.BR move_pages (2) -to arbitrary processes; -.IP \[bu] -use the -.B MPOL_MF_MOVE_ALL -flag with -.BR mbind (2) -and -.BR move_pages (2). -.RE -.PD -.TP -.B CAP_SYS_PACCT -Use -.BR acct (2). -.TP -.B CAP_SYS_PTRACE -.PD 0 -.RS -.IP \[bu] 3 -Trace arbitrary processes using -.BR ptrace (2); -.IP \[bu] -apply -.BR get_robust_list (2) -to arbitrary processes; -.IP \[bu] -transfer data to or from the memory of arbitrary processes using -.BR process_vm_readv (2) -and -.BR process_vm_writev (2); -.IP \[bu] -inspect processes using -.BR kcmp (2). -.RE -.PD -.TP -.B CAP_SYS_RAWIO -.PD 0 -.RS -.IP \[bu] 3 -Perform I/O port operations -.RB ( iopl (2) -and -.BR ioperm (2)); -.IP \[bu] -access -.IR /proc/kcore ; -.IP \[bu] -employ the -.B FIBMAP -.BR ioctl (2) -operation; -.IP \[bu] -open devices for accessing x86 model-specific registers (MSRs, see -.BR msr (4)); -.IP \[bu] -update -.IR /proc/sys/vm/mmap_min_addr ; -.IP \[bu] -create memory mappings at addresses below the value specified by -.IR /proc/sys/vm/mmap_min_addr ; -.IP \[bu] -map files in -.IR /proc/bus/pci ; -.IP \[bu] -open -.I /dev/mem -and -.IR /dev/kmem ; -.IP \[bu] -perform various SCSI device commands; -.IP \[bu] -perform certain operations on -.BR hpsa (4) -and -.BR cciss (4) -devices; -.IP \[bu] -perform a range of device-specific operations on other devices. -.RE -.PD -.TP -.B CAP_SYS_RESOURCE -.PD 0 -.RS -.IP \[bu] 3 -Use reserved space on ext2 filesystems; -.IP \[bu] -make -.BR ioctl (2) -calls controlling ext3 journaling; -.IP \[bu] -override disk quota limits; -.IP \[bu] -increase resource limits (see -.BR setrlimit (2)); -.IP \[bu] -override -.B RLIMIT_NPROC -resource limit; -.IP \[bu] -override maximum number of consoles on console allocation; -.IP \[bu] -override maximum number of keymaps; -.IP \[bu] -allow more than 64hz interrupts from the real-time clock; -.IP \[bu] -raise -.I msg_qbytes -limit for a System V message queue above the limit in -.I /proc/sys/kernel/msgmnb -(see -.BR msgop (2) -and -.BR msgctl (2)); -.IP \[bu] -allow the -.B RLIMIT_NOFILE -resource limit on the number of "in-flight" file descriptors -to be bypassed when passing file descriptors to another process -via a UNIX domain socket (see -.BR unix (7)); -.IP \[bu] -override the -.I /proc/sys/fs/pipe\-size\-max -limit when setting the capacity of a pipe using the -.B F_SETPIPE_SZ -.BR fcntl (2) -command; -.IP \[bu] -use -.B F_SETPIPE_SZ -to increase the capacity of a pipe above the limit specified by -.IR /proc/sys/fs/pipe\-max\-size ; -.IP \[bu] -override -.IR /proc/sys/fs/mqueue/queues_max , -.IR /proc/sys/fs/mqueue/msg_max , -and -.I /proc/sys/fs/mqueue/msgsize_max -limits when creating POSIX message queues (see -.BR mq_overview (7)); -.IP \[bu] -employ the -.BR prctl (2) -.B PR_SET_MM -operation; -.IP \[bu] -set -.IR /proc/ pid /oom_score_adj -to a value lower than the value last set by a process with -.BR CAP_SYS_RESOURCE . -.RE -.PD -.TP -.B CAP_SYS_TIME -Set system clock -.RB ( settimeofday (2), -.BR stime (2), -.BR adjtimex (2)); -set real-time (hardware) clock. -.TP -.B CAP_SYS_TTY_CONFIG -Use -.BR vhangup (2); -employ various privileged -.BR ioctl (2) -operations on virtual terminals. -.TP -.BR CAP_SYSLOG " (since Linux 2.6.37)" -.RS -.PD 0 -.IP \[bu] 3 -Perform privileged -.BR syslog (2) -operations. -See -.BR syslog (2) -for information on which operations require privilege. -.IP \[bu] -View kernel addresses exposed via -.I /proc -and other interfaces when -.I /proc/sys/kernel/kptr_restrict -has the value 1. -(See the discussion of the -.I kptr_restrict -in -.BR proc (5).) -.PD -.RE -.TP -.BR CAP_WAKE_ALARM " (since Linux 3.0)" -Trigger something that will wake up the system (set -.B CLOCK_REALTIME_ALARM -and -.B CLOCK_BOOTTIME_ALARM -timers). -.\" -.SS Past and current implementation -A full implementation of capabilities requires that: -.IP \[bu] 3 -For all privileged operations, -the kernel must check whether the thread has the required -capability in its effective set. -.IP \[bu] -The kernel must provide system calls allowing a thread's capability sets to -be changed and retrieved. -.IP \[bu] -The filesystem must support attaching capabilities to an executable file, -so that a process gains those capabilities when the file is executed. -.P -Before Linux 2.6.24, only the first two of these requirements are met; -since Linux 2.6.24, all three requirements are met. -.\" -.SS Notes to kernel developers -When adding a new kernel feature that should be governed by a capability, -consider the following points. -.IP \[bu] 3 -The goal of capabilities is divide the power of superuser into pieces, -such that if a program that has one or more capabilities is compromised, -its power to do damage to the system would be less than the same program -running with root privilege. -.IP \[bu] -You have the choice of either creating a new capability for your new feature, -or associating the feature with one of the existing capabilities. -In order to keep the set of capabilities to a manageable size, -the latter option is preferable, -unless there are compelling reasons to take the former option. -(There is also a technical limit: -the size of capability sets is currently limited to 64 bits.) -.IP \[bu] -To determine which existing capability might best be associated -with your new feature, review the list of capabilities above in order -to find a "silo" into which your new feature best fits. -One approach to take is to determine if there are other features -requiring capabilities that will always be used along with the new feature. -If the new feature is useless without these other features, -you should use the same capability as the other features. -.IP \[bu] -.I Don't -choose -.B CAP_SYS_ADMIN -if you can possibly avoid it! -A vast proportion of existing capability checks are associated -with this capability (see the partial list above). -It can plausibly be called "the new root", -since on the one hand, it confers a wide range of powers, -and on the other hand, -its broad scope means that this is the capability -that is required by many privileged programs. -Don't make the problem worse. -The only new features that should be associated with -.B CAP_SYS_ADMIN -are ones that -.I closely -match existing uses in that silo. -.IP \[bu] -If you have determined that it really is necessary to create -a new capability for your feature, -don't make or name it as a "single-use" capability. -Thus, for example, the addition of the highly specific -.B CAP_SYS_PACCT -was probably a mistake. -Instead, try to identify and name your new capability as a broader -silo into which other related future use cases might fit. -.\" -.SS Thread capability sets -Each thread has the following capability sets containing zero or more -of the above capabilities: -.TP -.I Permitted -This is a limiting superset for the effective -capabilities that the thread may assume. -It is also a limiting superset for the capabilities that -may be added to the inheritable set by a thread that does not have the -.B CAP_SETPCAP -capability in its effective set. -.IP -If a thread drops a capability from its permitted set, -it can never reacquire that capability (unless it -.BR execve (2)s -either a set-user-ID-root program, or -a program whose associated file capabilities grant that capability). -.TP -.I Inheritable -This is a set of capabilities preserved across an -.BR execve (2). -Inheritable capabilities remain inheritable when executing any program, -and inheritable capabilities are added to the permitted set when executing -a program that has the corresponding bits set in the file inheritable set. -.IP -Because inheritable capabilities are not generally preserved across -.BR execve (2) -when running as a non-root user, applications that wish to run helper -programs with elevated capabilities should consider using -ambient capabilities, described below. -.TP -.I Effective -This is the set of capabilities used by the kernel to -perform permission checks for the thread. -.TP -.IR Bounding " (per-thread since Linux 2.6.25)" -The capability bounding set is a mechanism that can be used -to limit the capabilities that are gained during -.BR execve (2). -.IP -Since Linux 2.6.25, this is a per-thread capability set. -In older kernels, the capability bounding set was a system wide attribute -shared by all threads on the system. -.IP -For more details, see -.I Capability bounding set -below. -.TP -.IR Ambient " (since Linux 4.3)" -.\" commit 58319057b7847667f0c9585b9de0e8932b0fdb08 -This is a set of capabilities that are preserved across an -.BR execve (2) -of a program that is not privileged. -The ambient capability set obeys the invariant that no capability -can ever be ambient if it is not both permitted and inheritable. -.IP -The ambient capability set can be directly modified using -.BR prctl (2). -Ambient capabilities are automatically lowered if either of -the corresponding permitted or inheritable capabilities is lowered. -.IP -Executing a program that changes UID or GID due to the -set-user-ID or set-group-ID bits or executing a program that has -any file capabilities set will clear the ambient set. -Ambient capabilities are added to the permitted set and -assigned to the effective set when -.BR execve (2) -is called. -If ambient capabilities cause a process's permitted and effective -capabilities to increase during an -.BR execve (2), -this does not trigger the secure-execution mode described in -.BR ld.so (8). -.P -A child created via -.BR fork (2) -inherits copies of its parent's capability sets. -For details on how -.BR execve (2) -affects capabilities, see -.I Transformation of capabilities during execve() -below. -.P -Using -.BR capset (2), -a thread may manipulate its own capability sets; see -.I Programmatically adjusting capability sets -below. -.P -Since Linux 3.2, the file -.I /proc/sys/kernel/cap_last_cap -.\" commit 73efc0394e148d0e15583e13712637831f926720 -exposes the numerical value of the highest capability -supported by the running kernel; -this can be used to determine the highest bit -that may be set in a capability set. -.\" -.SS File capabilities -Since Linux 2.6.24, the kernel supports -associating capability sets with an executable file using -.BR setcap (8). -The file capability sets are stored in an extended attribute (see -.BR setxattr (2) -and -.BR xattr (7)) -named -.IR "security.capability" . -Writing to this extended attribute requires the -.B CAP_SETFCAP -capability. -The file capability sets, -in conjunction with the capability sets of the thread, -determine the capabilities of a thread after an -.BR execve (2). -.P -The three file capability sets are: -.TP -.IR Permitted " (formerly known as " forced ): -These capabilities are automatically permitted to the thread, -regardless of the thread's inheritable capabilities. -.TP -.IR Inheritable " (formerly known as " allowed ): -This set is ANDed with the thread's inheritable set to determine which -inheritable capabilities are enabled in the permitted set of -the thread after the -.BR execve (2). -.TP -.IR Effective : -This is not a set, but rather just a single bit. -If this bit is set, then during an -.BR execve (2) -all of the new permitted capabilities for the thread are -also raised in the effective set. -If this bit is not set, then after an -.BR execve (2), -none of the new permitted capabilities is in the new effective set. -.IP -Enabling the file effective capability bit implies -that any file permitted or inheritable capability that causes a -thread to acquire the corresponding permitted capability during an -.BR execve (2) -(see -.I Transformation of capabilities during execve() -below) will also acquire that -capability in its effective set. -Therefore, when assigning capabilities to a file -.RB ( setcap (8), -.BR cap_set_file (3), -.BR cap_set_fd (3)), -if we specify the effective flag as being enabled for any capability, -then the effective flag must also be specified as enabled -for all other capabilities for which the corresponding permitted or -inheritable flag is enabled. -.\" -.SS File capability extended attribute versioning -To allow extensibility, -the kernel supports a scheme to encode a version number inside the -.I security.capability -extended attribute that is used to implement file capabilities. -These version numbers are internal to the implementation, -and not directly visible to user-space applications. -To date, the following versions are supported: -.TP -.B VFS_CAP_REVISION_1 -This was the original file capability implementation, -which supported 32-bit masks for file capabilities. -.TP -.BR VFS_CAP_REVISION_2 " (since Linux 2.6.25)" -.\" commit e338d263a76af78fe8f38a72131188b58fceb591 -This version allows for file capability masks that are 64 bits in size, -and was necessary as the number of supported capabilities grew beyond 32. -The kernel transparently continues to support the execution of files -that have 32-bit version 1 capability masks, -but when adding capabilities to files that did not previously -have capabilities, or modifying the capabilities of existing files, -it automatically uses the version 2 scheme -(or possibly the version 3 scheme, as described below). -.TP -.BR VFS_CAP_REVISION_3 " (since Linux 4.14)" -.\" commit 8db6c34f1dbc8e06aa016a9b829b06902c3e1340 -Version 3 file capabilities are provided -to support namespaced file capabilities (described below). -.IP -As with version 2 file capabilities, -version 3 capability masks are 64 bits in size. -But in addition, the root user ID of namespace is encoded in the -.I security.capability -extended attribute. -(A namespace's root user ID is the value that user ID 0 -inside that namespace maps to in the initial user namespace.) -.IP -Version 3 file capabilities are designed to coexist -with version 2 capabilities; -that is, on a modern Linux system, -there may be some files with version 2 capabilities -while others have version 3 capabilities. -.P -Before Linux 4.14, -the only kind of file capability extended attribute -that could be attached to a file was a -.B VFS_CAP_REVISION_2 -attribute. -Since Linux 4.14, -the version of the -.I security.capability -extended attribute that is attached to a file -depends on the circumstances in which the attribute was created. -.P -Starting with Linux 4.14, a -.I security.capability -extended attribute is automatically created as (or converted to) -a version 3 -.RB ( VFS_CAP_REVISION_3 ) -attribute if both of the following are true: -.IP \[bu] 3 -The thread writing the attribute resides in a noninitial user namespace. -(More precisely: the thread resides in a user namespace other -than the one from which the underlying filesystem was mounted.) -.IP \[bu] -The thread has the -.B CAP_SETFCAP -capability over the file inode, -meaning that (a) the thread has the -.B CAP_SETFCAP -capability in its own user namespace; -and (b) the UID and GID of the file inode have mappings in -the writer's user namespace. -.P -When a -.B VFS_CAP_REVISION_3 -.I security.capability -extended attribute is created, the root user ID of the creating thread's -user namespace is saved in the extended attribute. -.P -By contrast, creating or modifying a -.I security.capability -extended attribute from a privileged -.RB ( CAP_SETFCAP ) -thread that resides in the -namespace where the underlying filesystem was mounted -(this normally means the initial user namespace) -automatically results in the creation of a version 2 -.RB ( VFS_CAP_REVISION_2 ) -attribute. -.P -Note that the creation of a version 3 -.I security.capability -extended attribute is automatic. -That is to say, when a user-space application writes -.RB ( setxattr (2)) -a -.I security.capability -attribute in the version 2 format, -the kernel will automatically create a version 3 attribute -if the attribute is created in the circumstances described above. -Correspondingly, when a version 3 -.I security.capability -attribute is retrieved -.RB ( getxattr (2)) -by a process that resides inside a user namespace that was created by the -root user ID (or a descendant of that user namespace), -the returned attribute is (automatically) -simplified to appear as a version 2 attribute -(i.e., the returned value is the size of a version 2 attribute and does -not include the root user ID). -These automatic translations mean that no changes are required to -user-space tools (e.g., -.BR setcap (1) -and -.BR getcap (1)) -in order for those tools to be used to create and retrieve version 3 -.I security.capability -attributes. -.P -Note that a file can have either a version 2 or a version 3 -.I security.capability -extended attribute associated with it, but not both: -creation or modification of the -.I security.capability -extended attribute will automatically modify the version -according to the circumstances in which the extended attribute is -created or modified. -.\" -.SS Transformation of capabilities during execve() -During an -.BR execve (2), -the kernel calculates the new capabilities of -the process using the following algorithm: -.P -.in +4n -.EX -P'(ambient) = (file is privileged) ? 0 : P(ambient) -\& -P'(permitted) = (P(inheritable) & F(inheritable)) | - (F(permitted) & P(bounding)) | P'(ambient) -\& -P'(effective) = F(effective) ? P'(permitted) : P'(ambient) -\& -P'(inheritable) = P(inheritable) [i.e., unchanged] -\& -P'(bounding) = P(bounding) [i.e., unchanged] -.EE -.in -.P -where: -.RS 4 -.TP -P() -denotes the value of a thread capability set before the -.BR execve (2) -.TP -P'() -denotes the value of a thread capability set after the -.BR execve (2) -.TP -F() -denotes a file capability set -.RE -.P -Note the following details relating to the above capability -transformation rules: -.IP \[bu] 3 -The ambient capability set is present only since Linux 4.3. -When determining the transformation of the ambient set during -.BR execve (2), -a privileged file is one that has capabilities or -has the set-user-ID or set-group-ID bit set. -.IP \[bu] -Prior to Linux 2.6.25, -the bounding set was a system-wide attribute shared by all threads. -That system-wide value was employed to calculate the new permitted set during -.BR execve (2) -in the same manner as shown above for -.IR P(bounding) . -.P -.IR Note : -during the capability transitions described above, -file capabilities may be ignored (treated as empty) for the same reasons -that the set-user-ID and set-group-ID bits are ignored; see -.BR execve (2). -File capabilities are similarly ignored if the kernel was booted with the -.I no_file_caps -option. -.P -.IR Note : -according to the rules above, -if a process with nonzero user IDs performs an -.BR execve (2) -then any capabilities that are present in -its permitted and effective sets will be cleared. -For the treatment of capabilities when a process with a -user ID of zero performs an -.BR execve (2), -see -.I Capabilities and execution of programs by root -below. -.\" -.SS Safety checking for capability-dumb binaries -A capability-dumb binary is an application that has been -marked to have file capabilities, but has not been converted to use the -.BR libcap (3) -API to manipulate its capabilities. -(In other words, this is a traditional set-user-ID-root program -that has been switched to use file capabilities, -but whose code has not been modified to understand capabilities.) -For such applications, -the effective capability bit is set on the file, -so that the file permitted capabilities are automatically -enabled in the process effective set when executing the file. -The kernel recognizes a file which has the effective capability bit set -as capability-dumb for the purpose of the check described here. -.P -When executing a capability-dumb binary, -the kernel checks if the process obtained all permitted capabilities -that were specified in the file permitted set, -after the capability transformations described above have been performed. -(The typical reason why this might -.I not -occur is that the capability bounding set masked out some -of the capabilities in the file permitted set.) -If the process did not obtain the full set of -file permitted capabilities, then -.BR execve (2) -fails with the error -.BR EPERM . -This prevents possible security risks that could arise when -a capability-dumb application is executed with less privilege than it needs. -Note that, by definition, -the application could not itself recognize this problem, -since it does not employ the -.BR libcap (3) -API. -.\" -.SS Capabilities and execution of programs by root -.\" See cap_bprm_set_creds(), bprm_caps_from_vfs_cap() and -.\" handle_privileged_root() in security/commoncap.c (Linux 5.0 source) -In order to mirror traditional UNIX semantics, -the kernel performs special treatment of file capabilities when -a process with UID 0 (root) executes a program and -when a set-user-ID-root program is executed. -.P -After having performed any changes to the process effective ID that -were triggered by the set-user-ID mode bit of the binary\[em]e.g., -switching the effective user ID to 0 (root) because -a set-user-ID-root program was executed\[em]the -kernel calculates the file capability sets as follows: -.IP (1) 5 -If the real or effective user ID of the process is 0 (root), -then the file inheritable and permitted sets are ignored; -instead they are notionally considered to be all ones -(i.e., all capabilities enabled). -(There is one exception to this behavior, described in -.I Set-user-ID-root programs that have file capabilities -below.) -.IP (2) -If the effective user ID of the process is 0 (root) or -the file effective bit is in fact enabled, -then the file effective bit is notionally defined to be one (enabled). -.P -These notional values for the file's capability sets are then used -as described above to calculate the transformation of the process's -capabilities during -.BR execve (2). -.P -Thus, when a process with nonzero UIDs -.BR execve (2)s -a set-user-ID-root program that does not have capabilities attached, -or when a process whose real and effective UIDs are zero -.BR execve (2)s -a program, the calculation of the process's new -permitted capabilities simplifies to: -.P -.in +4n -.EX -P'(permitted) = P(inheritable) | P(bounding) -\& -P'(effective) = P'(permitted) -.EE -.in -.P -Consequently, the process gains all capabilities in its permitted and -effective capability sets, -except those masked out by the capability bounding set. -(In the calculation of P'(permitted), -the P'(ambient) term can be simplified away because it is by -definition a proper subset of P(inheritable).) -.P -The special treatments of user ID 0 (root) described in this subsection -can be disabled using the securebits mechanism described below. -.\" -.\" -.SS Set-user-ID-root programs that have file capabilities -There is one exception to the behavior described in -.I Capabilities and execution of programs by root -above. -If (a) the binary that is being executed has capabilities attached and -(b) the real user ID of the process is -.I not -0 (root) and -(c) the effective user ID of the process -.I is -0 (root), then the file capability bits are honored -(i.e., they are not notionally considered to be all ones). -The usual way in which this situation can arise is when executing -a set-UID-root program that also has file capabilities. -When such a program is executed, -the process gains just the capabilities granted by the program -(i.e., not all capabilities, -as would occur when executing a set-user-ID-root program -that does not have any associated file capabilities). -.P -Note that one can assign empty capability sets to a program file, -and thus it is possible to create a set-user-ID-root program that -changes the effective and saved set-user-ID of the process -that executes the program to 0, -but confers no capabilities to that process. -.\" -.SS Capability bounding set -The capability bounding set is a security mechanism that can be used -to limit the capabilities that can be gained during an -.BR execve (2). -The bounding set is used in the following ways: -.IP \[bu] 3 -During an -.BR execve (2), -the capability bounding set is ANDed with the file permitted -capability set, and the result of this operation is assigned to the -thread's permitted capability set. -The capability bounding set thus places a limit on the permitted -capabilities that may be granted by an executable file. -.IP \[bu] -(Since Linux 2.6.25) -The capability bounding set acts as a limiting superset for -the capabilities that a thread can add to its inheritable set using -.BR capset (2). -This means that if a capability is not in the bounding set, -then a thread can't add this capability to its -inheritable set, even if it was in its permitted capabilities, -and thereby cannot have this capability preserved in its -permitted set when it -.BR execve (2)s -a file that has the capability in its inheritable set. -.P -Note that the bounding set masks the file permitted capabilities, -but not the inheritable capabilities. -If a thread maintains a capability in its inheritable set -that is not in its bounding set, -then it can still gain that capability in its permitted set -by executing a file that has the capability in its inheritable set. -.P -Depending on the kernel version, the capability bounding set is either -a system-wide attribute, or a per-process attribute. -.P -.B "Capability bounding set from Linux 2.6.25 onward" -.P -From Linux 2.6.25, the -.I "capability bounding set" -is a per-thread attribute. -(The system-wide capability bounding set described below no longer exists.) -.P -The bounding set is inherited at -.BR fork (2) -from the thread's parent, and is preserved across an -.BR execve (2). -.P -A thread may remove capabilities from its capability bounding set using the -.BR prctl (2) -.B PR_CAPBSET_DROP -operation, provided it has the -.B CAP_SETPCAP -capability. -Once a capability has been dropped from the bounding set, -it cannot be restored to that set. -A thread can determine if a capability is in its bounding set using the -.BR prctl (2) -.B PR_CAPBSET_READ -operation. -.P -Removing capabilities from the bounding set is supported only if file -capabilities are compiled into the kernel. -Before Linux 2.6.33, -file capabilities were an optional feature configurable via the -.B CONFIG_SECURITY_FILE_CAPABILITIES -option. -Since Linux 2.6.33, -.\" commit b3a222e52e4d4be77cc4520a57af1a4a0d8222d1 -the configuration option has been removed -and file capabilities are always part of the kernel. -When file capabilities are compiled into the kernel, the -.B init -process (the ancestor of all processes) begins with a full bounding set. -If file capabilities are not compiled into the kernel, then -.B init -begins with a full bounding set minus -.BR CAP_SETPCAP , -because this capability has a different meaning when there are -no file capabilities. -.P -Removing a capability from the bounding set does not remove it -from the thread's inheritable set. -However it does prevent the capability from being added -back into the thread's inheritable set in the future. -.P -.B "Capability bounding set prior to Linux 2.6.25" -.P -Before Linux 2.6.25, the capability bounding set is a system-wide -attribute that affects all threads on the system. -The bounding set is accessible via the file -.IR /proc/sys/kernel/cap\-bound . -(Confusingly, this bit mask parameter is expressed as a -signed decimal number in -.IR /proc/sys/kernel/cap\-bound .) -.P -Only the -.B init -process may set capabilities in the capability bounding set; -other than that, the superuser (more precisely: a process with the -.B CAP_SYS_MODULE -capability) may only clear capabilities from this set. -.P -On a standard system the capability bounding set always masks out the -.B CAP_SETPCAP -capability. -To remove this restriction (dangerous!), modify the definition of -.B CAP_INIT_EFF_SET -in -.I include/linux/capability.h -and rebuild the kernel. -.P -The system-wide capability bounding set feature was added -to Linux 2.2.11. -.\" -.\" -.\" -.SS Effect of user ID changes on capabilities -To preserve the traditional semantics for transitions between -0 and nonzero user IDs, -the kernel makes the following changes to a thread's capability -sets on changes to the thread's real, effective, saved set, -and filesystem user IDs (using -.BR setuid (2), -.BR setresuid (2), -or similar): -.IP \[bu] 3 -If one or more of the real, effective, or saved set user IDs -was previously 0, and as a result of the UID changes all of these IDs -have a nonzero value, -then all capabilities are cleared from the permitted, effective, and ambient -capability sets. -.IP \[bu] -If the effective user ID is changed from 0 to nonzero, -then all capabilities are cleared from the effective set. -.IP \[bu] -If the effective user ID is changed from nonzero to 0, -then the permitted set is copied to the effective set. -.IP \[bu] -If the filesystem user ID is changed from 0 to nonzero (see -.BR setfsuid (2)), -then the following capabilities are cleared from the effective set: -.BR CAP_CHOWN , -.BR CAP_DAC_OVERRIDE , -.BR CAP_DAC_READ_SEARCH , -.BR CAP_FOWNER , -.BR CAP_FSETID , -.B CAP_LINUX_IMMUTABLE -(since Linux 2.6.30), -.BR CAP_MAC_OVERRIDE , -and -.B CAP_MKNOD -(since Linux 2.6.30). -If the filesystem UID is changed from nonzero to 0, -then any of these capabilities that are enabled in the permitted set -are enabled in the effective set. -.P -If a thread that has a 0 value for one or more of its user IDs wants -to prevent its permitted capability set being cleared when it resets -all of its user IDs to nonzero values, it can do so using the -.B SECBIT_KEEP_CAPS -securebits flag described below. -.\" -.SS Programmatically adjusting capability sets -A thread can retrieve and change its permitted, effective, and inheritable -capability sets using the -.BR capget (2) -and -.BR capset (2) -system calls. -However, the use of -.BR cap_get_proc (3) -and -.BR cap_set_proc (3), -both provided in the -.I libcap -package, -is preferred for this purpose. -The following rules govern changes to the thread capability sets: -.IP \[bu] 3 -If the caller does not have the -.B CAP_SETPCAP -capability, -the new inheritable set must be a subset of the combination -of the existing inheritable and permitted sets. -.IP \[bu] -(Since Linux 2.6.25) -The new inheritable set must be a subset of the combination of the -existing inheritable set and the capability bounding set. -.IP \[bu] -The new permitted set must be a subset of the existing permitted set -(i.e., it is not possible to acquire permitted capabilities -that the thread does not currently have). -.IP \[bu] -The new effective set must be a subset of the new permitted set. -.SS The securebits flags: establishing a capabilities-only environment -.\" For some background: -.\" see http://lwn.net/Articles/280279/ and -.\" http://article.gmane.org/gmane.linux.kernel.lsm/5476/ -Starting with Linux 2.6.26, -and with a kernel in which file capabilities are enabled, -Linux implements a set of per-thread -.I securebits -flags that can be used to disable special handling of capabilities for UID 0 -.RI ( root ). -These flags are as follows: -.TP -.B SECBIT_KEEP_CAPS -Setting this flag allows a thread that has one or more 0 UIDs to retain -capabilities in its permitted set -when it switches all of its UIDs to nonzero values. -If this flag is not set, -then such a UID switch causes the thread to lose all permitted capabilities. -This flag is always cleared on an -.BR execve (2). -.IP -Note that even with the -.B SECBIT_KEEP_CAPS -flag set, the effective capabilities of a thread are cleared when it -switches its effective UID to a nonzero value. -However, -if the thread has set this flag and its effective UID is already nonzero, -and the thread subsequently switches all other UIDs to nonzero values, -then the effective capabilities will not be cleared. -.IP -The setting of the -.B SECBIT_KEEP_CAPS -flag is ignored if the -.B SECBIT_NO_SETUID_FIXUP -flag is set. -(The latter flag provides a superset of the effect of the former flag.) -.IP -This flag provides the same functionality as the older -.BR prctl (2) -.B PR_SET_KEEPCAPS -operation. -.TP -.B SECBIT_NO_SETUID_FIXUP -Setting this flag stops the kernel from adjusting the process's -permitted, effective, and ambient capability sets when -the thread's effective and filesystem UIDs are switched between -zero and nonzero values. -See -.I Effect of user ID changes on capabilities -above. -.TP -.B SECBIT_NOROOT -If this bit is set, then the kernel does not grant capabilities -when a set-user-ID-root program is executed, or when a process with -an effective or real UID of 0 calls -.BR execve (2). -(See -.I Capabilities and execution of programs by root -above.) -.TP -.B SECBIT_NO_CAP_AMBIENT_RAISE -Setting this flag disallows raising ambient capabilities via the -.BR prctl (2) -.B PR_CAP_AMBIENT_RAISE -operation. -.P -Each of the above "base" flags has a companion "locked" flag. -Setting any of the "locked" flags is irreversible, -and has the effect of preventing further changes to the -corresponding "base" flag. -The locked flags are: -.BR SECBIT_KEEP_CAPS_LOCKED , -.BR SECBIT_NO_SETUID_FIXUP_LOCKED , -.BR SECBIT_NOROOT_LOCKED , -and -.BR SECBIT_NO_CAP_AMBIENT_RAISE_LOCKED . -.P -The -.I securebits -flags can be modified and retrieved using the -.BR prctl (2) -.B PR_SET_SECUREBITS -and -.B PR_GET_SECUREBITS -operations. -The -.B CAP_SETPCAP -capability is required to modify the flags. -Note that the -.B SECBIT_* -constants are available only after including the -.I <linux/securebits.h> -header file. -.P -The -.I securebits -flags are inherited by child processes. -During an -.BR execve (2), -all of the flags are preserved, except -.B SECBIT_KEEP_CAPS -which is always cleared. -.P -An application can use the following call to lock itself, -and all of its descendants, -into an environment where the only way of gaining capabilities -is by executing a program with associated file capabilities: -.P -.in +4n -.EX -prctl(PR_SET_SECUREBITS, - /* SECBIT_KEEP_CAPS off */ - SECBIT_KEEP_CAPS_LOCKED | - SECBIT_NO_SETUID_FIXUP | - SECBIT_NO_SETUID_FIXUP_LOCKED | - SECBIT_NOROOT | - SECBIT_NOROOT_LOCKED); - /* Setting/locking SECBIT_NO_CAP_AMBIENT_RAISE - is not required */ -.EE -.in -.\" -.\" -.SS Per-user-namespace \[dq]set-user-ID-root\[dq] programs -A set-user-ID program whose UID matches the UID that -created a user namespace will confer capabilities -in the process's permitted and effective sets -when executed by any process inside that namespace -or any descendant user namespace. -.P -The rules about the transformation of the process's capabilities during the -.BR execve (2) -are exactly as described in -.I Transformation of capabilities during execve() -and -.I Capabilities and execution of programs by root -above, -with the difference that, in the latter subsection, "root" -is the UID of the creator of the user namespace. -.\" -.\" -.SS Namespaced file capabilities -.\" commit 8db6c34f1dbc8e06aa016a9b829b06902c3e1340 -Traditional (i.e., version 2) file capabilities associate -only a set of capability masks with a binary executable file. -When a process executes a binary with such capabilities, -it gains the associated capabilities (within its user namespace) -as per the rules described in -.I Transformation of capabilities during execve() -above. -.P -Because version 2 file capabilities confer capabilities to -the executing process regardless of which user namespace it resides in, -only privileged processes are permitted to associate capabilities with a file. -Here, "privileged" means a process that has the -.B CAP_SETFCAP -capability in the user namespace where the filesystem was mounted -(normally the initial user namespace). -This limitation renders file capabilities useless for certain use cases. -For example, in user-namespaced containers, -it can be desirable to be able to create a binary that -confers capabilities only to processes executed inside that container, -but not to processes that are executed outside the container. -.P -Linux 4.14 added so-called namespaced file capabilities -to support such use cases. -Namespaced file capabilities are recorded as version 3 (i.e., -.BR VFS_CAP_REVISION_3 ) -.I security.capability -extended attributes. -Such an attribute is automatically created in the circumstances described -in -.I File capability extended attribute versioning -above. -When a version 3 -.I security.capability -extended attribute is created, -the kernel records not just the capability masks in the extended attribute, -but also the namespace root user ID. -.P -As with a binary that has -.B VFS_CAP_REVISION_2 -file capabilities, a binary with -.B VFS_CAP_REVISION_3 -file capabilities confers capabilities to a process during -.BR execve (). -However, capabilities are conferred only if the binary is executed by -a process that resides in a user namespace whose -UID 0 maps to the root user ID that is saved in the extended attribute, -or when executed by a process that resides in a descendant of such a namespace. -.\" -.\" -.SS Interaction with user namespaces -For further information on the interaction of -capabilities and user namespaces, see -.BR user_namespaces (7). -.SH STANDARDS -No standards govern capabilities, but the Linux capability implementation -is based on the withdrawn -.UR https://archive.org\:/details\:/posix_1003.1e\-990310 -POSIX.1e draft standard -.UE . -.SH NOTES -When attempting to -.BR strace (1) -binaries that have capabilities (or set-user-ID-root binaries), -you may find the -.I \-u <username> -option useful. -Something like: -.P -.in +4n -.EX -$ \fBsudo strace \-o trace.log \-u ceci ./myprivprog\fP -.EE -.in -.P -From Linux 2.5.27 to Linux 2.6.26, -.\" commit 5915eb53861c5776cfec33ca4fcc1fd20d66dd27 removed -.\" CONFIG_SECURITY_CAPABILITIES -capabilities were an optional kernel component, -and could be enabled/disabled via the -.B CONFIG_SECURITY_CAPABILITIES -kernel configuration option. -.P -The -.IR /proc/ pid /task/TID/status -file can be used to view the capability sets of a thread. -The -.IR /proc/ pid /status -file shows the capability sets of a process's main thread. -Before Linux 3.8, nonexistent capabilities were shown as being -enabled (1) in these sets. -Since Linux 3.8, -.\" 7b9a7ec565505699f503b4fcf61500dceb36e744 -all nonexistent capabilities (above -.BR CAP_LAST_CAP ) -are shown as disabled (0). -.P -The -.I libcap -package provides a suite of routines for setting and -getting capabilities that is more comfortable and less likely -to change than the interface provided by -.BR capset (2) -and -.BR capget (2). -This package also provides the -.BR setcap (8) -and -.BR getcap (8) -programs. -It can be found at -.br -.UR https://git.kernel.org\:/pub\:/scm\:/libs\:/libcap\:/libcap.git\:/refs/ -.UE . -.P -Before Linux 2.6.24, and from Linux 2.6.24 to Linux 2.6.32 if -file capabilities are not enabled, a thread with the -.B CAP_SETPCAP -capability can manipulate the capabilities of threads other than itself. -However, this is only theoretically possible, -since no thread ever has -.B CAP_SETPCAP -in either of these cases: -.IP \[bu] 3 -In the pre-2.6.25 implementation the system-wide capability bounding set, -.IR /proc/sys/kernel/cap\-bound , -always masks out the -.B CAP_SETPCAP -capability, and this can not be changed -without modifying the kernel source and rebuilding the kernel. -.IP \[bu] -If file capabilities are disabled (i.e., the kernel -.B CONFIG_SECURITY_FILE_CAPABILITIES -option is disabled), then -.B init -starts out with the -.B CAP_SETPCAP -capability removed from its per-process bounding -set, and that bounding set is inherited by all other processes -created on the system. -.SH SEE ALSO -.BR capsh (1), -.BR setpriv (1), -.BR prctl (2), -.BR setfsuid (2), -.BR cap_clear (3), -.BR cap_copy_ext (3), -.BR cap_from_text (3), -.BR cap_get_file (3), -.BR cap_get_proc (3), -.BR cap_init (3), -.BR capgetp (3), -.BR capsetp (3), -.BR libcap (3), -.BR proc (5), -.BR credentials (7), -.BR pthreads (7), -.BR user_namespaces (7), -.BR captest (8), \" from libcap-ng -.BR filecap (8), \" from libcap-ng -.BR getcap (8), -.BR getpcaps (8), -.BR netcap (8), \" from libcap-ng -.BR pscap (8), \" from libcap-ng -.BR setcap (8) -.P -.I include/linux/capability.h -in the Linux kernel source tree |