summaryrefslogtreecommitdiffstats
path: root/man2/openat2.2
diff options
context:
space:
mode:
Diffstat (limited to 'man2/openat2.2')
-rw-r--r--man2/openat2.2582
1 files changed, 582 insertions, 0 deletions
diff --git a/man2/openat2.2 b/man2/openat2.2
new file mode 100644
index 0000000..b98bbaf
--- /dev/null
+++ b/man2/openat2.2
@@ -0,0 +1,582 @@
+.\" Copyright (C) 2019 Aleksa Sarai <cyphar@cyphar.com>
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.TH openat2 2 2023-04-23 "Linux man-pages 6.05.01"
+.SH NAME
+openat2 \- open and possibly create a file (extended)
+.SH LIBRARY
+Standard C library
+.RI ( libc ", " \-lc )
+.SH SYNOPSIS
+.nf
+.BR "#include <fcntl.h>" \
+" /* Definition of " O_* " and " S_* " constants */"
+.BR "#include <linux/openat2.h>" " /* Definition of " RESOLVE_* " constants */"
+.BR "#include <sys/syscall.h>" " /* Definition of " SYS_* " constants */"
+.B #include <unistd.h>
+.PP
+.BI "long syscall(SYS_openat2, int " dirfd ", const char *" pathname ,
+.BI " struct open_how *" how ", size_t " size );
+.fi
+.PP
+.IR Note :
+glibc provides no wrapper for
+.BR openat2 (),
+necessitating the use of
+.BR syscall (2).
+.SH DESCRIPTION
+The
+.BR openat2 ()
+system call is an extension of
+.BR openat (2)
+and provides a superset of its functionality.
+.PP
+The
+.BR openat2 ()
+system call opens the file specified by
+.IR pathname .
+If the specified file does not exist, it may optionally (if
+.B O_CREAT
+is specified in
+.IR how.flags )
+be created.
+.PP
+As with
+.BR openat (2),
+if
+.I pathname
+is a relative pathname, then it is interpreted relative to the
+directory referred to by the file descriptor
+.I dirfd
+(or the current working directory of the calling process, if
+.I dirfd
+is the special value
+.BR AT_FDCWD ).
+If
+.I pathname
+is an absolute pathname, then
+.I dirfd
+is ignored (unless
+.I how.resolve
+contains
+.BR RESOLVE_IN_ROOT ,
+in which case
+.I pathname
+is resolved relative to
+.IR dirfd ).
+.PP
+Rather than taking a single
+.I flags
+argument, an extensible structure (\fIhow\fP) is passed to allow for
+future extensions.
+The
+.I size
+argument must be specified as
+.IR "sizeof(struct open_how)" .
+.\"
+.SS The open_how structure
+The
+.I how
+argument specifies how
+.I pathname
+should be opened, and acts as a superset of the
+.I flags
+and
+.I mode
+arguments to
+.BR openat (2).
+This argument is a pointer to an
+.I open_how
+structure,
+described in
+.BR open_how (2type).
+.PP
+Any future extensions to
+.BR openat2 ()
+will be implemented as new fields appended to the
+.I open_how
+structure,
+with a zero value in a new field resulting in the kernel behaving
+as though that extension field was not present.
+Therefore, the caller
+.I must
+zero-fill this structure on
+initialization.
+(See the "Extensibility" section of the
+.B NOTES
+for more detail on why this is necessary.)
+.PP
+The fields of the
+.I open_how
+structure are as follows:
+.TP
+.I flags
+This field specifies
+the file creation and file status flags to use when opening the file.
+All of the
+.B O_*
+flags defined for
+.BR openat (2)
+are valid
+.BR openat2 ()
+flag values.
+.IP
+Whereas
+.BR openat (2)
+ignores unknown bits in its
+.I flags
+argument,
+.BR openat2 ()
+returns an error if unknown or conflicting flags are specified in
+.IR how.flags .
+.TP
+.I mode
+This field specifies the
+mode for the new file, with identical semantics to the
+.I mode
+argument of
+.BR openat (2).
+.IP
+Whereas
+.BR openat (2)
+ignores bits other than those in the range
+.I 07777
+in its
+.I mode
+argument,
+.BR openat2 ()
+returns an error if
+.I how.mode
+contains bits other than
+.IR 07777 .
+Similarly, an error is returned if
+.BR openat2 ()
+is called with a nonzero
+.I how.mode
+and
+.I how.flags
+does not contain
+.B O_CREAT
+or
+.BR O_TMPFILE .
+.TP
+.I resolve
+This is a bit-mask of flags that modify the way in which
+.B all
+components of
+.I pathname
+will be resolved.
+(See
+.BR path_resolution (7)
+for background information.)
+.IP
+The primary use case for these flags is to allow trusted programs to restrict
+how untrusted paths (or paths inside untrusted directories) are resolved.
+The full list of
+.I resolve
+flags is as follows:
+.RS
+.TP
+.B RESOLVE_BENEATH
+.\" commit adb21d2b526f7f196b2f3fdca97d80ba05dd14a0
+Do not permit the path resolution to succeed if any component of the resolution
+is not a descendant of the directory indicated by
+.IR dirfd .
+This causes absolute symbolic links (and absolute values of
+.IR pathname )
+to be rejected.
+.IP
+Currently, this flag also disables magic-link resolution (see below).
+However, this may change in the future.
+Therefore, to ensure that magic links are not resolved,
+the caller should explicitly specify
+.BR RESOLVE_NO_MAGICLINKS .
+.TP
+.B RESOLVE_IN_ROOT
+.\" commit 8db52c7e7ee1bd861b6096fcafc0fe7d0f24a994
+Treat the directory referred to by
+.I dirfd
+as the root directory while resolving
+.IR pathname .
+Absolute symbolic links are interpreted relative to
+.IR dirfd .
+If a prefix component of
+.I pathname
+equates to
+.IR dirfd ,
+then an immediately following
+.I ..\&
+component likewise equates to
+.I dirfd
+(just as
+.I /..\&
+is traditionally equivalent to
+.IR / ).
+If
+.I pathname
+is an absolute path, it is also interpreted relative to
+.IR dirfd .
+.IP
+The effect of this flag is as though the calling process had used
+.BR chroot (2)
+to (temporarily) modify its root directory (to the directory
+referred to by
+.IR dirfd ).
+However, unlike
+.BR chroot (2)
+(which changes the filesystem root permanently for a process),
+.B RESOLVE_IN_ROOT
+allows a program to efficiently restrict path resolution on a per-open basis.
+.IP
+Currently, this flag also disables magic-link resolution.
+However, this may change in the future.
+Therefore, to ensure that magic links are not resolved,
+the caller should explicitly specify
+.BR RESOLVE_NO_MAGICLINKS .
+.TP
+.B RESOLVE_NO_MAGICLINKS
+.\" commit 278121417a72d87fb29dd8c48801f80821e8f75a
+Disallow all magic-link resolution during path resolution.
+.IP
+Magic links are symbolic link-like objects that are most notably found in
+.BR proc (5);
+examples include
+.IR /proc/ pid /exe
+and
+.IR /proc/ pid /fd/* .
+(See
+.BR symlink (7)
+for more details.)
+.IP
+Unknowingly opening magic links can be risky for some applications.
+Examples of such risks include the following:
+.RS
+.IP \[bu] 3
+If the process opening a pathname is a controlling process that
+currently has no controlling terminal (see
+.BR credentials (7)),
+then opening a magic link inside
+.IR /proc/ pid /fd
+that happens to refer to a terminal
+would cause the process to acquire a controlling terminal.
+.IP \[bu]
+.\" From https://lwn.net/Articles/796868/:
+.\" The presence of this flag will prevent a path lookup operation
+.\" from traversing through one of these magic links, thus blocking
+.\" (for example) attempts to escape from a container via a /proc
+.\" entry for an open file descriptor.
+In a containerized environment,
+a magic link inside
+.I /proc
+may refer to an object outside the container,
+and thus may provide a means to escape from the container.
+.RE
+.IP
+Because of such risks,
+an application may prefer to disable magic link resolution using the
+.B RESOLVE_NO_MAGICLINKS
+flag.
+.IP
+If the trailing component (i.e., basename) of
+.I pathname
+is a magic link,
+.I how.resolve
+contains
+.BR RESOLVE_NO_MAGICLINKS ,
+and
+.I how.flags
+contains both
+.B O_PATH
+and
+.BR O_NOFOLLOW ,
+then an
+.B O_PATH
+file descriptor referencing the magic link will be returned.
+.TP
+.B RESOLVE_NO_SYMLINKS
+.\" commit 278121417a72d87fb29dd8c48801f80821e8f75a
+Disallow resolution of symbolic links during path resolution.
+This option implies
+.BR RESOLVE_NO_MAGICLINKS .
+.IP
+If the trailing component (i.e., basename) of
+.I pathname
+is a symbolic link,
+.I how.resolve
+contains
+.BR RESOLVE_NO_SYMLINKS ,
+and
+.I how.flags
+contains both
+.B O_PATH
+and
+.BR O_NOFOLLOW ,
+then an
+.B O_PATH
+file descriptor referencing the symbolic link will be returned.
+.IP
+Note that the effect of the
+.B RESOLVE_NO_SYMLINKS
+flag,
+which affects the treatment of symbolic links in all of the components of
+.IR pathname ,
+differs from the effect of the
+.B O_NOFOLLOW
+file creation flag (in
+.IR how.flags ),
+which affects the handling of symbolic links only in the final component of
+.IR pathname .
+.IP
+Applications that employ the
+.B RESOLVE_NO_SYMLINKS
+flag are encouraged to make its use configurable
+(unless it is used for a specific security purpose),
+as symbolic links are very widely used by end-users.
+Setting this flag indiscriminately\[em]i.e.,
+for purposes not specifically related to security\[em]for all uses of
+.BR openat2 ()
+may result in spurious errors on previously functional systems.
+This may occur if, for example,
+a system pathname that is used by an application is modified
+(e.g., in a new distribution release)
+so that a pathname component (now) contains a symbolic link.
+.TP
+.B RESOLVE_NO_XDEV
+.\" commit 72ba29297e1439efaa54d9125b866ae9d15df339
+Disallow traversal of mount points during path resolution (including all bind
+mounts).
+Consequently,
+.I pathname
+must either be on the same mount as the directory referred to by
+.IR dirfd ,
+or on the same mount as the current working directory if
+.I dirfd
+is specified as
+.BR AT_FDCWD .
+.IP
+Applications that employ the
+.B RESOLVE_NO_XDEV
+flag are encouraged to make its use configurable (unless it is
+used for a specific security purpose),
+as bind mounts are widely used by end-users.
+Setting this flag indiscriminately\[em]i.e.,
+for purposes not specifically related to security\[em]for all uses of
+.BR openat2 ()
+may result in spurious errors on previously functional systems.
+This may occur if, for example,
+a system pathname that is used by an application is modified
+(e.g., in a new distribution release)
+so that a pathname component (now) contains a bind mount.
+.TP
+.B RESOLVE_CACHED
+Make the open operation fail unless all path components are already present
+in the kernel's lookup cache.
+If any kind of revalidation or I/O is needed to satisfy the lookup,
+.BR openat2 ()
+fails with the error
+.B EAGAIN .
+This is useful in providing a fast-path open that can be performed without
+resorting to thread offload, or other mechanisms that an application might
+use to offload slower operations.
+.RE
+.IP
+If any bits other than those listed above are set in
+.IR how.resolve ,
+an error is returned.
+.SH RETURN VALUE
+On success, a new file descriptor is returned.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+The set of errors returned by
+.BR openat2 ()
+includes all of the errors returned by
+.BR openat (2),
+as well as the following additional errors:
+.TP
+.B E2BIG
+An extension that this kernel does not support was specified in
+.IR how .
+(See the "Extensibility" section of
+.B NOTES
+for more detail on how extensions are handled.)
+.TP
+.B EAGAIN
+.I how.resolve
+contains either
+.B RESOLVE_IN_ROOT
+or
+.BR RESOLVE_BENEATH ,
+and the kernel could not ensure that a ".." component didn't escape (due to a
+race condition or potential attack).
+The caller may choose to retry the
+.BR openat2 ()
+call.
+.TP
+.B EAGAIN
+.B RESOLVE_CACHED
+was set, and the open operation cannot be performed using only cached
+information.
+The caller should retry without
+.B RESOLVE_CACHED
+set in
+.I how.resolve .
+.TP
+.B EINVAL
+An unknown flag or invalid value was specified in
+.IR how .
+.TP
+.B EINVAL
+.I mode
+is nonzero, but
+.I how.flags
+does not contain
+.B O_CREAT
+or
+.BR O_TMPFILE .
+.TP
+.B EINVAL
+.I size
+was smaller than any known version of
+.IR "struct open_how" .
+.TP
+.B ELOOP
+.I how.resolve
+contains
+.BR RESOLVE_NO_SYMLINKS ,
+and one of the path components was a symbolic link (or magic link).
+.TP
+.B ELOOP
+.I how.resolve
+contains
+.BR RESOLVE_NO_MAGICLINKS ,
+and one of the path components was a magic link.
+.TP
+.B EXDEV
+.I how.resolve
+contains either
+.B RESOLVE_IN_ROOT
+or
+.BR RESOLVE_BENEATH ,
+and an escape from the root during path resolution was detected.
+.TP
+.B EXDEV
+.I how.resolve
+contains
+.BR RESOLVE_NO_XDEV ,
+and a path component crosses a mount point.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.6.
+.\" commit fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179
+.PP
+The semantics of
+.B RESOLVE_BENEATH
+were modeled after FreeBSD's
+.BR O_BENEATH .
+.SH NOTES
+.SS Extensibility
+In order to allow for future extensibility,
+.BR openat2 ()
+requires the user-space application to specify the size of the
+.I open_how
+structure that it is passing.
+By providing this information, it is possible for
+.BR openat2 ()
+to provide both forwards- and backwards-compatibility, with
+.I size
+acting as an implicit version number.
+(Because new extension fields will always
+be appended, the structure size will always increase.)
+This extensibility design is very similar to other system calls such as
+.BR sched_setattr (2),
+.BR perf_event_open (2),
+and
+.BR clone3 (2).
+.PP
+If we let
+.I usize
+be the size of the structure as specified by the user-space application, and
+.I ksize
+be the size of the structure which the kernel supports, then there are
+three cases to consider:
+.IP \[bu] 3
+If
+.I ksize
+equals
+.IR usize ,
+then there is no version mismatch and
+.I how
+can be used verbatim.
+.IP \[bu]
+If
+.I ksize
+is larger than
+.IR usize ,
+then there are some extension fields that the kernel supports
+which the user-space application
+is unaware of.
+Because a zero value in any added extension field signifies a no-op,
+the kernel
+treats all of the extension fields not provided by the user-space application
+as having zero values.
+This provides backwards-compatibility.
+.IP \[bu]
+If
+.I ksize
+is smaller than
+.IR usize ,
+then there are some extension fields which the user-space application
+is aware of but which the kernel does not support.
+Because any extension field must have its zero values signify a no-op,
+the kernel can
+safely ignore the unsupported extension fields if they are all-zero.
+If any unsupported extension fields are nonzero, then \-1 is returned and
+.I errno
+is set to
+.BR E2BIG .
+This provides forwards-compatibility.
+.PP
+Because the definition of
+.I struct open_how
+may change in the future (with new fields being added when system headers are
+updated), user-space applications should zero-fill
+.I struct open_how
+to ensure that recompiling the program with new headers will not result in
+spurious errors at run time.
+The simplest way is to use a designated
+initializer:
+.PP
+.in +4n
+.EX
+struct open_how how = { .flags = O_RDWR,
+ .resolve = RESOLVE_IN_ROOT };
+.EE
+.in
+.PP
+or explicitly using
+.BR memset (3)
+or similar:
+.PP
+.in +4n
+.EX
+struct open_how how;
+memset(&how, 0, sizeof(how));
+how.flags = O_RDWR;
+how.resolve = RESOLVE_IN_ROOT;
+.EE
+.in
+.PP
+A user-space application that wishes to determine which extensions
+the running kernel supports can do so by conducting a binary search on
+.I size
+with a structure which has every byte nonzero (to find the largest value
+which doesn't produce an error of
+.BR E2BIG ).
+.SH SEE ALSO
+.BR openat (2),
+.BR open_how (2type),
+.BR path_resolution (7),
+.BR symlink (7)