summaryrefslogtreecommitdiffstats
path: root/man2/open.2
diff options
context:
space:
mode:
Diffstat (limited to 'man2/open.2')
-rw-r--r--man2/open.21941
1 files changed, 0 insertions, 1941 deletions
diff --git a/man2/open.2 b/man2/open.2
deleted file mode 100644
index a6982fa..0000000
--- a/man2/open.2
+++ /dev/null
@@ -1,1941 +0,0 @@
-.\" This manpage is Copyright (C) 1992 Drew Eckhardt;
-.\" and Copyright (C) 1993 Michael Haardt, Ian Jackson.
-.\" and Copyright (C) 2008 Greg Banks
-.\" and Copyright (C) 2006, 2008, 2013, 2014 Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" SPDX-License-Identifier: Linux-man-pages-copyleft
-.\"
-.\" Modified 1993-07-21 by Rik Faith <faith@cs.unc.edu>
-.\" Modified 1994-08-21 by Michael Haardt
-.\" Modified 1996-04-13 by Andries Brouwer <aeb@cwi.nl>
-.\" Modified 1996-05-13 by Thomas Koenig
-.\" Modified 1996-12-20 by Michael Haardt
-.\" Modified 1999-02-19 by Andries Brouwer <aeb@cwi.nl>
-.\" Modified 1998-11-28 by Joseph S. Myers <jsm28@hermes.cam.ac.uk>
-.\" Modified 1999-06-03 by Michael Haardt
-.\" Modified 2002-05-07 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\" Modified 2004-06-23 by Michael Kerrisk <mtk.manpages@gmail.com>
-.\" 2004-12-08, mtk, reordered flags list alphabetically
-.\" 2004-12-08, Martin Pool <mbp@sourcefrog.net> (& mtk), added O_NOATIME
-.\" 2007-09-18, mtk, Added description of O_CLOEXEC + other minor edits
-.\" 2008-01-03, mtk, with input from Trond Myklebust
-.\" <trond.myklebust@fys.uio.no> and Timo Sirainen <tss@iki.fi>
-.\" Rewrite description of O_EXCL.
-.\" 2008-01-11, Greg Banks <gnb@melbourne.sgi.com>: add more detail
-.\" on O_DIRECT.
-.\" 2008-02-26, Michael Haardt: Reorganized text for O_CREAT and mode
-.\"
-.\" FIXME . Apr 08: The next POSIX revision has O_EXEC, O_SEARCH, and
-.\" O_TTYINIT. Eventually these may need to be documented. --mtk
-.\"
-.TH open 2 2024-01-16 "Linux man-pages 6.7"
-.SH NAME
-open, openat, creat \- open and possibly create a file
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.B #include <fcntl.h>
-.P
-.BI "int open(const char *" pathname ", int " flags ", ..."
-.BI " \fR/*\fP mode_t " mode " \fR*/\fP );"
-.P
-.BI "int creat(const char *" pathname ", mode_t " mode );
-.P
-.BI "int openat(int " dirfd ", const char *" pathname ", int " flags ", ..."
-.BI " \fR/*\fP mode_t " mode " \fR*/\fP );"
-.P
-/* Documented separately, in \c
-.BR openat2 (2):\c
-\& */
-.BI "int openat2(int " dirfd ", const char *" pathname ,
-.BI " const struct open_how *" how ", size_t " size );
-.fi
-.P
-.RS -4
-Feature Test Macro Requirements for glibc (see
-.BR feature_test_macros (7)):
-.RE
-.P
-.BR openat ():
-.nf
- Since glibc 2.10:
- _POSIX_C_SOURCE >= 200809L
- Before glibc 2.10:
- _ATFILE_SOURCE
-.fi
-.SH DESCRIPTION
-The
-.BR open ()
-system call opens the file specified by
-.IR pathname .
-If the specified file does not exist,
-it may optionally (if
-.B O_CREAT
-is specified in
-.IR flags )
-be created by
-.BR open ().
-.P
-The return value of
-.BR open ()
-is a file descriptor, a small, nonnegative integer that is an index
-to an entry in the process's table of open file descriptors.
-The file descriptor is used
-in subsequent system calls
-(\c
-.BR read (2),
-.BR write (2),
-.BR lseek (2),
-.BR fcntl (2),
-etc.)
-to refer to the open file.
-The file descriptor returned by a successful call will be
-the lowest-numbered file descriptor not currently open for the process.
-.P
-By default, the new file descriptor is set to remain open across an
-.BR execve (2)
-(i.e., the
-.B FD_CLOEXEC
-file descriptor flag described in
-.BR fcntl (2)
-is initially disabled); the
-.B O_CLOEXEC
-flag, described below, can be used to change this default.
-The file offset is set to the beginning of the file (see
-.BR lseek (2)).
-.P
-A call to
-.BR open ()
-creates a new
-.IR "open file description" ,
-an entry in the system-wide table of open files.
-The open file description records the file offset and the file status flags
-(see below).
-A file descriptor is a reference to an open file description;
-this reference is unaffected if
-.I pathname
-is subsequently removed or modified to refer to a different file.
-For further details on open file descriptions, see NOTES.
-.P
-The argument
-.I flags
-must include one of the following
-.IR "access modes" :
-.BR O_RDONLY ", " O_WRONLY ", or " O_RDWR .
-These request opening the file read-only, write-only, or read/write,
-respectively.
-.P
-In addition, zero or more file creation flags and file status flags
-can be
-bitwise ORed
-in
-.IR flags .
-The
-.I file creation flags
-are
-.BR O_CLOEXEC ,
-.BR O_CREAT ,
-.BR O_DIRECTORY ,
-.BR O_EXCL ,
-.BR O_NOCTTY ,
-.BR O_NOFOLLOW ,
-.BR O_TMPFILE ,
-and
-.BR O_TRUNC .
-The
-.I file status flags
-are all of the remaining flags listed below.
-.\" SUSv4 divides the flags into:
-.\" * Access mode
-.\" * File creation
-.\" * File status
-.\" * Other (O_CLOEXEC, O_DIRECTORY, O_NOFOLLOW)
-.\" though it's not clear what the difference between "other" and
-.\" "File creation" flags is. I raised an Aardvark to see if this
-.\" can be clarified in SUSv4; 10 Oct 2008.
-.\" http://thread.gmane.org/gmane.comp.standards.posix.austin.general/64/focus=67
-.\" TC1 (balloted in 2013), resolved this, so that those three constants
-.\" are also categorized" as file status flags.
-.\"
-The distinction between these two groups of flags is that
-the file creation flags affect the semantics of the open operation itself,
-while the file status flags affect the semantics of subsequent I/O operations.
-The file status flags can be retrieved and (in some cases)
-modified; see
-.BR fcntl (2)
-for details.
-.P
-The full list of file creation flags and file status flags is as follows:
-.TP
-.B O_APPEND
-The file is opened in append mode.
-Before each
-.BR write (2),
-the file offset is positioned at the end of the file,
-as if with
-.BR lseek (2).
-The modification of the file offset and the write operation
-are performed as a single atomic step.
-.IP
-.B O_APPEND
-may lead to corrupted files on NFS filesystems if more than one process
-appends data to a file at once.
-.\" For more background, see
-.\" http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=453946
-.\" http://nfs.sourceforge.net/
-This is because NFS does not support
-appending to a file, so the client kernel has to simulate it, which
-can't be done without a race condition.
-.TP
-.B O_ASYNC
-Enable signal-driven I/O:
-generate a signal
-.RB ( SIGIO
-by default, but this can be changed via
-.BR fcntl (2))
-when input or output becomes possible on this file descriptor.
-This feature is available only for terminals, pseudoterminals,
-sockets, and (since Linux 2.6) pipes and FIFOs.
-See
-.BR fcntl (2)
-for further details.
-See also BUGS, below.
-.TP
-.BR O_CLOEXEC " (since Linux 2.6.23)"
-.\" NOTE! several other man pages refer to this text
-Enable the close-on-exec flag for the new file descriptor.
-.\" FIXME . for later review when Issue 8 is one day released...
-.\" POSIX proposes to fix many APIs that provide hidden FDs
-.\" http://austingroupbugs.net/tag_view_page.php?tag_id=8
-.\" http://austingroupbugs.net/view.php?id=368
-Specifying this flag permits a program to avoid additional
-.BR fcntl (2)
-.B F_SETFD
-operations to set the
-.B FD_CLOEXEC
-flag.
-.IP
-Note that the use of this flag is essential in some multithreaded programs,
-because using a separate
-.BR fcntl (2)
-.B F_SETFD
-operation to set the
-.B FD_CLOEXEC
-flag does not suffice to avoid race conditions
-where one thread opens a file descriptor and
-attempts to set its close-on-exec flag using
-.BR fcntl (2)
-at the same time as another thread does a
-.BR fork (2)
-plus
-.BR execve (2).
-Depending on the order of execution,
-the race may lead to the file descriptor returned by
-.BR open ()
-being unintentionally leaked to the program executed by the child process
-created by
-.BR fork (2).
-(This kind of race is in principle possible for any system call
-that creates a file descriptor whose close-on-exec flag should be set,
-and various other Linux system calls provide an equivalent of the
-.B O_CLOEXEC
-flag to deal with this problem.)
-.\" This flag fixes only one form of the race condition;
-.\" The race can also occur with, for example, file descriptors
-.\" returned by accept(), pipe(), etc.
-.TP
-.B O_CREAT
-If
-.I pathname
-does not exist, create it as a regular file.
-.IP
-The owner (user ID) of the new file is set to the effective user ID
-of the process.
-.IP
-The group ownership (group ID) of the new file is set either to
-the effective group ID of the process (System V semantics)
-or to the group ID of the parent directory (BSD semantics).
-On Linux, the behavior depends on whether the
-set-group-ID mode bit is set on the parent directory:
-if that bit is set, then BSD semantics apply;
-otherwise, System V semantics apply.
-For some filesystems, the behavior also depends on the
-.I bsdgroups
-and
-.I sysvgroups
-mount options described in
-.BR mount (8).
-.\" As at Linux 2.6.25, bsdgroups is supported by ext2, ext3, ext4, and
-.\" XFS (since Linux 2.6.14).
-.IP
-The
-.I mode
-argument specifies the file mode bits to be applied when a new file is created.
-If neither
-.B O_CREAT
-nor
-.B O_TMPFILE
-is specified in
-.IR flags ,
-then
-.I mode
-is ignored (and can thus be specified as 0, or simply omitted).
-The
-.I mode
-argument
-.B must
-be supplied if
-.B O_CREAT
-or
-.B O_TMPFILE
-is specified in
-.IR flags ;
-if it is not supplied,
-some arbitrary bytes from the stack will be applied as the file mode.
-.IP
-The effective mode is modified by the process's
-.I umask
-in the usual way: in the absence of a default ACL, the mode of the
-created file is
-.IR "(mode\ &\ \[ti]umask)" .
-.IP
-Note that
-.I mode
-applies only to future accesses of the
-newly created file; the
-.BR open ()
-call that creates a read-only file may well return a read/write
-file descriptor.
-.IP
-The following symbolic constants are provided for
-.IR mode :
-.RS
-.TP 9
-.B S_IRWXU
-00700 user (file owner) has read, write, and execute permission
-.TP
-.B S_IRUSR
-00400 user has read permission
-.TP
-.B S_IWUSR
-00200 user has write permission
-.TP
-.B S_IXUSR
-00100 user has execute permission
-.TP
-.B S_IRWXG
-00070 group has read, write, and execute permission
-.TP
-.B S_IRGRP
-00040 group has read permission
-.TP
-.B S_IWGRP
-00020 group has write permission
-.TP
-.B S_IXGRP
-00010 group has execute permission
-.TP
-.B S_IRWXO
-00007 others have read, write, and execute permission
-.TP
-.B S_IROTH
-00004 others have read permission
-.TP
-.B S_IWOTH
-00002 others have write permission
-.TP
-.B S_IXOTH
-00001 others have execute permission
-.RE
-.IP
-According to POSIX, the effect when other bits are set in
-.I mode
-is unspecified.
-On Linux, the following bits are also honored in
-.IR mode :
-.RS
-.TP 9
-.B S_ISUID
-0004000 set-user-ID bit
-.TP
-.B S_ISGID
-0002000 set-group-ID bit (see
-.BR inode (7)).
-.TP
-.B S_ISVTX
-0001000 sticky bit (see
-.BR inode (7)).
-.RE
-.TP
-.BR O_DIRECT " (since Linux 2.4.10)"
-Try to minimize cache effects of the I/O to and from this file.
-In general this will degrade performance, but it is useful in
-special situations, such as when applications do their own caching.
-File I/O is done directly to/from user-space buffers.
-The
-.B O_DIRECT
-flag on its own makes an effort to transfer data synchronously,
-but does not give the guarantees of the
-.B O_SYNC
-flag that data and necessary metadata are transferred.
-To guarantee synchronous I/O,
-.B O_SYNC
-must be used in addition to
-.BR O_DIRECT .
-See NOTES below for further discussion.
-.IP
-A semantically similar (but deprecated) interface for block devices
-is described in
-.BR raw (8).
-.TP
-.B O_DIRECTORY
-If \fIpathname\fP is not a directory, cause the open to fail.
-.\" But see the following and its replies:
-.\" http://marc.theaimsgroup.com/?t=112748702800001&r=1&w=2
-.\" [PATCH] open: O_DIRECTORY and O_CREAT together should fail
-.\" O_DIRECTORY | O_CREAT causes O_DIRECTORY to be ignored.
-This flag was added in Linux 2.1.126, to
-avoid denial-of-service problems if
-.BR opendir (3)
-is called on a
-FIFO or tape device.
-.TP
-.B O_DSYNC
-Write operations on the file will complete according to the requirements of
-synchronized I/O
-.I data
-integrity completion.
-.IP
-By the time
-.BR write (2)
-(and similar)
-return, the output data
-has been transferred to the underlying hardware,
-along with any file metadata that would be required to retrieve that data
-(i.e., as though each
-.BR write (2)
-was followed by a call to
-.BR fdatasync (2)).
-.IR "See NOTES below" .
-.TP
-.B O_EXCL
-Ensure that this call creates the file:
-if this flag is specified in conjunction with
-.BR O_CREAT ,
-and
-.I pathname
-already exists, then
-.BR open ()
-fails with the error
-.BR EEXIST .
-.IP
-When these two flags are specified, symbolic links are not followed:
-.\" POSIX.1-2001 explicitly requires this behavior.
-if
-.I pathname
-is a symbolic link, then
-.BR open ()
-fails regardless of where the symbolic link points.
-.IP
-In general, the behavior of
-.B O_EXCL
-is undefined if it is used without
-.BR O_CREAT .
-There is one exception: on Linux 2.6 and later,
-.B O_EXCL
-can be used without
-.B O_CREAT
-if
-.I pathname
-refers to a block device.
-If the block device is in use by the system (e.g., mounted),
-.BR open ()
-fails with the error
-.BR EBUSY .
-.IP
-On NFS,
-.B O_EXCL
-is supported only when using NFSv3 or later on kernel 2.6 or later.
-In NFS environments where
-.B O_EXCL
-support is not provided, programs that rely on it
-for performing locking tasks will contain a race condition.
-Portable programs that want to perform atomic file locking using a lockfile,
-and need to avoid reliance on NFS support for
-.BR O_EXCL ,
-can create a unique file on
-the same filesystem (e.g., incorporating hostname and PID), and use
-.BR link (2)
-to make a link to the lockfile.
-If
-.BR link (2)
-returns 0, the lock is successful.
-Otherwise, use
-.BR stat (2)
-on the unique file to check if its link count has increased to 2,
-in which case the lock is also successful.
-.TP
-.B O_LARGEFILE
-(LFS)
-Allow files whose sizes cannot be represented in an
-.I off_t
-(but can be represented in an
-.IR off64_t )
-to be opened.
-The
-.B _LARGEFILE64_SOURCE
-macro must be defined
-(before including
-.I any
-header files)
-in order to obtain this definition.
-Setting the
-.B _FILE_OFFSET_BITS
-feature test macro to 64 (rather than using
-.BR O_LARGEFILE )
-is the preferred
-method of accessing large files on 32-bit systems (see
-.BR feature_test_macros (7)).
-.TP
-.BR O_NOATIME " (since Linux 2.6.8)"
-Do not update the file last access time
-.RI ( st_atime
-in the inode)
-when the file is
-.BR read (2).
-.IP
-This flag can be employed only if one of the following conditions is true:
-.RS
-.IP \[bu] 3
-The effective UID of the process
-.\" Strictly speaking: the filesystem UID
-matches the owner UID of the file.
-.IP \[bu]
-The calling process has the
-.B CAP_FOWNER
-capability in its user namespace and
-the owner UID of the file has a mapping in the namespace.
-.RE
-.IP
-This flag is intended for use by indexing or backup programs,
-where its use can significantly reduce the amount of disk activity.
-This flag may not be effective on all filesystems.
-One example is NFS, where the server maintains the access time.
-.\" The O_NOATIME flag also affects the treatment of st_atime
-.\" by mmap() and readdir(2), MTK, Dec 04.
-.TP
-.B O_NOCTTY
-If
-.I pathname
-refers to a terminal device\[em]see
-.BR tty (4)\[em]it
-will not become the process's controlling terminal even if the
-process does not have one.
-.TP
-.B O_NOFOLLOW
-If the trailing component (i.e., basename) of
-.I pathname
-is a symbolic link, then the open fails, with the error
-.BR ELOOP .
-Symbolic links in earlier components of the pathname will still be
-followed.
-(Note that the
-.B ELOOP
-error that can occur in this case is indistinguishable from the case where
-an open fails because there are too many symbolic links found
-while resolving components in the prefix part of the pathname.)
-.IP
-This flag is a FreeBSD extension, which was added in Linux 2.1.126,
-and has subsequently been standardized in POSIX.1-2008.
-.IP
-See also
-.B O_PATH
-below.
-.\" The headers from glibc 2.0.100 and later include a
-.\" definition of this flag; \fIkernels before Linux 2.1.126 will ignore it if
-.\" used\fP.
-.TP
-.BR O_NONBLOCK " or " O_NDELAY
-When possible, the file is opened in nonblocking mode.
-Neither the
-.BR open ()
-nor any subsequent I/O operations on the file descriptor which is
-returned will cause the calling process to wait.
-.IP
-Note that the setting of this flag has no effect on the operation of
-.BR poll (2),
-.BR select (2),
-.BR epoll (7),
-and similar,
-since those interfaces merely inform the caller about whether
-a file descriptor is "ready",
-meaning that an I/O operation performed on
-the file descriptor with the
-.B O_NONBLOCK
-flag
-.I clear
-would not block.
-.IP
-Note that this flag has no effect for regular files and block devices;
-that is, I/O operations will (briefly) block when device activity
-is required, regardless of whether
-.B O_NONBLOCK
-is set.
-Since
-.B O_NONBLOCK
-semantics might eventually be implemented,
-applications should not depend upon blocking behavior
-when specifying this flag for regular files and block devices.
-.IP
-For the handling of FIFOs (named pipes), see also
-.BR fifo (7).
-For a discussion of the effect of
-.B O_NONBLOCK
-in conjunction with mandatory file locks and with file leases, see
-.BR fcntl (2).
-.TP
-.BR O_PATH " (since Linux 2.6.39)"
-.\" commit 1abf0c718f15a56a0a435588d1b104c7a37dc9bd
-.\" commit 326be7b484843988afe57566b627fb7a70beac56
-.\" commit 65cfc6722361570bfe255698d9cd4dccaf47570d
-.\"
-.\" http://thread.gmane.org/gmane.linux.man/2790/focus=3496
-.\" Subject: Re: [PATCH] open(2): document O_PATH
-.\" Newsgroups: gmane.linux.man, gmane.linux.kernel
-.\"
-Obtain a file descriptor that can be used for two purposes:
-to indicate a location in the filesystem tree and
-to perform operations that act purely at the file descriptor level.
-The file itself is not opened, and other file operations (e.g.,
-.BR read (2),
-.BR write (2),
-.BR fchmod (2),
-.BR fchown (2),
-.BR fgetxattr (2),
-.BR ioctl (2),
-.BR mmap (2))
-fail with the error
-.BR EBADF .
-.IP
-The following operations
-.I can
-be performed on the resulting file descriptor:
-.RS
-.IP \[bu] 3
-.BR close (2).
-.IP \[bu]
-.BR fchdir (2),
-if the file descriptor refers to a directory
-(since Linux 3.5).
-.\" commit 332a2e1244bd08b9e3ecd378028513396a004a24
-.IP \[bu]
-.BR fstat (2)
-(since Linux 3.6).
-.IP \[bu]
-.\" fstat(): commit 55815f70147dcfa3ead5738fd56d3574e2e3c1c2
-.BR fstatfs (2)
-(since Linux 3.12).
-.\" fstatfs(): commit 9d05746e7b16d8565dddbe3200faa1e669d23bbf
-.IP \[bu]
-Duplicating the file descriptor
-.RB ( dup (2),
-.BR fcntl (2)
-.BR F_DUPFD ,
-etc.).
-.IP \[bu]
-Getting and setting file descriptor flags
-.RB ( fcntl (2)
-.B F_GETFD
-and
-.BR F_SETFD ).
-.IP \[bu]
-Retrieving open file status flags using the
-.BR fcntl (2)
-.B F_GETFL
-operation: the returned flags will include the bit
-.BR O_PATH .
-.IP \[bu]
-Passing the file descriptor as the
-.I dirfd
-argument of
-.BR openat ()
-and the other "*at()" system calls.
-This includes
-.BR linkat (2)
-with
-.B AT_EMPTY_PATH
-(or via procfs using
-.BR AT_SYMLINK_FOLLOW )
-even if the file is not a directory.
-.IP \[bu]
-Passing the file descriptor to another process via a UNIX domain socket
-(see
-.B SCM_RIGHTS
-in
-.BR unix (7)).
-.RE
-.IP
-When
-.B O_PATH
-is specified in
-.IR flags ,
-flag bits other than
-.BR O_CLOEXEC ,
-.BR O_DIRECTORY ,
-and
-.B O_NOFOLLOW
-are ignored.
-.IP
-Opening a file or directory with the
-.B O_PATH
-flag requires no permissions on the object itself
-(but does require execute permission on the directories in the path prefix).
-Depending on the subsequent operation,
-a check for suitable file permissions may be performed (e.g.,
-.BR fchdir (2)
-requires execute permission on the directory referred to
-by its file descriptor argument).
-By contrast,
-obtaining a reference to a filesystem object by opening it with the
-.B O_RDONLY
-flag requires that the caller have read permission on the object,
-even when the subsequent operation (e.g.,
-.BR fchdir (2),
-.BR fstat (2))
-does not require read permission on the object.
-.IP
-If
-.I pathname
-is a symbolic link and the
-.B O_NOFOLLOW
-flag is also specified,
-then the call returns a file descriptor referring to the symbolic link.
-This file descriptor can be used as the
-.I dirfd
-argument in calls to
-.BR fchownat (2),
-.BR fstatat (2),
-.BR linkat (2),
-and
-.BR readlinkat (2)
-with an empty pathname to have the calls operate on the symbolic link.
-.IP
-If
-.I pathname
-refers to an automount point that has not yet been triggered, so no
-other filesystem is mounted on it, then the call returns a file
-descriptor referring to the automount directory without triggering a mount.
-.BR fstatfs (2)
-can then be used to determine if it is, in fact, an untriggered
-automount point
-.RB ( ".f_type == AUTOFS_SUPER_MAGIC" ).
-.IP
-One use of
-.B O_PATH
-for regular files is to provide the equivalent of POSIX.1's
-.B O_EXEC
-functionality.
-This permits us to open a file for which we have execute
-permission but not read permission, and then execute that file,
-with steps something like the following:
-.IP
-.in +4n
-.EX
-char buf[PATH_MAX];
-fd = open("some_prog", O_PATH);
-snprintf(buf, PATH_MAX, "/proc/self/fd/%d", fd);
-execl(buf, "some_prog", (char *) NULL);
-.EE
-.in
-.IP
-An
-.B O_PATH
-file descriptor can also be passed as the argument of
-.BR fexecve (3).
-.TP
-.B O_SYNC
-Write operations on the file will complete according to the requirements of
-synchronized I/O
-.I file
-integrity completion
-(by contrast with the
-synchronized I/O
-.I data
-integrity completion
-provided by
-.BR O_DSYNC .)
-.IP
-By the time
-.BR write (2)
-(or similar)
-returns, the output data and associated file metadata
-have been transferred to the underlying hardware
-(i.e., as though each
-.BR write (2)
-was followed by a call to
-.BR fsync (2)).
-.IR "See NOTES below" .
-.TP
-.BR O_TMPFILE " (since Linux 3.11)"
-.\" commit 60545d0d4610b02e55f65d141c95b18ccf855b6e
-.\" commit f4e0c30c191f87851c4a53454abb55ee276f4a7e
-.\" commit bb458c644a59dbba3a1fe59b27106c5e68e1c4bd
-Create an unnamed temporary regular file.
-The
-.I pathname
-argument specifies a directory;
-an unnamed inode will be created in that directory's filesystem.
-Anything written to the resulting file will be lost when
-the last file descriptor is closed, unless the file is given a name.
-.IP
-.B O_TMPFILE
-must be specified with one of
-.B O_RDWR
-or
-.B O_WRONLY
-and, optionally,
-.BR O_EXCL .
-If
-.B O_EXCL
-is not specified, then
-.BR linkat (2)
-can be used to link the temporary file into the filesystem, making it
-permanent, using code like the following:
-.IP
-.in +4n
-.EX
-char path[PATH_MAX];
-fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
- S_IRUSR | S_IWUSR);
-\&
-/* File I/O on \[aq]fd\[aq]... */
-\&
-linkat(fd, "", AT_FDCWD, "/path/for/file", AT_EMPTY_PATH);
-\&
-/* If the caller doesn\[aq]t have the CAP_DAC_READ_SEARCH
- capability (needed to use AT_EMPTY_PATH with linkat(2)),
- and there is a proc(5) filesystem mounted, then the
- linkat(2) call above can be replaced with:
-\&
-snprintf(path, PATH_MAX, "/proc/self/fd/%d", fd);
-linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
- AT_SYMLINK_FOLLOW);
-*/
-.EE
-.in
-.IP
-In this case,
-the
-.BR open ()
-.I mode
-argument determines the file permission mode, as with
-.BR O_CREAT .
-.IP
-Specifying
-.B O_EXCL
-in conjunction with
-.B O_TMPFILE
-prevents a temporary file from being linked into the filesystem
-in the above manner.
-(Note that the meaning of
-.B O_EXCL
-in this case is different from the meaning of
-.B O_EXCL
-otherwise.)
-.IP
-There are two main use cases for
-.\" Inspired by http://lwn.net/Articles/559147/
-.BR O_TMPFILE :
-.RS
-.IP \[bu] 3
-Improved
-.BR tmpfile (3)
-functionality: race-free creation of temporary files that
-(1) are automatically deleted when closed;
-(2) can never be reached via any pathname;
-(3) are not subject to symlink attacks; and
-(4) do not require the caller to devise unique names.
-.IP \[bu]
-Creating a file that is initially invisible, which is then populated
-with data and adjusted to have appropriate filesystem attributes
-.RB ( fchown (2),
-.BR fchmod (2),
-.BR fsetxattr (2),
-etc.)
-before being atomically linked into the filesystem
-in a fully formed state (using
-.BR linkat (2)
-as described above).
-.RE
-.IP
-.B O_TMPFILE
-requires support by the underlying filesystem;
-only a subset of Linux filesystems provide that support.
-In the initial implementation, support was provided in
-the ext2, ext3, ext4, UDF, Minix, and tmpfs filesystems.
-.\" To check for support, grep for "tmpfile" in kernel sources
-Support for other filesystems has subsequently been added as follows:
-XFS (Linux 3.15);
-.\" commit 99b6436bc29e4f10e4388c27a3e4810191cc4788
-.\" commit ab29743117f9f4c22ac44c13c1647fb24fb2bafe
-Btrfs (Linux 3.16);
-.\" commit ef3b9af50bfa6a1f02cd7b3f5124b712b1ba3e3c
-F2FS (Linux 3.16);
-.\" commit 50732df02eefb39ab414ef655979c2c9b64ad21c
-and ubifs (Linux 4.9)
-.TP
-.B O_TRUNC
-If the file already exists and is a regular file and the access mode allows
-writing (i.e., is
-.B O_RDWR
-or
-.BR O_WRONLY )
-it will be truncated to length 0.
-If the file is a FIFO or terminal device file, the
-.B O_TRUNC
-flag is ignored.
-Otherwise, the effect of
-.B O_TRUNC
-is unspecified.
-.SS creat()
-A call to
-.BR creat ()
-is equivalent to calling
-.BR open ()
-with
-.I flags
-equal to
-.BR O_CREAT|O_WRONLY|O_TRUNC .
-.SS openat()
-The
-.BR openat ()
-system call operates in exactly the same way as
-.BR open (),
-except for the differences described here.
-.P
-The
-.I dirfd
-argument is used in conjunction with the
-.I pathname
-argument as follows:
-.IP \[bu] 3
-If the pathname given in
-.I pathname
-is absolute, then
-.I dirfd
-is ignored.
-.IP \[bu]
-If the pathname given in
-.I pathname
-is relative and
-.I dirfd
-is the special value
-.BR AT_FDCWD ,
-then
-.I pathname
-is interpreted relative to the current working
-directory of the calling process (like
-.BR open ()).
-.IP \[bu]
-If the pathname given in
-.I pathname
-is relative, then it is interpreted relative to the directory
-referred to by the file descriptor
-.I dirfd
-(rather than relative to the current working directory of
-the calling process, as is done by
-.BR open ()
-for a relative pathname).
-In this case,
-.I dirfd
-must be a directory that was opened for reading
-.RB ( O_RDONLY )
-or using the
-.B O_PATH
-flag.
-.P
-If the pathname given in
-.I pathname
-is relative, and
-.I dirfd
-is not a valid file descriptor, an error
-.RB ( EBADF )
-results.
-(Specifying an invalid file descriptor number in
-.I dirfd
-can be used as a means to ensure that
-.I pathname
-is absolute.)
-.\"
-.SS openat2(2)
-The
-.BR openat2 (2)
-system call is an extension of
-.BR openat (),
-and provides a superset of the features of
-.BR openat ().
-It is documented separately, in
-.BR openat2 (2).
-.SH RETURN VALUE
-On success,
-.BR open (),
-.BR openat (),
-and
-.BR creat ()
-return the new file descriptor (a nonnegative integer).
-On error, \-1 is returned and
-.I errno
-is set to indicate the error.
-.SH ERRORS
-.BR open (),
-.BR openat (),
-and
-.BR creat ()
-can fail with the following errors:
-.TP
-.B EACCES
-The requested access to the file is not allowed, or search permission
-is denied for one of the directories in the path prefix of
-.IR pathname ,
-or the file did not exist yet and write access to the parent directory
-is not allowed.
-(See also
-.BR path_resolution (7).)
-.TP
-.B EACCES
-.\" commit 30aba6656f61ed44cba445a3c0d38b296fa9e8f5
-Where
-.B O_CREAT
-is specified, the
-.I protected_fifos
-or
-.I protected_regular
-sysctl is enabled, the file already exists and is a FIFO or regular file, the
-owner of the file is neither the current user nor the owner of the
-containing directory, and the containing directory is both world- or
-group-writable and sticky.
-For details, see the descriptions of
-.I /proc/sys/fs/protected_fifos
-and
-.I /proc/sys/fs/protected_regular
-in
-.BR proc (5).
-.TP
-.B EBADF
-.RB ( openat ())
-.I pathname
-is relative but
-.I dirfd
-is neither
-.B AT_FDCWD
-nor a valid file descriptor.
-.TP
-.B EBUSY
-.B O_EXCL
-was specified in
-.I flags
-and
-.I pathname
-refers to a block device that is in use by the system (e.g., it is mounted).
-.TP
-.B EDQUOT
-Where
-.B O_CREAT
-is specified, the file does not exist, and the user's quota of disk
-blocks or inodes on the filesystem has been exhausted.
-.TP
-.B EEXIST
-.I pathname
-already exists and
-.BR O_CREAT " and " O_EXCL
-were used.
-.TP
-.B EFAULT
-.I pathname
-points outside your accessible address space.
-.TP
-.B EFBIG
-See
-.BR EOVERFLOW .
-.TP
-.B EINTR
-While blocked waiting to complete an open of a slow device
-(e.g., a FIFO; see
-.BR fifo (7)),
-the call was interrupted by a signal handler; see
-.BR signal (7).
-.TP
-.B EINVAL
-The filesystem does not support the
-.B O_DIRECT
-flag.
-See
-.B NOTES
-for more information.
-.TP
-.B EINVAL
-Invalid value in
-.\" In particular, __O_TMPFILE instead of O_TMPFILE
-.IR flags .
-.TP
-.B EINVAL
-.B O_TMPFILE
-was specified in
-.IR flags ,
-but neither
-.B O_WRONLY
-nor
-.B O_RDWR
-was specified.
-.TP
-.B EINVAL
-.B O_CREAT
-was specified in
-.I flags
-and the final component ("basename") of the new file's
-.I pathname
-is invalid
-(e.g., it contains characters not permitted by the underlying filesystem).
-.TP
-.B EINVAL
-The final component ("basename") of
-.I pathname
-is invalid
-(e.g., it contains characters not permitted by the underlying filesystem).
-.TP
-.B EISDIR
-.I pathname
-refers to a directory and the access requested involved writing
-(that is,
-.B O_WRONLY
-or
-.B O_RDWR
-is set).
-.TP
-.B EISDIR
-.I pathname
-refers to an existing directory,
-.B O_TMPFILE
-and one of
-.B O_WRONLY
-or
-.B O_RDWR
-were specified in
-.IR flags ,
-but this kernel version does not provide the
-.B O_TMPFILE
-functionality.
-.TP
-.B ELOOP
-Too many symbolic links were encountered in resolving
-.IR pathname .
-.TP
-.B ELOOP
-.I pathname
-was a symbolic link, and
-.I flags
-specified
-.B O_NOFOLLOW
-but not
-.BR O_PATH .
-.TP
-.B EMFILE
-The per-process limit on the number of open file descriptors has been reached
-(see the description of
-.B RLIMIT_NOFILE
-in
-.BR getrlimit (2)).
-.TP
-.B ENAMETOOLONG
-.I pathname
-was too long.
-.TP
-.B ENFILE
-The system-wide limit on the total number of open files has been reached.
-.TP
-.B ENODEV
-.I pathname
-refers to a device special file and no corresponding device exists.
-(This is a Linux kernel bug; in this situation
-.B ENXIO
-must be returned.)
-.TP
-.B ENOENT
-.B O_CREAT
-is not set and the named file does not exist.
-.TP
-.B ENOENT
-A directory component in
-.I pathname
-does not exist or is a dangling symbolic link.
-.TP
-.B ENOENT
-.I pathname
-refers to a nonexistent directory,
-.B O_TMPFILE
-and one of
-.B O_WRONLY
-or
-.B O_RDWR
-were specified in
-.IR flags ,
-but this kernel version does not provide the
-.B O_TMPFILE
-functionality.
-.TP
-.B ENOMEM
-The named file is a FIFO,
-but memory for the FIFO buffer can't be allocated because
-the per-user hard limit on memory allocation for pipes has been reached
-and the caller is not privileged; see
-.BR pipe (7).
-.TP
-.B ENOMEM
-Insufficient kernel memory was available.
-.TP
-.B ENOSPC
-.I pathname
-was to be created but the device containing
-.I pathname
-has no room for the new file.
-.TP
-.B ENOTDIR
-A component used as a directory in
-.I pathname
-is not, in fact, a directory, or \fBO_DIRECTORY\fP was specified and
-.I pathname
-was not a directory.
-.TP
-.B ENOTDIR
-.RB ( openat ())
-.I pathname
-is a relative pathname and
-.I dirfd
-is a file descriptor referring to a file other than a directory.
-.TP
-.B ENXIO
-.BR O_NONBLOCK " | " O_WRONLY
-is set, the named file is a FIFO, and
-no process has the FIFO open for reading.
-.TP
-.B ENXIO
-The file is a device special file and no corresponding device exists.
-.TP
-.B ENXIO
-The file is a UNIX domain socket.
-.TP
-.B EOPNOTSUPP
-The filesystem containing
-.I pathname
-does not support
-.BR O_TMPFILE .
-.TP
-.B EOVERFLOW
-.I pathname
-refers to a regular file that is too large to be opened.
-The usual scenario here is that an application compiled
-on a 32-bit platform without
-.I \-D_FILE_OFFSET_BITS=64
-tried to open a file whose size exceeds
-.I (1<<31)\-1
-bytes;
-see also
-.B O_LARGEFILE
-above.
-This is the error specified by POSIX.1;
-before Linux 2.6.24, Linux gave the error
-.B EFBIG
-for this case.
-.\" See http://bugzilla.kernel.org/show_bug.cgi?id=7253
-.\" "Open of a large file on 32-bit fails with EFBIG, should be EOVERFLOW"
-.\" Reported 2006-10-03
-.TP
-.B EPERM
-The
-.B O_NOATIME
-flag was specified, but the effective user ID of the caller
-.\" Strictly speaking, it's the filesystem UID... (MTK)
-did not match the owner of the file and the caller was not privileged.
-.TP
-.B EPERM
-The operation was prevented by a file seal; see
-.BR fcntl (2).
-.TP
-.B EROFS
-.I pathname
-refers to a file on a read-only filesystem and write access was
-requested.
-.TP
-.B ETXTBSY
-.I pathname
-refers to an executable image which is currently being executed and
-write access was requested.
-.TP
-.B ETXTBSY
-.I pathname
-refers to a file that is currently in use as a swap file, and the
-.B O_TRUNC
-flag was specified.
-.TP
-.B ETXTBSY
-.I pathname
-refers to a file that is currently being read by the kernel (e.g., for
-module/firmware loading), and write access was requested.
-.TP
-.B EWOULDBLOCK
-The
-.B O_NONBLOCK
-flag was specified, and an incompatible lease was held on the file
-(see
-.BR fcntl (2)).
-.SH VERSIONS
-The (undefined) effect of
-.B O_RDONLY | O_TRUNC
-varies among implementations.
-On many systems the file is actually truncated.
-.\" Linux 2.0, 2.5: truncate
-.\" Solaris 5.7, 5.8: truncate
-.\" Irix 6.5: truncate
-.\" Tru64 5.1B: truncate
-.\" HP-UX 11.22: truncate
-.\" FreeBSD 4.7: truncate
-.SS Synchronized I/O
-The POSIX.1-2008 "synchronized I/O" option
-specifies different variants of synchronized I/O,
-and specifies the
-.BR open ()
-flags
-.BR O_SYNC ,
-.BR O_DSYNC ,
-and
-.B O_RSYNC
-for controlling the behavior.
-Regardless of whether an implementation supports this option,
-it must at least support the use of
-.B O_SYNC
-for regular files.
-.P
-Linux implements
-.B O_SYNC
-and
-.BR O_DSYNC ,
-but not
-.BR O_RSYNC .
-Somewhat incorrectly, glibc defines
-.B O_RSYNC
-to have the same value as
-.BR O_SYNC .
-.RB ( O_RSYNC
-is defined in the Linux header file
-.I <asm/fcntl.h>
-on HP PA-RISC, but it is not used.)
-.P
-.B O_SYNC
-provides synchronized I/O
-.I file
-integrity completion,
-meaning write operations will flush data and all associated metadata
-to the underlying hardware.
-.B O_DSYNC
-provides synchronized I/O
-.I data
-integrity completion,
-meaning write operations will flush data
-to the underlying hardware,
-but will only flush metadata updates that are required
-to allow a subsequent read operation to complete successfully.
-Data integrity completion can reduce the number of disk operations
-that are required for applications that don't need the guarantees
-of file integrity completion.
-.P
-To understand the difference between the two types of completion,
-consider two pieces of file metadata:
-the file last modification timestamp
-.RI ( st_mtime )
-and the file length.
-All write operations will update the last file modification timestamp,
-but only writes that add data to the end of the
-file will change the file length.
-The last modification timestamp is not needed to ensure that
-a read completes successfully, but the file length is.
-Thus,
-.B O_DSYNC
-would only guarantee to flush updates to the file length metadata
-(whereas
-.B O_SYNC
-would also always flush the last modification timestamp metadata).
-.P
-Before Linux 2.6.33, Linux implemented only the
-.B O_SYNC
-flag for
-.BR open ().
-However, when that flag was specified,
-most filesystems actually provided the equivalent of synchronized I/O
-.I data
-integrity completion (i.e.,
-.B O_SYNC
-was actually implemented as the equivalent of
-.BR O_DSYNC ).
-.P
-Since Linux 2.6.33, proper
-.B O_SYNC
-support is provided.
-However, to ensure backward binary compatibility,
-.B O_DSYNC
-was defined with the same value as the historical
-.BR O_SYNC ,
-and
-.B O_SYNC
-was defined as a new (two-bit) flag value that includes the
-.B O_DSYNC
-flag value.
-This ensures that applications compiled against
-new headers get at least
-.B O_DSYNC
-semantics before Linux 2.6.33.
-.\"
-.SS C library/kernel differences
-Since glibc 2.26,
-the glibc wrapper function for
-.BR open ()
-employs the
-.BR openat ()
-system call, rather than the kernel's
-.BR open ()
-system call.
-For certain architectures, this is also true before glibc 2.26.
-.\"
-.SH STANDARDS
-.TP
-.BR open ()
-.TQ
-.BR creat ()
-.TQ
-.BR openat ()
-POSIX.1-2008.
-.P
-.BR openat2 (2)
-Linux.
-.P
-The
-.BR O_DIRECT ,
-.BR O_NOATIME ,
-.BR O_PATH ,
-and
-.B O_TMPFILE
-flags are Linux-specific.
-One must define
-.B _GNU_SOURCE
-to obtain their definitions.
-.P
-The
-.BR O_CLOEXEC ,
-.BR O_DIRECTORY ,
-and
-.B O_NOFOLLOW
-flags are not specified in POSIX.1-2001,
-but are specified in POSIX.1-2008.
-Since glibc 2.12, one can obtain their definitions by defining either
-.B _POSIX_C_SOURCE
-with a value greater than or equal to 200809L or
-.B _XOPEN_SOURCE
-with a value greater than or equal to 700.
-In glibc 2.11 and earlier, one obtains the definitions by defining
-.BR _GNU_SOURCE .
-.SH HISTORY
-.TP
-.BR open ()
-.TQ
-.BR creat ()
-SVr4, 4.3BSD, POSIX.1-2001.
-.TP
-.BR openat ()
-POSIX.1-2008.
-Linux 2.6.16,
-glibc 2.4.
-.SH NOTES
-Under Linux, the
-.B O_NONBLOCK
-flag is sometimes used in cases where one wants to open
-but does not necessarily have the intention to read or write.
-For example,
-this may be used to open a device in order to get a file descriptor
-for use with
-.BR ioctl (2).
-.P
-Note that
-.BR open ()
-can open device special files, but
-.BR creat ()
-cannot create them; use
-.BR mknod (2)
-instead.
-.P
-If the file is newly created, its
-.IR st_atime ,
-.IR st_ctime ,
-.I st_mtime
-fields
-(respectively, time of last access, time of last status change, and
-time of last modification; see
-.BR stat (2))
-are set
-to the current time, and so are the
-.I st_ctime
-and
-.I st_mtime
-fields of the
-parent directory.
-Otherwise, if the file is modified because of the
-.B O_TRUNC
-flag, its
-.I st_ctime
-and
-.I st_mtime
-fields are set to the current time.
-.P
-The files in the
-.IR /proc/ pid /fd
-directory show the open file descriptors of the process with the PID
-.IR pid .
-The files in the
-.IR /proc/ pid /fdinfo
-directory show even more information about these file descriptors.
-See
-.BR proc (5)
-for further details of both of these directories.
-.P
-The Linux header file
-.B <asm/fcntl.h>
-doesn't define
-.BR O_ASYNC ;
-the (BSD-derived)
-.B FASYNC
-synonym is defined instead.
-.\"
-.\"
-.SS Open file descriptions
-The term open file description is the one used by POSIX to refer to the
-entries in the system-wide table of open files.
-In other contexts, this object is
-variously also called an "open file object",
-a "file handle", an "open file table entry",
-or\[em]in kernel-developer parlance\[em]a
-.IR "struct file" .
-.P
-When a file descriptor is duplicated (using
-.BR dup (2)
-or similar),
-the duplicate refers to the same open file description
-as the original file descriptor,
-and the two file descriptors consequently share
-the file offset and file status flags.
-Such sharing can also occur between processes:
-a child process created via
-.BR fork (2)
-inherits duplicates of its parent's file descriptors,
-and those duplicates refer to the same open file descriptions.
-.P
-Each
-.BR open ()
-of a file creates a new open file description;
-thus, there may be multiple open file descriptions
-corresponding to a file inode.
-.P
-On Linux, one can use the
-.BR kcmp (2)
-.B KCMP_FILE
-operation to test whether two file descriptors
-(in the same process or in two different processes)
-refer to the same open file description.
-.\"
-.SS NFS
-There are many infelicities in the protocol underlying NFS, affecting
-amongst others
-.BR O_SYNC " and " O_NDELAY .
-.P
-On NFS filesystems with UID mapping enabled,
-.BR open ()
-may
-return a file descriptor but, for example,
-.BR read (2)
-requests are denied
-with
-.BR EACCES .
-This is because the client performs
-.BR open ()
-by checking the
-permissions, but UID mapping is performed by the server upon
-read and write requests.
-.\"
-.\"
-.SS FIFOs
-Opening the read or write end of a FIFO blocks until the other
-end is also opened (by another process or thread).
-See
-.BR fifo (7)
-for further details.
-.\"
-.\"
-.SS File access mode
-Unlike the other values that can be specified in
-.IR flags ,
-the
-.I "access mode"
-values
-.BR O_RDONLY ", " O_WRONLY ", and " O_RDWR
-do not specify individual bits.
-Rather, they define the low order two bits of
-.IR flags ,
-and are defined respectively as 0, 1, and 2.
-In other words, the combination
-.B "O_RDONLY | O_WRONLY"
-is a logical error, and certainly does not have the same meaning as
-.BR O_RDWR .
-.P
-Linux reserves the special, nonstandard access mode 3 (binary 11) in
-.I flags
-to mean:
-check for read and write permission on the file and return a file descriptor
-that can't be used for reading or writing.
-This nonstandard access mode is used by some Linux drivers to return a
-file descriptor that is to be used only for device-specific
-.BR ioctl (2)
-operations.
-.\" See for example util-linux's disk-utils/setfdprm.c
-.\" For some background on access mode 3, see
-.\" http://thread.gmane.org/gmane.linux.kernel/653123
-.\" "[RFC] correct flags to f_mode conversion in __dentry_open"
-.\" LKML, 12 Mar 2008
-.\"
-.\"
-.SS Rationale for openat() and other "directory file descriptor" APIs
-.BR openat ()
-and the other system calls and library functions that take
-a directory file descriptor argument
-(i.e.,
-.BR execveat (2),
-.BR faccessat (2),
-.BR fanotify_mark (2),
-.BR fchmodat (2),
-.BR fchownat (2),
-.BR fspick (2),
-.BR fstatat (2),
-.BR futimesat (2),
-.BR linkat (2),
-.BR mkdirat (2),
-.BR mknodat (2),
-.BR mount_setattr (2),
-.BR move_mount (2),
-.BR name_to_handle_at (2),
-.BR open_tree (2),
-.BR openat2 (2),
-.BR readlinkat (2),
-.BR renameat (2),
-.BR renameat2 (2),
-.BR statx (2),
-.BR symlinkat (2),
-.BR unlinkat (2),
-.BR utimensat (2),
-.BR mkfifoat (3),
-and
-.BR scandirat (3))
-address two problems with the older interfaces that preceded them.
-Here, the explanation is in terms of the
-.BR openat ()
-call, but the rationale is analogous for the other interfaces.
-.P
-First,
-.BR openat ()
-allows an application to avoid race conditions that could
-occur when using
-.BR open ()
-to open files in directories other than the current working directory.
-These race conditions result from the fact that some component
-of the directory prefix given to
-.BR open ()
-could be changed in parallel with the call to
-.BR open ().
-Suppose, for example, that we wish to create the file
-.I dir1/dir2/xxx.dep
-if the file
-.I dir1/dir2/xxx
-exists.
-The problem is that between the existence check and the file-creation step,
-.I dir1
-or
-.I dir2
-(which might be symbolic links)
-could be modified to point to a different location.
-Such races can be avoided by
-opening a file descriptor for the target directory,
-and then specifying that file descriptor as the
-.I dirfd
-argument of (say)
-.BR fstatat (2)
-and
-.BR openat ().
-The use of the
-.I dirfd
-file descriptor also has other benefits:
-.IP \[bu] 3
-the file descriptor is a stable reference to the directory,
-even if the directory is renamed; and
-.IP \[bu]
-the open file descriptor prevents the underlying filesystem from
-being dismounted,
-just as when a process has a current working directory on a filesystem.
-.P
-Second,
-.BR openat ()
-allows the implementation of a per-thread "current working
-directory", via file descriptor(s) maintained by the application.
-(This functionality can also be obtained by tricks based
-on the use of
-.IR /proc/self/fd/ dirfd,
-but less efficiently.)
-.P
-The
-.I dirfd
-argument for these APIs can be obtained by using
-.BR open ()
-or
-.BR openat ()
-to open a directory (with either the
-.B O_RDONLY
-or the
-.B O_PATH
-flag).
-Alternatively, such a file descriptor can be obtained by applying
-.BR dirfd (3)
-to a directory stream created using
-.BR opendir (3).
-.P
-When these APIs are given a
-.I dirfd
-argument of
-.B AT_FDCWD
-or the specified pathname is absolute,
-then they handle their pathname argument in the same way as
-the corresponding conventional APIs.
-However, in this case, several of the APIs have a
-.I flags
-argument that provides access to functionality that is not available with
-the corresponding conventional APIs.
-.\"
-.\"
-.SS O_DIRECT
-The
-.B O_DIRECT
-flag may impose alignment restrictions on the length and address
-of user-space buffers and the file offset of I/Os.
-In Linux alignment
-restrictions vary by filesystem and kernel version and might be
-absent entirely.
-The handling of misaligned
-.B O_DIRECT
-I/Os also varies;
-they can either fail with
-.B EINVAL
-or fall back to buffered I/O.
-.P
-Since Linux 6.1,
-.B O_DIRECT
-support and alignment restrictions for a file can be queried using
-.BR statx (2),
-using the
-.B STATX_DIOALIGN
-flag.
-Support for
-.B STATX_DIOALIGN
-varies by filesystem;
-see
-.BR statx (2).
-.P
-Some filesystems provide their own interfaces for querying
-.B O_DIRECT
-alignment restrictions,
-for example the
-.B XFS_IOC_DIOINFO
-operation in
-.BR xfsctl (3).
-.B STATX_DIOALIGN
-should be used instead when it is available.
-.P
-If none of the above is available,
-then direct I/O support and alignment restrictions
-can only be assumed from known characteristics of the filesystem,
-the individual file,
-the underlying storage device(s),
-and the kernel version.
-In Linux 2.4,
-most filesystems based on block devices require that
-the file offset and the length and memory address of all I/O segments
-be multiples of the filesystem block size
-(typically 4096 bytes).
-In Linux 2.6.0,
-this was relaxed to the logical block size of the block device
-(typically 512 bytes).
-A block device's logical block size can be determined using the
-.BR ioctl (2)
-.B BLKSSZGET
-operation or from the shell using the command:
-.P
-.in +4n
-.EX
-blockdev \-\-getss
-.EE
-.in
-.P
-.B O_DIRECT
-I/Os should never be run concurrently with the
-.BR fork (2)
-system call,
-if the memory buffer is a private mapping
-(i.e., any mapping created with the
-.BR mmap (2)
-.B MAP_PRIVATE
-flag;
-this includes memory allocated on the heap and statically allocated buffers).
-Any such I/Os, whether submitted via an asynchronous I/O interface or from
-another thread in the process,
-should be completed before
-.BR fork (2)
-is called.
-Failure to do so can result in data corruption and undefined behavior in
-parent and child processes.
-This restriction does not apply when the memory buffer for the
-.B O_DIRECT
-I/Os was created using
-.BR shmat (2)
-or
-.BR mmap (2)
-with the
-.B MAP_SHARED
-flag.
-Nor does this restriction apply when the memory buffer has been advised as
-.B MADV_DONTFORK
-with
-.BR madvise (2),
-ensuring that it will not be available
-to the child after
-.BR fork (2).
-.P
-The
-.B O_DIRECT
-flag was introduced in SGI IRIX, where it has alignment
-restrictions similar to those of Linux 2.4.
-IRIX has also a
-.BR fcntl (2)
-call to query appropriate alignments, and sizes.
-FreeBSD 4.x introduced
-a flag of the same name, but without alignment restrictions.
-.P
-.B O_DIRECT
-support was added in Linux 2.4.10.
-Older Linux kernels simply ignore this flag.
-Some filesystems may not implement the flag, in which case
-.BR open ()
-fails with the error
-.B EINVAL
-if it is used.
-.P
-Applications should avoid mixing
-.B O_DIRECT
-and normal I/O to the same file,
-and especially to overlapping byte regions in the same file.
-Even when the filesystem correctly handles the coherency issues in
-this situation, overall I/O throughput is likely to be slower than
-using either mode alone.
-Likewise, applications should avoid mixing
-.BR mmap (2)
-of files with direct I/O to the same files.
-.P
-The behavior of
-.B O_DIRECT
-with NFS will differ from local filesystems.
-Older kernels, or
-kernels configured in certain ways, may not support this combination.
-The NFS protocol does not support passing the flag to the server, so
-.B O_DIRECT
-I/O will bypass the page cache only on the client; the server may
-still cache the I/O.
-The client asks the server to make the I/O
-synchronous to preserve the synchronous semantics of
-.BR O_DIRECT .
-Some servers will perform poorly under these circumstances, especially
-if the I/O size is small.
-Some servers may also be configured to
-lie to clients about the I/O having reached stable storage; this
-will avoid the performance penalty at some risk to data integrity
-in the event of server power failure.
-The Linux NFS client places no alignment restrictions on
-.B O_DIRECT
-I/O.
-.P
-In summary,
-.B O_DIRECT
-is a potentially powerful tool that should be used with caution.
-It is recommended that applications treat use of
-.B O_DIRECT
-as a performance option which is disabled by default.
-.SH BUGS
-Currently, it is not possible to enable signal-driven
-I/O by specifying
-.B O_ASYNC
-when calling
-.BR open ();
-use
-.BR fcntl (2)
-to enable this flag.
-.\" FIXME . Check bugzilla report on open(O_ASYNC)
-.\" See http://bugzilla.kernel.org/show_bug.cgi?id=5993
-.P
-One must check for two different error codes,
-.B EISDIR
-and
-.BR ENOENT ,
-when trying to determine whether the kernel supports
-.B O_TMPFILE
-functionality.
-.P
-When both
-.B O_CREAT
-and
-.B O_DIRECTORY
-are specified in
-.I flags
-and the file specified by
-.I pathname
-does not exist,
-.BR open ()
-will create a regular file (i.e.,
-.B O_DIRECTORY
-is ignored).
-.SH SEE ALSO
-.BR chmod (2),
-.BR chown (2),
-.BR close (2),
-.BR dup (2),
-.BR fcntl (2),
-.BR link (2),
-.BR lseek (2),
-.BR mknod (2),
-.BR mmap (2),
-.BR mount (2),
-.BR open_by_handle_at (2),
-.BR openat2 (2),
-.BR read (2),
-.BR socket (2),
-.BR stat (2),
-.BR umask (2),
-.BR unlink (2),
-.BR write (2),
-.BR fopen (3),
-.BR acl (5),
-.BR fifo (7),
-.BR inode (7),
-.BR path_resolution (7),
-.BR symlink (7)