summaryrefslogtreecommitdiffstats
path: root/man2/futex.2
diff options
context:
space:
mode:
Diffstat (limited to 'man2/futex.2')
-rw-r--r--man2/futex.21976
1 files changed, 0 insertions, 1976 deletions
diff --git a/man2/futex.2 b/man2/futex.2
deleted file mode 100644
index 2ff300b..0000000
--- a/man2/futex.2
+++ /dev/null
@@ -1,1976 +0,0 @@
-.\" Page by b.hubert
-.\" and Copyright (C) 2015, Thomas Gleixner <tglx@linutronix.de>
-.\" and Copyright (C) 2015, Michael Kerrisk <mtk.manpages@gmail.com>
-.\"
-.\" %%%LICENSE_START(FREELY_REDISTRIBUTABLE)
-.\" may be freely modified and distributed
-.\" %%%LICENSE_END
-.\"
-.\" Niki A. Rahimi (LTC Security Development, narahimi@us.ibm.com)
-.\" added ERRORS section.
-.\"
-.\" Modified 2004-06-17 mtk
-.\" Modified 2004-10-07 aeb, added FUTEX_REQUEUE, FUTEX_CMP_REQUEUE
-.\"
-.\" FIXME Still to integrate are some points from Torvald Riegel's mail of
-.\" 2015-01-23:
-.\" http://thread.gmane.org/gmane.linux.kernel/1703405/focus=7977
-.\"
-.\" FIXME Do we need to add some text regarding Torvald Riegel's 2015-01-24 mail
-.\" http://thread.gmane.org/gmane.linux.kernel/1703405/focus=1873242
-.\"
-.TH futex 2 2023-10-31 "Linux man-pages 6.7"
-.SH NAME
-futex \- fast user-space locking
-.SH LIBRARY
-Standard C library
-.RI ( libc ", " \-lc )
-.SH SYNOPSIS
-.nf
-.P
-.BR "#include <linux/futex.h>" " /* Definition of " FUTEX_* " constants */"
-.BR "#include <sys/syscall.h>" " /* Definition of " SYS_* " constants */"
-.B #include <unistd.h>
-.P
-.BI "long syscall(SYS_futex, uint32_t *" uaddr ", int " futex_op \
-", uint32_t " val ,
-.BI " const struct timespec *" timeout , \
-" \fR /* or: \fBuint32_t \fIval2\fP */"
-.BI " uint32_t *" uaddr2 ", uint32_t " val3 );
-.fi
-.P
-.IR Note :
-glibc provides no wrapper for
-.BR futex (),
-necessitating the use of
-.BR syscall (2).
-.SH DESCRIPTION
-The
-.BR futex ()
-system call provides a method for waiting until a certain condition becomes
-true.
-It is typically used as a blocking construct in the context of
-shared-memory synchronization.
-When using futexes, the majority of
-the synchronization operations are performed in user space.
-A user-space program employs the
-.BR futex ()
-system call only when it is likely that the program has to block for
-a longer time until the condition becomes true.
-Other
-.BR futex ()
-operations can be used to wake any processes or threads waiting
-for a particular condition.
-.P
-A futex is a 32-bit value\[em]referred to below as a
-.IR "futex word" \[em]whose
-address is supplied to the
-.BR futex ()
-system call.
-(Futexes are 32 bits in size on all platforms, including 64-bit systems.)
-All futex operations are governed by this value.
-In order to share a futex between processes,
-the futex is placed in a region of shared memory,
-created using (for example)
-.BR mmap (2)
-or
-.BR shmat (2).
-(Thus, the futex word may have different
-virtual addresses in different processes,
-but these addresses all refer to the same location in physical memory.)
-In a multithreaded program, it is sufficient to place the futex word
-in a global variable shared by all threads.
-.P
-When executing a futex operation that requests to block a thread,
-the kernel will block only if the futex word has the value that the
-calling thread supplied (as one of the arguments of the
-.BR futex ()
-call) as the expected value of the futex word.
-The loading of the futex word's value,
-the comparison of that value with the expected value,
-and the actual blocking will happen atomically and will be totally ordered
-with respect to concurrent operations performed by other threads
-on the same futex word.
-.\" Notes from Darren Hart (Dec 2015):
-.\" Totally ordered with respect futex operations refers to semantics
-.\" of the ACQUIRE/RELEASE operations and how they impact ordering of
-.\" memory reads and writes. The kernel futex operations are protected
-.\" by spinlocks, which ensure that all operations are serialized
-.\" with respect to one another.
-.\"
-.\" This is a lot to attempt to define in this document. Perhaps a
-.\" reference to linux/Documentation/memory-barriers.txt as a footnote
-.\" would be sufficient? Or perhaps for this manual, "serialized" would
-.\" be sufficient, with a footnote regarding "totally ordered" and a
-.\" pointer to the memory-barrier documentation?
-Thus, the futex word is used to connect the synchronization in user space
-with the implementation of blocking by the kernel.
-Analogously to an atomic
-compare-and-exchange operation that potentially changes shared memory,
-blocking via a futex is an atomic compare-and-block operation.
-.\" FIXME(Torvald Riegel):
-.\" Eventually we want to have some text in NOTES to satisfy
-.\" the reference in the following sentence
-.\" See NOTES for a detailed specification of
-.\" the synchronization semantics.
-.P
-One use of futexes is for implementing locks.
-The state of the lock (i.e., acquired or not acquired)
-can be represented as an atomically accessed flag in shared memory.
-In the uncontended case,
-a thread can access or modify the lock state with atomic instructions,
-for example atomically changing it from not acquired to acquired
-using an atomic compare-and-exchange instruction.
-(Such instructions are performed entirely in user mode,
-and the kernel maintains no information about the lock state.)
-On the other hand, a thread may be unable to acquire a lock because
-it is already acquired by another thread.
-It then may pass the lock's flag as a futex word and the value
-representing the acquired state as the expected value to a
-.BR futex ()
-wait operation.
-This
-.BR futex ()
-operation will block if and only if the lock is still acquired
-(i.e., the value in the futex word still matches the "acquired state").
-When releasing the lock, a thread has to first reset the
-lock state to not acquired and then execute a futex
-operation that wakes threads blocked on the lock flag used as a futex word
-(this can be further optimized to avoid unnecessary wake-ups).
-See
-.BR futex (7)
-for more detail on how to use futexes.
-.P
-Besides the basic wait and wake-up futex functionality, there are further
-futex operations aimed at supporting more complex use cases.
-.P
-Note that
-no explicit initialization or destruction is necessary to use futexes;
-the kernel maintains a futex
-(i.e., the kernel-internal implementation artifact)
-only while operations such as
-.BR FUTEX_WAIT ,
-described below, are being performed on a particular futex word.
-.\"
-.SS Arguments
-The
-.I uaddr
-argument points to the futex word.
-On all platforms, futexes are four-byte
-integers that must be aligned on a four-byte boundary.
-The operation to perform on the futex is specified in the
-.I futex_op
-argument;
-.I val
-is a value whose meaning and purpose depends on
-.IR futex_op .
-.P
-The remaining arguments
-.RI ( timeout ,
-.IR uaddr2 ,
-and
-.IR val3 )
-are required only for certain of the futex operations described below.
-Where one of these arguments is not required, it is ignored.
-.P
-For several blocking operations, the
-.I timeout
-argument is a pointer to a
-.I timespec
-structure that specifies a timeout for the operation.
-However, notwithstanding the prototype shown above, for some operations,
-the least significant four bytes of this argument are instead
-used as an integer whose meaning is determined by the operation.
-For these operations, the kernel casts the
-.I timeout
-value first to
-.IR "unsigned long",
-then to
-.IR uint32_t ,
-and in the remainder of this page, this argument is referred to as
-.I val2
-when interpreted in this fashion.
-.P
-Where it is required, the
-.I uaddr2
-argument is a pointer to a second futex word that is employed
-by the operation.
-.P
-The interpretation of the final integer argument,
-.IR val3 ,
-depends on the operation.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.SS Futex operations
-The
-.I futex_op
-argument consists of two parts:
-a command that specifies the operation to be performed,
-bitwise ORed with zero or more options that
-modify the behaviour of the operation.
-The options that may be included in
-.I futex_op
-are as follows:
-.TP
-.BR FUTEX_PRIVATE_FLAG " (since Linux 2.6.22)"
-.\" commit 34f01cc1f512fa783302982776895c73714ebbc2
-This option bit can be employed with all futex operations.
-It tells the kernel that the futex is process-private and not shared
-with another process (i.e., it is being used for synchronization
-only between threads of the same process).
-This allows the kernel to make some additional performance optimizations.
-.\" I.e., It allows the kernel choose the fast path for validating
-.\" the user-space address and avoids expensive VMA lookups,
-.\" taking reference counts on file backing store, and so on.
-.IP
-As a convenience,
-.I <linux/futex.h>
-defines a set of constants with the suffix
-.B _PRIVATE
-that are equivalents of all of the operations listed below,
-.\" except the obsolete FUTEX_FD, for which the "private" flag was
-.\" meaningless
-but with the
-.B FUTEX_PRIVATE_FLAG
-ORed into the constant value.
-Thus, there are
-.BR FUTEX_WAIT_PRIVATE ,
-.BR FUTEX_WAKE_PRIVATE ,
-and so on.
-.TP
-.BR FUTEX_CLOCK_REALTIME " (since Linux 2.6.28)"
-.\" commit 1acdac104668a0834cfa267de9946fac7764d486
-This option bit can be employed only with the
-.BR FUTEX_WAIT_BITSET ,
-.BR FUTEX_WAIT_REQUEUE_PI ,
-(since Linux 4.5)
-.\" commit 337f13046ff03717a9e99675284a817527440a49
-.BR FUTEX_WAIT ,
-and
-(since Linux 5.14)
-.\" commit bf22a6976897977b0a3f1aeba6823c959fc4fdae
-.B FUTEX_LOCK_PI2
-operations.
-.IP
-If this option is set, the kernel measures the
-.I timeout
-against the
-.B CLOCK_REALTIME
-clock.
-.IP
-If this option is not set, the kernel measures the
-.I timeout
-against the
-.B CLOCK_MONOTONIC
-clock.
-.P
-The operation specified in
-.I futex_op
-is one of the following:
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_WAIT " (since Linux 2.6.0)"
-.\" Strictly speaking, since some time in Linux 2.5.x
-This operation tests that the value at the
-futex word pointed to by the address
-.I uaddr
-still contains the expected value
-.IR val ,
-and if so, then sleeps waiting for a
-.B FUTEX_WAKE
-operation on the futex word.
-The load of the value of the futex word is an atomic memory
-access (i.e., using atomic machine instructions of the respective
-architecture).
-This load, the comparison with the expected value, and
-starting to sleep are performed atomically
-.\" FIXME: Torvald, I think we may need to add some explanation of
-.\" "totally ordered" here.
-and totally ordered
-with respect to other futex operations on the same futex word.
-If the thread starts to sleep,
-it is considered a waiter on this futex word.
-If the futex value does not match
-.IR val ,
-then the call fails immediately with the error
-.BR EAGAIN .
-.IP
-The purpose of the comparison with the expected value is to prevent lost
-wake-ups.
-If another thread changed the value of the futex word after the
-calling thread decided to block based on the prior value,
-and if the other thread executed a
-.B FUTEX_WAKE
-operation (or similar wake-up) after the value change and before this
-.B FUTEX_WAIT
-operation, then the calling thread will observe the
-value change and will not start to sleep.
-.IP
-If the
-.I timeout
-is not NULL, the structure it points to specifies a
-timeout for the wait.
-(This interval will be rounded up to the system clock granularity,
-and is guaranteed not to expire early.)
-The timeout is by default measured according to the
-.B CLOCK_MONOTONIC
-clock, but, since Linux 4.5, the
-.B CLOCK_REALTIME
-clock can be selected by specifying
-.B FUTEX_CLOCK_REALTIME
-in
-.IR futex_op .
-If
-.I timeout
-is NULL, the call blocks indefinitely.
-.IP
-.IR Note :
-for
-.BR FUTEX_WAIT ,
-.I timeout
-is interpreted as a
-.I relative
-value.
-This differs from other futex operations, where
-.I timeout
-is interpreted as an absolute value.
-To obtain the equivalent of
-.B FUTEX_WAIT
-with an absolute timeout, employ
-.B FUTEX_WAIT_BITSET
-with
-.I val3
-specified as
-.BR FUTEX_BITSET_MATCH_ANY .
-.IP
-The arguments
-.I uaddr2
-and
-.I val3
-are ignored.
-.\" FIXME . (Torvald) I think we should remove this. Or maybe adapt to a
-.\" different example.
-.\"
-.\" For
-.\" .BR futex (7),
-.\" this call is executed if decrementing the count gave a negative value
-.\" (indicating contention),
-.\" and will sleep until another process or thread releases
-.\" the futex and executes the
-.\" .B FUTEX_WAKE
-.\" operation.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_WAKE " (since Linux 2.6.0)"
-.\" Strictly speaking, since Linux 2.5.x
-This operation wakes at most
-.I val
-of the waiters that are waiting (e.g., inside
-.BR FUTEX_WAIT )
-on the futex word at the address
-.IR uaddr .
-Most commonly,
-.I val
-is specified as either 1 (wake up a single waiter) or
-.B INT_MAX
-(wake up all waiters).
-No guarantee is provided about which waiters are awoken
-(e.g., a waiter with a higher scheduling priority is not guaranteed
-to be awoken in preference to a waiter with a lower priority).
-.IP
-The arguments
-.IR timeout ,
-.IR uaddr2 ,
-and
-.I val3
-are ignored.
-.\" FIXME . (Torvald) I think we should remove this. Or maybe adapt to
-.\" a different example.
-.\"
-.\" For
-.\" .BR futex (7),
-.\" this is executed if incrementing the count showed that
-.\" there were waiters,
-.\" once the futex value has been set to 1
-.\" (indicating that it is available).
-.\"
-.\" How does "incrementing the count show that there were waiters"?
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_FD " (from Linux 2.6.0 up to and including Linux 2.6.25)"
-.\" Strictly speaking, from Linux 2.5.x to Linux 2.6.25
-This operation creates a file descriptor that is associated with
-the futex at
-.IR uaddr .
-The caller must close the returned file descriptor after use.
-When another process or thread performs a
-.B FUTEX_WAKE
-on the futex word, the file descriptor indicates as being readable with
-.BR select (2),
-.BR poll (2),
-and
-.BR epoll (7)
-.IP
-The file descriptor can be used to obtain asynchronous notifications: if
-.I val
-is nonzero, then, when another process or thread executes a
-.BR FUTEX_WAKE ,
-the caller will receive the signal number that was passed in
-.IR val .
-.IP
-The arguments
-.IR timeout ,
-.IR uaddr2 ,
-and
-.I val3
-are ignored.
-.IP
-Because it was inherently racy,
-.B FUTEX_FD
-has been removed
-.\" commit 82af7aca56c67061420d618cc5a30f0fd4106b80
-from Linux 2.6.26 onward.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_REQUEUE " (since Linux 2.6.0)"
-This operation performs the same task as
-.B FUTEX_CMP_REQUEUE
-(see below), except that no check is made using the value in
-.IR val3 .
-(The argument
-.I val3
-is ignored.)
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_CMP_REQUEUE " (since Linux 2.6.7)"
-This operation first checks whether the location
-.I uaddr
-still contains the value
-.IR val3 .
-If not, the operation fails with the error
-.BR EAGAIN .
-Otherwise, the operation wakes up a maximum of
-.I val
-waiters that are waiting on the futex at
-.IR uaddr .
-If there are more than
-.I val
-waiters, then the remaining waiters are removed
-from the wait queue of the source futex at
-.I uaddr
-and added to the wait queue of the target futex at
-.IR uaddr2 .
-The
-.I val2
-argument specifies an upper limit on the number of waiters
-that are requeued to the futex at
-.IR uaddr2 .
-.IP
-.\" FIXME(Torvald) Is the following correct? Or is just the decision
-.\" which threads to wake or requeue part of the atomic operation?
-The load from
-.I uaddr
-is an atomic memory access (i.e., using atomic machine instructions of
-the respective architecture).
-This load, the comparison with
-.IR val3 ,
-and the requeueing of any waiters are performed atomically and totally
-ordered with respect to other operations on the same futex word.
-.\" Notes from a f2f conversation with Thomas Gleixner (Aug 2015): ###
-.\" The operation is serialized with respect to operations on both
-.\" source and target futex. No other waiter can enqueue itself
-.\" for waiting and no other waiter can dequeue itself because of
-.\" a timeout or signal.
-.IP
-Typical values to specify for
-.I val
-are 0 or 1.
-(Specifying
-.B INT_MAX
-is not useful, because it would make the
-.B FUTEX_CMP_REQUEUE
-operation equivalent to
-.BR FUTEX_WAKE .)
-The limit value specified via
-.I val2
-is typically either 1 or
-.BR INT_MAX .
-(Specifying the argument as 0 is not useful, because it would make the
-.B FUTEX_CMP_REQUEUE
-operation equivalent to
-.BR FUTEX_WAIT .)
-.IP
-The
-.B FUTEX_CMP_REQUEUE
-operation was added as a replacement for the earlier
-.BR FUTEX_REQUEUE .
-The difference is that the check of the value at
-.I uaddr
-can be used to ensure that requeueing happens only under certain
-conditions, which allows race conditions to be avoided in certain use cases.
-.\" But, as Rich Felker points out, there remain valid use cases for
-.\" FUTEX_REQUEUE, for example, when the calling thread is requeuing
-.\" the target(s) to a lock that the calling thread owns
-.\" From: Rich Felker <dalias@libc.org>
-.\" Date: Wed, 29 Oct 2014 22:43:17 -0400
-.\" To: Darren Hart <dvhart@infradead.org>
-.\" CC: libc-alpha@sourceware.org, ...
-.\" Subject: Re: Add futex wrapper to glibc?
-.IP
-Both
-.B FUTEX_REQUEUE
-and
-.B FUTEX_CMP_REQUEUE
-can be used to avoid "thundering herd" wake-ups that could occur when using
-.B FUTEX_WAKE
-in cases where all of the waiters that are woken need to acquire
-another futex.
-Consider the following scenario,
-where multiple waiter threads are waiting on B,
-a wait queue implemented using a futex:
-.IP
-.in +4n
-.EX
-lock(A)
-while (!check_value(V)) {
- unlock(A);
- block_on(B);
- lock(A);
-};
-unlock(A);
-.EE
-.in
-.IP
-If a waker thread used
-.BR FUTEX_WAKE ,
-then all waiters waiting on B would be woken up,
-and they would all try to acquire lock A.
-However, waking all of the threads in this manner would be pointless because
-all except one of the threads would immediately block on lock A again.
-By contrast, a requeue operation wakes just one waiter and moves
-the other waiters to lock A,
-and when the woken waiter unlocks A then the next waiter can proceed.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_WAKE_OP " (since Linux 2.6.14)"
-.\" commit 4732efbeb997189d9f9b04708dc26bf8613ed721
-.\" Author: Jakub Jelinek <jakub@redhat.com>
-.\" Date: Tue Sep 6 15:16:25 2005 -0700
-.\" FIXME. (Torvald) The glibc condvar implementation is currently being
-.\" revised (e.g., to not use an internal lock anymore).
-.\" It is probably more future-proof to remove this paragraph.
-.\" [Torvald, do you have an update here?]
-This operation was added to support some user-space use cases
-where more than one futex must be handled at the same time.
-The most notable example is the implementation of
-.BR pthread_cond_signal (3),
-which requires operations on two futexes,
-the one used to implement the mutex and the one used in the implementation
-of the wait queue associated with the condition variable.
-.B FUTEX_WAKE_OP
-allows such cases to be implemented without leading to
-high rates of contention and context switching.
-.IP
-The
-.B FUTEX_WAKE_OP
-operation is equivalent to executing the following code atomically
-and totally ordered with respect to other futex operations on
-any of the two supplied futex words:
-.IP
-.in +4n
-.EX
-uint32_t oldval = *(uint32_t *) uaddr2;
-*(uint32_t *) uaddr2 = oldval \fIop\fP \fIoparg\fP;
-futex(uaddr, FUTEX_WAKE, val, 0, 0, 0);
-if (oldval \fIcmp\fP \fIcmparg\fP)
- futex(uaddr2, FUTEX_WAKE, val2, 0, 0, 0);
-.EE
-.in
-.IP
-In other words,
-.B FUTEX_WAKE_OP
-does the following:
-.RS
-.IP \[bu] 3
-saves the original value of the futex word at
-.I uaddr2
-and performs an operation to modify the value of the futex at
-.IR uaddr2 ;
-this is an atomic read-modify-write memory access (i.e., using atomic
-machine instructions of the respective architecture)
-.IP \[bu]
-wakes up a maximum of
-.I val
-waiters on the futex for the futex word at
-.IR uaddr ;
-and
-.IP \[bu]
-dependent on the results of a test of the original value of the
-futex word at
-.IR uaddr2 ,
-wakes up a maximum of
-.I val2
-waiters on the futex for the futex word at
-.IR uaddr2 .
-.RE
-.IP
-The operation and comparison that are to be performed are encoded
-in the bits of the argument
-.IR val3 .
-Pictorially, the encoding is:
-.IP
-.in +4n
-.EX
-+---+---+-----------+-----------+
-|op |cmp| oparg | cmparg |
-+---+---+-----------+-----------+
- 4 4 12 12 <== # of bits
-.EE
-.in
-.IP
-Expressed in code, the encoding is:
-.IP
-.in +4n
-.EX
-#define FUTEX_OP(op, oparg, cmp, cmparg) \e
- (((op & 0xf) << 28) | \e
- ((cmp & 0xf) << 24) | \e
- ((oparg & 0xfff) << 12) | \e
- (cmparg & 0xfff))
-.EE
-.in
-.IP
-In the above,
-.I op
-and
-.I cmp
-are each one of the codes listed below.
-The
-.I oparg
-and
-.I cmparg
-components are literal numeric values, except as noted below.
-.IP
-The
-.I op
-component has one of the following values:
-.IP
-.in +4n
-.EX
-FUTEX_OP_SET 0 /* uaddr2 = oparg; */
-FUTEX_OP_ADD 1 /* uaddr2 += oparg; */
-FUTEX_OP_OR 2 /* uaddr2 |= oparg; */
-FUTEX_OP_ANDN 3 /* uaddr2 &= \[ti]oparg; */
-FUTEX_OP_XOR 4 /* uaddr2 \[ha]= oparg; */
-.EE
-.in
-.IP
-In addition, bitwise ORing the following value into
-.I op
-causes
-.I (1\~<<\~oparg)
-to be used as the operand:
-.IP
-.in +4n
-.EX
-FUTEX_OP_ARG_SHIFT 8 /* Use (1 << oparg) as operand */
-.EE
-.in
-.IP
-The
-.I cmp
-field is one of the following:
-.IP
-.in +4n
-.EX
-FUTEX_OP_CMP_EQ 0 /* if (oldval == cmparg) wake */
-FUTEX_OP_CMP_NE 1 /* if (oldval != cmparg) wake */
-FUTEX_OP_CMP_LT 2 /* if (oldval < cmparg) wake */
-FUTEX_OP_CMP_LE 3 /* if (oldval <= cmparg) wake */
-FUTEX_OP_CMP_GT 4 /* if (oldval > cmparg) wake */
-FUTEX_OP_CMP_GE 5 /* if (oldval >= cmparg) wake */
-.EE
-.in
-.IP
-The return value of
-.B FUTEX_WAKE_OP
-is the sum of the number of waiters woken on the futex
-.I uaddr
-plus the number of waiters woken on the futex
-.IR uaddr2 .
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_WAIT_BITSET " (since Linux 2.6.25)"
-.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
-This operation is like
-.B FUTEX_WAIT
-except that
-.I val3
-is used to provide a 32-bit bit mask to the kernel.
-This bit mask, in which at least one bit must be set,
-is stored in the kernel-internal state of the waiter.
-See the description of
-.B FUTEX_WAKE_BITSET
-for further details.
-.IP
-If
-.I timeout
-is not NULL, the structure it points to specifies
-an absolute timeout for the wait operation.
-If
-.I timeout
-is NULL, the operation can block indefinitely.
-.IP
-The
-.I uaddr2
-argument is ignored.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_WAKE_BITSET " (since Linux 2.6.25)"
-.\" commit cd689985cf49f6ff5c8eddc48d98b9d581d9475d
-This operation is the same as
-.B FUTEX_WAKE
-except that the
-.I val3
-argument is used to provide a 32-bit bit mask to the kernel.
-This bit mask, in which at least one bit must be set,
-is used to select which waiters should be woken up.
-The selection is done by a bitwise AND of the "wake" bit mask
-(i.e., the value in
-.IR val3 )
-and the bit mask which is stored in the kernel-internal
-state of the waiter (the "wait" bit mask that is set using
-.BR FUTEX_WAIT_BITSET ).
-All of the waiters for which the result of the AND is nonzero are woken up;
-the remaining waiters are left sleeping.
-.IP
-The effect of
-.B FUTEX_WAIT_BITSET
-and
-.B FUTEX_WAKE_BITSET
-is to allow selective wake-ups among multiple waiters that are blocked
-on the same futex.
-However, note that, depending on the use case,
-employing this bit-mask multiplexing feature on a
-futex can be less efficient than simply using multiple futexes,
-because employing bit-mask multiplexing requires the kernel
-to check all waiters on a futex,
-including those that are not interested in being woken up
-(i.e., they do not have the relevant bit set in their "wait" bit mask).
-.\" According to http://locklessinc.com/articles/futex_cheat_sheet/:
-.\"
-.\" "The original reason for the addition of these extensions
-.\" was to improve the performance of pthread read-write locks
-.\" in glibc. However, the pthreads library no longer uses the
-.\" same locking algorithm, and these extensions are not used
-.\" without the bitset parameter being all ones.
-.\"
-.\" The page goes on to note that the FUTEX_WAIT_BITSET operation
-.\" is nevertheless used (with a bit mask of all ones) in order to
-.\" obtain the absolute timeout functionality that is useful
-.\" for efficiently implementing Pthreads APIs (which use absolute
-.\" timeouts); FUTEX_WAIT provides only relative timeouts.
-.IP
-The constant
-.BR FUTEX_BITSET_MATCH_ANY ,
-which corresponds to all 32 bits set in the bit mask, can be used as the
-.I val3
-argument for
-.B FUTEX_WAIT_BITSET
-and
-.BR FUTEX_WAKE_BITSET .
-Other than differences in the handling of the
-.I timeout
-argument, the
-.B FUTEX_WAIT
-operation is equivalent to
-.B FUTEX_WAIT_BITSET
-with
-.I val3
-specified as
-.BR FUTEX_BITSET_MATCH_ANY ;
-that is, allow a wake-up by any waker.
-The
-.B FUTEX_WAKE
-operation is equivalent to
-.B FUTEX_WAKE_BITSET
-with
-.I val3
-specified as
-.BR FUTEX_BITSET_MATCH_ANY ;
-that is, wake up any waiter(s).
-.IP
-The
-.I uaddr2
-and
-.I timeout
-arguments are ignored.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.SS Priority-inheritance futexes
-Linux supports priority-inheritance (PI) futexes in order to handle
-priority-inversion problems that can be encountered with
-normal futex locks.
-Priority inversion is the problem that occurs when a high-priority
-task is blocked waiting to acquire a lock held by a low-priority task,
-while tasks at an intermediate priority continuously preempt
-the low-priority task from the CPU.
-Consequently, the low-priority task makes no progress toward
-releasing the lock, and the high-priority task remains blocked.
-.P
-Priority inheritance is a mechanism for dealing with
-the priority-inversion problem.
-With this mechanism, when a high-priority task becomes blocked
-by a lock held by a low-priority task,
-the priority of the low-priority task is temporarily raised
-to that of the high-priority task,
-so that it is not preempted by any intermediate level tasks,
-and can thus make progress toward releasing the lock.
-To be effective, priority inheritance must be transitive,
-meaning that if a high-priority task blocks on a lock
-held by a lower-priority task that is itself blocked by a lock
-held by another intermediate-priority task
-(and so on, for chains of arbitrary length),
-then both of those tasks
-(or more generally, all of the tasks in a lock chain)
-have their priorities raised to be the same as the high-priority task.
-.P
-From a user-space perspective,
-what makes a futex PI-aware is a policy agreement (described below)
-between user space and the kernel about the value of the futex word,
-coupled with the use of the PI-futex operations described below.
-(Unlike the other futex operations described above,
-the PI-futex operations are designed
-for the implementation of very specific IPC mechanisms.)
-.\"
-.\" Quoting Darren Hart:
-.\" These opcodes paired with the PI futex value policy (described below)
-.\" defines a "futex" as PI aware. These were created very specifically
-.\" in support of PI pthread_mutexes, so it makes a lot more sense to
-.\" talk about a PI aware pthread_mutex, than a PI aware futex, since
-.\" there is a lot of policy and scaffolding that has to be built up
-.\" around it to use it properly (this is what a PI pthread_mutex is).
-.P
-.\" mtk: The following text is drawn from the Hart/Guniguntala paper
-.\" (listed in SEE ALSO), but I have reworded some pieces
-.\" significantly.
-.\"
-The PI-futex operations described below differ from the other
-futex operations in that they impose policy on the use of the value of the
-futex word:
-.IP \[bu] 3
-If the lock is not acquired, the futex word's value shall be 0.
-.IP \[bu]
-If the lock is acquired, the futex word's value shall
-be the thread ID (TID;
-see
-.BR gettid (2))
-of the owning thread.
-.IP \[bu]
-If the lock is owned and there are threads contending for the lock,
-then the
-.B FUTEX_WAITERS
-bit shall be set in the futex word's value; in other words, this value is:
-.IP
-.in +4n
-.EX
-FUTEX_WAITERS | TID
-.EE
-.in
-.IP
-(Note that is invalid for a PI futex word to have no owner and
-.B FUTEX_WAITERS
-set.)
-.P
-With this policy in place,
-a user-space application can acquire an unacquired
-lock or release a lock using atomic instructions executed in user mode
-(e.g., a compare-and-swap operation such as
-.I cmpxchg
-on the x86 architecture).
-Acquiring a lock simply consists of using compare-and-swap to atomically
-set the futex word's value to the caller's TID if its previous value was 0.
-Releasing a lock requires using compare-and-swap to set the futex word's
-value to 0 if the previous value was the expected TID.
-.P
-If a futex is already acquired (i.e., has a nonzero value),
-waiters must employ the
-.B FUTEX_LOCK_PI
-or
-.B FUTEX_LOCK_PI2
-operations to acquire the lock.
-If other threads are waiting for the lock, then the
-.B FUTEX_WAITERS
-bit is set in the futex value;
-in this case, the lock owner must employ the
-.B FUTEX_UNLOCK_PI
-operation to release the lock.
-.P
-In the cases where callers are forced into the kernel
-(i.e., required to perform a
-.BR futex ()
-call),
-they then deal directly with a so-called RT-mutex,
-a kernel locking mechanism which implements the required
-priority-inheritance semantics.
-After the RT-mutex is acquired, the futex value is updated accordingly,
-before the calling thread returns to user space.
-.P
-It is important to note
-.\" tglx (July 2015):
-.\" If there are multiple waiters on a pi futex then a wake pi operation
-.\" will wake the first waiter and hand over the lock to this waiter. This
-.\" includes handing over the rtmutex which represents the futex in the
-.\" kernel. The strict requirement is that the futex owner and the rtmutex
-.\" owner must be the same, except for the update period which is
-.\" serialized by the futex internal locking. That means the kernel must
-.\" update the user-space value prior to returning to user space
-that the kernel will update the futex word's value prior
-to returning to user space.
-(This prevents the possibility of the futex word's value ending
-up in an invalid state, such as having an owner but the value being 0,
-or having waiters but not having the
-.B FUTEX_WAITERS
-bit set.)
-.P
-If a futex has an associated RT-mutex in the kernel
-(i.e., there are blocked waiters)
-and the owner of the futex/RT-mutex dies unexpectedly,
-then the kernel cleans up the RT-mutex and hands it over to the next waiter.
-This in turn requires that the user-space value is updated accordingly.
-To indicate that this is required, the kernel sets the
-.B FUTEX_OWNER_DIED
-bit in the futex word along with the thread ID of the new owner.
-User space can detect this situation via the presence of the
-.B FUTEX_OWNER_DIED
-bit and is then responsible for cleaning up the stale state left over by
-the dead owner.
-.\" tglx (July 2015):
-.\" The FUTEX_OWNER_DIED bit can also be set on uncontended futexes, where
-.\" the kernel has no state associated. This happens via the robust futex
-.\" mechanism. In that case the futex value will be set to
-.\" FUTEX_OWNER_DIED. The robust futex mechanism is also available for non
-.\" PI futexes.
-.P
-PI futexes are operated on by specifying one of the values listed below in
-.IR futex_op .
-Note that the PI futex operations must be used as paired operations
-and are subject to some additional requirements:
-.IP \[bu] 3
-.BR FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-and
-.B FUTEX_TRYLOCK_PI
-pair with
-.BR FUTEX_UNLOCK_PI .
-.B FUTEX_UNLOCK_PI
-must be called only on a futex owned by the calling thread,
-as defined by the value policy, otherwise the error
-.B EPERM
-results.
-.IP \[bu]
-.B FUTEX_WAIT_REQUEUE_PI
-pairs with
-.BR FUTEX_CMP_REQUEUE_PI .
-This must be performed from a non-PI futex to a distinct PI futex
-(or the error
-.B EINVAL
-results).
-Additionally,
-.I val
-(the number of waiters to be woken) must be 1
-(or the error
-.B EINVAL
-results).
-.P
-The PI futex operations are as follows:
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_LOCK_PI " (since Linux 2.6.18)"
-.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
-This operation is used after an attempt to acquire
-the lock via an atomic user-mode instruction failed
-because the futex word has a nonzero value\[em]specifically,
-because it contained the (PID-namespace-specific) TID of the lock owner.
-.IP
-The operation checks the value of the futex word at the address
-.IR uaddr .
-If the value is 0, then the kernel tries to atomically set
-the futex value to the caller's TID.
-If the futex word's value is nonzero,
-the kernel atomically sets the
-.B FUTEX_WAITERS
-bit, which signals the futex owner that it cannot unlock the futex in
-user space atomically by setting the futex value to 0.
-.\" tglx (July 2015):
-.\" The operation here is similar to the FUTEX_WAIT logic. When the user
-.\" space atomic acquire does not succeed because the futex value was non
-.\" zero, then the waiter goes into the kernel, takes the kernel internal
-.\" lock and retries the acquisition under the lock. If the acquisition
-.\" does not succeed either, then it sets the FUTEX_WAITERS bit, to signal
-.\" the lock owner that it needs to go into the kernel. Here is the pseudo
-.\" code:
-.\"
-.\" lock(kernel_lock);
-.\" retry:
-.\"
-.\" /*
-.\" * Owner might have unlocked in user space before we
-.\" * were able to set the waiter bit.
-.\" */
-.\" if (atomic_acquire(futex) == SUCCESS) {
-.\" unlock(kernel_lock());
-.\" return 0;
-.\" }
-.\"
-.\" /*
-.\" * Owner might have unlocked after the above atomic_acquire()
-.\" * attempt.
-.\" */
-.\" if (atomic_set_waiters_bit(futex) != SUCCESS)
-.\" goto retry;
-.\"
-.\" queue_waiter();
-.\" unlock(kernel_lock);
-.\" block();
-.\"
-After that, the kernel:
-.RS
-.IP (1) 5
-Tries to find the thread which is associated with the owner TID.
-.IP (2)
-Creates or reuses kernel state on behalf of the owner.
-(If this is the first waiter, there is no kernel state for this
-futex, so kernel state is created by locking the RT-mutex
-and the futex owner is made the owner of the RT-mutex.
-If there are existing waiters, then the existing state is reused.)
-.IP (3)
-Attaches the waiter to the futex
-(i.e., the waiter is enqueued on the RT-mutex waiter list).
-.RE
-.IP
-If more than one waiter exists,
-the enqueueing of the waiter is in descending priority order.
-(For information on priority ordering, see the discussion of the
-.BR SCHED_DEADLINE ,
-.BR SCHED_FIFO ,
-and
-.B SCHED_RR
-scheduling policies in
-.BR sched (7).)
-The owner inherits either the waiter's CPU bandwidth
-(if the waiter is scheduled under the
-.B SCHED_DEADLINE
-policy) or the waiter's priority (if the waiter is scheduled under the
-.B SCHED_RR
-or
-.B SCHED_FIFO
-policy).
-.\" August 2015:
-.\" mtk: If the realm is restricted purely to SCHED_OTHER (SCHED_NORMAL)
-.\" processes, does the nice value come into play also?
-.\"
-.\" tglx: No. SCHED_OTHER/NORMAL tasks are handled in FIFO order
-This inheritance follows the lock chain in the case of nested locking
-.\" (i.e., task 1 blocks on lock A, held by task 2,
-.\" while task 2 blocks on lock B, held by task 3)
-and performs deadlock detection.
-.IP
-The
-.I timeout
-argument provides a timeout for the lock attempt.
-If
-.I timeout
-is not NULL, the structure it points to specifies
-an absolute timeout, measured against the
-.B CLOCK_REALTIME
-clock.
-.\" 2016-07-07 response from Thomas Gleixner on LKML:
-.\" From: Thomas Gleixner <tglx@linutronix.de>
-.\" Date: 6 July 2016 at 20:57
-.\" Subject: Re: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op
-.\"
-.\" On Thu, 23 Jun 2016, Michael Kerrisk (man-pages) wrote:
-.\" > On 06/23/2016 08:28 PM, Darren Hart wrote:
-.\" > > And as a follow-on, what is the reason for FUTEX_LOCK_PI only using
-.\" > > CLOCK_REALTIME? It seems reasonable to me that a user may want to wait a
-.\" > > specific amount of time, regardless of wall time.
-.\" >
-.\" > Yes, that's another weird inconsistency.
-.\"
-.\" The reason is that phtread_mutex_timedlock() uses absolute timeouts based on
-.\" CLOCK_REALTIME. glibc folks asked to make that the default behaviour back
-.\" then when we added LOCK_PI.
-If
-.I timeout
-is NULL, the operation will block indefinitely.
-.IP
-The
-.IR uaddr2 ,
-.IR val ,
-and
-.I val3
-arguments are ignored.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_LOCK_PI2 " (since Linux 5.14)"
-.\" commit bf22a6976897977b0a3f1aeba6823c959fc4fdae
-This operation is the same as
-.BR FUTEX_LOCK_PI ,
-except that the clock against which
-.I timeout
-is measured is selectable.
-By default, the (absolute) timeout specified in
-.I timeout
-is measured against the
-.B CLOCK_MONOTONIC
-clock, but if the
-.B FUTEX_CLOCK_REALTIME
-flag is specified in
-.IR futex_op ,
-then the timeout is measured against the
-.B CLOCK_REALTIME
-clock.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_TRYLOCK_PI " (since Linux 2.6.18)"
-.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
-This operation tries to acquire the lock at
-.IR uaddr .
-It is invoked when a user-space atomic acquire did not
-succeed because the futex word was not 0.
-.IP
-Because the kernel has access to more state information than user space,
-acquisition of the lock might succeed if performed by the
-kernel in cases where the futex word
-(i.e., the state information accessible to use-space) contains stale state
-.RB ( FUTEX_WAITERS
-and/or
-.BR FUTEX_OWNER_DIED ).
-This can happen when the owner of the futex died.
-User space cannot handle this condition in a race-free manner,
-but the kernel can fix this up and acquire the futex.
-.\" Paraphrasing a f2f conversation with Thomas Gleixner about the
-.\" above point (Aug 2015): ###
-.\" There is a rare possibility of a race condition involving an
-.\" uncontended futex with no owner, but with waiters. The
-.\" kernel-user-space contract is that if a futex is nonzero, you must
-.\" go into kernel. The futex was owned by a task, and that task dies
-.\" but there are no waiters, so the futex value is non zero.
-.\" Therefore, the next locker has to go into the kernel,
-.\" so that the kernel has a chance to clean up. (CMXCH on zero
-.\" in user space would fail, so kernel has to clean up.)
-.\" Darren Hart (Oct 2015):
-.\" The trylock in the kernel has more state, so it can independently
-.\" verify the flags that user space must trust implicitly.
-.IP
-The
-.IR uaddr2 ,
-.IR val ,
-.IR timeout ,
-and
-.I val3
-arguments are ignored.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_UNLOCK_PI " (since Linux 2.6.18)"
-.\" commit c87e2837be82df479a6bae9f155c43516d2feebc
-This operation wakes the top priority waiter that is waiting in
-.B FUTEX_LOCK_PI
-or
-.B FUTEX_LOCK_PI2
-on the futex address provided by the
-.I uaddr
-argument.
-.IP
-This is called when the user-space value at
-.I uaddr
-cannot be changed atomically from a TID (of the owner) to 0.
-.IP
-The
-.IR uaddr2 ,
-.IR val ,
-.IR timeout ,
-and
-.I val3
-arguments are ignored.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_CMP_REQUEUE_PI " (since Linux 2.6.31)"
-.\" commit 52400ba946759af28442dee6265c5c0180ac7122
-This operation is a PI-aware variant of
-.BR FUTEX_CMP_REQUEUE .
-It requeues waiters that are blocked via
-.B FUTEX_WAIT_REQUEUE_PI
-on
-.I uaddr
-from a non-PI source futex
-.RI ( uaddr )
-to a PI target futex
-.RI ( uaddr2 ).
-.IP
-As with
-.BR FUTEX_CMP_REQUEUE ,
-this operation wakes up a maximum of
-.I val
-waiters that are waiting on the futex at
-.IR uaddr .
-However, for
-.BR FUTEX_CMP_REQUEUE_PI ,
-.I val
-is required to be 1
-(since the main point is to avoid a thundering herd).
-The remaining waiters are removed from the wait queue of the source futex at
-.I uaddr
-and added to the wait queue of the target futex at
-.IR uaddr2 .
-.IP
-The
-.I val2
-.\" val2 is the cap on the number of requeued waiters.
-.\" In the glibc pthread_cond_broadcast() implementation, this argument
-.\" is specified as INT_MAX, and for pthread_cond_signal() it is 0.
-and
-.I val3
-arguments serve the same purposes as for
-.BR FUTEX_CMP_REQUEUE .
-.\"
-.\" The page at http://locklessinc.com/articles/futex_cheat_sheet/
-.\" notes that "priority-inheritance Futex to priority-inheritance
-.\" Futex requeues are currently unsupported". However, probably
-.\" the page does not need to say nothing about this, since
-.\" Thomas Gleixner commented (July 2015): "they never will be
-.\" supported because they make no sense at all"
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.TP
-.BR FUTEX_WAIT_REQUEUE_PI " (since Linux 2.6.31)"
-.\" commit 52400ba946759af28442dee6265c5c0180ac7122
-.\"
-Wait on a non-PI futex at
-.I uaddr
-and potentially be requeued (via a
-.B FUTEX_CMP_REQUEUE_PI
-operation in another task) onto a PI futex at
-.IR uaddr2 .
-The wait operation on
-.I uaddr
-is the same as for
-.BR FUTEX_WAIT .
-.IP
-The waiter can be removed from the wait on
-.I uaddr
-without requeueing on
-.I uaddr2
-via a
-.B FUTEX_WAKE
-operation in another task.
-In this case, the
-.B FUTEX_WAIT_REQUEUE_PI
-operation fails with the error
-.BR EAGAIN .
-.IP
-If
-.I timeout
-is not NULL, the structure it points to specifies
-an absolute timeout for the wait operation.
-If
-.I timeout
-is NULL, the operation can block indefinitely.
-.IP
-The
-.I val3
-argument is ignored.
-.IP
-The
-.B FUTEX_WAIT_REQUEUE_PI
-and
-.B FUTEX_CMP_REQUEUE_PI
-were added to support a fairly specific use case:
-support for priority-inheritance-aware POSIX threads condition variables.
-The idea is that these operations should always be paired,
-in order to ensure that user space and the kernel remain in sync.
-Thus, in the
-.B FUTEX_WAIT_REQUEUE_PI
-operation, the user-space application pre-specifies the target
-of the requeue that takes place in the
-.B FUTEX_CMP_REQUEUE_PI
-operation.
-.\"
-.\" Darren Hart notes that a patch to allow glibc to fully support
-.\" PI-aware pthreads condition variables has not yet been accepted into
-.\" glibc. The story is complex, and can be found at
-.\" https://sourceware.org/bugzilla/show_bug.cgi?id=11588
-.\" Darren notes that in the meantime, the patch is shipped with various
-.\" PREEMPT_RT-enabled Linux systems.
-.\"
-.\" Related to the preceding, Darren proposed that somewhere, man-pages
-.\" should document the following point:
-.\"
-.\" While the Linux kernel, since Linux 2.6.31, supports requeueing of
-.\" priority-inheritance (PI) aware mutexes via the
-.\" FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI futex operations,
-.\" the glibc implementation does not yet take full advantage of this.
-.\" Specifically, the condvar internal data lock remains a non-PI aware
-.\" mutex, regardless of the type of the pthread_mutex associated with
-.\" the condvar. This can lead to an unbounded priority inversion on
-.\" the internal data lock even when associating a PI aware
-.\" pthread_mutex with a condvar during a pthread_cond*_wait
-.\" operation. For this reason, it is not recommended to rely on
-.\" priority inheritance when using pthread condition variables.
-.\"
-.\" The problem is that the obvious location for this text is
-.\" the pthread_cond*wait(3) man page. However, such a man page
-.\" does not currently exist.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.SH RETURN VALUE
-In the event of an error (and assuming that
-.BR futex ()
-was invoked via
-.BR syscall (2)),
-all operations return \-1 and set
-.I errno
-to indicate the error.
-.P
-The return value on success depends on the operation,
-as described in the following list:
-.TP
-.B FUTEX_WAIT
-Returns 0 if the caller was woken up.
-Note that a wake-up can also be caused by common futex usage patterns
-in unrelated code that happened to have previously used the futex word's
-memory location (e.g., typical futex-based implementations of
-Pthreads mutexes can cause this under some conditions).
-Therefore, callers should always conservatively assume that a return
-value of 0 can mean a spurious wake-up, and use the futex word's value
-(i.e., the user-space synchronization scheme)
-to decide whether to continue to block or not.
-.TP
-.B FUTEX_WAKE
-Returns the number of waiters that were woken up.
-.TP
-.B FUTEX_FD
-Returns the new file descriptor associated with the futex.
-.TP
-.B FUTEX_REQUEUE
-Returns the number of waiters that were woken up.
-.TP
-.B FUTEX_CMP_REQUEUE
-Returns the total number of waiters that were woken up or
-requeued to the futex for the futex word at
-.IR uaddr2 .
-If this value is greater than
-.IR val ,
-then the difference is the number of waiters requeued to the futex for the
-futex word at
-.IR uaddr2 .
-.TP
-.B FUTEX_WAKE_OP
-Returns the total number of waiters that were woken up.
-This is the sum of the woken waiters on the two futexes for
-the futex words at
-.I uaddr
-and
-.IR uaddr2 .
-.TP
-.B FUTEX_WAIT_BITSET
-Returns 0 if the caller was woken up.
-See
-.B FUTEX_WAIT
-for how to interpret this correctly in practice.
-.TP
-.B FUTEX_WAKE_BITSET
-Returns the number of waiters that were woken up.
-.TP
-.B FUTEX_LOCK_PI
-Returns 0 if the futex was successfully locked.
-.TP
-.B FUTEX_LOCK_PI2
-Returns 0 if the futex was successfully locked.
-.TP
-.B FUTEX_TRYLOCK_PI
-Returns 0 if the futex was successfully locked.
-.TP
-.B FUTEX_UNLOCK_PI
-Returns 0 if the futex was successfully unlocked.
-.TP
-.B FUTEX_CMP_REQUEUE_PI
-Returns the total number of waiters that were woken up or
-requeued to the futex for the futex word at
-.IR uaddr2 .
-If this value is greater than
-.IR val ,
-then difference is the number of waiters requeued to the futex for
-the futex word at
-.IR uaddr2 .
-.TP
-.B FUTEX_WAIT_REQUEUE_PI
-Returns 0 if the caller was successfully requeued to the futex for
-the futex word at
-.IR uaddr2 .
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.SH ERRORS
-.TP
-.B EACCES
-No read access to the memory of a futex word.
-.TP
-.B EAGAIN
-.RB ( FUTEX_WAIT ,
-.BR FUTEX_WAIT_BITSET ,
-.BR FUTEX_WAIT_REQUEUE_PI )
-The value pointed to by
-.I uaddr
-was not equal to the expected value
-.I val
-at the time of the call.
-.IP
-.BR Note :
-on Linux, the symbolic names
-.B EAGAIN
-and
-.B EWOULDBLOCK
-(both of which appear in different parts of the kernel futex code)
-have the same value.
-.TP
-.B EAGAIN
-.RB ( FUTEX_CMP_REQUEUE ,
-.BR FUTEX_CMP_REQUEUE_PI )
-The value pointed to by
-.I uaddr
-is not equal to the expected value
-.IR val3 .
-.TP
-.B EAGAIN
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_CMP_REQUEUE_PI )
-The futex owner thread ID of
-.I uaddr
-(for
-.BR FUTEX_CMP_REQUEUE_PI :
-.IR uaddr2 )
-is about to exit,
-but has not yet handled the internal state cleanup.
-Try again.
-.TP
-.B EDEADLK
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_CMP_REQUEUE_PI )
-The futex word at
-.I uaddr
-is already locked by the caller.
-.TP
-.B EDEADLK
-.\" FIXME . I see that kernel/locking/rtmutex.c uses EDEADLK in some
-.\" places, and EDEADLOCK in others. On almost all architectures
-.\" these constants are synonymous. Is there a reason that both
-.\" names are used?
-.\"
-.\" tglx (July 2015): "No. We should probably fix that."
-.\"
-.RB ( FUTEX_CMP_REQUEUE_PI )
-While requeueing a waiter to the PI futex for the futex word at
-.IR uaddr2 ,
-the kernel detected a deadlock.
-.TP
-.B EFAULT
-A required pointer argument (i.e.,
-.IR uaddr ,
-.IR uaddr2 ,
-or
-.IR timeout )
-did not point to a valid user-space address.
-.TP
-.B EINTR
-A
-.B FUTEX_WAIT
-or
-.B FUTEX_WAIT_BITSET
-operation was interrupted by a signal (see
-.BR signal (7)).
-Before Linux 2.6.22, this error could also be returned for
-a spurious wakeup; since Linux 2.6.22, this no longer happens.
-.TP
-.B EINVAL
-The operation in
-.I futex_op
-is one of those that employs a timeout, but the supplied
-.I timeout
-argument was invalid
-.RI ( tv_sec
-was less than zero, or
-.I tv_nsec
-was not less than 1,000,000,000).
-.TP
-.B EINVAL
-The operation specified in
-.I futex_op
-employs one or both of the pointers
-.I uaddr
-and
-.IR uaddr2 ,
-but one of these does not point to a valid object\[em]that is,
-the address is not four-byte-aligned.
-.TP
-.B EINVAL
-.RB ( FUTEX_WAIT_BITSET ,
-.BR FUTEX_WAKE_BITSET )
-The bit mask supplied in
-.I val3
-is zero.
-.TP
-.B EINVAL
-.RB ( FUTEX_CMP_REQUEUE_PI )
-.I uaddr
-equals
-.I uaddr2
-(i.e., an attempt was made to requeue to the same futex).
-.TP
-.B EINVAL
-.RB ( FUTEX_FD )
-The signal number supplied in
-.I val
-is invalid.
-.TP
-.B EINVAL
-.RB ( FUTEX_WAKE ,
-.BR FUTEX_WAKE_OP ,
-.BR FUTEX_WAKE_BITSET ,
-.BR FUTEX_REQUEUE ,
-.BR FUTEX_CMP_REQUEUE )
-The kernel detected an inconsistency between the user-space state at
-.I uaddr
-and the kernel state\[em]that is, it detected a waiter which waits in
-.B FUTEX_LOCK_PI
-or
-.B FUTEX_LOCK_PI2
-on
-.IR uaddr .
-.TP
-.B EINVAL
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_UNLOCK_PI )
-The kernel detected an inconsistency between the user-space state at
-.I uaddr
-and the kernel state.
-This indicates either state corruption
-or that the kernel found a waiter on
-.I uaddr
-which is waiting via
-.B FUTEX_WAIT
-or
-.BR FUTEX_WAIT_BITSET .
-.TP
-.B EINVAL
-.RB ( FUTEX_CMP_REQUEUE_PI )
-The kernel detected an inconsistency between the user-space state at
-.I uaddr2
-and the kernel state;
-.\" From a conversation with Thomas Gleixner (Aug 2015): ###
-.\" The kernel sees: I have non PI state for a futex you tried to
-.\" tell me was PI
-that is, the kernel detected a waiter which waits via
-.B FUTEX_WAIT
-or
-.B FUTEX_WAIT_BITSET
-on
-.IR uaddr2 .
-.TP
-.B EINVAL
-.RB ( FUTEX_CMP_REQUEUE_PI )
-The kernel detected an inconsistency between the user-space state at
-.I uaddr
-and the kernel state;
-that is, the kernel detected a waiter which waits via
-.B FUTEX_WAIT
-or
-.B FUTEX_WAIT_BITSET
-on
-.IR uaddr .
-.TP
-.B EINVAL
-.RB ( FUTEX_CMP_REQUEUE_PI )
-The kernel detected an inconsistency between the user-space state at
-.I uaddr
-and the kernel state;
-that is, the kernel detected a waiter which waits on
-.I uaddr
-via
-.B FUTEX_LOCK_PI
-or
-.B FUTEX_LOCK_PI2
-(instead of
-.BR FUTEX_WAIT_REQUEUE_PI ).
-.TP
-.B EINVAL
-.RB ( FUTEX_CMP_REQUEUE_PI )
-.\" This deals with the case:
-.\" wait_requeue_pi(A, B);
-.\" requeue_pi(A, C);
-An attempt was made to requeue a waiter to a futex other than that
-specified by the matching
-.B FUTEX_WAIT_REQUEUE_PI
-call for that waiter.
-.TP
-.B EINVAL
-.RB ( FUTEX_CMP_REQUEUE_PI )
-The
-.I val
-argument is not 1.
-.TP
-.B EINVAL
-Invalid argument.
-.TP
-.B ENFILE
-.RB ( FUTEX_FD )
-The system-wide limit on the total number of open files has been reached.
-.TP
-.B ENOMEM
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_CMP_REQUEUE_PI )
-The kernel could not allocate memory to hold state information.
-.TP
-.B ENOSYS
-Invalid operation specified in
-.IR futex_op .
-.TP
-.B ENOSYS
-The
-.B FUTEX_CLOCK_REALTIME
-option was specified in
-.IR futex_op ,
-but the accompanying operation was neither
-.BR FUTEX_WAIT ,
-.BR FUTEX_WAIT_BITSET ,
-.BR FUTEX_WAIT_REQUEUE_PI ,
-nor
-.BR FUTEX_LOCK_PI2 .
-.TP
-.B ENOSYS
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_UNLOCK_PI ,
-.BR FUTEX_CMP_REQUEUE_PI ,
-.BR FUTEX_WAIT_REQUEUE_PI )
-A run-time check determined that the operation is not available.
-The PI-futex operations are not implemented on all architectures and
-are not supported on some CPU variants.
-.TP
-.B EPERM
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_CMP_REQUEUE_PI )
-The caller is not allowed to attach itself to the futex at
-.I uaddr
-(for
-.BR FUTEX_CMP_REQUEUE_PI :
-the futex at
-.IR uaddr2 ).
-(This may be caused by a state corruption in user space.)
-.TP
-.B EPERM
-.RB ( FUTEX_UNLOCK_PI )
-The caller does not own the lock represented by the futex word.
-.TP
-.B ESRCH
-.RB ( FUTEX_LOCK_PI ,
-.BR FUTEX_LOCK_PI2 ,
-.BR FUTEX_TRYLOCK_PI ,
-.BR FUTEX_CMP_REQUEUE_PI )
-The thread ID in the futex word at
-.I uaddr
-does not exist.
-.TP
-.B ESRCH
-.RB ( FUTEX_CMP_REQUEUE_PI )
-The thread ID in the futex word at
-.I uaddr2
-does not exist.
-.TP
-.B ETIMEDOUT
-The operation in
-.I futex_op
-employed the timeout specified in
-.IR timeout ,
-and the timeout expired before the operation completed.
-.\"
-.\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
-.\"
-.SH STANDARDS
-Linux.
-.SH HISTORY
-Linux 2.6.0.
-.P
-Initial futex support was merged in Linux 2.5.7 but with different
-semantics from what was described above.
-A four-argument system call with the semantics
-described in this page was introduced in Linux 2.5.40.
-A fifth argument was added in Linux 2.5.70,
-and a sixth argument was added in Linux 2.6.7.
-.SH EXAMPLES
-The program below demonstrates use of futexes in a program where a parent
-process and a child process use a pair of futexes located inside a
-shared anonymous mapping to synchronize access to a shared resource:
-the terminal.
-The two processes each write
-.I nloops
-(a command-line argument that defaults to 5 if omitted)
-messages to the terminal and employ a synchronization protocol
-that ensures that they alternate in writing messages.
-Upon running this program we see output such as the following:
-.P
-.in +4n
-.EX
-$ \fB./futex_demo\fP
-Parent (18534) 0
-Child (18535) 0
-Parent (18534) 1
-Child (18535) 1
-Parent (18534) 2
-Child (18535) 2
-Parent (18534) 3
-Child (18535) 3
-Parent (18534) 4
-Child (18535) 4
-.EE
-.in
-.SS Program source
-\&
-.\" SRC BEGIN (futex.c)
-.EX
-/* futex_demo.c
-\&
- Usage: futex_demo [nloops]
- (Default: 5)
-\&
- Demonstrate the use of futexes in a program where parent and child
- use a pair of futexes located inside a shared anonymous mapping to
- synchronize access to a shared resource: the terminal. The two
- processes each write \[aq]num\-loops\[aq] messages to the terminal and employ
- a synchronization protocol that ensures that they alternate in
- writing messages.
-*/
-#define _GNU_SOURCE
-#include <err.h>
-#include <errno.h>
-#include <linux/futex.h>
-#include <stdatomic.h>
-#include <stdint.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <sys/mman.h>
-#include <sys/syscall.h>
-#include <sys/time.h>
-#include <sys/wait.h>
-#include <unistd.h>
-\&
-static uint32_t *futex1, *futex2, *iaddr;
-\&
-static int
-futex(uint32_t *uaddr, int futex_op, uint32_t val,
- const struct timespec *timeout, uint32_t *uaddr2, uint32_t val3)
-{
- return syscall(SYS_futex, uaddr, futex_op, val,
- timeout, uaddr2, val3);
-}
-\&
-/* Acquire the futex pointed to by \[aq]futexp\[aq]: wait for its value to
- become 1, and then set the value to 0. */
-\&
-static void
-fwait(uint32_t *futexp)
-{
- long s;
- const uint32_t one = 1;
-\&
- /* atomic_compare_exchange_strong(ptr, oldval, newval)
- atomically performs the equivalent of:
-\&
- if (*ptr == *oldval)
- *ptr = newval;
-\&
- It returns true if the test yielded true and *ptr was updated. */
-\&
- while (1) {
-\&
- /* Is the futex available? */
- if (atomic_compare_exchange_strong(futexp, &one, 0))
- break; /* Yes */
-\&
- /* Futex is not available; wait. */
-\&
- s = futex(futexp, FUTEX_WAIT, 0, NULL, NULL, 0);
- if (s == \-1 && errno != EAGAIN)
- err(EXIT_FAILURE, "futex\-FUTEX_WAIT");
- }
-}
-\&
-/* Release the futex pointed to by \[aq]futexp\[aq]: if the futex currently
- has the value 0, set its value to 1 and then wake any futex waiters,
- so that if the peer is blocked in fwait(), it can proceed. */
-\&
-static void
-fpost(uint32_t *futexp)
-{
- long s;
- const uint32_t zero = 0;
-\&
- /* atomic_compare_exchange_strong() was described
- in comments above. */
-\&
- if (atomic_compare_exchange_strong(futexp, &zero, 1)) {
- s = futex(futexp, FUTEX_WAKE, 1, NULL, NULL, 0);
- if (s == \-1)
- err(EXIT_FAILURE, "futex\-FUTEX_WAKE");
- }
-}
-\&
-int
-main(int argc, char *argv[])
-{
- pid_t childPid;
- unsigned int nloops;
-\&
- setbuf(stdout, NULL);
-\&
- nloops = (argc > 1) ? atoi(argv[1]) : 5;
-\&
- /* Create a shared anonymous mapping that will hold the futexes.
- Since the futexes are being shared between processes, we
- subsequently use the "shared" futex operations (i.e., not the
- ones suffixed "_PRIVATE"). */
-\&
- iaddr = mmap(NULL, sizeof(*iaddr) * 2, PROT_READ | PROT_WRITE,
- MAP_ANONYMOUS | MAP_SHARED, \-1, 0);
- if (iaddr == MAP_FAILED)
- err(EXIT_FAILURE, "mmap");
-\&
- futex1 = &iaddr[0];
- futex2 = &iaddr[1];
-\&
- *futex1 = 0; /* State: unavailable */
- *futex2 = 1; /* State: available */
-\&
- /* Create a child process that inherits the shared anonymous
- mapping. */
-\&
- childPid = fork();
- if (childPid == \-1)
- err(EXIT_FAILURE, "fork");
-\&
- if (childPid == 0) { /* Child */
- for (unsigned int j = 0; j < nloops; j++) {
- fwait(futex1);
- printf("Child (%jd) %u\en", (intmax_t) getpid(), j);
- fpost(futex2);
- }
-\&
- exit(EXIT_SUCCESS);
- }
-\&
- /* Parent falls through to here. */
-\&
- for (unsigned int j = 0; j < nloops; j++) {
- fwait(futex2);
- printf("Parent (%jd) %u\en", (intmax_t) getpid(), j);
- fpost(futex1);
- }
-\&
- wait(NULL);
-\&
- exit(EXIT_SUCCESS);
-}
-.EE
-.\" SRC END
-.SH SEE ALSO
-.ad l
-.BR get_robust_list (2),
-.BR restart_syscall (2),
-.BR pthread_mutexattr_getprotocol (3),
-.BR futex (7),
-.BR sched (7)
-.P
-The following kernel source files:
-.IP \[bu] 3
-.I Documentation/pi\-futex.txt
-.IP \[bu]
-.I Documentation/futex\-requeue\-pi.txt
-.IP \[bu]
-.I Documentation/locking/rt\-mutex.txt
-.IP \[bu]
-.I Documentation/locking/rt\-mutex\-design.txt
-.IP \[bu]
-.I Documentation/robust\-futex\-ABI.txt
-.P
-Franke, H., Russell, R., and Kirwood, M., 2002.
-\fIFuss, Futexes and Furwocks: Fast Userlevel Locking in Linux\fP
-(from proceedings of the Ottawa Linux Symposium 2002),
-.br
-.UR http://kernel.org\:/doc\:/ols\:/2002\:/ols2002\-pages\-479\-495.pdf
-.UE
-.P
-Hart, D., 2009. \fIA futex overview and update\fP,
-.UR http://lwn.net/Articles/360699/
-.UE
-.P
-Hart, D.\& and Guniguntala, D., 2009.
-\fIRequeue-PI: Making glibc Condvars PI-Aware\fP
-(from proceedings of the 2009 Real-Time Linux Workshop),
-.UR http://lwn.net/images/conf/rtlws11/papers/proc/p10.pdf
-.UE
-.P
-Drepper, U., 2011. \fIFutexes Are Tricky\fP,
-.UR http://www.akkadia.org/drepper/futex.pdf
-.UE
-.P
-Futex example library, futex\-*.tar.bz2 at
-.br
-.UR https://mirrors.kernel.org\:/pub\:/linux\:/kernel\:/people\:/rusty/
-.UE
-.\"
-.\" FIXME(Torvald) We should probably refer to the glibc code here, in
-.\" particular the glibc-internal futex wrapper functions that are
-.\" WIP, and the generic pthread_mutex_t and perhaps condvar
-.\" implementations.