diff options
Diffstat (limited to 'man2/set_mempolicy.2')
-rw-r--r-- | man2/set_mempolicy.2 | 325 |
1 files changed, 325 insertions, 0 deletions
diff --git a/man2/set_mempolicy.2 b/man2/set_mempolicy.2 new file mode 100644 index 0000000..a7f561d --- /dev/null +++ b/man2/set_mempolicy.2 @@ -0,0 +1,325 @@ +.\" SPDX-License-Identifier: Linux-man-pages-copyleft-var +.\" +.\" Copyright 2003,2004 Andi Kleen, SuSE Labs. +.\" and Copyright 2007 Lee Schermerhorn, Hewlett Packard +.\" +.\" 2006-02-03, mtk, substantial wording changes and other improvements +.\" 2007-08-27, Lee Schermerhorn <Lee.Schermerhorn@hp.com> +.\" more precise specification of behavior. +.\" +.TH set_mempolicy 2 2023-07-16 "Linux man-pages 6.05.01" +.SH NAME +set_mempolicy \- set default NUMA memory policy for a thread and its children +.SH LIBRARY +NUMA (Non-Uniform Memory Access) policy library +.RI ( libnuma ", " \-lnuma ) +.SH SYNOPSIS +.nf +.B "#include <numaif.h>" +.PP +.BI "long set_mempolicy(int " mode ", const unsigned long *" nodemask , +.BI " unsigned long " maxnode ); +.fi +.SH DESCRIPTION +.BR set_mempolicy () +sets the NUMA memory policy of the calling thread, +which consists of a policy mode and zero or more nodes, +to the values specified by the +.IR mode , +.IR nodemask , +and +.I maxnode +arguments. +.PP +A NUMA machine has different +memory controllers with different distances to specific CPUs. +The memory policy defines from which node memory is allocated for +the thread. +.PP +This system call defines the default policy for the thread. +The thread policy governs allocation of pages in the process's +address space outside of memory ranges +controlled by a more specific policy set by +.BR mbind (2). +The thread default policy also controls allocation of any pages for +memory-mapped files mapped using the +.BR mmap (2) +call with the +.B MAP_PRIVATE +flag and that are only read (loaded) from by the thread +and of memory-mapped files mapped using the +.BR mmap (2) +call with the +.B MAP_SHARED +flag, regardless of the access type. +The policy is applied only when a new page is allocated +for the thread. +For anonymous memory this is when the page is first +touched by the thread. +.PP +The +.I mode +argument must specify one of +.BR MPOL_DEFAULT , +.BR MPOL_BIND , +.BR MPOL_INTERLEAVE , +.BR MPOL_PREFERRED , +or +.B MPOL_LOCAL +(which are described in detail below). +All modes except +.B MPOL_DEFAULT +require the caller to specify the node or nodes to which the mode applies, +via the +.I nodemask +argument. +.PP +The +.I mode +argument may also include an optional +.IR "mode flag" . +The supported +.I "mode flags" +are: +.TP +.BR MPOL_F_NUMA_BALANCING " (since Linux 5.12)" +.\" commit bda420b985054a3badafef23807c4b4fa38a3dff +When +.I mode +is +.BR MPOL_BIND , +enable the kernel NUMA balancing for the task if it is supported by the kernel. +If the flag isn't supported by the kernel, or is used with +.I mode +other than +.BR MPOL_BIND , +\-1 is returned and +.I errno +is set to +.BR EINVAL . +.TP +.BR MPOL_F_RELATIVE_NODES " (since Linux 2.6.26)" +A nonempty +.I nodemask +specifies node IDs that are relative to the +set of node IDs allowed by the process's current cpuset. +.TP +.BR MPOL_F_STATIC_NODES " (since Linux 2.6.26)" +A nonempty +.I nodemask +specifies physical node IDs. +Linux will not remap the +.I nodemask +when the process moves to a different cpuset context, +nor when the set of nodes allowed by the process's +current cpuset context changes. +.PP +.I nodemask +points to a bit mask of node IDs that contains up to +.I maxnode +bits. +The bit mask size is rounded to the next multiple of +.IR "sizeof(unsigned long)" , +but the kernel will use bits only up to +.IR maxnode . +A NULL value of +.I nodemask +or a +.I maxnode +value of zero specifies the empty set of nodes. +If the value of +.I maxnode +is zero, +the +.I nodemask +argument is ignored. +.PP +Where a +.I nodemask +is required, it must contain at least one node that is on-line, +allowed by the process's current cpuset context, +(unless the +.B MPOL_F_STATIC_NODES +mode flag is specified), +and contains memory. +If the +.B MPOL_F_STATIC_NODES +is set in +.I mode +and a required +.I nodemask +contains no nodes that are allowed by the process's current cpuset context, +the memory policy reverts to +.IR "local allocation" . +This effectively overrides the specified policy until the process's +cpuset context includes one or more of the nodes specified by +.IR nodemask . +.PP +The +.I mode +argument must include one of the following values: +.TP +.B MPOL_DEFAULT +This mode specifies that any nondefault thread memory policy be removed, +so that the memory policy "falls back" to the system default policy. +The system default policy is "local allocation"\[em]that is, +allocate memory on the node of the CPU that triggered the allocation. +.I nodemask +must be specified as NULL. +If the "local node" contains no free memory, the system will +attempt to allocate memory from a "near by" node. +.TP +.B MPOL_BIND +This mode defines a strict policy that restricts memory allocation to the +nodes specified in +.IR nodemask . +If +.I nodemask +specifies more than one node, page allocations will come from +the node with the lowest numeric node ID first, until that node +contains no free memory. +Allocations will then come from the node with the next highest +node ID specified in +.I nodemask +and so forth, until none of the specified nodes contain free memory. +Pages will not be allocated from any node not specified in the +.IR nodemask . +.TP +.B MPOL_INTERLEAVE +This mode interleaves page allocations across the nodes specified in +.I nodemask +in numeric node ID order. +This optimizes for bandwidth instead of latency +by spreading out pages and memory accesses to those pages across +multiple nodes. +However, accesses to a single page will still be limited to +the memory bandwidth of a single node. +.\" NOTE: the following sentence doesn't make sense in the context +.\" of set_mempolicy() -- no memory area specified. +.\" To be effective the memory area should be fairly large, +.\" at least 1 MB or bigger. +.TP +.B MPOL_PREFERRED +This mode sets the preferred node for allocation. +The kernel will try to allocate pages from this node first +and fall back to "near by" nodes if the preferred node is low on free +memory. +If +.I nodemask +specifies more than one node ID, the first node in the +mask will be selected as the preferred node. +If the +.I nodemask +and +.I maxnode +arguments specify the empty set, then the policy +specifies "local allocation" +(like the system default policy discussed above). +.TP +.BR MPOL_LOCAL " (since Linux 3.8)" +.\" commit 479e2802d09f1e18a97262c4c6f8f17ae5884bd8 +.\" commit f2a07f40dbc603c15f8b06e6ec7f768af67b424f +This mode specifies "local allocation"; the memory is allocated on +the node of the CPU that triggered the allocation (the "local node"). +The +.I nodemask +and +.I maxnode +arguments must specify the empty set. +If the "local node" is low on free memory, +the kernel will try to allocate memory from other nodes. +The kernel will allocate memory from the "local node" +whenever memory for this node is available. +If the "local node" is not allowed by the process's current cpuset context, +the kernel will try to allocate memory from other nodes. +The kernel will allocate memory from the "local node" whenever +it becomes allowed by the process's current cpuset context. +.PP +The thread memory policy is preserved across an +.BR execve (2), +and is inherited by child threads created using +.BR fork (2) +or +.BR clone (2). +.SH RETURN VALUE +On success, +.BR set_mempolicy () +returns 0; +on error, \-1 is returned and +.I errno +is set to indicate the error. +.SH ERRORS +.TP +.B EFAULT +Part of all of the memory range specified by +.I nodemask +and +.I maxnode +points outside your accessible address space. +.TP +.B EINVAL +.I mode +is invalid. +Or, +.I mode +is +.B MPOL_DEFAULT +and +.I nodemask +is nonempty, +or +.I mode +is +.B MPOL_BIND +or +.B MPOL_INTERLEAVE +and +.I nodemask +is empty. +Or, +.I maxnode +specifies more than a page worth of bits. +Or, +.I nodemask +specifies one or more node IDs that are +greater than the maximum supported node ID. +Or, none of the node IDs specified by +.I nodemask +are on-line and allowed by the process's current cpuset context, +or none of the specified nodes contain memory. +Or, the +.I mode +argument specified both +.B MPOL_F_STATIC_NODES +and +.BR MPOL_F_RELATIVE_NODES . +Or, the +.B MPOL_F_NUMA_BALANCING +isn't supported by the kernel, or is used with +.I mode +other than +.BR MPOL_BIND . +.TP +.B ENOMEM +Insufficient kernel memory was available. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 2.6.7. +.SH NOTES +Memory policy is not remembered if the page is swapped out. +When such a page is paged back in, it will use the policy of +the thread or memory range that is in effect at the time the +page is allocated. +.PP +For information on library support, see +.BR numa (7). +.SH SEE ALSO +.BR get_mempolicy (2), +.BR getcpu (2), +.BR mbind (2), +.BR mmap (2), +.BR numa (3), +.BR cpuset (7), +.BR numa (7), +.BR numactl (8) |