diff options
Diffstat (limited to 'man7/socket.7')
-rw-r--r-- | man7/socket.7 | 1274 |
1 files changed, 0 insertions, 1274 deletions
diff --git a/man7/socket.7 b/man7/socket.7 deleted file mode 100644 index 619139e..0000000 --- a/man7/socket.7 +++ /dev/null @@ -1,1274 +0,0 @@ -'\" t -.\" SPDX-License-Identifier: Linux-man-pages-1-para -.\" -.\" This man page is Copyright (C) 1999 Andi Kleen <ak@muc.de>. -.\" and copyright (c) 1999 Matthew Wilcox. -.\" -.\" 2002-10-30, Michael Kerrisk, <mtk.manpages@gmail.com> -.\" Added description of SO_ACCEPTCONN -.\" 2004-05-20, aeb, added SO_RCVTIMEO/SO_SNDTIMEO text. -.\" Modified, 27 May 2004, Michael Kerrisk <mtk.manpages@gmail.com> -.\" Added notes on capability requirements -.\" A few small grammar fixes -.\" 2010-06-13 Jan Engelhardt <jengelh@medozas.de> -.\" Documented SO_DOMAIN and SO_PROTOCOL. -.\" -.\" FIXME -.\" The following are not yet documented: -.\" -.\" SO_PEERNAME (2.4?) -.\" get only -.\" Seems to do something similar to getpeername(), but then -.\" why is it necessary / how does it differ? -.\" -.\" SO_TIMESTAMPING (2.6.30) -.\" Documentation/networking/timestamping.txt -.\" commit cb9eff097831007afb30d64373f29d99825d0068 -.\" Author: Patrick Ohly <patrick.ohly@intel.com> -.\" -.\" SO_WIFI_STATUS (3.3) -.\" commit 6e3e939f3b1bf8534b32ad09ff199d88800835a0 -.\" Author: Johannes Berg <johannes.berg@intel.com> -.\" Also: SCM_WIFI_STATUS -.\" -.\" SO_NOFCS (3.4) -.\" commit 3bdc0eba0b8b47797f4a76e377dd8360f317450f -.\" Author: Ben Greear <greearb@candelatech.com> -.\" -.\" SO_GET_FILTER (3.8) -.\" commit a8fc92778080c845eaadc369a0ecf5699a03bef0 -.\" Author: Pavel Emelyanov <xemul@parallels.com> -.\" -.\" SO_MAX_PACING_RATE (3.13) -.\" commit 62748f32d501f5d3712a7c372bbb92abc7c62bc7 -.\" Author: Eric Dumazet <edumazet@google.com> -.\" -.\" SO_BPF_EXTENSIONS (3.14) -.\" commit ea02f9411d9faa3553ed09ce0ec9f00ceae9885e -.\" Author: Michal Sekletar <msekleta@redhat.com> -.\" -.TH socket 7 2024-01-16 "Linux man-pages 6.7" -.SH NAME -socket \- Linux socket interface -.SH SYNOPSIS -.nf -.B #include <sys/socket.h> -.P -.IB sockfd " = socket(int " socket_family ", int " socket_type ", int " protocol ); -.fi -.SH DESCRIPTION -This manual page describes the Linux networking socket layer user -interface. -The BSD compatible sockets -are the uniform interface -between the user process and the network protocol stacks in the kernel. -The protocol modules are grouped into -.I protocol families -such as -.BR AF_INET ", " AF_IPX ", and " AF_PACKET , -and -.I socket types -such as -.B SOCK_STREAM -or -.BR SOCK_DGRAM . -See -.BR socket (2) -for more information on families and types. -.SS Socket-layer functions -These functions are used by the user process to send or receive packets -and to do other socket operations. -For more information, see their respective manual pages. -.P -.BR socket (2) -creates a socket, -.BR connect (2) -connects a socket to a remote socket address, -the -.BR bind (2) -function binds a socket to a local socket address, -.BR listen (2) -tells the socket that new connections shall be accepted, and -.BR accept (2) -is used to get a new socket with a new incoming connection. -.BR socketpair (2) -returns two connected anonymous sockets (implemented only for a few -local families like -.BR AF_UNIX ) -.P -.BR send (2), -.BR sendto (2), -and -.BR sendmsg (2) -send data over a socket, and -.BR recv (2), -.BR recvfrom (2), -.BR recvmsg (2) -receive data from a socket. -.BR poll (2) -and -.BR select (2) -wait for arriving data or a readiness to send data. -In addition, the standard I/O operations like -.BR write (2), -.BR writev (2), -.BR sendfile (2), -.BR read (2), -and -.BR readv (2) -can be used to read and write data. -.P -.BR getsockname (2) -returns the local socket address and -.BR getpeername (2) -returns the remote socket address. -.BR getsockopt (2) -and -.BR setsockopt (2) -are used to set or get socket layer or protocol options. -.BR ioctl (2) -can be used to set or read some other options. -.P -.BR close (2) -is used to close a socket. -.BR shutdown (2) -closes parts of a full-duplex socket connection. -.P -Seeking, or calling -.BR pread (2) -or -.BR pwrite (2) -with a nonzero position is not supported on sockets. -.P -It is possible to do nonblocking I/O on sockets by setting the -.B O_NONBLOCK -flag on a socket file descriptor using -.BR fcntl (2). -Then all operations that would block will (usually) -return with -.B EAGAIN -(operation should be retried later); -.BR connect (2) -will return -.B EINPROGRESS -error. -The user can then wait for various events via -.BR poll (2) -or -.BR select (2). -.TS -tab(:) allbox; -c s s -l l lx. -I/O events -Event:Poll flag:Occurrence -Read:POLLIN:T{ -New data arrived. -T} -Read:POLLIN:T{ -A connection setup has been completed -(for connection-oriented sockets) -T} -Read:POLLHUP:T{ -A disconnection request has been initiated by the other end. -T} -Read:POLLHUP:T{ -A connection is broken (only for connection-oriented protocols). -When the socket is written -.B SIGPIPE -is also sent. -T} -Write:POLLOUT:T{ -Socket has enough send buffer space for writing new data. -T} -Read/Write:T{ -POLLIN | -.br -POLLOUT -T}:T{ -An outgoing -.BR connect (2) -finished. -T} -Read/Write:POLLERR:T{ -An asynchronous error occurred. -T} -Read/Write:POLLHUP:T{ -The other end has shut down one direction. -T} -Exception:POLLPRI:T{ -Urgent data arrived. -.B SIGURG -is sent then. -T} -.\" FIXME . The following is not true currently: -.\" It is no I/O event when the connection -.\" is broken from the local end using -.\" .BR shutdown (2) -.\" or -.\" .BR close (2). -.TE -.P -An alternative to -.BR poll (2) -and -.BR select (2) -is to let the kernel inform the application about events -via a -.B SIGIO -signal. -For that the -.B O_ASYNC -flag must be set on a socket file descriptor via -.BR fcntl (2) -and a valid signal handler for -.B SIGIO -must be installed via -.BR sigaction (2). -See the -.I Signals -discussion below. -.SS Socket address structures -Each socket domain has its own format for socket addresses, -with a domain-specific address structure. -Each of these structures begins with an -integer "family" field (typed as -.IR sa_family_t ) -that indicates the type of the address structure. -This allows -the various system calls (e.g., -.BR connect (2), -.BR bind (2), -.BR accept (2), -.BR getsockname (2), -.BR getpeername (2)), -which are generic to all socket domains, -to determine the domain of a particular socket address. -.P -To allow any type of socket address to be passed to -interfaces in the sockets API, -the type -.I struct sockaddr -is defined. -The purpose of this type is purely to allow casting of -domain-specific socket address types to a "generic" type, -so as to avoid compiler warnings about type mismatches in -calls to the sockets API. -.P -In addition, the sockets API provides the data type -.IR "struct sockaddr_storage". -This type -is suitable to accommodate all supported domain-specific socket -address structures; it is large enough and is aligned properly. -(In particular, it is large enough to hold -IPv6 socket addresses.) -The structure includes the following field, which can be used to identify -the type of socket address actually stored in the structure: -.P -.in +4n -.EX - sa_family_t ss_family; -.EE -.in -.P -The -.I sockaddr_storage -structure is useful in programs that must handle socket addresses -in a generic way -(e.g., programs that must deal with both IPv4 and IPv6 socket addresses). -.SS Socket options -The socket options listed below can be set by using -.BR setsockopt (2) -and read with -.BR getsockopt (2) -with the socket level set to -.B SOL_SOCKET -for all sockets. -Unless otherwise noted, -.I optval -is a pointer to an -.IR int . -.\" FIXME . -.\" In the list below, the text used to describe argument types -.\" for each socket option should be more consistent -.\" -.\" SO_ACCEPTCONN is in POSIX.1-2001, and its origin is explained in -.\" W R Stevens, UNPv1 -.TP -.B SO_ACCEPTCONN -Returns a value indicating whether or not this socket has been marked -to accept connections with -.BR listen (2). -The value 0 indicates that this is not a listening socket, -the value 1 indicates that this is a listening socket. -This socket option is read-only. -.TP -.BR SO_ATTACH_FILTER " (since Linux 2.2)" -.TQ -.BR SO_ATTACH_BPF " (since Linux 3.19)" -Attach a classic BPF -.RB ( SO_ATTACH_FILTER ) -or an extended BPF -.RB ( SO_ATTACH_BPF ) -program to the socket for use as a filter of incoming packets. -A packet will be dropped if the filter program returns zero. -If the filter program returns a -nonzero value which is less than the packet's data length, -the packet will be truncated to the length returned. -If the value returned by the filter is greater than or equal to the -packet's data length, the packet is allowed to proceed unmodified. -.IP -The argument for -.B SO_ATTACH_FILTER -is a -.I sock_fprog -structure, defined in -.IR <linux/filter.h> : -.IP -.in +4n -.EX -struct sock_fprog { - unsigned short len; - struct sock_filter *filter; -}; -.EE -.in -.IP -The argument for -.B SO_ATTACH_BPF -is a file descriptor returned by the -.BR bpf (2) -system call and must refer to a program of type -.BR BPF_PROG_TYPE_SOCKET_FILTER . -.IP -These options may be set multiple times for a given socket, -each time replacing the previous filter program. -The classic and extended versions may be called on the same socket, -but the previous filter will always be replaced such that a socket -never has more than one filter defined. -.IP -Both classic and extended BPF are explained in the kernel source file -.I Documentation/networking/filter.txt -.TP -.B SO_ATTACH_REUSEPORT_CBPF -.TQ -.B SO_ATTACH_REUSEPORT_EBPF -For use with the -.B SO_REUSEPORT -option, these options allow the user to set a classic BPF -.RB ( SO_ATTACH_REUSEPORT_CBPF ) -or an extended BPF -.RB ( SO_ATTACH_REUSEPORT_EBPF ) -program which defines how packets are assigned to -the sockets in the reuseport group (that is, all sockets which have -.B SO_REUSEPORT -set and are using the same local address to receive packets). -.IP -The BPF program must return an index between 0 and N\-1 representing -the socket which should receive the packet -(where N is the number of sockets in the group). -If the BPF program returns an invalid index, -socket selection will fall back to the plain -.B SO_REUSEPORT -mechanism. -.IP -Sockets are numbered in the order in which they are added to the group -(that is, the order of -.BR bind (2) -calls for UDP sockets or the order of -.BR listen (2) -calls for TCP sockets). -New sockets added to a reuseport group will inherit the BPF program. -When a socket is removed from a reuseport group (via -.BR close (2)), -the last socket in the group will be moved into the closed socket's -position. -.IP -These options may be set repeatedly at any time on any socket in the group -to replace the current BPF program used by all sockets in the group. -.IP -.B SO_ATTACH_REUSEPORT_CBPF -takes the same argument type as -.B SO_ATTACH_FILTER -and -.B SO_ATTACH_REUSEPORT_EBPF -takes the same argument type as -.BR SO_ATTACH_BPF . -.IP -UDP support for this feature is available since Linux 4.5; -TCP support is available since Linux 4.6. -.TP -.B SO_BINDTODEVICE -Bind this socket to a particular device like \[lq]eth0\[rq], -as specified in the passed interface name. -If the -name is an empty string or the option length is zero, the socket device -binding is removed. -The passed option is a variable-length null-terminated -interface name string with the maximum size of -.BR IFNAMSIZ . -If a socket is bound to an interface, -only packets received from that particular interface are processed by the -socket. -Note that this works only for some socket types, particularly -.B AF_INET -sockets. -It is not supported for packet sockets (use normal -.BR bind (2) -there). -.IP -Before Linux 3.8, -this socket option could be set, but could not retrieved with -.BR getsockopt (2). -Since Linux 3.8, it is readable. -The -.I optlen -argument should contain the buffer size available -to receive the device name and is recommended to be -.B IFNAMSIZ -bytes. -The real device name length is reported back in the -.I optlen -argument. -.TP -.B SO_BROADCAST -Set or get the broadcast flag. -When enabled, datagram sockets are allowed to send -packets to a broadcast address. -This option has no effect on stream-oriented sockets. -.TP -.B SO_BSDCOMPAT -Enable BSD bug-to-bug compatibility. -This is used by the UDP protocol module in Linux 2.0 and 2.2. -If enabled, ICMP errors received for a UDP socket will not be passed -to the user program. -In later kernel versions, support for this option has been phased out: -Linux 2.4 silently ignores it, and Linux 2.6 generates a kernel warning -(printk()) if a program uses this option. -Linux 2.0 also enabled BSD bug-to-bug compatibility -options (random header changing, skipping of the broadcast flag) for raw -sockets with this option, but that was removed in Linux 2.2. -.TP -.B SO_DEBUG -Enable socket debugging. -Allowed only for processes with the -.B CAP_NET_ADMIN -capability or an effective user ID of 0. -.TP -.BR SO_DETACH_FILTER " (since Linux 2.2)" -.TQ -.BR SO_DETACH_BPF " (since Linux 3.19)" -These two options, which are synonyms, -may be used to remove the classic or extended BPF -program attached to a socket with either -.B SO_ATTACH_FILTER -or -.BR SO_ATTACH_BPF . -The option value is ignored. -.TP -.BR SO_DOMAIN " (since Linux 2.6.32)" -Retrieves the socket domain as an integer, returning a value such as -.BR AF_INET6 . -See -.BR socket (2) -for details. -This socket option is read-only. -.TP -.B SO_ERROR -Get and clear the pending socket error. -This socket option is read-only. -Expects an integer. -.TP -.B SO_DONTROUTE -Don't send via a gateway, send only to directly connected hosts. -The same effect can be achieved by setting the -.B MSG_DONTROUTE -flag on a socket -.BR send (2) -operation. -Expects an integer boolean flag. -.TP -.BR SO_INCOMING_CPU " (gettable since Linux 3.19, settable since Linux 4.4)" -.\" getsockopt 2c8c56e15df3d4c2af3d656e44feb18789f75837 -.\" setsockopt 70da268b569d32a9fddeea85dc18043de9d89f89 -Sets or gets the CPU affinity of a socket. -Expects an integer flag. -.IP -.in +4n -.EX -int cpu = 1; -setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, - sizeof(cpu)); -.EE -.in -.IP -Because all of the packets for a single stream -(i.e., all packets for the same 4-tuple) -arrive on the single RX queue that is associated with a particular CPU, -the typical use case is to employ one listening process per RX queue, -with the incoming flow being handled by a listener -on the same CPU that is handling the RX queue. -This provides optimal NUMA behavior and keeps CPU caches hot. -.\" -.\" From an email conversation with Eric Dumazet: -.\" >> Note that setting the option is not supported if SO_REUSEPORT is used. -.\" > -.\" > Please define "not supported". Does this yield an API diagnostic? -.\" > If so, what is it? -.\" > -.\" >> Socket will be selected from an array, either by a hash or BPF program -.\" >> that has no access to this information. -.\" > -.\" > Sorry -- I'm lost here. How does this comment relate to the proposed -.\" > man page text above? -.\" -.\" Simply that : -.\" -.\" If an application uses both SO_INCOMING_CPU and SO_REUSEPORT, then -.\" SO_REUSEPORT logic, selecting the socket to receive the packet, ignores -.\" SO_INCOMING_CPU setting. -.TP -.BR SO_INCOMING_NAPI_ID " (gettable since Linux 4.12)" -.\" getsockopt 6d4339028b350efbf87c61e6d9e113e5373545c9 -Returns a system-level unique ID called NAPI ID that is associated -with a RX queue on which the last packet associated with that -socket is received. -.IP -This can be used by an application to split the incoming flows among worker -threads based on the RX queue on which the packets associated with the -flows are received. -It allows each worker thread to be associated with -a NIC HW receive queue and service all the connection -requests received on that RX queue. -This mapping between an app thread and -a HW NIC queue streamlines the -flow of data from the NIC to the application. -.TP -.B SO_KEEPALIVE -Enable sending of keep-alive messages on connection-oriented sockets. -Expects an integer boolean flag. -.TP -.B SO_LINGER -Sets or gets the -.B SO_LINGER -option. -The argument is a -.I linger -structure. -.IP -.in +4n -.EX -struct linger { - int l_onoff; /* linger active */ - int l_linger; /* how many seconds to linger for */ -}; -.EE -.in -.IP -When enabled, a -.BR close (2) -or -.BR shutdown (2) -will not return until all queued messages for the socket have been -successfully sent or the linger timeout has been reached. -Otherwise, -the call returns immediately and the closing is done in the background. -When the socket is closed as part of -.BR exit (2), -it always lingers in the background. -.TP -.B SO_LOCK_FILTER -.\" commit d59577b6ffd313d0ab3be39cb1ab47e29bdc9182 -When set, this option will prevent -changing the filters associated with the socket. -These filters include any set using the socket options -.BR SO_ATTACH_FILTER , -.BR SO_ATTACH_BPF , -.BR SO_ATTACH_REUSEPORT_CBPF , -and -.BR SO_ATTACH_REUSEPORT_EBPF . -.IP -The typical use case is for a privileged process to set up a raw socket -(an operation that requires the -.B CAP_NET_RAW -capability), apply a restrictive filter, set the -.B SO_LOCK_FILTER -option, -and then either drop its privileges or pass the socket file descriptor -to an unprivileged process via a UNIX domain socket. -.IP -Once the -.B SO_LOCK_FILTER -option has been enabled, attempts to change or remove the filter -attached to a socket, or to disable the -.B SO_LOCK_FILTER -option will fail with the error -.BR EPERM . -.TP -.BR SO_MARK " (since Linux 2.6.25)" -.\" commit 4a19ec5800fc3bb64e2d87c4d9fdd9e636086fe0 -.\" and 914a9ab386a288d0f22252fc268ecbc048cdcbd5 -Set the mark for each packet sent through this socket -(similar to the netfilter MARK target but socket-based). -Changing the mark can be used for mark-based -routing without netfilter or for packet filtering. -Setting this option requires the -.B CAP_NET_ADMIN -or -.B CAP_NET_RAW -(since Linux 5.17) capability. -.TP -.B SO_OOBINLINE -If this option is enabled, -out-of-band data is directly placed into the receive data stream. -Otherwise, out-of-band data is passed only when the -.B MSG_OOB -flag is set during receiving. -.\" don't document it because it can do too much harm. -.\".B SO_NO_CHECK -.\" The kernel has support for the SO_NO_CHECK socket -.\" option (boolean: 0 == default, calculate checksum on xmit, -.\" 1 == do not calculate checksum on xmit). -.\" Additional note from Andi Kleen on SO_NO_CHECK (2010-08-30) -.\" On Linux UDP checksums are essentially free and there's no reason -.\" to turn them off and it would disable another safety line. -.\" That is why I didn't document the option. -.TP -.B SO_PASSCRED -Enable or disable the receiving of the -.B SCM_CREDENTIALS -control message. -For more information, see -.BR unix (7). -.TP -.B SO_PASSSEC -Enable or disable the receiving of the -.B SCM_SECURITY -control message. -For more information, see -.BR unix (7). -.TP -.BR SO_PEEK_OFF " (since Linux 3.4)" -.\" commit ef64a54f6e558155b4f149bb10666b9e914b6c54 -This option, which is currently supported only for -.BR unix (7) -sockets, sets the value of the "peek offset" for the -.BR recv (2) -system call when used with -.B MSG_PEEK -flag. -.IP -When this option is set to a negative value -(it is set to \-1 for all new sockets), -traditional behavior is provided: -.BR recv (2) -with the -.B MSG_PEEK -flag will peek data from the front of the queue. -.IP -When the option is set to a value greater than or equal to zero, -then the next peek at data queued in the socket will occur at -the byte offset specified by the option value. -At the same time, the "peek offset" will be -incremented by the number of bytes that were peeked from the queue, -so that a subsequent peek will return the next data in the queue. -.IP -If data is removed from the front of the queue via a call to -.BR recv (2) -(or similar) without the -.B MSG_PEEK -flag, the "peek offset" will be decreased by the number of bytes removed. -In other words, receiving data without the -.B MSG_PEEK -flag will cause the "peek offset" to be adjusted to maintain -the correct relative position in the queued data, -so that a subsequent peek will retrieve the data that would have been -retrieved had the data not been removed. -.IP -For datagram sockets, if the "peek offset" points to the middle of a packet, -the data returned will be marked with the -.B MSG_TRUNC -flag. -.IP -The following example serves to illustrate the use of -.BR SO_PEEK_OFF . -Suppose a stream socket has the following queued input data: -.IP -.in +4n -.EX -aabbccddeeff -.EE -.in -.IP -The following sequence of -.BR recv (2) -calls would have the effect noted in the comments: -.IP -.in +4n -.EX -int ov = 4; // Set peek offset to 4 -setsockopt(fd, SOL_SOCKET, SO_PEEK_OFF, &ov, sizeof(ov)); -\& -recv(fd, buf, 2, MSG_PEEK); // Peeks "cc"; offset set to 6 -recv(fd, buf, 2, MSG_PEEK); // Peeks "dd"; offset set to 8 -recv(fd, buf, 2, 0); // Reads "aa"; offset set to 6 -recv(fd, buf, 2, MSG_PEEK); // Peeks "ee"; offset set to 8 -.EE -.in -.TP -.B SO_PEERCRED -Return the credentials of the peer process connected to this socket. -For further details, see -.BR unix (7). -.TP -.BR SO_PEERSEC " (since Linux 2.6.2)" -Return the security context of the peer socket connected to this socket. -For further details, see -.BR unix (7) -and -.BR ip (7). -.TP -.B SO_PRIORITY -Set the protocol-defined priority for all packets to be sent on -this socket. -Linux uses this value to order the networking queues: -packets with a higher priority may be processed first depending -on the selected device queueing discipline. -.\" For -.\" .BR ip (7), -.\" this also sets the IP type-of-service (TOS) field for outgoing packets. -Setting a priority outside the range 0 to 6 requires the -.B CAP_NET_ADMIN -capability. -.TP -.BR SO_PROTOCOL " (since Linux 2.6.32)" -Retrieves the socket protocol as an integer, returning a value such as -.BR IPPROTO_SCTP . -See -.BR socket (2) -for details. -This socket option is read-only. -.TP -.B SO_RCVBUF -Sets or gets the maximum socket receive buffer in bytes. -The kernel doubles this value (to allow space for bookkeeping overhead) -when it is set using -.\" Most (all?) other implementations do not do this -- MTK, Dec 05 -.BR setsockopt (2), -and this doubled value is returned by -.BR getsockopt (2). -.\" The following thread on LMKL is quite informative: -.\" getsockopt/setsockopt with SO_RCVBUF and SO_SNDBUF "non-standard" behavior -.\" 17 July 2012 -.\" http://thread.gmane.org/gmane.linux.kernel/1328935 -The default value is set by the -.I /proc/sys/net/core/rmem_default -file, and the maximum allowed value is set by the -.I /proc/sys/net/core/rmem_max -file. -The minimum (doubled) value for this option is 256. -.TP -.BR SO_RCVBUFFORCE " (since Linux 2.6.14)" -Using this socket option, a privileged -.RB ( CAP_NET_ADMIN ) -process can perform the same task as -.BR SO_RCVBUF , -but the -.I rmem_max -limit can be overridden. -.TP -.BR SO_RCVLOWAT " and " SO_SNDLOWAT -Specify the minimum number of bytes in the buffer until the socket layer -will pass the data to the protocol -.RB ( SO_SNDLOWAT ) -or the user on receiving -.RB ( SO_RCVLOWAT ). -These two values are initialized to 1. -.B SO_SNDLOWAT -is not changeable on Linux -.RB ( setsockopt (2) -fails with the error -.BR ENOPROTOOPT ). -.B SO_RCVLOWAT -is changeable -only since Linux 2.4. -.IP -Before Linux 2.6.28 -.\" Tested on kernel 2.6.14 -- mtk, 30 Nov 05 -.BR select (2), -.BR poll (2), -and -.BR epoll (7) -did not respect the -.B SO_RCVLOWAT -setting on Linux, -and indicated a socket as readable when even a single byte of data -was available. -A subsequent read from the socket would then block until -.B SO_RCVLOWAT -bytes are available. -Since Linux 2.6.28, -.\" commit c7004482e8dcb7c3c72666395cfa98a216a4fb70 -.BR select (2), -.BR poll (2), -and -.BR epoll (7) -indicate a socket as readable only if at least -.B SO_RCVLOWAT -bytes are available. -.TP -.BR SO_RCVTIMEO " and " SO_SNDTIMEO -.\" Not implemented in Linux 2.0. -.\" Implemented in Linux 2.1.11 for getsockopt: always return a zero struct. -.\" Implemented in Linux 2.3.41 for setsockopt, and actually used. -Specify the receiving or sending timeouts until reporting an error. -The argument is a -.IR "struct timeval" . -If an input or output function blocks for this period of time, and -data has been sent or received, the return value of that function -will be the amount of data transferred; if no data has been transferred -and the timeout has been reached, then \-1 is returned with -.I errno -set to -.B EAGAIN -or -.BR EWOULDBLOCK , -.\" in fact to EAGAIN -or -.B EINPROGRESS -(for -.BR connect (2)) -just as if the socket was specified to be nonblocking. -If the timeout is set to zero (the default), -then the operation will never timeout. -Timeouts only have effect for system calls that perform socket I/O (e.g., -.BR accept (2), -.BR connect (2), -.BR read (2), -.BR recvmsg (2), -.BR send (2), -.BR sendmsg (2)); -timeouts have no effect for -.BR select (2), -.BR poll (2), -.BR epoll_wait (2), -and so on. -.TP -.B SO_REUSEADDR -.\" commit c617f398edd4db2b8567a28e899a88f8f574798d -.\" https://lwn.net/Articles/542629/ -Indicates that the rules used in validating addresses supplied in a -.BR bind (2) -call should allow reuse of local addresses. -For -.B AF_INET -sockets this -means that a socket may bind, except when there -is an active listening socket bound to the address. -When the listening socket is bound to -.B INADDR_ANY -with a specific port then it is not possible -to bind to this port for any local address. -Argument is an integer boolean flag. -.TP -.BR SO_REUSEPORT " (since Linux 3.9)" -Permits multiple -.B AF_INET -or -.B AF_INET6 -sockets to be bound to an identical socket address. -This option must be set on each socket (including the first socket) -prior to calling -.BR bind (2) -on the socket. -To prevent port hijacking, -all of the processes binding to the same address must have the same -effective UID. -This option can be employed with both TCP and UDP sockets. -.IP -For TCP sockets, this option allows -.BR accept (2) -load distribution in a multi-threaded server to be improved by -using a distinct listener socket for each thread. -This provides improved load distribution as compared -to traditional techniques such using a single -.BR accept (2)ing -thread that distributes connections, -or having multiple threads that compete to -.BR accept (2) -from the same socket. -.IP -For UDP sockets, -the use of this option can provide better distribution -of incoming datagrams to multiple processes (or threads) as compared -to the traditional technique of having multiple processes -compete to receive datagrams on the same socket. -.TP -.BR SO_RXQ_OVFL " (since Linux 2.6.33)" -.\" commit 3b885787ea4112eaa80945999ea0901bf742707f -Indicates that an unsigned 32-bit value ancillary message (cmsg) -should be attached to received skbs indicating -the number of packets dropped by the socket since its creation. -.TP -.BR SO_SELECT_ERR_QUEUE " (since Linux 3.10)" -.\" commit 7d4c04fc170087119727119074e72445f2bb192b -.\" Author: Keller, Jacob E <jacob.e.keller@intel.com> -When this option is set on a socket, -an error condition on a socket causes notification not only via the -.I exceptfds -set of -.BR select (2). -Similarly, -.BR poll (2) -also returns a -.B POLLPRI -whenever an -.B POLLERR -event is returned. -.\" It does not affect wake up. -.IP -Background: this option was added when waking up on an error condition -occurred only via the -.I readfds -and -.I writefds -sets of -.BR select (2). -The option was added to allow monitoring for error conditions via the -.I exceptfds -argument without simultaneously having to receive notifications (via -.IR readfds ) -for regular data that can be read from the socket. -After changes in Linux 4.16, -.\" commit 6e5d58fdc9bedd0255a8 -.\" ("skbuff: Fix not waking applications when errors are enqueued") -the use of this flag to achieve the desired notifications -is no longer necessary. -This option is nevertheless retained for backwards compatibility. -.TP -.B SO_SNDBUF -Sets or gets the maximum socket send buffer in bytes. -The kernel doubles this value (to allow space for bookkeeping overhead) -when it is set using -.\" Most (all?) other implementations do not do this -- MTK, Dec 05 -.\" See also the comment to SO_RCVBUF (17 Jul 2012 LKML mail) -.BR setsockopt (2), -and this doubled value is returned by -.BR getsockopt (2). -The default value is set by the -.I /proc/sys/net/core/wmem_default -file and the maximum allowed value is set by the -.I /proc/sys/net/core/wmem_max -file. -The minimum (doubled) value for this option is 2048. -.TP -.BR SO_SNDBUFFORCE " (since Linux 2.6.14)" -Using this socket option, a privileged -.RB ( CAP_NET_ADMIN ) -process can perform the same task as -.BR SO_SNDBUF , -but the -.I wmem_max -limit can be overridden. -.TP -.B SO_TIMESTAMP -Enable or disable the receiving of the -.B SO_TIMESTAMP -control message. -The timestamp control message is sent with level -.B SOL_SOCKET -and a -.I cmsg_type -of -.BR SCM_TIMESTAMP . -The -.I cmsg_data -field is a -.I "struct timeval" -indicating the -reception time of the last packet passed to the user in this call. -See -.BR cmsg (3) -for details on control messages. -.TP -.BR SO_TIMESTAMPNS " (since Linux 2.6.22)" -.\" commit 92f37fd2ee805aa77925c1e64fd56088b46094fc -Enable or disable the receiving of the -.B SO_TIMESTAMPNS -control message. -The timestamp control message is sent with level -.B SOL_SOCKET -and a -.I cmsg_type -of -.BR SCM_TIMESTAMPNS . -The -.I cmsg_data -field is a -.I "struct timespec" -indicating the -reception time of the last packet passed to the user in this call. -The clock used for the timestamp is -.BR CLOCK_REALTIME . -See -.BR cmsg (3) -for details on control messages. -.IP -A socket cannot mix -.B SO_TIMESTAMP -and -.BR SO_TIMESTAMPNS : -the two modes are mutually exclusive. -.TP -.B SO_TYPE -Gets the socket type as an integer (e.g., -.BR SOCK_STREAM ). -This socket option is read-only. -.TP -.BR SO_BUSY_POLL " (since Linux 3.11)" -Sets the approximate time in microseconds to busy poll on a blocking receive -when there is no data. -Increasing this value requires -.BR CAP_NET_ADMIN . -The default for this option is controlled by the -.I /proc/sys/net/core/busy_read -file. -.IP -The value in the -.I /proc/sys/net/core/busy_poll -file determines how long -.BR select (2) -and -.BR poll (2) -will busy poll when they operate on sockets with -.B SO_BUSY_POLL -set and no events to report are found. -.IP -In both cases, -busy polling will only be done when the socket last received data -from a network device that supports this option. -.IP -While busy polling may improve latency of some applications, -care must be taken when using it since this will increase -both CPU utilization and power usage. -.SS Signals -When writing onto a connection-oriented socket that has been shut down -(by the local or the remote end) -.B SIGPIPE -is sent to the writing process and -.B EPIPE -is returned. -The signal is not sent when the write call -specified the -.B MSG_NOSIGNAL -flag. -.P -When requested with the -.B FIOSETOWN -.BR fcntl (2) -or -.B SIOCSPGRP -.BR ioctl (2), -.B SIGIO -is sent when an I/O event occurs. -It is possible to use -.BR poll (2) -or -.BR select (2) -in the signal handler to find out which socket the event occurred on. -An alternative (in Linux 2.2) is to set a real-time signal using the -.B F_SETSIG -.BR fcntl (2); -the handler of the real time signal will be called with -the file descriptor in the -.I si_fd -field of its -.IR siginfo_t . -See -.BR fcntl (2) -for more information. -.P -Under some circumstances (e.g., multiple processes accessing a -single socket), the condition that caused the -.B SIGIO -may have already disappeared when the process reacts to the signal. -If this happens, the process should wait again because Linux -will resend the signal later. -.\" .SS Ancillary messages -.SS /proc interfaces -The core socket networking parameters can be accessed -via files in the directory -.IR /proc/sys/net/core/ . -.TP -.I rmem_default -contains the default setting in bytes of the socket receive buffer. -.TP -.I rmem_max -contains the maximum socket receive buffer size in bytes which a user may -set by using the -.B SO_RCVBUF -socket option. -.TP -.I wmem_default -contains the default setting in bytes of the socket send buffer. -.TP -.I wmem_max -contains the maximum socket send buffer size in bytes which a user may -set by using the -.B SO_SNDBUF -socket option. -.TP -.IR message_cost " and " message_burst -configure the token bucket filter used to load limit warning messages -caused by external network events. -.TP -.I netdev_max_backlog -Maximum number of packets in the global input queue. -.TP -.I optmem_max -Maximum length of ancillary data and user control data like the iovecs -per socket. -.\" netdev_fastroute is not documented because it is experimental -.SS Ioctls -These operations can be accessed using -.BR ioctl (2): -.P -.in +4n -.EX -.IB error " = ioctl(" ip_socket ", " ioctl_type ", " &value_result ");" -.EE -.in -.TP -.B SIOCGSTAMP -Return a -.I struct timeval -with the receive timestamp of the last packet passed to the user. -This is useful for accurate round trip time measurements. -See -.BR setitimer (2) -for a description of -.IR "struct timeval" . -.\" -This ioctl should be used only if the socket options -.B SO_TIMESTAMP -and -.B SO_TIMESTAMPNS -are not set on the socket. -Otherwise, it returns the timestamp of the -last packet that was received while -.B SO_TIMESTAMP -and -.B SO_TIMESTAMPNS -were not set, or it fails if no such packet has been received, -(i.e., -.BR ioctl (2) -returns \-1 with -.I errno -set to -.BR ENOENT ). -.TP -.B SIOCSPGRP -Set the process or process group that is to receive -.B SIGIO -or -.B SIGURG -signals when I/O becomes possible or urgent data is available. -The argument is a pointer to a -.IR pid_t . -For further details, see the description of -.B F_SETOWN -in -.BR fcntl (2). -.TP -.B FIOASYNC -Change the -.B O_ASYNC -flag to enable or disable asynchronous I/O mode of the socket. -Asynchronous I/O mode means that the -.B SIGIO -signal or the signal set with -.B F_SETSIG -is raised when a new I/O event occurs. -.IP -Argument is an integer boolean flag. -(This operation is synonymous with the use of -.BR fcntl (2) -to set the -.B O_ASYNC -flag.) -.\" -.TP -.B SIOCGPGRP -Get the current process or process group that receives -.B SIGIO -or -.B SIGURG -signals, -or 0 -when none is set. -.P -Valid -.BR fcntl (2) -operations: -.TP -.B FIOGETOWN -The same as the -.B SIOCGPGRP -.BR ioctl (2). -.TP -.B FIOSETOWN -The same as the -.B SIOCSPGRP -.BR ioctl (2). -.SH VERSIONS -.B SO_BINDTODEVICE -was introduced in Linux 2.0.30. -.B SO_PASSCRED -is new in Linux 2.2. -The -.I /proc -interfaces were introduced in Linux 2.2. -.B SO_RCVTIMEO -and -.B SO_SNDTIMEO -are supported since Linux 2.3.41. -Earlier, timeouts were fixed to -a protocol-specific setting, and could not be read or written. -.SH NOTES -Linux assumes that half of the send/receive buffer is used for internal -kernel structures; thus the values in the corresponding -.I /proc -files are twice what can be observed on the wire. -.P -Linux will allow port reuse only with the -.B SO_REUSEADDR -option -when this option was set both in the previous program that performed a -.BR bind (2) -to the port and in the program that wants to reuse the port. -This differs from some implementations (e.g., FreeBSD) -where only the later program needs to set the -.B SO_REUSEADDR -option. -Typically this difference is invisible, since, for example, a server -program is designed to always set this option. -.\" .SH AUTHORS -.\" This man page was written by Andi Kleen. -.SH SEE ALSO -.BR wireshark (1), -.BR bpf (2), -.BR connect (2), -.BR getsockopt (2), -.BR setsockopt (2), -.BR socket (2), -.BR pcap (3), -.BR address_families (7), -.BR capabilities (7), -.BR ddp (7), -.BR ip (7), -.BR ipv6 (7), -.BR packet (7), -.BR tcp (7), -.BR udp (7), -.BR unix (7), -.BR tcpdump (8) |