diff options
Diffstat (limited to 'man7/bpf-helpers.7')
-rw-r--r-- | man7/bpf-helpers.7 | 5171 |
1 files changed, 0 insertions, 5171 deletions
diff --git a/man7/bpf-helpers.7 b/man7/bpf-helpers.7 deleted file mode 100644 index b4236f1..0000000 --- a/man7/bpf-helpers.7 +++ /dev/null @@ -1,5171 +0,0 @@ -.\" Man page generated from reStructuredText. -. -. -.nr rst2man-indent-level 0 -. -.de1 rstReportMargin -\\$1 \\n[an-margin] -level \\n[rst2man-indent-level] -level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] -- -\\n[rst2man-indent0] -\\n[rst2man-indent1] -\\n[rst2man-indent2] -.. -.de1 INDENT -.\" .rstReportMargin pre: -. RS \\$1 -. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] -. nr rst2man-indent-level +1 -.\" .rstReportMargin post: -.. -.de UNINDENT -. RE -.\" indent \\n[an-margin] -.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] -.nr rst2man-indent-level -1 -.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] -.in \\n[rst2man-indent\\n[rst2man-indent-level]]u -.. -.TH "BPF-HELPERS" 7 "2023-11-10" "Linux v6.8" -.SH NAME -BPF-HELPERS \- list of eBPF helper functions -.\" Copyright (C) All BPF authors and contributors from 2014 to present. -. -.\" See git log include/uapi/linux/bpf.h in kernel tree for details. -. -.\" -. -.\" SPDX-License-Identifier: Linux-man-pages-copyleft -. -.\" -. -.\" Please do not edit this file. It was generated from the documentation -. -.\" located in file include/uapi/linux/bpf.h of the Linux kernel sources -. -.\" (helpers description), and from scripts/bpf_doc.py in the same -. -.\" repository (header and footer). -. -.SH DESCRIPTION -.sp -The extended Berkeley Packet Filter (eBPF) subsystem consists in programs -written in a pseudo\-assembly language, then attached to one of the several -kernel hooks and run in reaction of specific events. This framework differs -from the older, \(dqclassic\(dq BPF (or \(dqcBPF\(dq) in several aspects, one of them being -the ability to call special functions (or \(dqhelpers\(dq) from within a program. -These functions are restricted to a white\-list of helpers defined in the -kernel. -.sp -These helpers are used by eBPF programs to interact with the system, or with -the context in which they work. For instance, they can be used to print -debugging messages, to get the time since the system was booted, to interact -with eBPF maps, or to manipulate network packets. Since there are several eBPF -program types, and that they do not run in the same context, each program type -can only call a subset of those helpers. -.sp -Due to eBPF conventions, a helper can not have more than five arguments. -.sp -Internally, eBPF programs call directly into the compiled helper functions -without requiring any foreign\-function interface. As a result, calling helpers -introduces no overhead, thus offering excellent performance. -.sp -This document is an attempt to list and document the helpers available to eBPF -developers. They are sorted by chronological order (the oldest helpers in the -kernel at the top). -.SH HELPERS -.INDENT 0.0 -.TP -.B \fBvoid *bpf_map_lookup_elem(struct bpf_map *\fP\fImap\fP\fB, const void *\fP\fIkey\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Perform a lookup in \fImap\fP for an entry associated to \fIkey\fP\&. -.TP -.B Return -Map value associated to \fIkey\fP, or \fBNULL\fP if no entry was -found. -.UNINDENT -.TP -.B \fBlong bpf_map_update_elem(struct bpf_map *\fP\fImap\fP\fB, const void *\fP\fIkey\fP\fB, const void *\fP\fIvalue\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Add or update the value of the entry associated to \fIkey\fP in -\fImap\fP with \fIvalue\fP\&. \fIflags\fP is one of: -.INDENT 7.0 -.TP -.B \fBBPF_NOEXIST\fP -The entry for \fIkey\fP must not exist in the map. -.TP -.B \fBBPF_EXIST\fP -The entry for \fIkey\fP must already exist in the map. -.TP -.B \fBBPF_ANY\fP -No condition on the existence of the entry for \fIkey\fP\&. -.UNINDENT -.sp -Flag value \fBBPF_NOEXIST\fP cannot be used for maps of types -\fBBPF_MAP_TYPE_ARRAY\fP or \fBBPF_MAP_TYPE_PERCPU_ARRAY\fP (all -elements always exist), the helper would return an error. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_map_delete_elem(struct bpf_map *\fP\fImap\fP\fB, const void *\fP\fIkey\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Delete entry with \fIkey\fP from \fImap\fP\&. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_probe_read(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIunsafe_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For tracing programs, safely attempt to read \fIsize\fP bytes from -kernel space address \fIunsafe_ptr\fP and store the data in \fIdst\fP\&. -.sp -Generally, use \fBbpf_probe_read_user\fP() or -\fBbpf_probe_read_kernel\fP() instead. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu64 bpf_ktime_get_ns(void)\fP -.INDENT 7.0 -.TP -.B Description -Return the time elapsed since system boot, in nanoseconds. -Does not include time the system was suspended. -See: \fBclock_gettime\fP(\fBCLOCK_MONOTONIC\fP) -.TP -.B Return -Current \fIktime\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_trace_printk(const char *\fP\fIfmt\fP\fB, u32\fP \fIfmt_size\fP\fB, ...)\fP -.INDENT 7.0 -.TP -.B Description -This helper is a \(dqprintk()\-like\(dq facility for debugging. It -prints a message defined by format \fIfmt\fP (of size \fIfmt_size\fP) -to file \fI/sys/kernel/tracing/trace\fP from TraceFS, if -available. It can take up to three additional \fBu64\fP -arguments (as an eBPF helpers, the total number of arguments is -limited to five). -.sp -Each time the helper is called, it appends a line to the trace. -Lines are discarded while \fI/sys/kernel/tracing/trace\fP is -open, use \fI/sys/kernel/tracing/trace_pipe\fP to avoid this. -The format of the trace is customizable, and the exact output -one will get depends on the options set in -\fI/sys/kernel/tracing/trace_options\fP (see also the -\fIREADME\fP file under the same directory). However, it usually -defaults to something like: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -telnet\-470 [001] .N.. 419421.045894: 0x00000001: <formatted msg> -.EE -.UNINDENT -.UNINDENT -.sp -In the above: -.INDENT 7.0 -.INDENT 3.5 -.INDENT 0.0 -.IP \(bu 2 -\fBtelnet\fP is the name of the current task. -.IP \(bu 2 -\fB470\fP is the PID of the current task. -.IP \(bu 2 -\fB001\fP is the CPU number on which the task is -running. -.IP \(bu 2 -In \fB\&.N..\fP, each character refers to a set of -options (whether irqs are enabled, scheduling -options, whether hard/softirqs are running, level of -preempt_disabled respectively). \fBN\fP means that -\fBTIF_NEED_RESCHED\fP and \fBPREEMPT_NEED_RESCHED\fP -are set. -.IP \(bu 2 -\fB419421.045894\fP is a timestamp. -.IP \(bu 2 -\fB0x00000001\fP is a fake value used by BPF for the -instruction pointer register. -.IP \(bu 2 -\fB<formatted msg>\fP is the message formatted with -\fIfmt\fP\&. -.UNINDENT -.UNINDENT -.UNINDENT -.sp -The conversion specifiers supported by \fIfmt\fP are similar, but -more limited than for printk(). They are \fB%d\fP, \fB%i\fP, -\fB%u\fP, \fB%x\fP, \fB%ld\fP, \fB%li\fP, \fB%lu\fP, \fB%lx\fP, \fB%lld\fP, -\fB%lli\fP, \fB%llu\fP, \fB%llx\fP, \fB%p\fP, \fB%s\fP\&. No modifier (size -of field, padding with zeroes, etc.) is available, and the -helper will return \fB\-EINVAL\fP (but print nothing) if it -encounters an unknown specifier. -.sp -Also, note that \fBbpf_trace_printk\fP() is slow, and should -only be used for debugging purposes. For this reason, a notice -block (spanning several lines) is printed to kernel logs and -states that the helper should not be used \(dqfor production use\(dq -the first time this helper is used (or more precisely, when -\fBtrace_printk\fP() buffers are allocated). For passing values -to user space, perf events should be preferred. -.TP -.B Return -The number of bytes written to the buffer, or a negative error -in case of failure. -.UNINDENT -.TP -.B \fBu32 bpf_get_prandom_u32(void)\fP -.INDENT 7.0 -.TP -.B Description -Get a pseudo\-random number. -.sp -From a security point of view, this helper uses its own -pseudo\-random internal state, and cannot be used to infer the -seed of other random functions in the kernel. However, it is -essential to note that the generator used by the helper is not -cryptographically secure. -.TP -.B Return -A random 32\-bit unsigned value. -.UNINDENT -.TP -.B \fBu32 bpf_get_smp_processor_id(void)\fP -.INDENT 7.0 -.TP -.B Description -Get the SMP (symmetric multiprocessing) processor id. Note that -all programs run with migration disabled, which means that the -SMP processor id is stable during all the execution of the -program. -.TP -.B Return -The SMP id of the processor running the program. -.UNINDENT -.TP -.B \fBlong bpf_skb_store_bytes(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, const void *\fP\fIfrom\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Store \fIlen\fP bytes from address \fIfrom\fP into the packet -associated to \fIskb\fP, at \fIoffset\fP\&. \fIflags\fP are a combination of -\fBBPF_F_RECOMPUTE_CSUM\fP (automatically recompute the -checksum for the packet after storing the bytes) and -\fBBPF_F_INVALIDATE_HASH\fP (set \fIskb\fP\fB\->hash\fP, \fIskb\fP\fB\->swhash\fP and \fIskb\fP\fB\->l4hash\fP to 0). -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_l3_csum_replace(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, u64\fP \fIfrom\fP\fB, u64\fP \fIto\fP\fB, u64\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Recompute the layer 3 (e.g. IP) checksum for the packet -associated to \fIskb\fP\&. Computation is incremental, so the helper -must know the former value of the header field that was -modified (\fIfrom\fP), the new value of this field (\fIto\fP), and the -number of bytes (2 or 4) for this field, stored in \fIsize\fP\&. -Alternatively, it is possible to store the difference between -the previous and the new values of the header field in \fIto\fP, by -setting \fIfrom\fP and \fIsize\fP to 0. For both methods, \fIoffset\fP -indicates the location of the IP checksum within the packet. -.sp -This helper works in combination with \fBbpf_csum_diff\fP(), -which does not update the checksum in\-place, but offers more -flexibility and can handle sizes larger than 2 or 4 for the -checksum to update. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_l4_csum_replace(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, u64\fP \fIfrom\fP\fB, u64\fP \fIto\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Recompute the layer 4 (e.g. TCP, UDP or ICMP) checksum for the -packet associated to \fIskb\fP\&. Computation is incremental, so the -helper must know the former value of the header field that was -modified (\fIfrom\fP), the new value of this field (\fIto\fP), and the -number of bytes (2 or 4) for this field, stored on the lowest -four bits of \fIflags\fP\&. Alternatively, it is possible to store -the difference between the previous and the new values of the -header field in \fIto\fP, by setting \fIfrom\fP and the four lowest -bits of \fIflags\fP to 0. For both methods, \fIoffset\fP indicates the -location of the IP checksum within the packet. In addition to -the size of the field, \fIflags\fP can be added (bitwise OR) actual -flags. With \fBBPF_F_MARK_MANGLED_0\fP, a null checksum is left -untouched (unless \fBBPF_F_MARK_ENFORCE\fP is added as well), and -for updates resulting in a null checksum the value is set to -\fBCSUM_MANGLED_0\fP instead. Flag \fBBPF_F_PSEUDO_HDR\fP indicates -the checksum is to be computed against a pseudo\-header. -.sp -This helper works in combination with \fBbpf_csum_diff\fP(), -which does not update the checksum in\-place, but offers more -flexibility and can handle sizes larger than 2 or 4 for the -checksum to update. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_tail_call(void *\fP\fIctx\fP\fB, struct bpf_map *\fP\fIprog_array_map\fP\fB, u32\fP \fIindex\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This special helper is used to trigger a \(dqtail call\(dq, or in -other words, to jump into another eBPF program. The same stack -frame is used (but values on stack and in registers for the -caller are not accessible to the callee). This mechanism allows -for program chaining, either for raising the maximum number of -available eBPF instructions, or to execute given programs in -conditional blocks. For security reasons, there is an upper -limit to the number of successive tail calls that can be -performed. -.sp -Upon call of this helper, the program attempts to jump into a -program referenced at index \fIindex\fP in \fIprog_array_map\fP, a -special map of type \fBBPF_MAP_TYPE_PROG_ARRAY\fP, and passes -\fIctx\fP, a pointer to the context. -.sp -If the call succeeds, the kernel immediately runs the first -instruction of the new program. This is not a function call, -and it never returns to the previous program. If the call -fails, then the helper has no effect, and the caller continues -to run its subsequent instructions. A call can fail if the -destination program for the jump does not exist (i.e. \fIindex\fP -is superior to the number of entries in \fIprog_array_map\fP), or -if the maximum number of tail calls has been reached for this -chain of programs. This limit is defined in the kernel by the -macro \fBMAX_TAIL_CALL_CNT\fP (not accessible to user space), -which is currently set to 33. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_clone_redirect(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIifindex\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Clone and redirect the packet associated to \fIskb\fP to another -net device of index \fIifindex\fP\&. Both ingress and egress -interfaces can be used for redirection. The \fBBPF_F_INGRESS\fP -value in \fIflags\fP is used to make the distinction (ingress path -is selected if the flag is present, egress path otherwise). -This is the only flag supported for now. -.sp -In comparison with \fBbpf_redirect\fP() helper, -\fBbpf_clone_redirect\fP() has the associated cost of -duplicating the packet buffer, but this can be executed out of -the eBPF program. Conversely, \fBbpf_redirect\fP() is more -efficient, but it is handled through an action code where the -redirection happens only after the eBPF program has returned. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. Positive -error indicates a potential drop or congestion in the target -device. The particular positive error codes are not defined. -.UNINDENT -.TP -.B \fBu64 bpf_get_current_pid_tgid(void)\fP -.INDENT 7.0 -.TP -.B Description -Get the current pid and tgid. -.TP -.B Return -A 64\-bit integer containing the current tgid and pid, and -created as such: -\fIcurrent_task\fP\fB\->tgid << 32 |\fP -\fIcurrent_task\fP\fB\->pid\fP\&. -.UNINDENT -.TP -.B \fBu64 bpf_get_current_uid_gid(void)\fP -.INDENT 7.0 -.TP -.B Description -Get the current uid and gid. -.TP -.B Return -A 64\-bit integer containing the current GID and UID, and -created as such: \fIcurrent_gid\fP \fB<< 32 |\fP \fIcurrent_uid\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_get_current_comm(void *\fP\fIbuf\fP\fB, u32\fP \fIsize_of_buf\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Copy the \fBcomm\fP attribute of the current task into \fIbuf\fP of -\fIsize_of_buf\fP\&. The \fBcomm\fP attribute contains the name of -the executable (excluding the path) for the current task. The -\fIsize_of_buf\fP must be strictly positive. On success, the -helper makes sure that the \fIbuf\fP is NUL\-terminated. On failure, -it is filled with zeroes. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu32 bpf_get_cgroup_classid(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Retrieve the classid for the current task, i.e. for the net_cls -cgroup to which \fIskb\fP belongs. -.sp -This helper can be used on TC egress path, but not on ingress. -.sp -The net_cls cgroup provides an interface to tag network packets -based on a user\-provided identifier for all traffic coming from -the tasks belonging to the related cgroup. See also the related -kernel documentation, available from the Linux sources in file -\fIDocumentation/admin\-guide/cgroup\-v1/net_cls.rst\fP\&. -.sp -The Linux kernel has two versions for cgroups: there are -cgroups v1 and cgroups v2. Both are available to users, who can -use a mixture of them, but note that the net_cls cgroup is for -cgroup v1 only. This makes it incompatible with BPF programs -run on cgroups, which is a cgroup\-v2\-only feature (a socket can -only hold data for one version of cgroups at a time). -.sp -This helper is only available is the kernel was compiled with -the \fBCONFIG_CGROUP_NET_CLASSID\fP configuration option set to -\(dq\fBy\fP\(dq or to \(dq\fBm\fP\(dq. -.TP -.B Return -The classid, or 0 for the default unconfigured classid. -.UNINDENT -.TP -.B \fBlong bpf_skb_vlan_push(struct sk_buff *\fP\fIskb\fP\fB, __be16\fP \fIvlan_proto\fP\fB, u16\fP \fIvlan_tci\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Push a \fIvlan_tci\fP (VLAN tag control information) of protocol -\fIvlan_proto\fP to the packet associated to \fIskb\fP, then update -the checksum. Note that if \fIvlan_proto\fP is different from -\fBETH_P_8021Q\fP and \fBETH_P_8021AD\fP, it is considered to -be \fBETH_P_8021Q\fP\&. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_vlan_pop(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Pop a VLAN header from the packet associated to \fIskb\fP\&. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_get_tunnel_key(struct sk_buff *\fP\fIskb\fP\fB, struct bpf_tunnel_key *\fP\fIkey\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get tunnel metadata. This helper takes a pointer \fIkey\fP to an -empty \fBstruct bpf_tunnel_key\fP of \fBsize\fP, that will be -filled with tunnel metadata for the packet associated to \fIskb\fP\&. -The \fIflags\fP can be set to \fBBPF_F_TUNINFO_IPV6\fP, which -indicates that the tunnel is based on IPv6 protocol instead of -IPv4. -.sp -The \fBstruct bpf_tunnel_key\fP is an object that generalizes the -principal parameters used by various tunneling protocols into a -single struct. This way, it can be used to easily make a -decision based on the contents of the encapsulation header, -\(dqsummarized\(dq in this struct. In particular, it holds the IP -address of the remote end (IPv4 or IPv6, depending on the case) -in \fIkey\fP\fB\->remote_ipv4\fP or \fIkey\fP\fB\->remote_ipv6\fP\&. Also, -this struct exposes the \fIkey\fP\fB\->tunnel_id\fP, which is -generally mapped to a VNI (Virtual Network Identifier), making -it programmable together with the \fBbpf_skb_set_tunnel_key\fP() helper. -.sp -Let\(aqs imagine that the following code is part of a program -attached to the TC ingress interface, on one end of a GRE -tunnel, and is supposed to filter out all messages coming from -remote ends with IPv4 address other than 10.0.0.1: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -int ret; -struct bpf_tunnel_key key = {}; - -ret = bpf_skb_get_tunnel_key(skb, &key, sizeof(key), 0); -if (ret < 0) - return TC_ACT_SHOT; // drop packet - -if (key.remote_ipv4 != 0x0a000001) - return TC_ACT_SHOT; // drop packet - -return TC_ACT_OK; // accept packet -.EE -.UNINDENT -.UNINDENT -.sp -This interface can also be used with all encapsulation devices -that can operate in \(dqcollect metadata\(dq mode: instead of having -one network device per specific configuration, the \(dqcollect -metadata\(dq mode only requires a single device where the -configuration can be extracted from this helper. -.sp -This can be used together with various tunnels such as VXLan, -Geneve, GRE or IP in IP (IPIP). -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_set_tunnel_key(struct sk_buff *\fP\fIskb\fP\fB, struct bpf_tunnel_key *\fP\fIkey\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Populate tunnel metadata for packet associated to \fIskb.\fP The -tunnel metadata is set to the contents of \fIkey\fP, of \fIsize\fP\&. The -\fIflags\fP can be set to a combination of the following values: -.INDENT 7.0 -.TP -.B \fBBPF_F_TUNINFO_IPV6\fP -Indicate that the tunnel is based on IPv6 protocol -instead of IPv4. -.TP -.B \fBBPF_F_ZERO_CSUM_TX\fP -For IPv4 packets, add a flag to tunnel metadata -indicating that checksum computation should be skipped -and checksum set to zeroes. -.TP -.B \fBBPF_F_DONT_FRAGMENT\fP -Add a flag to tunnel metadata indicating that the -packet should not be fragmented. -.TP -.B \fBBPF_F_SEQ_NUMBER\fP -Add a flag to tunnel metadata indicating that a -sequence number should be added to tunnel header before -sending the packet. This flag was added for GRE -encapsulation, but might be used with other protocols -as well in the future. -.TP -.B \fBBPF_F_NO_TUNNEL_KEY\fP -Add a flag to tunnel metadata indicating that no tunnel -key should be set in the resulting tunnel header. -.UNINDENT -.sp -Here is a typical usage on the transmit path: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -struct bpf_tunnel_key key; - populate key ... -bpf_skb_set_tunnel_key(skb, &key, sizeof(key), 0); -bpf_clone_redirect(skb, vxlan_dev_ifindex, 0); -.EE -.UNINDENT -.UNINDENT -.sp -See also the description of the \fBbpf_skb_get_tunnel_key\fP() -helper for additional information. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu64 bpf_perf_event_read(struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Read the value of a perf event counter. This helper relies on a -\fImap\fP of type \fBBPF_MAP_TYPE_PERF_EVENT_ARRAY\fP\&. The nature of -the perf event counter is selected when \fImap\fP is updated with -perf event file descriptors. The \fImap\fP is an array whose size -is the number of available CPUs, and each cell contains a value -relative to one CPU. The value to retrieve is indicated by -\fIflags\fP, that contains the index of the CPU to look up, masked -with \fBBPF_F_INDEX_MASK\fP\&. Alternatively, \fIflags\fP can be set to -\fBBPF_F_CURRENT_CPU\fP to indicate that the value for the -current CPU should be retrieved. -.sp -Note that before Linux 4.13, only hardware perf event can be -retrieved. -.sp -Also, be aware that the newer helper -\fBbpf_perf_event_read_value\fP() is recommended over -\fBbpf_perf_event_read\fP() in general. The latter has some ABI -quirks where error and counter value are used as a return code -(which is wrong to do since ranges may overlap). This issue is -fixed with \fBbpf_perf_event_read_value\fP(), which at the same -time provides more features over the \fBbpf_perf_event_read\fP() interface. Please refer to the description of -\fBbpf_perf_event_read_value\fP() for details. -.TP -.B Return -The value of the perf event counter read from the map, or a -negative error code in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_redirect(u32\fP \fIifindex\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Redirect the packet to another net device of index \fIifindex\fP\&. -This helper is somewhat similar to \fBbpf_clone_redirect\fP(), except that the packet is not cloned, which provides -increased performance. -.sp -Except for XDP, both ingress and egress interfaces can be used -for redirection. The \fBBPF_F_INGRESS\fP value in \fIflags\fP is used -to make the distinction (ingress path is selected if the flag -is present, egress path otherwise). Currently, XDP only -supports redirection to the egress interface, and accepts no -flag at all. -.sp -The same effect can also be attained with the more generic -\fBbpf_redirect_map\fP(), which uses a BPF map to store the -redirect target instead of providing it directly to the helper. -.TP -.B Return -For XDP, the helper returns \fBXDP_REDIRECT\fP on success or -\fBXDP_ABORTED\fP on error. For other program types, the values -are \fBTC_ACT_REDIRECT\fP on success or \fBTC_ACT_SHOT\fP on -error. -.UNINDENT -.TP -.B \fBu32 bpf_get_route_realm(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Retrieve the realm or the route, that is to say the -\fBtclassid\fP field of the destination for the \fIskb\fP\&. The -identifier retrieved is a user\-provided tag, similar to the -one used with the net_cls cgroup (see description for -\fBbpf_get_cgroup_classid\fP() helper), but here this tag is -held by a route (a destination entry), not by a task. -.sp -Retrieving this identifier works with the clsact TC egress hook -(see also \fBtc\-bpf(8)\fP), or alternatively on conventional -classful egress qdiscs, but not on TC ingress path. In case of -clsact TC egress hook, this has the advantage that, internally, -the destination entry has not been dropped yet in the transmit -path. Therefore, the destination entry does not need to be -artificially held via \fBnetif_keep_dst\fP() for a classful -qdisc until the \fIskb\fP is freed. -.sp -This helper is available only if the kernel was compiled with -\fBCONFIG_IP_ROUTE_CLASSID\fP configuration option. -.TP -.B Return -The realm of the route for the packet associated to \fIskb\fP, or 0 -if none was found. -.UNINDENT -.TP -.B \fBlong bpf_perf_event_output(void *\fP\fIctx\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB, void *\fP\fIdata\fP\fB, u64\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Write raw \fIdata\fP blob into a special BPF perf event held by -\fImap\fP of type \fBBPF_MAP_TYPE_PERF_EVENT_ARRAY\fP\&. This perf -event must have the following attributes: \fBPERF_SAMPLE_RAW\fP -as \fBsample_type\fP, \fBPERF_TYPE_SOFTWARE\fP as \fBtype\fP, and -\fBPERF_COUNT_SW_BPF_OUTPUT\fP as \fBconfig\fP\&. -.sp -The \fIflags\fP are used to indicate the index in \fImap\fP for which -the value must be put, masked with \fBBPF_F_INDEX_MASK\fP\&. -Alternatively, \fIflags\fP can be set to \fBBPF_F_CURRENT_CPU\fP -to indicate that the index of the current CPU core should be -used. -.sp -The value to write, of \fIsize\fP, is passed through eBPF stack and -pointed by \fIdata\fP\&. -.sp -The context of the program \fIctx\fP needs also be passed to the -helper. -.sp -On user space, a program willing to read the values needs to -call \fBperf_event_open\fP() on the perf event (either for -one or for all CPUs) and to store the file descriptor into the -\fImap\fP\&. This must be done before the eBPF program can send data -into it. An example is available in file -\fIsamples/bpf/trace_output_user.c\fP in the Linux kernel source -tree (the eBPF program counterpart is in -\fIsamples/bpf/trace_output_kern.c\fP). -.sp -\fBbpf_perf_event_output\fP() achieves better performance -than \fBbpf_trace_printk\fP() for sharing data with user -space, and is much better suitable for streaming data from eBPF -programs. -.sp -Note that this helper is not restricted to tracing use cases -and can be used with programs attached to TC or XDP as well, -where it allows for passing data to user space listeners. Data -can be: -.INDENT 7.0 -.IP \(bu 2 -Only custom structs, -.IP \(bu 2 -Only the packet payload, or -.IP \(bu 2 -A combination of both. -.UNINDENT -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_load_bytes(const void *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, void *\fP\fIto\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper was provided as an easy way to load data from a -packet. It can be used to load \fIlen\fP bytes from \fIoffset\fP from -the packet associated to \fIskb\fP, into the buffer pointed by -\fIto\fP\&. -.sp -Since Linux 4.7, usage of this helper has mostly been replaced -by \(dqdirect packet access\(dq, enabling packet data to be -manipulated with \fIskb\fP\fB\->data\fP and \fIskb\fP\fB\->data_end\fP -pointing respectively to the first byte of packet data and to -the byte after the last byte of packet data. However, it -remains useful if one wishes to read large quantities of data -at once from a packet into the eBPF stack. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_get_stackid(void *\fP\fIctx\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Walk a user or a kernel stack and return its id. To achieve -this, the helper needs \fIctx\fP, which is a pointer to the context -on which the tracing program is executed, and a pointer to a -\fImap\fP of type \fBBPF_MAP_TYPE_STACK_TRACE\fP\&. -.sp -The last argument, \fIflags\fP, holds the number of stack frames to -skip (from 0 to 255), masked with -\fBBPF_F_SKIP_FIELD_MASK\fP\&. The next bits can be used to set -a combination of the following flags: -.INDENT 7.0 -.TP -.B \fBBPF_F_USER_STACK\fP -Collect a user space stack instead of a kernel stack. -.TP -.B \fBBPF_F_FAST_STACK_CMP\fP -Compare stacks by hash only. -.TP -.B \fBBPF_F_REUSE_STACKID\fP -If two different stacks hash into the same \fIstackid\fP, -discard the old one. -.UNINDENT -.sp -The stack id retrieved is a 32 bit long integer handle which -can be further combined with other data (including other stack -ids) and used as a key into maps. This can be useful for -generating a variety of graphs (such as flame graphs or off\-cpu -graphs). -.sp -For walking a stack, this helper is an improvement over -\fBbpf_probe_read\fP(), which can be used with unrolled loops -but is not efficient and consumes a lot of eBPF instructions. -Instead, \fBbpf_get_stackid\fP() can collect up to -\fBPERF_MAX_STACK_DEPTH\fP both kernel and user frames. Note that -this limit can be controlled with the \fBsysctl\fP program, and -that it should be manually increased in order to profile long -user stacks (such as stacks for Java programs). To do so, use: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -# sysctl kernel.perf_event_max_stack=<new value> -.EE -.UNINDENT -.UNINDENT -.TP -.B Return -The positive or null stack id on success, or a negative error -in case of failure. -.UNINDENT -.TP -.B \fBs64 bpf_csum_diff(__be32 *\fP\fIfrom\fP\fB, u32\fP \fIfrom_size\fP\fB, __be32 *\fP\fIto\fP\fB, u32\fP \fIto_size\fP\fB, __wsum\fP \fIseed\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Compute a checksum difference, from the raw buffer pointed by -\fIfrom\fP, of length \fIfrom_size\fP (that must be a multiple of 4), -towards the raw buffer pointed by \fIto\fP, of size \fIto_size\fP -(same remark). An optional \fIseed\fP can be added to the value -(this can be cascaded, the seed may come from a previous call -to the helper). -.sp -This is flexible enough to be used in several ways: -.INDENT 7.0 -.IP \(bu 2 -With \fIfrom_size\fP == 0, \fIto_size\fP > 0 and \fIseed\fP set to -checksum, it can be used when pushing new data. -.IP \(bu 2 -With \fIfrom_size\fP > 0, \fIto_size\fP == 0 and \fIseed\fP set to -checksum, it can be used when removing data from a packet. -.IP \(bu 2 -With \fIfrom_size\fP > 0, \fIto_size\fP > 0 and \fIseed\fP set to 0, it -can be used to compute a diff. Note that \fIfrom_size\fP and -\fIto_size\fP do not need to be equal. -.UNINDENT -.sp -This helper can be used in combination with -\fBbpf_l3_csum_replace\fP() and \fBbpf_l4_csum_replace\fP(), to -which one can feed in the difference computed with -\fBbpf_csum_diff\fP(). -.TP -.B Return -The checksum result, or a negative error code in case of -failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_get_tunnel_opt(struct sk_buff *\fP\fIskb\fP\fB, void *\fP\fIopt\fP\fB, u32\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Retrieve tunnel options metadata for the packet associated to -\fIskb\fP, and store the raw tunnel option data to the buffer \fIopt\fP -of \fIsize\fP\&. -.sp -This helper can be used with encapsulation devices that can -operate in \(dqcollect metadata\(dq mode (please refer to the related -note in the description of \fBbpf_skb_get_tunnel_key\fP() for -more details). A particular example where this can be used is -in combination with the Geneve encapsulation protocol, where it -allows for pushing (with \fBbpf_skb_get_tunnel_opt\fP() helper) -and retrieving arbitrary TLVs (Type\-Length\-Value headers) from -the eBPF program. This allows for full customization of these -headers. -.TP -.B Return -The size of the option data retrieved. -.UNINDENT -.TP -.B \fBlong bpf_skb_set_tunnel_opt(struct sk_buff *\fP\fIskb\fP\fB, void *\fP\fIopt\fP\fB, u32\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Set tunnel options metadata for the packet associated to \fIskb\fP -to the option data contained in the raw buffer \fIopt\fP of \fIsize\fP\&. -.sp -See also the description of the \fBbpf_skb_get_tunnel_opt\fP() -helper for additional information. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_change_proto(struct sk_buff *\fP\fIskb\fP\fB, __be16\fP \fIproto\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Change the protocol of the \fIskb\fP to \fIproto\fP\&. Currently -supported are transition from IPv4 to IPv6, and from IPv6 to -IPv4. The helper takes care of the groundwork for the -transition, including resizing the socket buffer. The eBPF -program is expected to fill the new headers, if any, via -\fBskb_store_bytes\fP() and to recompute the checksums with -\fBbpf_l3_csum_replace\fP() and \fBbpf_l4_csum_replace\fP(). The main case for this helper is to perform NAT64 -operations out of an eBPF program. -.sp -Internally, the GSO type is marked as dodgy so that headers are -checked and segments are recalculated by the GSO/GRO engine. -The size for GSO target is adapted as well. -.sp -All values for \fIflags\fP are reserved for future usage, and must -be left at zero. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_change_type(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fItype\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Change the packet type for the packet associated to \fIskb\fP\&. This -comes down to setting \fIskb\fP\fB\->pkt_type\fP to \fItype\fP, except -the eBPF program does not have a write access to \fIskb\fP\fB\->pkt_type\fP beside this helper. Using a helper here allows -for graceful handling of errors. -.sp -The major use case is to change incoming \fIskb*s to -**PACKET_HOST*\fP in a programmatic way instead of having to -recirculate via \fBredirect\fP(..., \fBBPF_F_INGRESS\fP), for -example. -.sp -Note that \fItype\fP only allows certain values. At this time, they -are: -.INDENT 7.0 -.TP -.B \fBPACKET_HOST\fP -Packet is for us. -.TP -.B \fBPACKET_BROADCAST\fP -Send packet to all. -.TP -.B \fBPACKET_MULTICAST\fP -Send packet to group. -.TP -.B \fBPACKET_OTHERHOST\fP -Send packet to someone else. -.UNINDENT -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_under_cgroup(struct sk_buff *\fP\fIskb\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u32\fP \fIindex\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Check whether \fIskb\fP is a descendant of the cgroup2 held by -\fImap\fP of type \fBBPF_MAP_TYPE_CGROUP_ARRAY\fP, at \fIindex\fP\&. -.TP -.B Return -The return value depends on the result of the test, and can be: -.INDENT 7.0 -.IP \(bu 2 -0, if the \fIskb\fP failed the cgroup2 descendant test. -.IP \(bu 2 -1, if the \fIskb\fP succeeded the cgroup2 descendant test. -.IP \(bu 2 -A negative error code, if an error occurred. -.UNINDENT -.UNINDENT -.TP -.B \fBu32 bpf_get_hash_recalc(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Retrieve the hash of the packet, \fIskb\fP\fB\->hash\fP\&. If it is -not set, in particular if the hash was cleared due to mangling, -recompute this hash. Later accesses to the hash can be done -directly with \fIskb\fP\fB\->hash\fP\&. -.sp -Calling \fBbpf_set_hash_invalid\fP(), changing a packet -prototype with \fBbpf_skb_change_proto\fP(), or calling -\fBbpf_skb_store_bytes\fP() with the -\fBBPF_F_INVALIDATE_HASH\fP are actions susceptible to clear -the hash and to trigger a new computation for the next call to -\fBbpf_get_hash_recalc\fP(). -.TP -.B Return -The 32\-bit hash. -.UNINDENT -.TP -.B \fBu64 bpf_get_current_task(void)\fP -.INDENT 7.0 -.TP -.B Description -Get the current task. -.TP -.B Return -A pointer to the current task struct. -.UNINDENT -.TP -.B \fBlong bpf_probe_write_user(void *\fP\fIdst\fP\fB, const void *\fP\fIsrc\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Attempt in a safe way to write \fIlen\fP bytes from the buffer -\fIsrc\fP to \fIdst\fP in memory. It only works for threads that are in -user context, and \fIdst\fP must be a valid user space address. -.sp -This helper should not be used to implement any kind of -security mechanism because of TOC\-TOU attacks, but rather to -debug, divert, and manipulate execution of semi\-cooperative -processes. -.sp -Keep in mind that this feature is meant for experiments, and it -has a risk of crashing the system and running programs. -Therefore, when an eBPF program using this helper is attached, -a warning including PID and process name is printed to kernel -logs. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_current_task_under_cgroup(struct bpf_map *\fP\fImap\fP\fB, u32\fP \fIindex\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Check whether the probe is being run is the context of a given -subset of the cgroup2 hierarchy. The cgroup2 to test is held by -\fImap\fP of type \fBBPF_MAP_TYPE_CGROUP_ARRAY\fP, at \fIindex\fP\&. -.TP -.B Return -The return value depends on the result of the test, and can be: -.INDENT 7.0 -.IP \(bu 2 -1, if current task belongs to the cgroup2. -.IP \(bu 2 -0, if current task does not belong to the cgroup2. -.IP \(bu 2 -A negative error code, if an error occurred. -.UNINDENT -.UNINDENT -.TP -.B \fBlong bpf_skb_change_tail(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Resize (trim or grow) the packet associated to \fIskb\fP to the -new \fIlen\fP\&. The \fIflags\fP are reserved for future usage, and must -be left at zero. -.sp -The basic idea is that the helper performs the needed work to -change the size of the packet, then the eBPF program rewrites -the rest via helpers like \fBbpf_skb_store_bytes\fP(), -\fBbpf_l3_csum_replace\fP(), \fBbpf_l3_csum_replace\fP() -and others. This helper is a slow path utility intended for -replies with control messages. And because it is targeted for -slow path, the helper itself can afford to be slow: it -implicitly linearizes, unclones and drops offloads from the -\fIskb\fP\&. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_pull_data(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Pull in non\-linear data in case the \fIskb\fP is non\-linear and not -all of \fIlen\fP are part of the linear section. Make \fIlen\fP bytes -from \fIskb\fP readable and writable. If a zero value is passed for -\fIlen\fP, then all bytes in the linear part of \fIskb\fP will be made -readable and writable. -.sp -This helper is only needed for reading and writing with direct -packet access. -.sp -For direct packet access, testing that offsets to access -are within packet boundaries (test on \fIskb\fP\fB\->data_end\fP) is -susceptible to fail if offsets are invalid, or if the requested -data is in non\-linear parts of the \fIskb\fP\&. On failure the -program can just bail out, or in the case of a non\-linear -buffer, use a helper to make the data available. The -\fBbpf_skb_load_bytes\fP() helper is a first solution to access -the data. Another one consists in using \fBbpf_skb_pull_data\fP -to pull in once the non\-linear parts, then retesting and -eventually access the data. -.sp -At the same time, this also makes sure the \fIskb\fP is uncloned, -which is a necessary condition for direct write. As this needs -to be an invariant for the write part only, the verifier -detects writes and adds a prologue that is calling -\fBbpf_skb_pull_data()\fP to effectively unclone the \fIskb\fP from -the very beginning in case it is indeed cloned. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBs64 bpf_csum_update(struct sk_buff *\fP\fIskb\fP\fB, __wsum\fP \fIcsum\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Add the checksum \fIcsum\fP into \fIskb\fP\fB\->csum\fP in case the -driver has supplied a checksum for the entire packet into that -field. Return an error otherwise. This helper is intended to be -used in combination with \fBbpf_csum_diff\fP(), in particular -when the checksum needs to be updated after data has been -written into the packet through direct packet access. -.TP -.B Return -The checksum on success, or a negative error code in case of -failure. -.UNINDENT -.TP -.B \fBvoid bpf_set_hash_invalid(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Invalidate the current \fIskb\fP\fB\->hash\fP\&. It can be used after -mangling on headers through direct packet access, in order to -indicate that the hash is outdated and to trigger a -recalculation the next time the kernel tries to access this -hash or when the \fBbpf_get_hash_recalc\fP() helper is called. -.TP -.B Return -void. -.UNINDENT -.TP -.B \fBlong bpf_get_numa_node_id(void)\fP -.INDENT 7.0 -.TP -.B Description -Return the id of the current NUMA node. The primary use case -for this helper is the selection of sockets for the local NUMA -node, when the program is attached to sockets using the -\fBSO_ATTACH_REUSEPORT_EBPF\fP option (see also \fBsocket(7)\fP), -but the helper is also available to other eBPF program types, -similarly to \fBbpf_get_smp_processor_id\fP(). -.TP -.B Return -The id of current NUMA node. -.UNINDENT -.TP -.B \fBlong bpf_skb_change_head(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Grows headroom of packet associated to \fIskb\fP and adjusts the -offset of the MAC header accordingly, adding \fIlen\fP bytes of -space. It automatically extends and reallocates memory as -required. -.sp -This helper can be used on a layer 3 \fIskb\fP to push a MAC header -for redirection into a layer 2 device. -.sp -All values for \fIflags\fP are reserved for future usage, and must -be left at zero. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_xdp_adjust_head(struct xdp_buff *\fP\fIxdp_md\fP\fB, int\fP \fIdelta\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Adjust (move) \fIxdp_md\fP\fB\->data\fP by \fIdelta\fP bytes. Note that -it is possible to use a negative value for \fIdelta\fP\&. This helper -can be used to prepare the packet for pushing or popping -headers. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_probe_read_str(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIunsafe_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Copy a NUL terminated string from an unsafe kernel address -\fIunsafe_ptr\fP to \fIdst\fP\&. See \fBbpf_probe_read_kernel_str\fP() for -more details. -.sp -Generally, use \fBbpf_probe_read_user_str\fP() or -\fBbpf_probe_read_kernel_str\fP() instead. -.TP -.B Return -On success, the strictly positive length of the string, -including the trailing NUL character. On error, a negative -value. -.UNINDENT -.TP -.B \fBu64 bpf_get_socket_cookie(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -If the \fBstruct sk_buff\fP pointed by \fIskb\fP has a known socket, -retrieve the cookie (generated by the kernel) of this socket. -If no cookie has been set yet, generate a new cookie. Once -generated, the socket cookie remains stable for the life of the -socket. This helper can be useful for monitoring per socket -networking traffic statistics as it provides a global socket -identifier that can be assumed unique. -.TP -.B Return -A 8\-byte long unique number on success, or 0 if the socket -field is missing inside \fIskb\fP\&. -.UNINDENT -.TP -.B \fBu64 bpf_get_socket_cookie(struct bpf_sock_addr *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Equivalent to bpf_get_socket_cookie() helper that accepts -\fIskb\fP, but gets socket from \fBstruct bpf_sock_addr\fP context. -.TP -.B Return -A 8\-byte long unique number. -.UNINDENT -.TP -.B \fBu64 bpf_get_socket_cookie(struct bpf_sock_ops *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Equivalent to \fBbpf_get_socket_cookie\fP() helper that accepts -\fIskb\fP, but gets socket from \fBstruct bpf_sock_ops\fP context. -.TP -.B Return -A 8\-byte long unique number. -.UNINDENT -.TP -.B \fBu64 bpf_get_socket_cookie(struct sock *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Equivalent to \fBbpf_get_socket_cookie\fP() helper that accepts -\fIsk\fP, but gets socket from a BTF \fBstruct sock\fP\&. This helper -also works for sleepable programs. -.TP -.B Return -A 8\-byte long unique number or 0 if \fIsk\fP is NULL. -.UNINDENT -.TP -.B \fBu32 bpf_get_socket_uid(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get the owner UID of the socked associated to \fIskb\fP\&. -.TP -.B Return -The owner UID of the socket associated to \fIskb\fP\&. If the socket -is \fBNULL\fP, or if it is not a full socket (i.e. if it is a -time\-wait or a request socket instead), \fBoverflowuid\fP value -is returned (note that \fBoverflowuid\fP might also be the actual -UID value for the socket). -.UNINDENT -.TP -.B \fBlong bpf_set_hash(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIhash\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Set the full hash for \fIskb\fP (set the field \fIskb\fP\fB\->hash\fP) -to value \fIhash\fP\&. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_setsockopt(void *\fP\fIbpf_socket\fP\fB, int\fP \fIlevel\fP\fB, int\fP \fIoptname\fP\fB, void *\fP\fIoptval\fP\fB, int\fP \fIoptlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Emulate a call to \fBsetsockopt()\fP on the socket associated to -\fIbpf_socket\fP, which must be a full socket. The \fIlevel\fP at -which the option resides and the name \fIoptname\fP of the option -must be specified, see \fBsetsockopt(2)\fP for more information. -The option value of length \fIoptlen\fP is pointed by \fIoptval\fP\&. -.sp -\fIbpf_socket\fP should be one of the following: -.INDENT 7.0 -.IP \(bu 2 -\fBstruct bpf_sock_ops\fP for \fBBPF_PROG_TYPE_SOCK_OPS\fP\&. -.IP \(bu 2 -\fBstruct bpf_sock_addr\fP for \fBBPF_CGROUP_INET4_CONNECT\fP, -\fBBPF_CGROUP_INET6_CONNECT\fP and \fBBPF_CGROUP_UNIX_CONNECT\fP\&. -.UNINDENT -.sp -This helper actually implements a subset of \fBsetsockopt()\fP\&. -It supports the following \fIlevel\fPs: -.INDENT 7.0 -.IP \(bu 2 -\fBSOL_SOCKET\fP, which supports the following \fIoptname\fPs: -\fBSO_RCVBUF\fP, \fBSO_SNDBUF\fP, \fBSO_MAX_PACING_RATE\fP, -\fBSO_PRIORITY\fP, \fBSO_RCVLOWAT\fP, \fBSO_MARK\fP, -\fBSO_BINDTODEVICE\fP, \fBSO_KEEPALIVE\fP, \fBSO_REUSEADDR\fP, -\fBSO_REUSEPORT\fP, \fBSO_BINDTOIFINDEX\fP, \fBSO_TXREHASH\fP\&. -.IP \(bu 2 -\fBIPPROTO_TCP\fP, which supports the following \fIoptname\fPs: -\fBTCP_CONGESTION\fP, \fBTCP_BPF_IW\fP, -\fBTCP_BPF_SNDCWND_CLAMP\fP, \fBTCP_SAVE_SYN\fP, -\fBTCP_KEEPIDLE\fP, \fBTCP_KEEPINTVL\fP, \fBTCP_KEEPCNT\fP, -\fBTCP_SYNCNT\fP, \fBTCP_USER_TIMEOUT\fP, \fBTCP_NOTSENT_LOWAT\fP, -\fBTCP_NODELAY\fP, \fBTCP_MAXSEG\fP, \fBTCP_WINDOW_CLAMP\fP, -\fBTCP_THIN_LINEAR_TIMEOUTS\fP, \fBTCP_BPF_DELACK_MAX\fP, -\fBTCP_BPF_RTO_MIN\fP\&. -.IP \(bu 2 -\fBIPPROTO_IP\fP, which supports \fIoptname\fP \fBIP_TOS\fP\&. -.IP \(bu 2 -\fBIPPROTO_IPV6\fP, which supports the following \fIoptname\fPs: -\fBIPV6_TCLASS\fP, \fBIPV6_AUTOFLOWLABEL\fP\&. -.UNINDENT -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_adjust_room(struct sk_buff *\fP\fIskb\fP\fB, s32\fP \fIlen_diff\fP\fB, u32\fP \fImode\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Grow or shrink the room for data in the packet associated to -\fIskb\fP by \fIlen_diff\fP, and according to the selected \fImode\fP\&. -.sp -By default, the helper will reset any offloaded checksum -indicator of the skb to CHECKSUM_NONE. This can be avoided -by the following flag: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_NO_CSUM_RESET\fP: Do not reset offloaded -checksum data of the skb to CHECKSUM_NONE. -.UNINDENT -.sp -There are two supported modes at this time: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_ADJ_ROOM_MAC\fP: Adjust room at the mac layer -(room space is added or removed between the layer 2 and -layer 3 headers). -.IP \(bu 2 -\fBBPF_ADJ_ROOM_NET\fP: Adjust room at the network layer -(room space is added or removed between the layer 3 and -layer 4 headers). -.UNINDENT -.sp -The following flags are supported at this time: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_FIXED_GSO\fP: Do not adjust gso_size. -Adjusting mss in this way is not allowed for datagrams. -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_ENCAP_L3_IPV4\fP, -\fBBPF_F_ADJ_ROOM_ENCAP_L3_IPV6\fP: -Any new space is reserved to hold a tunnel header. -Configure skb offsets and other fields accordingly. -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_ENCAP_L4_GRE\fP, -\fBBPF_F_ADJ_ROOM_ENCAP_L4_UDP\fP: -Use with ENCAP_L3 flags to further specify the tunnel type. -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_ENCAP_L2\fP(\fIlen\fP): -Use with ENCAP_L3/L4 flags to further specify the tunnel -type; \fIlen\fP is the length of the inner MAC header. -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_ENCAP_L2_ETH\fP: -Use with BPF_F_ADJ_ROOM_ENCAP_L2 flag to further specify the -L2 type as Ethernet. -.IP \(bu 2 -\fBBPF_F_ADJ_ROOM_DECAP_L3_IPV4\fP, -\fBBPF_F_ADJ_ROOM_DECAP_L3_IPV6\fP: -Indicate the new IP header version after decapsulating the outer -IP header. Used when the inner and outer IP versions are different. -.UNINDENT -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_redirect_map(struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Redirect the packet to the endpoint referenced by \fImap\fP at -index \fIkey\fP\&. Depending on its type, this \fImap\fP can contain -references to net devices (for forwarding packets through other -ports), or to CPUs (for redirecting XDP frames to another CPU; -but this is only implemented for native XDP (with driver -support) as of this writing). -.sp -The lower two bits of \fIflags\fP are used as the return code if -the map lookup fails. This is so that the return value can be -one of the XDP program return codes up to \fBXDP_TX\fP, as chosen -by the caller. The higher bits of \fIflags\fP can be set to -BPF_F_BROADCAST or BPF_F_EXCLUDE_INGRESS as defined below. -.sp -With BPF_F_BROADCAST the packet will be broadcasted to all the -interfaces in the map, with BPF_F_EXCLUDE_INGRESS the ingress -interface will be excluded when do broadcasting. -.sp -See also \fBbpf_redirect\fP(), which only supports redirecting -to an ifindex, but doesn\(aqt require a map to do so. -.TP -.B Return -\fBXDP_REDIRECT\fP on success, or the value of the two lower bits -of the \fIflags\fP argument on error. -.UNINDENT -.TP -.B \fBlong bpf_sk_redirect_map(struct sk_buff *\fP\fIskb\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u32\fP \fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Redirect the packet to the socket referenced by \fImap\fP (of type -\fBBPF_MAP_TYPE_SOCKMAP\fP) at index \fIkey\fP\&. Both ingress and -egress interfaces can be used for redirection. The -\fBBPF_F_INGRESS\fP value in \fIflags\fP is used to make the -distinction (ingress path is selected if the flag is present, -egress path otherwise). This is the only flag supported for now. -.TP -.B Return -\fBSK_PASS\fP on success, or \fBSK_DROP\fP on error. -.UNINDENT -.TP -.B \fBlong bpf_sock_map_update(struct bpf_sock_ops *\fP\fIskops\fP\fB, struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Add an entry to, or update a \fImap\fP referencing sockets. The -\fIskops\fP is used as a new value for the entry associated to -\fIkey\fP\&. \fIflags\fP is one of: -.INDENT 7.0 -.TP -.B \fBBPF_NOEXIST\fP -The entry for \fIkey\fP must not exist in the map. -.TP -.B \fBBPF_EXIST\fP -The entry for \fIkey\fP must already exist in the map. -.TP -.B \fBBPF_ANY\fP -No condition on the existence of the entry for \fIkey\fP\&. -.UNINDENT -.sp -If the \fImap\fP has eBPF programs (parser and verdict), those will -be inherited by the socket being added. If the socket is -already attached to eBPF programs, this results in an error. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_xdp_adjust_meta(struct xdp_buff *\fP\fIxdp_md\fP\fB, int\fP \fIdelta\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Adjust the address pointed by \fIxdp_md\fP\fB\->data_meta\fP by -\fIdelta\fP (which can be positive or negative). Note that this -operation modifies the address stored in \fIxdp_md\fP\fB\->data\fP, -so the latter must be loaded only after the helper has been -called. -.sp -The use of \fIxdp_md\fP\fB\->data_meta\fP is optional and programs -are not required to use it. The rationale is that when the -packet is processed with XDP (e.g. as DoS filter), it is -possible to push further meta data along with it before passing -to the stack, and to give the guarantee that an ingress eBPF -program attached as a TC classifier on the same device can pick -this up for further post\-processing. Since TC works with socket -buffers, it remains possible to set from XDP the \fBmark\fP or -\fBpriority\fP pointers, or other pointers for the socket buffer. -Having this scratch space generic and programmable allows for -more flexibility as the user is free to store whatever meta -data they need. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_perf_event_read_value(struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB, struct bpf_perf_event_value *\fP\fIbuf\fP\fB, u32\fP \fIbuf_size\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Read the value of a perf event counter, and store it into \fIbuf\fP -of size \fIbuf_size\fP\&. This helper relies on a \fImap\fP of type -\fBBPF_MAP_TYPE_PERF_EVENT_ARRAY\fP\&. The nature of the perf event -counter is selected when \fImap\fP is updated with perf event file -descriptors. The \fImap\fP is an array whose size is the number of -available CPUs, and each cell contains a value relative to one -CPU. The value to retrieve is indicated by \fIflags\fP, that -contains the index of the CPU to look up, masked with -\fBBPF_F_INDEX_MASK\fP\&. Alternatively, \fIflags\fP can be set to -\fBBPF_F_CURRENT_CPU\fP to indicate that the value for the -current CPU should be retrieved. -.sp -This helper behaves in a way close to -\fBbpf_perf_event_read\fP() helper, save that instead of -just returning the value observed, it fills the \fIbuf\fP -structure. This allows for additional data to be retrieved: in -particular, the enabled and running times (in \fIbuf\fP\fB\->enabled\fP and \fIbuf\fP\fB\->running\fP, respectively) are -copied. In general, \fBbpf_perf_event_read_value\fP() is -recommended over \fBbpf_perf_event_read\fP(), which has some -ABI issues and provides fewer functionalities. -.sp -These values are interesting, because hardware PMU (Performance -Monitoring Unit) counters are limited resources. When there are -more PMU based perf events opened than available counters, -kernel will multiplex these events so each event gets certain -percentage (but not all) of the PMU time. In case that -multiplexing happens, the number of samples or counter value -will not reflect the case compared to when no multiplexing -occurs. This makes comparison between different runs difficult. -Typically, the counter value should be normalized before -comparing to other experiments. The usual normalization is done -as follows. -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -normalized_counter = counter * t_enabled / t_running -.EE -.UNINDENT -.UNINDENT -.sp -Where t_enabled is the time enabled for event and t_running is -the time running for event since last normalization. The -enabled and running times are accumulated since the perf event -open. To achieve scaling factor between two invocations of an -eBPF program, users can use CPU id as the key (which is -typical for perf array usage model) to remember the previous -value and do the calculation inside the eBPF program. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_perf_prog_read_value(struct bpf_perf_event_data *\fP\fIctx\fP\fB, struct bpf_perf_event_value *\fP\fIbuf\fP\fB, u32\fP \fIbuf_size\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For an eBPF program attached to a perf event, retrieve the -value of the event counter associated to \fIctx\fP and store it in -the structure pointed by \fIbuf\fP and of size \fIbuf_size\fP\&. Enabled -and running times are also stored in the structure (see -description of helper \fBbpf_perf_event_read_value\fP() for -more details). -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_getsockopt(void *\fP\fIbpf_socket\fP\fB, int\fP \fIlevel\fP\fB, int\fP \fIoptname\fP\fB, void *\fP\fIoptval\fP\fB, int\fP \fIoptlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Emulate a call to \fBgetsockopt()\fP on the socket associated to -\fIbpf_socket\fP, which must be a full socket. The \fIlevel\fP at -which the option resides and the name \fIoptname\fP of the option -must be specified, see \fBgetsockopt(2)\fP for more information. -The retrieved value is stored in the structure pointed by -\fIopval\fP and of length \fIoptlen\fP\&. -.sp -\fIbpf_socket\fP should be one of the following: -.INDENT 7.0 -.IP \(bu 2 -\fBstruct bpf_sock_ops\fP for \fBBPF_PROG_TYPE_SOCK_OPS\fP\&. -.IP \(bu 2 -\fBstruct bpf_sock_addr\fP for \fBBPF_CGROUP_INET4_CONNECT\fP, -\fBBPF_CGROUP_INET6_CONNECT\fP and \fBBPF_CGROUP_UNIX_CONNECT\fP\&. -.UNINDENT -.sp -This helper actually implements a subset of \fBgetsockopt()\fP\&. -It supports the same set of \fIoptname\fPs that is supported by -the \fBbpf_setsockopt\fP() helper. The exceptions are -\fBTCP_BPF_*\fP is \fBbpf_setsockopt\fP() only and -\fBTCP_SAVED_SYN\fP is \fBbpf_getsockopt\fP() only. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_override_return(struct pt_regs *\fP\fIregs\fP\fB, u64\fP \fIrc\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Used for error injection, this helper uses kprobes to override -the return value of the probed function, and to set it to \fIrc\fP\&. -The first argument is the context \fIregs\fP on which the kprobe -works. -.sp -This helper works by setting the PC (program counter) -to an override function which is run in place of the original -probed function. This means the probed function is not run at -all. The replacement function just returns with the required -value. -.sp -This helper has security implications, and thus is subject to -restrictions. It is only available if the kernel was compiled -with the \fBCONFIG_BPF_KPROBE_OVERRIDE\fP configuration -option, and in this case it only works on functions tagged with -\fBALLOW_ERROR_INJECTION\fP in the kernel code. -.sp -Also, the helper is only available for the architectures having -the CONFIG_FUNCTION_ERROR_INJECTION option. As of this writing, -x86 architecture is the only one to support this feature. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_sock_ops_cb_flags_set(struct bpf_sock_ops *\fP\fIbpf_sock\fP\fB, int\fP \fIargval\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Attempt to set the value of the \fBbpf_sock_ops_cb_flags\fP field -for the full TCP socket associated to \fIbpf_sock_ops\fP to -\fIargval\fP\&. -.sp -The primary use of this field is to determine if there should -be calls to eBPF programs of type -\fBBPF_PROG_TYPE_SOCK_OPS\fP at various points in the TCP -code. A program of the same type can change its value, per -connection and as necessary, when the connection is -established. This field is directly accessible for reading, but -this helper must be used for updates in order to return an -error if an eBPF program tries to set a callback that is not -supported in the current kernel. -.sp -\fIargval\fP is a flag array which can combine these flags: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_SOCK_OPS_RTO_CB_FLAG\fP (retransmission time out) -.IP \(bu 2 -\fBBPF_SOCK_OPS_RETRANS_CB_FLAG\fP (retransmission) -.IP \(bu 2 -\fBBPF_SOCK_OPS_STATE_CB_FLAG\fP (TCP state change) -.IP \(bu 2 -\fBBPF_SOCK_OPS_RTT_CB_FLAG\fP (every RTT) -.UNINDENT -.sp -Therefore, this function can be used to clear a callback flag by -setting the appropriate bit to zero. e.g. to disable the RTO -callback: -.INDENT 7.0 -.TP -.B \fBbpf_sock_ops_cb_flags_set(bpf_sock,\fP -\fBbpf_sock\->bpf_sock_ops_cb_flags & ~BPF_SOCK_OPS_RTO_CB_FLAG)\fP -.UNINDENT -.sp -Here are some examples of where one could call such eBPF -program: -.INDENT 7.0 -.IP \(bu 2 -When RTO fires. -.IP \(bu 2 -When a packet is retransmitted. -.IP \(bu 2 -When the connection terminates. -.IP \(bu 2 -When a packet is sent. -.IP \(bu 2 -When a packet is received. -.UNINDENT -.TP -.B Return -Code \fB\-EINVAL\fP if the socket is not a full TCP socket; -otherwise, a positive number containing the bits that could not -be set is returned (which comes down to 0 if all bits were set -as required). -.UNINDENT -.TP -.B \fBlong bpf_msg_redirect_map(struct sk_msg_buff *\fP\fImsg\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u32\fP \fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is used in programs implementing policies at the -socket level. If the message \fImsg\fP is allowed to pass (i.e. if -the verdict eBPF program returns \fBSK_PASS\fP), redirect it to -the socket referenced by \fImap\fP (of type -\fBBPF_MAP_TYPE_SOCKMAP\fP) at index \fIkey\fP\&. Both ingress and -egress interfaces can be used for redirection. The -\fBBPF_F_INGRESS\fP value in \fIflags\fP is used to make the -distinction (ingress path is selected if the flag is present, -egress path otherwise). This is the only flag supported for now. -.TP -.B Return -\fBSK_PASS\fP on success, or \fBSK_DROP\fP on error. -.UNINDENT -.TP -.B \fBlong bpf_msg_apply_bytes(struct sk_msg_buff *\fP\fImsg\fP\fB, u32\fP \fIbytes\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For socket policies, apply the verdict of the eBPF program to -the next \fIbytes\fP (number of bytes) of message \fImsg\fP\&. -.sp -For example, this helper can be used in the following cases: -.INDENT 7.0 -.IP \(bu 2 -A single \fBsendmsg\fP() or \fBsendfile\fP() system call -contains multiple logical messages that the eBPF program is -supposed to read and for which it should apply a verdict. -.IP \(bu 2 -An eBPF program only cares to read the first \fIbytes\fP of a -\fImsg\fP\&. If the message has a large payload, then setting up -and calling the eBPF program repeatedly for all bytes, even -though the verdict is already known, would create unnecessary -overhead. -.UNINDENT -.sp -When called from within an eBPF program, the helper sets a -counter internal to the BPF infrastructure, that is used to -apply the last verdict to the next \fIbytes\fP\&. If \fIbytes\fP is -smaller than the current data being processed from a -\fBsendmsg\fP() or \fBsendfile\fP() system call, the first -\fIbytes\fP will be sent and the eBPF program will be re\-run with -the pointer for start of data pointing to byte number \fIbytes\fP -\fB+ 1\fP\&. If \fIbytes\fP is larger than the current data being -processed, then the eBPF verdict will be applied to multiple -\fBsendmsg\fP() or \fBsendfile\fP() calls until \fIbytes\fP are -consumed. -.sp -Note that if a socket closes with the internal counter holding -a non\-zero value, this is not a problem because data is not -being buffered for \fIbytes\fP and is sent as it is received. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_msg_cork_bytes(struct sk_msg_buff *\fP\fImsg\fP\fB, u32\fP \fIbytes\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For socket policies, prevent the execution of the verdict eBPF -program for message \fImsg\fP until \fIbytes\fP (byte number) have been -accumulated. -.sp -This can be used when one needs a specific number of bytes -before a verdict can be assigned, even if the data spans -multiple \fBsendmsg\fP() or \fBsendfile\fP() calls. The extreme -case would be a user calling \fBsendmsg\fP() repeatedly with -1\-byte long message segments. Obviously, this is bad for -performance, but it is still valid. If the eBPF program needs -\fIbytes\fP bytes to validate a header, this helper can be used to -prevent the eBPF program to be called again until \fIbytes\fP have -been accumulated. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_msg_pull_data(struct sk_msg_buff *\fP\fImsg\fP\fB, u32\fP \fIstart\fP\fB, u32\fP \fIend\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For socket policies, pull in non\-linear data from user space -for \fImsg\fP and set pointers \fImsg\fP\fB\->data\fP and \fImsg\fP\fB\->data_end\fP to \fIstart\fP and \fIend\fP bytes offsets into \fImsg\fP, -respectively. -.sp -If a program of type \fBBPF_PROG_TYPE_SK_MSG\fP is run on a -\fImsg\fP it can only parse data that the (\fBdata\fP, \fBdata_end\fP) -pointers have already consumed. For \fBsendmsg\fP() hooks this -is likely the first scatterlist element. But for calls relying -on the \fBsendpage\fP handler (e.g. \fBsendfile\fP()) this will -be the range (\fB0\fP, \fB0\fP) because the data is shared with -user space and by default the objective is to avoid allowing -user space to modify data while (or after) eBPF verdict is -being decided. This helper can be used to pull in data and to -set the start and end pointer to given values. Data will be -copied if necessary (i.e. if data was not linear and if start -and end pointers do not point to the same chunk). -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.sp -All values for \fIflags\fP are reserved for future usage, and must -be left at zero. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_bind(struct bpf_sock_addr *\fP\fIctx\fP\fB, struct sockaddr *\fP\fIaddr\fP\fB, int\fP \fIaddr_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Bind the socket associated to \fIctx\fP to the address pointed by -\fIaddr\fP, of length \fIaddr_len\fP\&. This allows for making outgoing -connection from the desired IP address, which can be useful for -example when all processes inside a cgroup should use one -single IP address on a host that has multiple IP configured. -.sp -This helper works for IPv4 and IPv6, TCP and UDP sockets. The -domain (\fIaddr\fP\fB\->sa_family\fP) must be \fBAF_INET\fP (or -\fBAF_INET6\fP). It\(aqs advised to pass zero port (\fBsin_port\fP -or \fBsin6_port\fP) which triggers IP_BIND_ADDRESS_NO_PORT\-like -behavior and lets the kernel efficiently pick up an unused -port as long as 4\-tuple is unique. Passing non\-zero port might -lead to degraded performance. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_xdp_adjust_tail(struct xdp_buff *\fP\fIxdp_md\fP\fB, int\fP \fIdelta\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Adjust (move) \fIxdp_md\fP\fB\->data_end\fP by \fIdelta\fP bytes. It is -possible to both shrink and grow the packet tail. -Shrink done via \fIdelta\fP being a negative integer. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_get_xfrm_state(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIindex\fP\fB, struct bpf_xfrm_state *\fP\fIxfrm_state\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Retrieve the XFRM state (IP transform framework, see also -\fBip\-xfrm(8)\fP) at \fIindex\fP in XFRM \(dqsecurity path\(dq for \fIskb\fP\&. -.sp -The retrieved value is stored in the \fBstruct bpf_xfrm_state\fP -pointed by \fIxfrm_state\fP and of length \fIsize\fP\&. -.sp -All values for \fIflags\fP are reserved for future usage, and must -be left at zero. -.sp -This helper is available only if the kernel was compiled with -\fBCONFIG_XFRM\fP configuration option. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_get_stack(void *\fP\fIctx\fP\fB, void *\fP\fIbuf\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return a user or a kernel stack in bpf program provided buffer. -To achieve this, the helper needs \fIctx\fP, which is a pointer -to the context on which the tracing program is executed. -To store the stacktrace, the bpf program provides \fIbuf\fP with -a nonnegative \fIsize\fP\&. -.sp -The last argument, \fIflags\fP, holds the number of stack frames to -skip (from 0 to 255), masked with -\fBBPF_F_SKIP_FIELD_MASK\fP\&. The next bits can be used to set -the following flags: -.INDENT 7.0 -.TP -.B \fBBPF_F_USER_STACK\fP -Collect a user space stack instead of a kernel stack. -.TP -.B \fBBPF_F_USER_BUILD_ID\fP -Collect (build_id, file_offset) instead of ips for user -stack, only valid if \fBBPF_F_USER_STACK\fP is also -specified. -.sp -\fIfile_offset\fP is an offset relative to the beginning -of the executable or shared object file backing the vma -which the \fIip\fP falls in. It is \fInot\fP an offset relative -to that object\(aqs base address. Accordingly, it must be -adjusted by adding (sh_addr \- sh_offset), where -sh_{addr,offset} correspond to the executable section -containing \fIfile_offset\fP in the object, for comparisons -to symbols\(aq st_value to be valid. -.UNINDENT -.sp -\fBbpf_get_stack\fP() can collect up to -\fBPERF_MAX_STACK_DEPTH\fP both kernel and user frames, subject -to sufficient large buffer size. Note that -this limit can be controlled with the \fBsysctl\fP program, and -that it should be manually increased in order to profile long -user stacks (such as stacks for Java programs). To do so, use: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -# sysctl kernel.perf_event_max_stack=<new value> -.EE -.UNINDENT -.UNINDENT -.TP -.B Return -The non\-negative copied \fIbuf\fP length equal to or less than -\fIsize\fP on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_load_bytes_relative(const void *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, void *\fP\fIto\fP\fB, u32\fP \fIlen\fP\fB, u32\fP \fIstart_header\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is similar to \fBbpf_skb_load_bytes\fP() in that -it provides an easy way to load \fIlen\fP bytes from \fIoffset\fP -from the packet associated to \fIskb\fP, into the buffer pointed -by \fIto\fP\&. The difference to \fBbpf_skb_load_bytes\fP() is that -a fifth argument \fIstart_header\fP exists in order to select a -base offset to start from. \fIstart_header\fP can be one of: -.INDENT 7.0 -.TP -.B \fBBPF_HDR_START_MAC\fP -Base offset to load data from is \fIskb\fP\(aqs mac header. -.TP -.B \fBBPF_HDR_START_NET\fP -Base offset to load data from is \fIskb\fP\(aqs network header. -.UNINDENT -.sp -In general, \(dqdirect packet access\(dq is the preferred method to -access packet data, however, this helper is in particular useful -in socket filters where \fIskb\fP\fB\->data\fP does not always point -to the start of the mac header and where \(dqdirect packet access\(dq -is not available. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_fib_lookup(void *\fP\fIctx\fP\fB, struct bpf_fib_lookup *\fP\fIparams\fP\fB, int\fP \fIplen\fP\fB, u32\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Do FIB lookup in kernel tables using parameters in \fIparams\fP\&. -If lookup is successful and result shows packet is to be -forwarded, the neighbor tables are searched for the nexthop. -If successful (ie., FIB lookup shows forwarding and nexthop -is resolved), the nexthop address is returned in ipv4_dst -or ipv6_dst based on family, smac is set to mac address of -egress device, dmac is set to nexthop mac address, rt_metric -is set to metric from route (IPv4/IPv6 only), and ifindex -is set to the device index of the nexthop from the FIB lookup. -.sp -\fIplen\fP argument is the size of the passed in struct. -\fIflags\fP argument can be a combination of one or more of the -following values: -.INDENT 7.0 -.TP -.B \fBBPF_FIB_LOOKUP_DIRECT\fP -Do a direct table lookup vs full lookup using FIB -rules. -.TP -.B \fBBPF_FIB_LOOKUP_TBID\fP -Used with BPF_FIB_LOOKUP_DIRECT. -Use the routing table ID present in \fIparams\fP\->tbid -for the fib lookup. -.TP -.B \fBBPF_FIB_LOOKUP_OUTPUT\fP -Perform lookup from an egress perspective (default is -ingress). -.TP -.B \fBBPF_FIB_LOOKUP_SKIP_NEIGH\fP -Skip the neighbour table lookup. \fIparams\fP\->dmac -and \fIparams\fP\->smac will not be set as output. A common -use case is to call \fBbpf_redirect_neigh\fP() after -doing \fBbpf_fib_lookup\fP(). -.TP -.B \fBBPF_FIB_LOOKUP_SRC\fP -Derive and set source IP addr in \fIparams\fP\->ipv{4,6}_src -for the nexthop. If the src addr cannot be derived, -\fBBPF_FIB_LKUP_RET_NO_SRC_ADDR\fP is returned. In this -case, \fIparams\fP\->dmac and \fIparams\fP\->smac are not set either. -.UNINDENT -.sp -\fIctx\fP is either \fBstruct xdp_md\fP for XDP programs or -\fBstruct sk_buff\fP tc cls_act programs. -.TP -.B Return -.INDENT 7.0 -.IP \(bu 2 -< 0 if any input argument is invalid -.IP \(bu 2 -0 on success (packet is forwarded, nexthop neighbor exists) -.IP \(bu 2 -> 0 one of \fBBPF_FIB_LKUP_RET_\fP codes explaining why the -packet is not forwarded or needs assist from full stack -.UNINDENT -.sp -If lookup fails with BPF_FIB_LKUP_RET_FRAG_NEEDED, then the MTU -was exceeded and output params\->mtu_result contains the MTU. -.UNINDENT -.TP -.B \fBlong bpf_sock_hash_update(struct bpf_sock_ops *\fP\fIskops\fP\fB, struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Add an entry to, or update a sockhash \fImap\fP referencing sockets. -The \fIskops\fP is used as a new value for the entry associated to -\fIkey\fP\&. \fIflags\fP is one of: -.INDENT 7.0 -.TP -.B \fBBPF_NOEXIST\fP -The entry for \fIkey\fP must not exist in the map. -.TP -.B \fBBPF_EXIST\fP -The entry for \fIkey\fP must already exist in the map. -.TP -.B \fBBPF_ANY\fP -No condition on the existence of the entry for \fIkey\fP\&. -.UNINDENT -.sp -If the \fImap\fP has eBPF programs (parser and verdict), those will -be inherited by the socket being added. If the socket is -already attached to eBPF programs, this results in an error. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_msg_redirect_hash(struct sk_msg_buff *\fP\fImsg\fP\fB, struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is used in programs implementing policies at the -socket level. If the message \fImsg\fP is allowed to pass (i.e. if -the verdict eBPF program returns \fBSK_PASS\fP), redirect it to -the socket referenced by \fImap\fP (of type -\fBBPF_MAP_TYPE_SOCKHASH\fP) using hash \fIkey\fP\&. Both ingress and -egress interfaces can be used for redirection. The -\fBBPF_F_INGRESS\fP value in \fIflags\fP is used to make the -distinction (ingress path is selected if the flag is present, -egress path otherwise). This is the only flag supported for now. -.TP -.B Return -\fBSK_PASS\fP on success, or \fBSK_DROP\fP on error. -.UNINDENT -.TP -.B \fBlong bpf_sk_redirect_hash(struct sk_buff *\fP\fIskb\fP\fB, struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is used in programs implementing policies at the -skb socket level. If the sk_buff \fIskb\fP is allowed to pass (i.e. -if the verdict eBPF program returns \fBSK_PASS\fP), redirect it -to the socket referenced by \fImap\fP (of type -\fBBPF_MAP_TYPE_SOCKHASH\fP) using hash \fIkey\fP\&. Both ingress and -egress interfaces can be used for redirection. The -\fBBPF_F_INGRESS\fP value in \fIflags\fP is used to make the -distinction (ingress path is selected if the flag is present, -egress otherwise). This is the only flag supported for now. -.TP -.B Return -\fBSK_PASS\fP on success, or \fBSK_DROP\fP on error. -.UNINDENT -.TP -.B \fBlong bpf_lwt_push_encap(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fItype\fP\fB, void *\fP\fIhdr\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Encapsulate the packet associated to \fIskb\fP within a Layer 3 -protocol header. This header is provided in the buffer at -address \fIhdr\fP, with \fIlen\fP its size in bytes. \fItype\fP indicates -the protocol of the header and can be one of: -.INDENT 7.0 -.TP -.B \fBBPF_LWT_ENCAP_SEG6\fP -IPv6 encapsulation with Segment Routing Header -(\fBstruct ipv6_sr_hdr\fP). \fIhdr\fP only contains the SRH, -the IPv6 header is computed by the kernel. -.TP -.B \fBBPF_LWT_ENCAP_SEG6_INLINE\fP -Only works if \fIskb\fP contains an IPv6 packet. Insert a -Segment Routing Header (\fBstruct ipv6_sr_hdr\fP) inside -the IPv6 header. -.TP -.B \fBBPF_LWT_ENCAP_IP\fP -IP encapsulation (GRE/GUE/IPIP/etc). The outer header -must be IPv4 or IPv6, followed by zero or more -additional headers, up to \fBLWT_BPF_MAX_HEADROOM\fP -total bytes in all prepended headers. Please note that -if \fBskb_is_gso\fP(\fIskb\fP) is true, no more than two -headers can be prepended, and the inner header, if -present, should be either GRE or UDP/GUE. -.UNINDENT -.sp -\fBBPF_LWT_ENCAP_SEG6\fP* types can be called by BPF programs -of type \fBBPF_PROG_TYPE_LWT_IN\fP; \fBBPF_LWT_ENCAP_IP\fP type can -be called by bpf programs of types \fBBPF_PROG_TYPE_LWT_IN\fP and -\fBBPF_PROG_TYPE_LWT_XMIT\fP\&. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_lwt_seg6_store_bytes(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, const void *\fP\fIfrom\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Store \fIlen\fP bytes from address \fIfrom\fP into the packet -associated to \fIskb\fP, at \fIoffset\fP\&. Only the flags, tag and TLVs -inside the outermost IPv6 Segment Routing Header can be -modified through this helper. -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_lwt_seg6_adjust_srh(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIoffset\fP\fB, s32\fP \fIdelta\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Adjust the size allocated to TLVs in the outermost IPv6 -Segment Routing Header contained in the packet associated to -\fIskb\fP, at position \fIoffset\fP by \fIdelta\fP bytes. Only offsets -after the segments are accepted. \fIdelta\fP can be as well -positive (growing) as negative (shrinking). -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_lwt_seg6_action(struct sk_buff *\fP\fIskb\fP\fB, u32\fP \fIaction\fP\fB, void *\fP\fIparam\fP\fB, u32\fP \fIparam_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Apply an IPv6 Segment Routing action of type \fIaction\fP to the -packet associated to \fIskb\fP\&. Each action takes a parameter -contained at address \fIparam\fP, and of length \fIparam_len\fP bytes. -\fIaction\fP can be one of: -.INDENT 7.0 -.TP -.B \fBSEG6_LOCAL_ACTION_END_X\fP -End.X action: Endpoint with Layer\-3 cross\-connect. -Type of \fIparam\fP: \fBstruct in6_addr\fP\&. -.TP -.B \fBSEG6_LOCAL_ACTION_END_T\fP -End.T action: Endpoint with specific IPv6 table lookup. -Type of \fIparam\fP: \fBint\fP\&. -.TP -.B \fBSEG6_LOCAL_ACTION_END_B6\fP -End.B6 action: Endpoint bound to an SRv6 policy. -Type of \fIparam\fP: \fBstruct ipv6_sr_hdr\fP\&. -.TP -.B \fBSEG6_LOCAL_ACTION_END_B6_ENCAP\fP -End.B6.Encap action: Endpoint bound to an SRv6 -encapsulation policy. -Type of \fIparam\fP: \fBstruct ipv6_sr_hdr\fP\&. -.UNINDENT -.sp -A call to this helper is susceptible to change the underlying -packet buffer. Therefore, at load time, all checks on pointers -previously done by the verifier are invalidated and must be -performed again, if the helper is used in combination with -direct packet access. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_rc_repeat(void *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is used in programs implementing IR decoding, to -report a successfully decoded repeat key message. This delays -the generation of a key up event for previously generated -key down event. -.sp -Some IR protocols like NEC have a special IR message for -repeating last button, for when a button is held down. -.sp -The \fIctx\fP should point to the lirc sample as passed into -the program. -.sp -This helper is only available is the kernel was compiled with -the \fBCONFIG_BPF_LIRC_MODE2\fP configuration option set to -\(dq\fBy\fP\(dq. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_rc_keydown(void *\fP\fIctx\fP\fB, u32\fP \fIprotocol\fP\fB, u64\fP \fIscancode\fP\fB, u32\fP \fItoggle\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is used in programs implementing IR decoding, to -report a successfully decoded key press with \fIscancode\fP, -\fItoggle\fP value in the given \fIprotocol\fP\&. The scancode will be -translated to a keycode using the rc keymap, and reported as -an input key down event. After a period a key up event is -generated. This period can be extended by calling either -\fBbpf_rc_keydown\fP() again with the same values, or calling -\fBbpf_rc_repeat\fP(). -.sp -Some protocols include a toggle bit, in case the button was -released and pressed again between consecutive scancodes. -.sp -The \fIctx\fP should point to the lirc sample as passed into -the program. -.sp -The \fIprotocol\fP is the decoded protocol number (see -\fBenum rc_proto\fP for some predefined values). -.sp -This helper is only available is the kernel was compiled with -the \fBCONFIG_BPF_LIRC_MODE2\fP configuration option set to -\(dq\fBy\fP\(dq. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBu64 bpf_skb_cgroup_id(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return the cgroup v2 id of the socket associated with the \fIskb\fP\&. -This is roughly similar to the \fBbpf_get_cgroup_classid\fP() -helper for cgroup v1 by providing a tag resp. identifier that -can be matched on or used for map lookups e.g. to implement -policy. The cgroup v2 id of a given path in the hierarchy is -exposed in user space through the f_handle API in order to get -to the same 64\-bit id. -.sp -This helper can be used on TC egress path, but not on ingress, -and is available only if the kernel was compiled with the -\fBCONFIG_SOCK_CGROUP_DATA\fP configuration option. -.TP -.B Return -The id is returned or 0 in case the id could not be retrieved. -.UNINDENT -.TP -.B \fBu64 bpf_get_current_cgroup_id(void)\fP -.INDENT 7.0 -.TP -.B Description -Get the current cgroup id based on the cgroup within which -the current task is running. -.TP -.B Return -A 64\-bit integer containing the current cgroup id based -on the cgroup within which the current task is running. -.UNINDENT -.TP -.B \fBvoid *bpf_get_local_storage(void *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get the pointer to the local storage area. -The type and the size of the local storage is defined -by the \fImap\fP argument. -The \fIflags\fP meaning is specific for each map type, -and has to be 0 for cgroup local storage. -.sp -Depending on the BPF program type, a local storage area -can be shared between multiple instances of the BPF program, -running simultaneously. -.sp -A user should care about the synchronization by himself. -For example, by using the \fBBPF_ATOMIC\fP instructions to alter -the shared data. -.TP -.B Return -A pointer to the local storage area. -.UNINDENT -.TP -.B \fBlong bpf_sk_select_reuseport(struct sk_reuseport_md *\fP\fIreuse\fP\fB, struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIkey\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Select a \fBSO_REUSEPORT\fP socket from a -\fBBPF_MAP_TYPE_REUSEPORT_SOCKARRAY\fP \fImap\fP\&. -It checks the selected socket is matching the incoming -request in the socket buffer. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu64 bpf_skb_ancestor_cgroup_id(struct sk_buff *\fP\fIskb\fP\fB, int\fP \fIancestor_level\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return id of cgroup v2 that is ancestor of cgroup associated -with the \fIskb\fP at the \fIancestor_level\fP\&. The root cgroup is at -\fIancestor_level\fP zero and each step down the hierarchy -increments the level. If \fIancestor_level\fP == level of cgroup -associated with \fIskb\fP, then return value will be same as that -of \fBbpf_skb_cgroup_id\fP(). -.sp -The helper is useful to implement policies based on cgroups -that are upper in hierarchy than immediate cgroup associated -with \fIskb\fP\&. -.sp -The format of returned id and helper limitations are same as in -\fBbpf_skb_cgroup_id\fP(). -.TP -.B Return -The id is returned or 0 in case the id could not be retrieved. -.UNINDENT -.TP -.B \fBstruct bpf_sock *bpf_sk_lookup_tcp(void *\fP\fIctx\fP\fB, struct bpf_sock_tuple *\fP\fItuple\fP\fB, u32\fP \fItuple_size\fP\fB, u64\fP \fInetns\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Look for TCP socket matching \fItuple\fP, optionally in a child -network namespace \fInetns\fP\&. The return value must be checked, -and if non\-\fBNULL\fP, released via \fBbpf_sk_release\fP(). -.sp -The \fIctx\fP should point to the context of the program, such as -the skb or socket (depending on the hook in use). This is used -to determine the base network namespace for the lookup. -.sp -\fItuple_size\fP must be one of: -.INDENT 7.0 -.TP -.B \fBsizeof\fP(\fItuple\fP\fB\->ipv4\fP) -Look for an IPv4 socket. -.TP -.B \fBsizeof\fP(\fItuple\fP\fB\->ipv6\fP) -Look for an IPv6 socket. -.UNINDENT -.sp -If the \fInetns\fP is a negative signed 32\-bit integer, then the -socket lookup table in the netns associated with the \fIctx\fP -will be used. For the TC hooks, this is the netns of the device -in the skb. For socket hooks, this is the netns of the socket. -If \fInetns\fP is any other signed 32\-bit value greater than or -equal to zero then it specifies the ID of the netns relative to -the netns associated with the \fIctx\fP\&. \fInetns\fP values beyond the -range of 32\-bit integers are reserved for future use. -.sp -All values for \fIflags\fP are reserved for future usage, and must -be left at zero. -.sp -This helper is available only if the kernel was compiled with -\fBCONFIG_NET\fP configuration option. -.TP -.B Return -Pointer to \fBstruct bpf_sock\fP, or \fBNULL\fP in case of failure. -For sockets with reuseport option, the \fBstruct bpf_sock\fP -result is from \fIreuse\fP\fB\->socks\fP[] using the hash of the -tuple. -.UNINDENT -.TP -.B \fBstruct bpf_sock *bpf_sk_lookup_udp(void *\fP\fIctx\fP\fB, struct bpf_sock_tuple *\fP\fItuple\fP\fB, u32\fP \fItuple_size\fP\fB, u64\fP \fInetns\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Look for UDP socket matching \fItuple\fP, optionally in a child -network namespace \fInetns\fP\&. The return value must be checked, -and if non\-\fBNULL\fP, released via \fBbpf_sk_release\fP(). -.sp -The \fIctx\fP should point to the context of the program, such as -the skb or socket (depending on the hook in use). This is used -to determine the base network namespace for the lookup. -.sp -\fItuple_size\fP must be one of: -.INDENT 7.0 -.TP -.B \fBsizeof\fP(\fItuple\fP\fB\->ipv4\fP) -Look for an IPv4 socket. -.TP -.B \fBsizeof\fP(\fItuple\fP\fB\->ipv6\fP) -Look for an IPv6 socket. -.UNINDENT -.sp -If the \fInetns\fP is a negative signed 32\-bit integer, then the -socket lookup table in the netns associated with the \fIctx\fP -will be used. For the TC hooks, this is the netns of the device -in the skb. For socket hooks, this is the netns of the socket. -If \fInetns\fP is any other signed 32\-bit value greater than or -equal to zero then it specifies the ID of the netns relative to -the netns associated with the \fIctx\fP\&. \fInetns\fP values beyond the -range of 32\-bit integers are reserved for future use. -.sp -All values for \fIflags\fP are reserved for future usage, and must -be left at zero. -.sp -This helper is available only if the kernel was compiled with -\fBCONFIG_NET\fP configuration option. -.TP -.B Return -Pointer to \fBstruct bpf_sock\fP, or \fBNULL\fP in case of failure. -For sockets with reuseport option, the \fBstruct bpf_sock\fP -result is from \fIreuse\fP\fB\->socks\fP[] using the hash of the -tuple. -.UNINDENT -.TP -.B \fBlong bpf_sk_release(void *\fP\fIsock\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Release the reference held by \fIsock\fP\&. \fIsock\fP must be a -non\-\fBNULL\fP pointer that was returned from -\fBbpf_sk_lookup_xxx\fP(). -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_map_push_elem(struct bpf_map *\fP\fImap\fP\fB, const void *\fP\fIvalue\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Push an element \fIvalue\fP in \fImap\fP\&. \fIflags\fP is one of: -.INDENT 7.0 -.TP -.B \fBBPF_EXIST\fP -If the queue/stack is full, the oldest element is -removed to make room for this. -.UNINDENT -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_map_pop_elem(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIvalue\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Pop an element from \fImap\fP\&. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_map_peek_elem(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIvalue\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get an element from \fImap\fP without removing it. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_msg_push_data(struct sk_msg_buff *\fP\fImsg\fP\fB, u32\fP \fIstart\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For socket policies, insert \fIlen\fP bytes into \fImsg\fP at offset -\fIstart\fP\&. -.sp -If a program of type \fBBPF_PROG_TYPE_SK_MSG\fP is run on a -\fImsg\fP it may want to insert metadata or options into the \fImsg\fP\&. -This can later be read and used by any of the lower layer BPF -hooks. -.sp -This helper may fail if under memory pressure (a malloc -fails) in these cases BPF programs will get an appropriate -error and BPF programs will need to handle them. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_msg_pop_data(struct sk_msg_buff *\fP\fImsg\fP\fB, u32\fP \fIstart\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Will remove \fIlen\fP bytes from a \fImsg\fP starting at byte \fIstart\fP\&. -This may result in \fBENOMEM\fP errors under certain situations if -an allocation and copy are required due to a full ring buffer. -However, the helper will try to avoid doing the allocation -if possible. Other errors can occur if input parameters are -invalid either due to \fIstart\fP byte not being valid part of \fImsg\fP -payload and/or \fIpop\fP value being to large. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_rc_pointer_rel(void *\fP\fIctx\fP\fB, s32\fP \fIrel_x\fP\fB, s32\fP \fIrel_y\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is used in programs implementing IR decoding, to -report a successfully decoded pointer movement. -.sp -The \fIctx\fP should point to the lirc sample as passed into -the program. -.sp -This helper is only available is the kernel was compiled with -the \fBCONFIG_BPF_LIRC_MODE2\fP configuration option set to -\(dq\fBy\fP\(dq. -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_spin_lock(struct bpf_spin_lock *\fP\fIlock\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Acquire a spinlock represented by the pointer \fIlock\fP, which is -stored as part of a value of a map. Taking the lock allows to -safely update the rest of the fields in that value. The -spinlock can (and must) later be released with a call to -\fBbpf_spin_unlock\fP(\fIlock\fP). -.sp -Spinlocks in BPF programs come with a number of restrictions -and constraints: -.INDENT 7.0 -.IP \(bu 2 -\fBbpf_spin_lock\fP objects are only allowed inside maps of -types \fBBPF_MAP_TYPE_HASH\fP and \fBBPF_MAP_TYPE_ARRAY\fP (this -list could be extended in the future). -.IP \(bu 2 -BTF description of the map is mandatory. -.IP \(bu 2 -The BPF program can take ONE lock at a time, since taking two -or more could cause dead locks. -.IP \(bu 2 -Only one \fBstruct bpf_spin_lock\fP is allowed per map element. -.IP \(bu 2 -When the lock is taken, calls (either BPF to BPF or helpers) -are not allowed. -.IP \(bu 2 -The \fBBPF_LD_ABS\fP and \fBBPF_LD_IND\fP instructions are not -allowed inside a spinlock\-ed region. -.IP \(bu 2 -The BPF program MUST call \fBbpf_spin_unlock\fP() to release -the lock, on all execution paths, before it returns. -.IP \(bu 2 -The BPF program can access \fBstruct bpf_spin_lock\fP only via -the \fBbpf_spin_lock\fP() and \fBbpf_spin_unlock\fP() -helpers. Loading or storing data into the \fBstruct -bpf_spin_lock\fP \fIlock\fP\fB;\fP field of a map is not allowed. -.IP \(bu 2 -To use the \fBbpf_spin_lock\fP() helper, the BTF description -of the map value must be a struct and have \fBstruct -bpf_spin_lock\fP \fIanyname\fP\fB;\fP field at the top level. -Nested lock inside another struct is not allowed. -.IP \(bu 2 -The \fBstruct bpf_spin_lock\fP \fIlock\fP field in a map value must -be aligned on a multiple of 4 bytes in that value. -.IP \(bu 2 -Syscall with command \fBBPF_MAP_LOOKUP_ELEM\fP does not copy -the \fBbpf_spin_lock\fP field to user space. -.IP \(bu 2 -Syscall with command \fBBPF_MAP_UPDATE_ELEM\fP, or update from -a BPF program, do not update the \fBbpf_spin_lock\fP field. -.IP \(bu 2 -\fBbpf_spin_lock\fP cannot be on the stack or inside a -networking packet (it can only be inside of a map values). -.IP \(bu 2 -\fBbpf_spin_lock\fP is available to root only. -.IP \(bu 2 -Tracing programs and socket filter programs cannot use -\fBbpf_spin_lock\fP() due to insufficient preemption checks -(but this may change in the future). -.IP \(bu 2 -\fBbpf_spin_lock\fP is not allowed in inner maps of map\-in\-map. -.UNINDENT -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBlong bpf_spin_unlock(struct bpf_spin_lock *\fP\fIlock\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Release the \fIlock\fP previously locked by a call to -\fBbpf_spin_lock\fP(\fIlock\fP). -.TP -.B Return -0 -.UNINDENT -.TP -.B \fBstruct bpf_sock *bpf_sk_fullsock(struct bpf_sock *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper gets a \fBstruct bpf_sock\fP pointer such -that all the fields in this \fBbpf_sock\fP can be accessed. -.TP -.B Return -A \fBstruct bpf_sock\fP pointer on success, or \fBNULL\fP in -case of failure. -.UNINDENT -.TP -.B \fBstruct bpf_tcp_sock *bpf_tcp_sock(struct bpf_sock *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper gets a \fBstruct bpf_tcp_sock\fP pointer from a -\fBstruct bpf_sock\fP pointer. -.TP -.B Return -A \fBstruct bpf_tcp_sock\fP pointer on success, or \fBNULL\fP in -case of failure. -.UNINDENT -.TP -.B \fBlong bpf_skb_ecn_set_ce(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Set ECN (Explicit Congestion Notification) field of IP header -to \fBCE\fP (Congestion Encountered) if current value is \fBECT\fP -(ECN Capable Transport). Otherwise, do nothing. Works with IPv6 -and IPv4. -.TP -.B Return -1 if the \fBCE\fP flag is set (either by the current helper call -or because it was already present), 0 if it is not set. -.UNINDENT -.TP -.B \fBstruct bpf_sock *bpf_get_listener_sock(struct bpf_sock *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return a \fBstruct bpf_sock\fP pointer in \fBTCP_LISTEN\fP state. -\fBbpf_sk_release\fP() is unnecessary and not allowed. -.TP -.B Return -A \fBstruct bpf_sock\fP pointer on success, or \fBNULL\fP in -case of failure. -.UNINDENT -.TP -.B \fBstruct bpf_sock *bpf_skc_lookup_tcp(void *\fP\fIctx\fP\fB, struct bpf_sock_tuple *\fP\fItuple\fP\fB, u32\fP \fItuple_size\fP\fB, u64\fP \fInetns\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Look for TCP socket matching \fItuple\fP, optionally in a child -network namespace \fInetns\fP\&. The return value must be checked, -and if non\-\fBNULL\fP, released via \fBbpf_sk_release\fP(). -.sp -This function is identical to \fBbpf_sk_lookup_tcp\fP(), except -that it also returns timewait or request sockets. Use -\fBbpf_sk_fullsock\fP() or \fBbpf_tcp_sock\fP() to access the -full structure. -.sp -This helper is available only if the kernel was compiled with -\fBCONFIG_NET\fP configuration option. -.TP -.B Return -Pointer to \fBstruct bpf_sock\fP, or \fBNULL\fP in case of failure. -For sockets with reuseport option, the \fBstruct bpf_sock\fP -result is from \fIreuse\fP\fB\->socks\fP[] using the hash of the -tuple. -.UNINDENT -.TP -.B \fBlong bpf_tcp_check_syncookie(void *\fP\fIsk\fP\fB, void *\fP\fIiph\fP\fB, u32\fP \fIiph_len\fP\fB, struct tcphdr *\fP\fIth\fP\fB, u32\fP \fIth_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Check whether \fIiph\fP and \fIth\fP contain a valid SYN cookie ACK for -the listening socket in \fIsk\fP\&. -.sp -\fIiph\fP points to the start of the IPv4 or IPv6 header, while -\fIiph_len\fP contains \fBsizeof\fP(\fBstruct iphdr\fP) or -\fBsizeof\fP(\fBstruct ipv6hdr\fP). -.sp -\fIth\fP points to the start of the TCP header, while \fIth_len\fP -contains the length of the TCP header (at least -\fBsizeof\fP(\fBstruct tcphdr\fP)). -.TP -.B Return -0 if \fIiph\fP and \fIth\fP are a valid SYN cookie ACK, or a negative -error otherwise. -.UNINDENT -.TP -.B \fBlong bpf_sysctl_get_name(struct bpf_sysctl *\fP\fIctx\fP\fB, char *\fP\fIbuf\fP\fB, size_t\fP \fIbuf_len\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get name of sysctl in /proc/sys/ and copy it into provided by -program buffer \fIbuf\fP of size \fIbuf_len\fP\&. -.sp -The buffer is always NUL terminated, unless it\(aqs zero\-sized. -.sp -If \fIflags\fP is zero, full name (e.g. \(dqnet/ipv4/tcp_mem\(dq) is -copied. Use \fBBPF_F_SYSCTL_BASE_NAME\fP flag to copy base name -only (e.g. \(dqtcp_mem\(dq). -.TP -.B Return -Number of character copied (not including the trailing NUL). -.sp -\fB\-E2BIG\fP if the buffer wasn\(aqt big enough (\fIbuf\fP will contain -truncated name in this case). -.UNINDENT -.TP -.B \fBlong bpf_sysctl_get_current_value(struct bpf_sysctl *\fP\fIctx\fP\fB, char *\fP\fIbuf\fP\fB, size_t\fP \fIbuf_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get current value of sysctl as it is presented in /proc/sys -(incl. newline, etc), and copy it as a string into provided -by program buffer \fIbuf\fP of size \fIbuf_len\fP\&. -.sp -The whole value is copied, no matter what file position user -space issued e.g. sys_read at. -.sp -The buffer is always NUL terminated, unless it\(aqs zero\-sized. -.TP -.B Return -Number of character copied (not including the trailing NUL). -.sp -\fB\-E2BIG\fP if the buffer wasn\(aqt big enough (\fIbuf\fP will contain -truncated name in this case). -.sp -\fB\-EINVAL\fP if current value was unavailable, e.g. because -sysctl is uninitialized and read returns \-EIO for it. -.UNINDENT -.TP -.B \fBlong bpf_sysctl_get_new_value(struct bpf_sysctl *\fP\fIctx\fP\fB, char *\fP\fIbuf\fP\fB, size_t\fP \fIbuf_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get new value being written by user space to sysctl (before -the actual write happens) and copy it as a string into -provided by program buffer \fIbuf\fP of size \fIbuf_len\fP\&. -.sp -User space may write new value at file position > 0. -.sp -The buffer is always NUL terminated, unless it\(aqs zero\-sized. -.TP -.B Return -Number of character copied (not including the trailing NUL). -.sp -\fB\-E2BIG\fP if the buffer wasn\(aqt big enough (\fIbuf\fP will contain -truncated name in this case). -.sp -\fB\-EINVAL\fP if sysctl is being read. -.UNINDENT -.TP -.B \fBlong bpf_sysctl_set_new_value(struct bpf_sysctl *\fP\fIctx\fP\fB, const char *\fP\fIbuf\fP\fB, size_t\fP \fIbuf_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Override new value being written by user space to sysctl with -value provided by program in buffer \fIbuf\fP of size \fIbuf_len\fP\&. -.sp -\fIbuf\fP should contain a string in same form as provided by user -space on sysctl write. -.sp -User space may write new value at file position > 0. To override -the whole sysctl value file position should be set to zero. -.TP -.B Return -0 on success. -.sp -\fB\-E2BIG\fP if the \fIbuf_len\fP is too big. -.sp -\fB\-EINVAL\fP if sysctl is being read. -.UNINDENT -.TP -.B \fBlong bpf_strtol(const char *\fP\fIbuf\fP\fB, size_t\fP \fIbuf_len\fP\fB, u64\fP \fIflags\fP\fB, long *\fP\fIres\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Convert the initial part of the string from buffer \fIbuf\fP of -size \fIbuf_len\fP to a long integer according to the given base -and save the result in \fIres\fP\&. -.sp -The string may begin with an arbitrary amount of white space -(as determined by \fBisspace\fP(3)) followed by a single -optional \(aq\fB\-\fP\(aq sign. -.sp -Five least significant bits of \fIflags\fP encode base, other bits -are currently unused. -.sp -Base must be either 8, 10, 16 or 0 to detect it automatically -similar to user space \fBstrtol\fP(3). -.TP -.B Return -Number of characters consumed on success. Must be positive but -no more than \fIbuf_len\fP\&. -.sp -\fB\-EINVAL\fP if no valid digits were found or unsupported base -was provided. -.sp -\fB\-ERANGE\fP if resulting value was out of range. -.UNINDENT -.TP -.B \fBlong bpf_strtoul(const char *\fP\fIbuf\fP\fB, size_t\fP \fIbuf_len\fP\fB, u64\fP \fIflags\fP\fB, unsigned long *\fP\fIres\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Convert the initial part of the string from buffer \fIbuf\fP of -size \fIbuf_len\fP to an unsigned long integer according to the -given base and save the result in \fIres\fP\&. -.sp -The string may begin with an arbitrary amount of white space -(as determined by \fBisspace\fP(3)). -.sp -Five least significant bits of \fIflags\fP encode base, other bits -are currently unused. -.sp -Base must be either 8, 10, 16 or 0 to detect it automatically -similar to user space \fBstrtoul\fP(3). -.TP -.B Return -Number of characters consumed on success. Must be positive but -no more than \fIbuf_len\fP\&. -.sp -\fB\-EINVAL\fP if no valid digits were found or unsupported base -was provided. -.sp -\fB\-ERANGE\fP if resulting value was out of range. -.UNINDENT -.TP -.B \fBvoid *bpf_sk_storage_get(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIsk\fP\fB, void *\fP\fIvalue\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get a bpf\-local\-storage from a \fIsk\fP\&. -.sp -Logically, it could be thought of getting the value from -a \fImap\fP with \fIsk\fP as the \fBkey\fP\&. From this -perspective, the usage is not much different from -\fBbpf_map_lookup_elem\fP(\fImap\fP, \fB&\fP\fIsk\fP) except this -helper enforces the key must be a full socket and the map must -be a \fBBPF_MAP_TYPE_SK_STORAGE\fP also. -.sp -Underneath, the value is stored locally at \fIsk\fP instead of -the \fImap\fP\&. The \fImap\fP is used as the bpf\-local\-storage -\(dqtype\(dq. The bpf\-local\-storage \(dqtype\(dq (i.e. the \fImap\fP) is -searched against all bpf\-local\-storages residing at \fIsk\fP\&. -.sp -\fIsk\fP is a kernel \fBstruct sock\fP pointer for LSM program. -\fIsk\fP is a \fBstruct bpf_sock\fP pointer for other program types. -.sp -An optional \fIflags\fP (\fBBPF_SK_STORAGE_GET_F_CREATE\fP) can be -used such that a new bpf\-local\-storage will be -created if one does not exist. \fIvalue\fP can be used -together with \fBBPF_SK_STORAGE_GET_F_CREATE\fP to specify -the initial value of a bpf\-local\-storage. If \fIvalue\fP is -\fBNULL\fP, the new bpf\-local\-storage will be zero initialized. -.TP -.B Return -A bpf\-local\-storage pointer is returned on success. -.sp -\fBNULL\fP if not found or there was an error in adding -a new bpf\-local\-storage. -.UNINDENT -.TP -.B \fBlong bpf_sk_storage_delete(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Delete a bpf\-local\-storage from a \fIsk\fP\&. -.TP -.B Return -0 on success. -.sp -\fB\-ENOENT\fP if the bpf\-local\-storage cannot be found. -\fB\-EINVAL\fP if sk is not a fullsock (e.g. a request_sock). -.UNINDENT -.TP -.B \fBlong bpf_send_signal(u32\fP \fIsig\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Send signal \fIsig\fP to the process of the current task. -The signal may be delivered to any of this process\(aqs threads. -.TP -.B Return -0 on success or successfully queued. -.sp -\fB\-EBUSY\fP if work queue under nmi is full. -.sp -\fB\-EINVAL\fP if \fIsig\fP is invalid. -.sp -\fB\-EPERM\fP if no permission to send the \fIsig\fP\&. -.sp -\fB\-EAGAIN\fP if bpf program can try again. -.UNINDENT -.TP -.B \fBs64 bpf_tcp_gen_syncookie(void *\fP\fIsk\fP\fB, void *\fP\fIiph\fP\fB, u32\fP \fIiph_len\fP\fB, struct tcphdr *\fP\fIth\fP\fB, u32\fP \fIth_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Try to issue a SYN cookie for the packet with corresponding -IP/TCP headers, \fIiph\fP and \fIth\fP, on the listening socket in \fIsk\fP\&. -.sp -\fIiph\fP points to the start of the IPv4 or IPv6 header, while -\fIiph_len\fP contains \fBsizeof\fP(\fBstruct iphdr\fP) or -\fBsizeof\fP(\fBstruct ipv6hdr\fP). -.sp -\fIth\fP points to the start of the TCP header, while \fIth_len\fP -contains the length of the TCP header with options (at least -\fBsizeof\fP(\fBstruct tcphdr\fP)). -.TP -.B Return -On success, lower 32 bits hold the generated SYN cookie in -followed by 16 bits which hold the MSS value for that cookie, -and the top 16 bits are unused. -.sp -On failure, the returned value is one of the following: -.sp -\fB\-EINVAL\fP SYN cookie cannot be issued due to error -.sp -\fB\-ENOENT\fP SYN cookie should not be issued (no SYN flood) -.sp -\fB\-EOPNOTSUPP\fP kernel configuration does not enable SYN cookies -.sp -\fB\-EPROTONOSUPPORT\fP IP packet version is not 4 or 6 -.UNINDENT -.TP -.B \fBlong bpf_skb_output(void *\fP\fIctx\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB, void *\fP\fIdata\fP\fB, u64\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Write raw \fIdata\fP blob into a special BPF perf event held by -\fImap\fP of type \fBBPF_MAP_TYPE_PERF_EVENT_ARRAY\fP\&. This perf -event must have the following attributes: \fBPERF_SAMPLE_RAW\fP -as \fBsample_type\fP, \fBPERF_TYPE_SOFTWARE\fP as \fBtype\fP, and -\fBPERF_COUNT_SW_BPF_OUTPUT\fP as \fBconfig\fP\&. -.sp -The \fIflags\fP are used to indicate the index in \fImap\fP for which -the value must be put, masked with \fBBPF_F_INDEX_MASK\fP\&. -Alternatively, \fIflags\fP can be set to \fBBPF_F_CURRENT_CPU\fP -to indicate that the index of the current CPU core should be -used. -.sp -The value to write, of \fIsize\fP, is passed through eBPF stack and -pointed by \fIdata\fP\&. -.sp -\fIctx\fP is a pointer to in\-kernel struct sk_buff. -.sp -This helper is similar to \fBbpf_perf_event_output\fP() but -restricted to raw_tracepoint bpf programs. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_probe_read_user(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIunsafe_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Safely attempt to read \fIsize\fP bytes from user space address -\fIunsafe_ptr\fP and store the data in \fIdst\fP\&. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_probe_read_kernel(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIunsafe_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Safely attempt to read \fIsize\fP bytes from kernel space address -\fIunsafe_ptr\fP and store the data in \fIdst\fP\&. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_probe_read_user_str(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIunsafe_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Copy a NUL terminated string from an unsafe user address -\fIunsafe_ptr\fP to \fIdst\fP\&. The \fIsize\fP should include the -terminating NUL byte. In case the string length is smaller than -\fIsize\fP, the target is not padded with further NUL bytes. If the -string length is larger than \fIsize\fP, just \fIsize\fP\-1 bytes are -copied and the last byte is set to NUL. -.sp -On success, returns the number of bytes that were written, -including the terminal NUL. This makes this helper useful in -tracing programs for reading strings, and more importantly to -get its length at runtime. See the following snippet: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -SEC(\(dqkprobe/sys_open\(dq) -void bpf_sys_open(struct pt_regs *ctx) -{ - char buf[PATHLEN]; // PATHLEN is defined to 256 - int res = bpf_probe_read_user_str(buf, sizeof(buf), - ctx\->di); - - // Consume buf, for example push it to - // userspace via bpf_perf_event_output(); we - // can use res (the string length) as event - // size, after checking its boundaries. -} -.EE -.UNINDENT -.UNINDENT -.sp -In comparison, using \fBbpf_probe_read_user\fP() helper here -instead to read the string would require to estimate the length -at compile time, and would often result in copying more memory -than necessary. -.sp -Another useful use case is when parsing individual process -arguments or individual environment variables navigating -\fIcurrent\fP\fB\->mm\->arg_start\fP and \fIcurrent\fP\fB\->mm\->env_start\fP: using this helper and the return value, -one can quickly iterate at the right offset of the memory area. -.TP -.B Return -On success, the strictly positive length of the output string, -including the trailing NUL character. On error, a negative -value. -.UNINDENT -.TP -.B \fBlong bpf_probe_read_kernel_str(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIunsafe_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Copy a NUL terminated string from an unsafe kernel address \fIunsafe_ptr\fP -to \fIdst\fP\&. Same semantics as with \fBbpf_probe_read_user_str\fP() apply. -.TP -.B Return -On success, the strictly positive length of the string, including -the trailing NUL character. On error, a negative value. -.UNINDENT -.TP -.B \fBlong bpf_tcp_send_ack(void *\fP\fItp\fP\fB, u32\fP \fIrcv_nxt\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Send out a tcp\-ack. \fItp\fP is the in\-kernel struct \fBtcp_sock\fP\&. -\fIrcv_nxt\fP is the ack_seq to be sent out. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_send_signal_thread(u32\fP \fIsig\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Send signal \fIsig\fP to the thread corresponding to the current task. -.TP -.B Return -0 on success or successfully queued. -.sp -\fB\-EBUSY\fP if work queue under nmi is full. -.sp -\fB\-EINVAL\fP if \fIsig\fP is invalid. -.sp -\fB\-EPERM\fP if no permission to send the \fIsig\fP\&. -.sp -\fB\-EAGAIN\fP if bpf program can try again. -.UNINDENT -.TP -.B \fBu64 bpf_jiffies64(void)\fP -.INDENT 7.0 -.TP -.B Description -Obtain the 64bit jiffies -.TP -.B Return -The 64 bit jiffies -.UNINDENT -.TP -.B \fBlong bpf_read_branch_records(struct bpf_perf_event_data *\fP\fIctx\fP\fB, void *\fP\fIbuf\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For an eBPF program attached to a perf event, retrieve the -branch records (\fBstruct perf_branch_entry\fP) associated to \fIctx\fP -and store it in the buffer pointed by \fIbuf\fP up to size -\fIsize\fP bytes. -.TP -.B Return -On success, number of bytes written to \fIbuf\fP\&. On error, a -negative value. -.sp -The \fIflags\fP can be set to \fBBPF_F_GET_BRANCH_RECORDS_SIZE\fP to -instead return the number of bytes required to store all the -branch entries. If this flag is set, \fIbuf\fP may be NULL. -.sp -\fB\-EINVAL\fP if arguments invalid or \fBsize\fP not a multiple -of \fBsizeof\fP(\fBstruct perf_branch_entry\fP). -.sp -\fB\-ENOENT\fP if architecture does not support branch records. -.UNINDENT -.TP -.B \fBlong bpf_get_ns_current_pid_tgid(u64\fP \fIdev\fP\fB, u64\fP \fIino\fP\fB, struct bpf_pidns_info *\fP\fInsdata\fP\fB, u32\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Returns 0 on success, values for \fIpid\fP and \fItgid\fP as seen from the current -\fInamespace\fP will be returned in \fInsdata\fP\&. -.TP -.B Return -0 on success, or one of the following in case of failure: -.sp -\fB\-EINVAL\fP if dev and inum supplied don\(aqt match dev_t and inode number -with nsfs of current task, or if dev conversion to dev_t lost high bits. -.sp -\fB\-ENOENT\fP if pidns does not exists for the current task. -.UNINDENT -.TP -.B \fBlong bpf_xdp_output(void *\fP\fIctx\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB, void *\fP\fIdata\fP\fB, u64\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Write raw \fIdata\fP blob into a special BPF perf event held by -\fImap\fP of type \fBBPF_MAP_TYPE_PERF_EVENT_ARRAY\fP\&. This perf -event must have the following attributes: \fBPERF_SAMPLE_RAW\fP -as \fBsample_type\fP, \fBPERF_TYPE_SOFTWARE\fP as \fBtype\fP, and -\fBPERF_COUNT_SW_BPF_OUTPUT\fP as \fBconfig\fP\&. -.sp -The \fIflags\fP are used to indicate the index in \fImap\fP for which -the value must be put, masked with \fBBPF_F_INDEX_MASK\fP\&. -Alternatively, \fIflags\fP can be set to \fBBPF_F_CURRENT_CPU\fP -to indicate that the index of the current CPU core should be -used. -.sp -The value to write, of \fIsize\fP, is passed through eBPF stack and -pointed by \fIdata\fP\&. -.sp -\fIctx\fP is a pointer to in\-kernel struct xdp_buff. -.sp -This helper is similar to \fBbpf_perf_eventoutput\fP() but -restricted to raw_tracepoint bpf programs. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu64 bpf_get_netns_cookie(void *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Retrieve the cookie (generated by the kernel) of the network -namespace the input \fIctx\fP is associated with. The network -namespace cookie remains stable for its lifetime and provides -a global identifier that can be assumed unique. If \fIctx\fP is -NULL, then the helper returns the cookie for the initial -network namespace. The cookie itself is very similar to that -of \fBbpf_get_socket_cookie\fP() helper, but for network -namespaces instead of sockets. -.TP -.B Return -A 8\-byte long opaque number. -.UNINDENT -.TP -.B \fBu64 bpf_get_current_ancestor_cgroup_id(int\fP \fIancestor_level\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return id of cgroup v2 that is ancestor of the cgroup associated -with the current task at the \fIancestor_level\fP\&. The root cgroup -is at \fIancestor_level\fP zero and each step down the hierarchy -increments the level. If \fIancestor_level\fP == level of cgroup -associated with the current task, then return value will be the -same as that of \fBbpf_get_current_cgroup_id\fP(). -.sp -The helper is useful to implement policies based on cgroups -that are upper in hierarchy than immediate cgroup associated -with the current task. -.sp -The format of returned id and helper limitations are same as in -\fBbpf_get_current_cgroup_id\fP(). -.TP -.B Return -The id is returned or 0 in case the id could not be retrieved. -.UNINDENT -.TP -.B \fBlong bpf_sk_assign(struct sk_buff *\fP\fIskb\fP\fB, void *\fP\fIsk\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Helper is overloaded depending on BPF program type. This -description applies to \fBBPF_PROG_TYPE_SCHED_CLS\fP and -\fBBPF_PROG_TYPE_SCHED_ACT\fP programs. -.sp -Assign the \fIsk\fP to the \fIskb\fP\&. When combined with appropriate -routing configuration to receive the packet towards the socket, -will cause \fIskb\fP to be delivered to the specified socket. -Subsequent redirection of \fIskb\fP via \fBbpf_redirect\fP(), -\fBbpf_clone_redirect\fP() or other methods outside of BPF may -interfere with successful delivery to the socket. -.sp -This operation is only valid from TC ingress path. -.sp -The \fIflags\fP argument must be zero. -.TP -.B Return -0 on success, or a negative error in case of failure: -.sp -\fB\-EINVAL\fP if specified \fIflags\fP are not supported. -.sp -\fB\-ENOENT\fP if the socket is unavailable for assignment. -.sp -\fB\-ENETUNREACH\fP if the socket is unreachable (wrong netns). -.sp -\fB\-EOPNOTSUPP\fP if the operation is not supported, for example -a call from outside of TC ingress. -.UNINDENT -.TP -.B \fBlong bpf_sk_assign(struct bpf_sk_lookup *\fP\fIctx\fP\fB, struct bpf_sock *\fP\fIsk\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Helper is overloaded depending on BPF program type. This -description applies to \fBBPF_PROG_TYPE_SK_LOOKUP\fP programs. -.sp -Select the \fIsk\fP as a result of a socket lookup. -.sp -For the operation to succeed passed socket must be compatible -with the packet description provided by the \fIctx\fP object. -.sp -L4 protocol (\fBIPPROTO_TCP\fP or \fBIPPROTO_UDP\fP) must -be an exact match. While IP family (\fBAF_INET\fP or -\fBAF_INET6\fP) must be compatible, that is IPv6 sockets -that are not v6\-only can be selected for IPv4 packets. -.sp -Only TCP listeners and UDP unconnected sockets can be -selected. \fIsk\fP can also be NULL to reset any previous -selection. -.sp -\fIflags\fP argument can combination of following values: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_SK_LOOKUP_F_REPLACE\fP to override the previous -socket selection, potentially done by a BPF program -that ran before us. -.IP \(bu 2 -\fBBPF_SK_LOOKUP_F_NO_REUSEPORT\fP to skip -load\-balancing within reuseport group for the socket -being selected. -.UNINDENT -.sp -On success \fIctx\->sk\fP will point to the selected socket. -.TP -.B Return -0 on success, or a negative errno in case of failure. -.INDENT 7.0 -.IP \(bu 2 -\fB\-EAFNOSUPPORT\fP if socket family (\fIsk\->family\fP) is -not compatible with packet family (\fIctx\->family\fP). -.IP \(bu 2 -\fB\-EEXIST\fP if socket has been already selected, -potentially by another program, and -\fBBPF_SK_LOOKUP_F_REPLACE\fP flag was not specified. -.IP \(bu 2 -\fB\-EINVAL\fP if unsupported flags were specified. -.IP \(bu 2 -\fB\-EPROTOTYPE\fP if socket L4 protocol -(\fIsk\->protocol\fP) doesn\(aqt match packet protocol -(\fIctx\->protocol\fP). -.IP \(bu 2 -\fB\-ESOCKTNOSUPPORT\fP if socket is not in allowed -state (TCP listening or UDP unconnected). -.UNINDENT -.UNINDENT -.TP -.B \fBu64 bpf_ktime_get_boot_ns(void)\fP -.INDENT 7.0 -.TP -.B Description -Return the time elapsed since system boot, in nanoseconds. -Does include the time the system was suspended. -See: \fBclock_gettime\fP(\fBCLOCK_BOOTTIME\fP) -.TP -.B Return -Current \fIktime\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_seq_printf(struct seq_file *\fP\fIm\fP\fB, const char *\fP\fIfmt\fP\fB, u32\fP \fIfmt_size\fP\fB, const void *\fP\fIdata\fP\fB, u32\fP \fIdata_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -\fBbpf_seq_printf\fP() uses seq_file \fBseq_printf\fP() to print -out the format string. -The \fIm\fP represents the seq_file. The \fIfmt\fP and \fIfmt_size\fP are for -the format string itself. The \fIdata\fP and \fIdata_len\fP are format string -arguments. The \fIdata\fP are a \fBu64\fP array and corresponding format string -values are stored in the array. For strings and pointers where pointees -are accessed, only the pointer values are stored in the \fIdata\fP array. -The \fIdata_len\fP is the size of \fIdata\fP in bytes \- must be a multiple of 8. -.sp -Formats \fB%s\fP, \fB%p{i,I}{4,6}\fP requires to read kernel memory. -Reading kernel memory may fail due to either invalid address or -valid address but requiring a major memory fault. If reading kernel memory -fails, the string for \fB%s\fP will be an empty string, and the ip -address for \fB%p{i,I}{4,6}\fP will be 0. Not returning error to -bpf program is consistent with what \fBbpf_trace_printk\fP() does for now. -.TP -.B Return -0 on success, or a negative error in case of failure: -.sp -\fB\-EBUSY\fP if per\-CPU memory copy buffer is busy, can try again -by returning 1 from bpf program. -.sp -\fB\-EINVAL\fP if arguments are invalid, or if \fIfmt\fP is invalid/unsupported. -.sp -\fB\-E2BIG\fP if \fIfmt\fP contains too many format specifiers. -.sp -\fB\-EOVERFLOW\fP if an overflow happened: The same object will be tried again. -.UNINDENT -.TP -.B \fBlong bpf_seq_write(struct seq_file *\fP\fIm\fP\fB, const void *\fP\fIdata\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -\fBbpf_seq_write\fP() uses seq_file \fBseq_write\fP() to write the data. -The \fIm\fP represents the seq_file. The \fIdata\fP and \fIlen\fP represent the -data to write in bytes. -.TP -.B Return -0 on success, or a negative error in case of failure: -.sp -\fB\-EOVERFLOW\fP if an overflow happened: The same object will be tried again. -.UNINDENT -.TP -.B \fBu64 bpf_sk_cgroup_id(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return the cgroup v2 id of the socket \fIsk\fP\&. -.sp -\fIsk\fP must be a non\-\fBNULL\fP pointer to a socket, e.g. one -returned from \fBbpf_sk_lookup_xxx\fP(), -\fBbpf_sk_fullsock\fP(), etc. The format of returned id is -same as in \fBbpf_skb_cgroup_id\fP(). -.sp -This helper is available only if the kernel was compiled with -the \fBCONFIG_SOCK_CGROUP_DATA\fP configuration option. -.TP -.B Return -The id is returned or 0 in case the id could not be retrieved. -.UNINDENT -.TP -.B \fBu64 bpf_sk_ancestor_cgroup_id(void *\fP\fIsk\fP\fB, int\fP \fIancestor_level\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return id of cgroup v2 that is ancestor of cgroup associated -with the \fIsk\fP at the \fIancestor_level\fP\&. The root cgroup is at -\fIancestor_level\fP zero and each step down the hierarchy -increments the level. If \fIancestor_level\fP == level of cgroup -associated with \fIsk\fP, then return value will be same as that -of \fBbpf_sk_cgroup_id\fP(). -.sp -The helper is useful to implement policies based on cgroups -that are upper in hierarchy than immediate cgroup associated -with \fIsk\fP\&. -.sp -The format of returned id and helper limitations are same as in -\fBbpf_sk_cgroup_id\fP(). -.TP -.B Return -The id is returned or 0 in case the id could not be retrieved. -.UNINDENT -.TP -.B \fBlong bpf_ringbuf_output(void *\fP\fIringbuf\fP\fB, void *\fP\fIdata\fP\fB, u64\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Copy \fIsize\fP bytes from \fIdata\fP into a ring buffer \fIringbuf\fP\&. -If \fBBPF_RB_NO_WAKEUP\fP is specified in \fIflags\fP, no notification -of new data availability is sent. -If \fBBPF_RB_FORCE_WAKEUP\fP is specified in \fIflags\fP, notification -of new data availability is sent unconditionally. -If \fB0\fP is specified in \fIflags\fP, an adaptive notification -of new data availability is sent. -.sp -An adaptive notification is a notification sent whenever the user\-space -process has caught up and consumed all available payloads. In case the user\-space -process is still processing a previous payload, then no notification is needed -as it will process the newly added payload automatically. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBvoid *bpf_ringbuf_reserve(void *\fP\fIringbuf\fP\fB, u64\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Reserve \fIsize\fP bytes of payload in a ring buffer \fIringbuf\fP\&. -\fIflags\fP must be 0. -.TP -.B Return -Valid pointer with \fIsize\fP bytes of memory available; NULL, -otherwise. -.UNINDENT -.TP -.B \fBvoid bpf_ringbuf_submit(void *\fP\fIdata\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Submit reserved ring buffer sample, pointed to by \fIdata\fP\&. -If \fBBPF_RB_NO_WAKEUP\fP is specified in \fIflags\fP, no notification -of new data availability is sent. -If \fBBPF_RB_FORCE_WAKEUP\fP is specified in \fIflags\fP, notification -of new data availability is sent unconditionally. -If \fB0\fP is specified in \fIflags\fP, an adaptive notification -of new data availability is sent. -.sp -See \(aqbpf_ringbuf_output()\(aq for the definition of adaptive notification. -.TP -.B Return -Nothing. Always succeeds. -.UNINDENT -.TP -.B \fBvoid bpf_ringbuf_discard(void *\fP\fIdata\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Discard reserved ring buffer sample, pointed to by \fIdata\fP\&. -If \fBBPF_RB_NO_WAKEUP\fP is specified in \fIflags\fP, no notification -of new data availability is sent. -If \fBBPF_RB_FORCE_WAKEUP\fP is specified in \fIflags\fP, notification -of new data availability is sent unconditionally. -If \fB0\fP is specified in \fIflags\fP, an adaptive notification -of new data availability is sent. -.sp -See \(aqbpf_ringbuf_output()\(aq for the definition of adaptive notification. -.TP -.B Return -Nothing. Always succeeds. -.UNINDENT -.TP -.B \fBu64 bpf_ringbuf_query(void *\fP\fIringbuf\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Query various characteristics of provided ring buffer. What -exactly is queries is determined by \fIflags\fP: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_RB_AVAIL_DATA\fP: Amount of data not yet consumed. -.IP \(bu 2 -\fBBPF_RB_RING_SIZE\fP: The size of ring buffer. -.IP \(bu 2 -\fBBPF_RB_CONS_POS\fP: Consumer position (can wrap around). -.IP \(bu 2 -\fBBPF_RB_PROD_POS\fP: Producer(s) position (can wrap around). -.UNINDENT -.sp -Data returned is just a momentary snapshot of actual values -and could be inaccurate, so this facility should be used to -power heuristics and for reporting, not to make 100% correct -calculation. -.TP -.B Return -Requested value, or 0, if \fIflags\fP are not recognized. -.UNINDENT -.TP -.B \fBlong bpf_csum_level(struct sk_buff *\fP\fIskb\fP\fB, u64\fP \fIlevel\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Change the skbs checksum level by one layer up or down, or -reset it entirely to none in order to have the stack perform -checksum validation. The level is applicable to the following -protocols: TCP, UDP, GRE, SCTP, FCOE. For example, a decap of -| ETH | IP | UDP | GUE | IP | TCP | into | ETH | IP | TCP | -through \fBbpf_skb_adjust_room\fP() helper with passing in -\fBBPF_F_ADJ_ROOM_NO_CSUM_RESET\fP flag would require one call -to \fBbpf_csum_level\fP() with \fBBPF_CSUM_LEVEL_DEC\fP since -the UDP header is removed. Similarly, an encap of the latter -into the former could be accompanied by a helper call to -\fBbpf_csum_level\fP() with \fBBPF_CSUM_LEVEL_INC\fP if the -skb is still intended to be processed in higher layers of the -stack instead of just egressing at tc. -.sp -There are three supported level settings at this time: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_CSUM_LEVEL_INC\fP: Increases skb\->csum_level for skbs -with CHECKSUM_UNNECESSARY. -.IP \(bu 2 -\fBBPF_CSUM_LEVEL_DEC\fP: Decreases skb\->csum_level for skbs -with CHECKSUM_UNNECESSARY. -.IP \(bu 2 -\fBBPF_CSUM_LEVEL_RESET\fP: Resets skb\->csum_level to 0 and -sets CHECKSUM_NONE to force checksum validation by the stack. -.IP \(bu 2 -\fBBPF_CSUM_LEVEL_QUERY\fP: No\-op, returns the current -skb\->csum_level. -.UNINDENT -.TP -.B Return -0 on success, or a negative error in case of failure. In the -case of \fBBPF_CSUM_LEVEL_QUERY\fP, the current skb\->csum_level -is returned or the error code \-EACCES in case the skb is not -subject to CHECKSUM_UNNECESSARY. -.UNINDENT -.TP -.B \fBstruct tcp6_sock *bpf_skc_to_tcp6_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fItcp6_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBstruct tcp_sock *bpf_skc_to_tcp_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fItcp_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBstruct tcp_timewait_sock *bpf_skc_to_tcp_timewait_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fItcp_timewait_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBstruct tcp_request_sock *bpf_skc_to_tcp_request_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fItcp_request_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBstruct udp6_sock *bpf_skc_to_udp6_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fIudp6_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBlong bpf_get_task_stack(struct task_struct *\fP\fItask\fP\fB, void *\fP\fIbuf\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return a user or a kernel stack in bpf program provided buffer. -Note: the user stack will only be populated if the \fItask\fP is -the current task; all other tasks will return \-EOPNOTSUPP. -To achieve this, the helper needs \fItask\fP, which is a valid -pointer to \fBstruct task_struct\fP\&. To store the stacktrace, the -bpf program provides \fIbuf\fP with a nonnegative \fIsize\fP\&. -.sp -The last argument, \fIflags\fP, holds the number of stack frames to -skip (from 0 to 255), masked with -\fBBPF_F_SKIP_FIELD_MASK\fP\&. The next bits can be used to set -the following flags: -.INDENT 7.0 -.TP -.B \fBBPF_F_USER_STACK\fP -Collect a user space stack instead of a kernel stack. -The \fItask\fP must be the current task. -.TP -.B \fBBPF_F_USER_BUILD_ID\fP -Collect buildid+offset instead of ips for user stack, -only valid if \fBBPF_F_USER_STACK\fP is also specified. -.UNINDENT -.sp -\fBbpf_get_task_stack\fP() can collect up to -\fBPERF_MAX_STACK_DEPTH\fP both kernel and user frames, subject -to sufficient large buffer size. Note that -this limit can be controlled with the \fBsysctl\fP program, and -that it should be manually increased in order to profile long -user stacks (such as stacks for Java programs). To do so, use: -.INDENT 7.0 -.INDENT 3.5 -.sp -.EX -# sysctl kernel.perf_event_max_stack=<new value> -.EE -.UNINDENT -.UNINDENT -.TP -.B Return -The non\-negative copied \fIbuf\fP length equal to or less than -\fIsize\fP on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_load_hdr_opt(struct bpf_sock_ops *\fP\fIskops\fP\fB, void *\fP\fIsearchby_res\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Load header option. Support reading a particular TCP header -option for bpf program (\fBBPF_PROG_TYPE_SOCK_OPS\fP). -.sp -If \fIflags\fP is 0, it will search the option from the -\fIskops\fP\fB\->skb_data\fP\&. The comment in \fBstruct bpf_sock_ops\fP -has details on what skb_data contains under different -\fIskops\fP\fB\->op\fP\&. -.sp -The first byte of the \fIsearchby_res\fP specifies the -kind that it wants to search. -.sp -If the searching kind is an experimental kind -(i.e. 253 or 254 according to RFC6994). It also -needs to specify the \(dqmagic\(dq which is either -2 bytes or 4 bytes. It then also needs to -specify the size of the magic by using -the 2nd byte which is \(dqkind\-length\(dq of a TCP -header option and the \(dqkind\-length\(dq also -includes the first 2 bytes \(dqkind\(dq and \(dqkind\-length\(dq -itself as a normal TCP header option also does. -.sp -For example, to search experimental kind 254 with -2 byte magic 0xeB9F, the searchby_res should be -[ 254, 4, 0xeB, 0x9F, 0, 0, .... 0 ]. -.sp -To search for the standard window scale option (3), -the \fIsearchby_res\fP should be [ 3, 0, 0, .... 0 ]. -Note, kind\-length must be 0 for regular option. -.sp -Searching for No\-Op (0) and End\-of\-Option\-List (1) are -not supported. -.sp -\fIlen\fP must be at least 2 bytes which is the minimal size -of a header option. -.sp -Supported flags: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_LOAD_HDR_OPT_TCP_SYN\fP to search from the -saved_syn packet or the just\-received syn packet. -.UNINDENT -.TP -.B Return -> 0 when found, the header option is copied to \fIsearchby_res\fP\&. -The return value is the total length copied. On failure, a -negative error code is returned: -.sp -\fB\-EINVAL\fP if a parameter is invalid. -.sp -\fB\-ENOMSG\fP if the option is not found. -.sp -\fB\-ENOENT\fP if no syn packet is available when -\fBBPF_LOAD_HDR_OPT_TCP_SYN\fP is used. -.sp -\fB\-ENOSPC\fP if there is not enough space. Only \fIlen\fP number of -bytes are copied. -.sp -\fB\-EFAULT\fP on failure to parse the header options in the -packet. -.sp -\fB\-EPERM\fP if the helper cannot be used under the current -\fIskops\fP\fB\->op\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_store_hdr_opt(struct bpf_sock_ops *\fP\fIskops\fP\fB, const void *\fP\fIfrom\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Store header option. The data will be copied -from buffer \fIfrom\fP with length \fIlen\fP to the TCP header. -.sp -The buffer \fIfrom\fP should have the whole option that -includes the kind, kind\-length, and the actual -option data. The \fIlen\fP must be at least kind\-length -long. The kind\-length does not have to be 4 byte -aligned. The kernel will take care of the padding -and setting the 4 bytes aligned value to th\->doff. -.sp -This helper will check for duplicated option -by searching the same option in the outgoing skb. -.sp -This helper can only be called during -\fBBPF_SOCK_OPS_WRITE_HDR_OPT_CB\fP\&. -.TP -.B Return -0 on success, or negative error in case of failure: -.sp -\fB\-EINVAL\fP If param is invalid. -.sp -\fB\-ENOSPC\fP if there is not enough space in the header. -Nothing has been written -.sp -\fB\-EEXIST\fP if the option already exists. -.sp -\fB\-EFAULT\fP on failure to parse the existing header options. -.sp -\fB\-EPERM\fP if the helper cannot be used under the current -\fIskops\fP\fB\->op\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_reserve_hdr_opt(struct bpf_sock_ops *\fP\fIskops\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Reserve \fIlen\fP bytes for the bpf header option. The -space will be used by \fBbpf_store_hdr_opt\fP() later in -\fBBPF_SOCK_OPS_WRITE_HDR_OPT_CB\fP\&. -.sp -If \fBbpf_reserve_hdr_opt\fP() is called multiple times, -the total number of bytes will be reserved. -.sp -This helper can only be called during -\fBBPF_SOCK_OPS_HDR_OPT_LEN_CB\fP\&. -.TP -.B Return -0 on success, or negative error in case of failure: -.sp -\fB\-EINVAL\fP if a parameter is invalid. -.sp -\fB\-ENOSPC\fP if there is not enough space in the header. -.sp -\fB\-EPERM\fP if the helper cannot be used under the current -\fIskops\fP\fB\->op\fP\&. -.UNINDENT -.TP -.B \fBvoid *bpf_inode_storage_get(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIinode\fP\fB, void *\fP\fIvalue\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get a bpf_local_storage from an \fIinode\fP\&. -.sp -Logically, it could be thought of as getting the value from -a \fImap\fP with \fIinode\fP as the \fBkey\fP\&. From this -perspective, the usage is not much different from -\fBbpf_map_lookup_elem\fP(\fImap\fP, \fB&\fP\fIinode\fP) except this -helper enforces the key must be an inode and the map must also -be a \fBBPF_MAP_TYPE_INODE_STORAGE\fP\&. -.sp -Underneath, the value is stored locally at \fIinode\fP instead of -the \fImap\fP\&. The \fImap\fP is used as the bpf\-local\-storage -\(dqtype\(dq. The bpf\-local\-storage \(dqtype\(dq (i.e. the \fImap\fP) is -searched against all bpf_local_storage residing at \fIinode\fP\&. -.sp -An optional \fIflags\fP (\fBBPF_LOCAL_STORAGE_GET_F_CREATE\fP) can be -used such that a new bpf_local_storage will be -created if one does not exist. \fIvalue\fP can be used -together with \fBBPF_LOCAL_STORAGE_GET_F_CREATE\fP to specify -the initial value of a bpf_local_storage. If \fIvalue\fP is -\fBNULL\fP, the new bpf_local_storage will be zero initialized. -.TP -.B Return -A bpf_local_storage pointer is returned on success. -.sp -\fBNULL\fP if not found or there was an error in adding -a new bpf_local_storage. -.UNINDENT -.TP -.B \fBint bpf_inode_storage_delete(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIinode\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Delete a bpf_local_storage from an \fIinode\fP\&. -.TP -.B Return -0 on success. -.sp -\fB\-ENOENT\fP if the bpf_local_storage cannot be found. -.UNINDENT -.TP -.B \fBlong bpf_d_path(struct path *\fP\fIpath\fP\fB, char *\fP\fIbuf\fP\fB, u32\fP \fIsz\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Return full path for given \fBstruct path\fP object, which -needs to be the kernel BTF \fIpath\fP object. The path is -returned in the provided buffer \fIbuf\fP of size \fIsz\fP and -is zero terminated. -.TP -.B Return -On success, the strictly positive length of the string, -including the trailing NUL character. On error, a negative -value. -.UNINDENT -.TP -.B \fBlong bpf_copy_from_user(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIuser_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Read \fIsize\fP bytes from user space address \fIuser_ptr\fP and store -the data in \fIdst\fP\&. This is a wrapper of \fBcopy_from_user\fP(). -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_snprintf_btf(char *\fP\fIstr\fP\fB, u32\fP \fIstr_size\fP\fB, struct btf_ptr *\fP\fIptr\fP\fB, u32\fP \fIbtf_ptr_size\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Use BTF to store a string representation of \fIptr\fP\->ptr in \fIstr\fP, -using \fIptr\fP\->type_id. This value should specify the type -that \fIptr\fP\->ptr points to. LLVM __builtin_btf_type_id(type, 1) -can be used to look up vmlinux BTF type ids. Traversing the -data structure using BTF, the type information and values are -stored in the first \fIstr_size\fP \- 1 bytes of \fIstr\fP\&. Safe copy of -the pointer data is carried out to avoid kernel crashes during -operation. Smaller types can use string space on the stack; -larger programs can use map data to store the string -representation. -.sp -The string can be subsequently shared with userspace via -bpf_perf_event_output() or ring buffer interfaces. -bpf_trace_printk() is to be avoided as it places too small -a limit on string size to be useful. -.sp -\fIflags\fP is a combination of -.INDENT 7.0 -.TP -.B \fBBTF_F_COMPACT\fP -no formatting around type information -.TP -.B \fBBTF_F_NONAME\fP -no struct/union member names/types -.TP -.B \fBBTF_F_PTR_RAW\fP -show raw (unobfuscated) pointer values; -equivalent to printk specifier %px. -.TP -.B \fBBTF_F_ZERO\fP -show zero\-valued struct/union members; they -are not displayed by default -.UNINDENT -.TP -.B Return -The number of bytes that were written (or would have been -written if output had to be truncated due to string size), -or a negative error in cases of failure. -.UNINDENT -.TP -.B \fBlong bpf_seq_printf_btf(struct seq_file *\fP\fIm\fP\fB, struct btf_ptr *\fP\fIptr\fP\fB, u32\fP \fIptr_size\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Use BTF to write to seq_write a string representation of -\fIptr\fP\->ptr, using \fIptr\fP\->type_id as per bpf_snprintf_btf(). -\fIflags\fP are identical to those used for bpf_snprintf_btf. -.TP -.B Return -0 on success or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu64 bpf_skb_cgroup_classid(struct sk_buff *\fP\fIskb\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -See \fBbpf_get_cgroup_classid\fP() for the main description. -This helper differs from \fBbpf_get_cgroup_classid\fP() in that -the cgroup v1 net_cls class is retrieved only from the \fIskb\fP\(aqs -associated socket instead of the current process. -.TP -.B Return -The id is returned or 0 in case the id could not be retrieved. -.UNINDENT -.TP -.B \fBlong bpf_redirect_neigh(u32\fP \fIifindex\fP\fB, struct bpf_redir_neigh *\fP\fIparams\fP\fB, int\fP \fIplen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Redirect the packet to another net device of index \fIifindex\fP -and fill in L2 addresses from neighboring subsystem. This helper -is somewhat similar to \fBbpf_redirect\fP(), except that it -populates L2 addresses as well, meaning, internally, the helper -relies on the neighbor lookup for the L2 address of the nexthop. -.sp -The helper will perform a FIB lookup based on the skb\(aqs -networking header to get the address of the next hop, unless -this is supplied by the caller in the \fIparams\fP argument. The -\fIplen\fP argument indicates the len of \fIparams\fP and should be set -to 0 if \fIparams\fP is NULL. -.sp -The \fIflags\fP argument is reserved and must be 0. The helper is -currently only supported for tc BPF program types, and enabled -for IPv4 and IPv6 protocols. -.TP -.B Return -The helper returns \fBTC_ACT_REDIRECT\fP on success or -\fBTC_ACT_SHOT\fP on error. -.UNINDENT -.TP -.B \fBvoid *bpf_per_cpu_ptr(const void *\fP\fIpercpu_ptr\fP\fB, u32\fP \fIcpu\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Take a pointer to a percpu ksym, \fIpercpu_ptr\fP, and return a -pointer to the percpu kernel variable on \fIcpu\fP\&. A ksym is an -extern variable decorated with \(aq__ksym\(aq. For ksym, there is a -global var (either static or global) defined of the same name -in the kernel. The ksym is percpu if the global var is percpu. -The returned pointer points to the global percpu var on \fIcpu\fP\&. -.sp -bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the -kernel, except that bpf_per_cpu_ptr() may return NULL. This -happens if \fIcpu\fP is larger than nr_cpu_ids. The caller of -bpf_per_cpu_ptr() must check the returned value. -.TP -.B Return -A pointer pointing to the kernel percpu variable on \fIcpu\fP, or -NULL, if \fIcpu\fP is invalid. -.UNINDENT -.TP -.B \fBvoid *bpf_this_cpu_ptr(const void *\fP\fIpercpu_ptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Take a pointer to a percpu ksym, \fIpercpu_ptr\fP, and return a -pointer to the percpu kernel variable on this cpu. See the -description of \(aqksym\(aq in \fBbpf_per_cpu_ptr\fP(). -.sp -bpf_this_cpu_ptr() has the same semantic as this_cpu_ptr() in -the kernel. Different from \fBbpf_per_cpu_ptr\fP(), it would -never return NULL. -.TP -.B Return -A pointer pointing to the kernel percpu variable on this cpu. -.UNINDENT -.TP -.B \fBlong bpf_redirect_peer(u32\fP \fIifindex\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Redirect the packet to another net device of index \fIifindex\fP\&. -This helper is somewhat similar to \fBbpf_redirect\fP(), except -that the redirection happens to the \fIifindex\fP\(aq peer device and -the netns switch takes place from ingress to ingress without -going through the CPU\(aqs backlog queue. -.sp -The \fIflags\fP argument is reserved and must be 0. The helper is -currently only supported for tc BPF program types at the ingress -hook and for veth device types. The peer device must reside in a -different network namespace. -.TP -.B Return -The helper returns \fBTC_ACT_REDIRECT\fP on success or -\fBTC_ACT_SHOT\fP on error. -.UNINDENT -.TP -.B \fBvoid *bpf_task_storage_get(struct bpf_map *\fP\fImap\fP\fB, struct task_struct *\fP\fItask\fP\fB, void *\fP\fIvalue\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get a bpf_local_storage from the \fItask\fP\&. -.sp -Logically, it could be thought of as getting the value from -a \fImap\fP with \fItask\fP as the \fBkey\fP\&. From this -perspective, the usage is not much different from -\fBbpf_map_lookup_elem\fP(\fImap\fP, \fB&\fP\fItask\fP) except this -helper enforces the key must be a task_struct and the map must also -be a \fBBPF_MAP_TYPE_TASK_STORAGE\fP\&. -.sp -Underneath, the value is stored locally at \fItask\fP instead of -the \fImap\fP\&. The \fImap\fP is used as the bpf\-local\-storage -\(dqtype\(dq. The bpf\-local\-storage \(dqtype\(dq (i.e. the \fImap\fP) is -searched against all bpf_local_storage residing at \fItask\fP\&. -.sp -An optional \fIflags\fP (\fBBPF_LOCAL_STORAGE_GET_F_CREATE\fP) can be -used such that a new bpf_local_storage will be -created if one does not exist. \fIvalue\fP can be used -together with \fBBPF_LOCAL_STORAGE_GET_F_CREATE\fP to specify -the initial value of a bpf_local_storage. If \fIvalue\fP is -\fBNULL\fP, the new bpf_local_storage will be zero initialized. -.TP -.B Return -A bpf_local_storage pointer is returned on success. -.sp -\fBNULL\fP if not found or there was an error in adding -a new bpf_local_storage. -.UNINDENT -.TP -.B \fBlong bpf_task_storage_delete(struct bpf_map *\fP\fImap\fP\fB, struct task_struct *\fP\fItask\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Delete a bpf_local_storage from a \fItask\fP\&. -.TP -.B Return -0 on success. -.sp -\fB\-ENOENT\fP if the bpf_local_storage cannot be found. -.UNINDENT -.TP -.B \fBstruct task_struct *bpf_get_current_task_btf(void)\fP -.INDENT 7.0 -.TP -.B Description -Return a BTF pointer to the \(dqcurrent\(dq task. -This pointer can also be used in helpers that accept an -\fIARG_PTR_TO_BTF_ID\fP of type \fItask_struct\fP\&. -.TP -.B Return -Pointer to the current task. -.UNINDENT -.TP -.B \fBlong bpf_bprm_opts_set(struct linux_binprm *\fP\fIbprm\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Set or clear certain options on \fIbprm\fP: -.sp -\fBBPF_F_BPRM_SECUREEXEC\fP Set the secureexec bit -which sets the \fBAT_SECURE\fP auxv for glibc. The bit -is cleared if the flag is not specified. -.TP -.B Return -\fB\-EINVAL\fP if invalid \fIflags\fP are passed, zero otherwise. -.UNINDENT -.TP -.B \fBu64 bpf_ktime_get_coarse_ns(void)\fP -.INDENT 7.0 -.TP -.B Description -Return a coarse\-grained version of the time elapsed since -system boot, in nanoseconds. Does not include time the system -was suspended. -.sp -See: \fBclock_gettime\fP(\fBCLOCK_MONOTONIC_COARSE\fP) -.TP -.B Return -Current \fIktime\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_ima_inode_hash(struct inode *\fP\fIinode\fP\fB, void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Returns the stored IMA hash of the \fIinode\fP (if it\(aqs available). -If the hash is larger than \fIsize\fP, then only \fIsize\fP -bytes will be copied to \fIdst\fP -.TP -.B Return -The \fBhash_algo\fP is returned on success, -\fB\-EOPNOTSUP\fP if IMA is disabled or \fB\-EINVAL\fP if -invalid arguments are passed. -.UNINDENT -.TP -.B \fBstruct socket *bpf_sock_from_file(struct file *\fP\fIfile\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -If the given file represents a socket, returns the associated -socket. -.TP -.B Return -A pointer to a struct socket on success or NULL if the file is -not a socket. -.UNINDENT -.TP -.B \fBlong bpf_check_mtu(void *\fP\fIctx\fP\fB, u32\fP \fIifindex\fP\fB, u32 *\fP\fImtu_len\fP\fB, s32\fP \fIlen_diff\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Check packet size against exceeding MTU of net device (based -on \fIifindex\fP). This helper will likely be used in combination -with helpers that adjust/change the packet size. -.sp -The argument \fIlen_diff\fP can be used for querying with a planned -size change. This allows to check MTU prior to changing packet -ctx. Providing a \fIlen_diff\fP adjustment that is larger than the -actual packet size (resulting in negative packet size) will in -principle not exceed the MTU, which is why it is not considered -a failure. Other BPF helpers are needed for performing the -planned size change; therefore the responsibility for catching -a negative packet size belongs in those helpers. -.sp -Specifying \fIifindex\fP zero means the MTU check is performed -against the current net device. This is practical if this isn\(aqt -used prior to redirect. -.sp -On input \fImtu_len\fP must be a valid pointer, else verifier will -reject BPF program. If the value \fImtu_len\fP is initialized to -zero then the ctx packet size is use. When value \fImtu_len\fP is -provided as input this specify the L3 length that the MTU check -is done against. Remember XDP and TC length operate at L2, but -this value is L3 as this correlate to MTU and IP\-header tot_len -values which are L3 (similar behavior as bpf_fib_lookup). -.sp -The Linux kernel route table can configure MTUs on a more -specific per route level, which is not provided by this helper. -For route level MTU checks use the \fBbpf_fib_lookup\fP() -helper. -.sp -\fIctx\fP is either \fBstruct xdp_md\fP for XDP programs or -\fBstruct sk_buff\fP for tc cls_act programs. -.sp -The \fIflags\fP argument can be a combination of one or more of the -following values: -.INDENT 7.0 -.TP -.B \fBBPF_MTU_CHK_SEGS\fP -This flag will only works for \fIctx\fP \fBstruct sk_buff\fP\&. -If packet context contains extra packet segment buffers -(often knows as GSO skb), then MTU check is harder to -check at this point, because in transmit path it is -possible for the skb packet to get re\-segmented -(depending on net device features). This could still be -a MTU violation, so this flag enables performing MTU -check against segments, with a different violation -return code to tell it apart. Check cannot use len_diff. -.UNINDENT -.sp -On return \fImtu_len\fP pointer contains the MTU value of the net -device. Remember the net device configured MTU is the L3 size, -which is returned here and XDP and TC length operate at L2. -Helper take this into account for you, but remember when using -MTU value in your BPF\-code. -.TP -.B Return -.INDENT 7.0 -.IP \(bu 2 -0 on success, and populate MTU value in \fImtu_len\fP pointer. -.IP \(bu 2 -< 0 if any input argument is invalid (\fImtu_len\fP not updated) -.UNINDENT -.sp -MTU violations return positive values, but also populate MTU -value in \fImtu_len\fP pointer, as this can be needed for -implementing PMTU handing: -.INDENT 7.0 -.IP \(bu 2 -\fBBPF_MTU_CHK_RET_FRAG_NEEDED\fP -.IP \(bu 2 -\fBBPF_MTU_CHK_RET_SEGS_TOOBIG\fP -.UNINDENT -.UNINDENT -.TP -.B \fBlong bpf_for_each_map_elem(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIcallback_fn\fP\fB, void *\fP\fIcallback_ctx\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For each element in \fBmap\fP, call \fBcallback_fn\fP function with -\fBmap\fP, \fBcallback_ctx\fP and other map\-specific parameters. -The \fBcallback_fn\fP should be a static function and -the \fBcallback_ctx\fP should be a pointer to the stack. -The \fBflags\fP is used to control certain aspects of the helper. -Currently, the \fBflags\fP must be 0. -.sp -The following are a list of supported map types and their -respective expected callback signatures: -.sp -BPF_MAP_TYPE_HASH, BPF_MAP_TYPE_PERCPU_HASH, -BPF_MAP_TYPE_LRU_HASH, BPF_MAP_TYPE_LRU_PERCPU_HASH, -BPF_MAP_TYPE_ARRAY, BPF_MAP_TYPE_PERCPU_ARRAY -.sp -long (*callback_fn)(struct bpf_map *map, const void *key, void *value, void *ctx); -.sp -For per_cpu maps, the map_value is the value on the cpu where the -bpf_prog is running. -.sp -If \fBcallback_fn\fP return 0, the helper will continue to the next -element. If return value is 1, the helper will skip the rest of -elements and return. Other return values are not used now. -.TP -.B Return -The number of traversed map elements for success, \fB\-EINVAL\fP for -invalid \fBflags\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_snprintf(char *\fP\fIstr\fP\fB, u32\fP \fIstr_size\fP\fB, const char *\fP\fIfmt\fP\fB, u64 *\fP\fIdata\fP\fB, u32\fP \fIdata_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Outputs a string into the \fBstr\fP buffer of size \fBstr_size\fP -based on a format string stored in a read\-only map pointed by -\fBfmt\fP\&. -.sp -Each format specifier in \fBfmt\fP corresponds to one u64 element -in the \fBdata\fP array. For strings and pointers where pointees -are accessed, only the pointer values are stored in the \fIdata\fP -array. The \fIdata_len\fP is the size of \fIdata\fP in bytes \- must be -a multiple of 8. -.sp -Formats \fB%s\fP and \fB%p{i,I}{4,6}\fP require to read kernel -memory. Reading kernel memory may fail due to either invalid -address or valid address but requiring a major memory fault. If -reading kernel memory fails, the string for \fB%s\fP will be an -empty string, and the ip address for \fB%p{i,I}{4,6}\fP will be 0. -Not returning error to bpf program is consistent with what -\fBbpf_trace_printk\fP() does for now. -.TP -.B Return -The strictly positive length of the formatted string, including -the trailing zero character. If the return value is greater than -\fBstr_size\fP, \fBstr\fP contains a truncated string, guaranteed to -be zero\-terminated except when \fBstr_size\fP is 0. -.sp -Or \fB\-EBUSY\fP if the per\-CPU memory copy buffer is busy. -.UNINDENT -.TP -.B \fBlong bpf_sys_bpf(u32\fP \fIcmd\fP\fB, void *\fP\fIattr\fP\fB, u32\fP \fIattr_size\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Execute bpf syscall with given arguments. -.TP -.B Return -A syscall result. -.UNINDENT -.TP -.B \fBlong bpf_btf_find_by_name_kind(char *\fP\fIname\fP\fB, int\fP \fIname_sz\fP\fB, u32\fP \fIkind\fP\fB, int\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Find BTF type with given name and kind in vmlinux BTF or in module\(aqs BTFs. -.TP -.B Return -Returns btf_id and btf_obj_fd in lower and upper 32 bits. -.UNINDENT -.TP -.B \fBlong bpf_sys_close(u32\fP \fIfd\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Execute close syscall for given FD. -.TP -.B Return -A syscall result. -.UNINDENT -.TP -.B \fBlong bpf_timer_init(struct bpf_timer *\fP\fItimer\fP\fB, struct bpf_map *\fP\fImap\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Initialize the timer. -First 4 bits of \fIflags\fP specify clockid. -Only CLOCK_MONOTONIC, CLOCK_REALTIME, CLOCK_BOOTTIME are allowed. -All other bits of \fIflags\fP are reserved. -The verifier will reject the program if \fItimer\fP is not from -the same \fImap\fP\&. -.TP -.B Return -0 on success. -\fB\-EBUSY\fP if \fItimer\fP is already initialized. -\fB\-EINVAL\fP if invalid \fIflags\fP are passed. -\fB\-EPERM\fP if \fItimer\fP is in a map that doesn\(aqt have any user references. -The user space should either hold a file descriptor to a map with timers -or pin such map in bpffs. When map is unpinned or file descriptor is -closed all timers in the map will be cancelled and freed. -.UNINDENT -.TP -.B \fBlong bpf_timer_set_callback(struct bpf_timer *\fP\fItimer\fP\fB, void *\fP\fIcallback_fn\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Configure the timer to call \fIcallback_fn\fP static function. -.TP -.B Return -0 on success. -\fB\-EINVAL\fP if \fItimer\fP was not initialized with bpf_timer_init() earlier. -\fB\-EPERM\fP if \fItimer\fP is in a map that doesn\(aqt have any user references. -The user space should either hold a file descriptor to a map with timers -or pin such map in bpffs. When map is unpinned or file descriptor is -closed all timers in the map will be cancelled and freed. -.UNINDENT -.TP -.B \fBlong bpf_timer_start(struct bpf_timer *\fP\fItimer\fP\fB, u64\fP \fInsecs\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Set timer expiration N nanoseconds from the current time. The -configured callback will be invoked in soft irq context on some cpu -and will not repeat unless another bpf_timer_start() is made. -In such case the next invocation can migrate to a different cpu. -Since struct bpf_timer is a field inside map element the map -owns the timer. The bpf_timer_set_callback() will increment refcnt -of BPF program to make sure that callback_fn code stays valid. -When user space reference to a map reaches zero all timers -in a map are cancelled and corresponding program\(aqs refcnts are -decremented. This is done to make sure that Ctrl\-C of a user -process doesn\(aqt leave any timers running. If map is pinned in -bpffs the callback_fn can re\-arm itself indefinitely. -bpf_map_update/delete_elem() helpers and user space sys_bpf commands -cancel and free the timer in the given map element. -The map can contain timers that invoke callback_fn\-s from different -programs. The same callback_fn can serve different timers from -different maps if key/value layout matches across maps. -Every bpf_timer_set_callback() can have different callback_fn. -.sp -\fIflags\fP can be one of: -.INDENT 7.0 -.TP -.B \fBBPF_F_TIMER_ABS\fP -Start the timer in absolute expire value instead of the -default relative one. -.TP -.B \fBBPF_F_TIMER_CPU_PIN\fP -Timer will be pinned to the CPU of the caller. -.UNINDENT -.TP -.B Return -0 on success. -\fB\-EINVAL\fP if \fItimer\fP was not initialized with bpf_timer_init() earlier -or invalid \fIflags\fP are passed. -.UNINDENT -.TP -.B \fBlong bpf_timer_cancel(struct bpf_timer *\fP\fItimer\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Cancel the timer and wait for callback_fn to finish if it was running. -.TP -.B Return -0 if the timer was not active. -1 if the timer was active. -\fB\-EINVAL\fP if \fItimer\fP was not initialized with bpf_timer_init() earlier. -\fB\-EDEADLK\fP if callback_fn tried to call bpf_timer_cancel() on its -own timer which would have led to a deadlock otherwise. -.UNINDENT -.TP -.B \fBu64 bpf_get_func_ip(void *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get address of the traced function (for tracing and kprobe programs). -.sp -When called for kprobe program attached as uprobe it returns -probe address for both entry and return uprobe. -.TP -.B Return -Address of the traced function for kprobe. -0 for kprobes placed within the function (not at the entry). -Address of the probe for uprobe and return uprobe. -.UNINDENT -.TP -.B \fBu64 bpf_get_attach_cookie(void *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get bpf_cookie value provided (optionally) during the program -attachment. It might be different for each individual -attachment, even if BPF program itself is the same. -Expects BPF program context \fIctx\fP as a first argument. -.INDENT 7.0 -.TP -.B Supported for the following program types: -.INDENT 7.0 -.IP \(bu 2 -kprobe/uprobe; -.IP \(bu 2 -tracepoint; -.IP \(bu 2 -perf_event. -.UNINDENT -.UNINDENT -.TP -.B Return -Value specified by user at BPF link creation/attachment time -or 0, if it was not specified. -.UNINDENT -.TP -.B \fBlong bpf_task_pt_regs(struct task_struct *\fP\fItask\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get the struct pt_regs associated with \fBtask\fP\&. -.TP -.B Return -A pointer to struct pt_regs. -.UNINDENT -.TP -.B \fBlong bpf_get_branch_snapshot(void *\fP\fIentries\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get branch trace from hardware engines like Intel LBR. The -hardware engine is stopped shortly after the helper is -called. Therefore, the user need to filter branch entries -based on the actual use case. To capture branch trace -before the trigger point of the BPF program, the helper -should be called at the beginning of the BPF program. -.sp -The data is stored as struct perf_branch_entry into output -buffer \fIentries\fP\&. \fIsize\fP is the size of \fIentries\fP in bytes. -\fIflags\fP is reserved for now and must be zero. -.TP -.B Return -On success, number of bytes written to \fIbuf\fP\&. On error, a -negative value. -.sp -\fB\-EINVAL\fP if \fIflags\fP is not zero. -.sp -\fB\-ENOENT\fP if architecture does not support branch records. -.UNINDENT -.TP -.B \fBlong bpf_trace_vprintk(const char *\fP\fIfmt\fP\fB, u32\fP \fIfmt_size\fP\fB, const void *\fP\fIdata\fP\fB, u32\fP \fIdata_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Behaves like \fBbpf_trace_printk\fP() helper, but takes an array of u64 -to format and can handle more format args as a result. -.sp -Arguments are to be used as in \fBbpf_seq_printf\fP() helper. -.TP -.B Return -The number of bytes written to the buffer, or a negative error -in case of failure. -.UNINDENT -.TP -.B \fBstruct unix_sock *bpf_skc_to_unix_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fIunix_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBlong bpf_kallsyms_lookup_name(const char *\fP\fIname\fP\fB, int\fP \fIname_sz\fP\fB, int\fP \fIflags\fP\fB, u64 *\fP\fIres\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get the address of a kernel symbol, returned in \fIres\fP\&. \fIres\fP is -set to 0 if the symbol is not found. -.TP -.B Return -On success, zero. On error, a negative value. -.sp -\fB\-EINVAL\fP if \fIflags\fP is not zero. -.sp -\fB\-EINVAL\fP if string \fIname\fP is not the same size as \fIname_sz\fP\&. -.sp -\fB\-ENOENT\fP if symbol is not found. -.sp -\fB\-EPERM\fP if caller does not have permission to obtain kernel address. -.UNINDENT -.TP -.B \fBlong bpf_find_vma(struct task_struct *\fP\fItask\fP\fB, u64\fP \fIaddr\fP\fB, void *\fP\fIcallback_fn\fP\fB, void *\fP\fIcallback_ctx\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Find vma of \fItask\fP that contains \fIaddr\fP, call \fIcallback_fn\fP -function with \fItask\fP, \fIvma\fP, and \fIcallback_ctx\fP\&. -The \fIcallback_fn\fP should be a static function and -the \fIcallback_ctx\fP should be a pointer to the stack. -The \fIflags\fP is used to control certain aspects of the helper. -Currently, the \fIflags\fP must be 0. -.sp -The expected callback signature is -.sp -long (*callback_fn)(struct task_struct *task, struct vm_area_struct *vma, void *callback_ctx); -.TP -.B Return -0 on success. -\fB\-ENOENT\fP if \fItask\->mm\fP is NULL, or no vma contains \fIaddr\fP\&. -\fB\-EBUSY\fP if failed to try lock mmap_lock. -\fB\-EINVAL\fP for invalid \fBflags\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_loop(u32\fP \fInr_loops\fP\fB, void *\fP\fIcallback_fn\fP\fB, void *\fP\fIcallback_ctx\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -For \fBnr_loops\fP, call \fBcallback_fn\fP function -with \fBcallback_ctx\fP as the context parameter. -The \fBcallback_fn\fP should be a static function and -the \fBcallback_ctx\fP should be a pointer to the stack. -The \fBflags\fP is used to control certain aspects of the helper. -Currently, the \fBflags\fP must be 0. Currently, nr_loops is -limited to 1 << 23 (~8 million) loops. -.sp -long (*callback_fn)(u32 index, void *ctx); -.sp -where \fBindex\fP is the current index in the loop. The index -is zero\-indexed. -.sp -If \fBcallback_fn\fP returns 0, the helper will continue to the next -loop. If return value is 1, the helper will skip the rest of -the loops and return. Other return values are not used now, -and will be rejected by the verifier. -.TP -.B Return -The number of loops performed, \fB\-EINVAL\fP for invalid \fBflags\fP, -\fB\-E2BIG\fP if \fBnr_loops\fP exceeds the maximum number of loops. -.UNINDENT -.TP -.B \fBlong bpf_strncmp(const char *\fP\fIs1\fP\fB, u32\fP \fIs1_sz\fP\fB, const char *\fP\fIs2\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Do strncmp() between \fBs1\fP and \fBs2\fP\&. \fBs1\fP doesn\(aqt need -to be null\-terminated and \fBs1_sz\fP is the maximum storage -size of \fBs1\fP\&. \fBs2\fP must be a read\-only string. -.TP -.B Return -An integer less than, equal to, or greater than zero -if the first \fBs1_sz\fP bytes of \fBs1\fP is found to be -less than, to match, or be greater than \fBs2\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_get_func_arg(void *\fP\fIctx\fP\fB, u32\fP \fIn\fP\fB, u64 *\fP\fIvalue\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get \fBn\fP\-th argument register (zero based) of the traced function (for tracing programs) -returned in \fBvalue\fP\&. -.TP -.B Return -0 on success. -\fB\-EINVAL\fP if n >= argument register count of traced function. -.UNINDENT -.TP -.B \fBlong bpf_get_func_ret(void *\fP\fIctx\fP\fB, u64 *\fP\fIvalue\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get return value of the traced function (for tracing programs) -in \fBvalue\fP\&. -.TP -.B Return -0 on success. -\fB\-EOPNOTSUPP\fP for tracing programs other than BPF_TRACE_FEXIT or BPF_MODIFY_RETURN. -.UNINDENT -.TP -.B \fBlong bpf_get_func_arg_cnt(void *\fP\fIctx\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get number of registers of the traced function (for tracing programs) where -function arguments are stored in these registers. -.TP -.B Return -The number of argument registers of the traced function. -.UNINDENT -.TP -.B \fBint bpf_get_retval(void)\fP -.INDENT 7.0 -.TP -.B Description -Get the BPF program\(aqs return value that will be returned to the upper layers. -.sp -This helper is currently supported by cgroup programs and only by the hooks -where BPF program\(aqs return value is returned to the userspace via errno. -.TP -.B Return -The BPF program\(aqs return value. -.UNINDENT -.TP -.B \fBint bpf_set_retval(int\fP \fIretval\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Set the BPF program\(aqs return value that will be returned to the upper layers. -.sp -This helper is currently supported by cgroup programs and only by the hooks -where BPF program\(aqs return value is returned to the userspace via errno. -.sp -Note that there is the following corner case where the program exports an error -via bpf_set_retval but signals success via \(aqreturn 1\(aq: -.INDENT 7.0 -.INDENT 3.5 -bpf_set_retval(\-EPERM); -return 1; -.UNINDENT -.UNINDENT -.sp -In this case, the BPF program\(aqs return value will use helper\(aqs \-EPERM. This -still holds true for cgroup/bind{4,6} which supports extra \(aqreturn 3\(aq success case. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBu64 bpf_xdp_get_buff_len(struct xdp_buff *\fP\fIxdp_md\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get the total size of a given xdp buff (linear and paged area) -.TP -.B Return -The total size of a given xdp buffer. -.UNINDENT -.TP -.B \fBlong bpf_xdp_load_bytes(struct xdp_buff *\fP\fIxdp_md\fP\fB, u32\fP \fIoffset\fP\fB, void *\fP\fIbuf\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -This helper is provided as an easy way to load data from a -xdp buffer. It can be used to load \fIlen\fP bytes from \fIoffset\fP from -the frame associated to \fIxdp_md\fP, into the buffer pointed by -\fIbuf\fP\&. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_xdp_store_bytes(struct xdp_buff *\fP\fIxdp_md\fP\fB, u32\fP \fIoffset\fP\fB, void *\fP\fIbuf\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Store \fIlen\fP bytes from buffer \fIbuf\fP into the frame -associated to \fIxdp_md\fP, at \fIoffset\fP\&. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBlong bpf_copy_from_user_task(void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB, const void *\fP\fIuser_ptr\fP\fB, struct task_struct *\fP\fItsk\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Read \fIsize\fP bytes from user space address \fIuser_ptr\fP in \fItsk\fP\(aqs -address space, and stores the data in \fIdst\fP\&. \fIflags\fP is not -used yet and is provided for future extensibility. This helper -can only be used by sleepable programs. -.TP -.B Return -0 on success, or a negative error in case of failure. On error -\fIdst\fP buffer is zeroed out. -.UNINDENT -.TP -.B \fBlong bpf_skb_set_tstamp(struct sk_buff *\fP\fIskb\fP\fB, u64\fP \fItstamp\fP\fB, u32\fP \fItstamp_type\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Change the __sk_buff\->tstamp_type to \fItstamp_type\fP -and set \fItstamp\fP to the __sk_buff\->tstamp together. -.sp -If there is no need to change the __sk_buff\->tstamp_type, -the tstamp value can be directly written to __sk_buff\->tstamp -instead. -.sp -BPF_SKB_TSTAMP_DELIVERY_MONO is the only tstamp that -will be kept during bpf_redirect_*(). A non zero -\fItstamp\fP must be used with the BPF_SKB_TSTAMP_DELIVERY_MONO -\fItstamp_type\fP\&. -.sp -A BPF_SKB_TSTAMP_UNSPEC \fItstamp_type\fP can only be used -with a zero \fItstamp\fP\&. -.sp -Only IPv4 and IPv6 skb\->protocol are supported. -.sp -This function is most useful when it needs to set a -mono delivery time to __sk_buff\->tstamp and then -bpf_redirect_*() to the egress of an iface. For example, -changing the (rcv) timestamp in __sk_buff\->tstamp at -ingress to a mono delivery time and then bpf_redirect_*() -to \fI\%sch_fq@phy\-dev\fP\&. -.TP -.B Return -0 on success. -\fB\-EINVAL\fP for invalid input -\fB\-EOPNOTSUPP\fP for unsupported protocol -.UNINDENT -.TP -.B \fBlong bpf_ima_file_hash(struct file *\fP\fIfile\fP\fB, void *\fP\fIdst\fP\fB, u32\fP \fIsize\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Returns a calculated IMA hash of the \fIfile\fP\&. -If the hash is larger than \fIsize\fP, then only \fIsize\fP -bytes will be copied to \fIdst\fP -.TP -.B Return -The \fBhash_algo\fP is returned on success, -\fB\-EOPNOTSUP\fP if the hash calculation failed or \fB\-EINVAL\fP if -invalid arguments are passed. -.UNINDENT -.TP -.B \fBvoid *bpf_kptr_xchg(void *\fP\fImap_value\fP\fB, void *\fP\fIptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Exchange kptr at pointer \fImap_value\fP with \fIptr\fP, and return the -old value. \fIptr\fP can be NULL, otherwise it must be a referenced -pointer which will be released when this helper is called. -.TP -.B Return -The old value of kptr (which can be NULL). The returned pointer -if not NULL, is a reference which must be released using its -corresponding release function, or moved into a BPF map before -program exit. -.UNINDENT -.TP -.B \fBvoid *bpf_map_lookup_percpu_elem(struct bpf_map *\fP\fImap\fP\fB, const void *\fP\fIkey\fP\fB, u32\fP \fIcpu\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Perform a lookup in \fIpercpu map\fP for an entry associated to -\fIkey\fP on \fIcpu\fP\&. -.TP -.B Return -Map value associated to \fIkey\fP on \fIcpu\fP, or \fBNULL\fP if no entry -was found or \fIcpu\fP is invalid. -.UNINDENT -.TP -.B \fBstruct mptcp_sock *bpf_skc_to_mptcp_sock(void *\fP\fIsk\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Dynamically cast a \fIsk\fP pointer to a \fImptcp_sock\fP pointer. -.TP -.B Return -\fIsk\fP if casting is valid, or \fBNULL\fP otherwise. -.UNINDENT -.TP -.B \fBlong bpf_dynptr_from_mem(void *\fP\fIdata\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB, struct bpf_dynptr *\fP\fIptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get a dynptr to local memory \fIdata\fP\&. -.sp -\fIdata\fP must be a ptr to a map value. -The maximum \fIsize\fP supported is DYNPTR_MAX_SIZE. -\fIflags\fP is currently unused. -.TP -.B Return -0 on success, \-E2BIG if the size exceeds DYNPTR_MAX_SIZE, -\-EINVAL if flags is not 0. -.UNINDENT -.TP -.B \fBlong bpf_ringbuf_reserve_dynptr(void *\fP\fIringbuf\fP\fB, u32\fP \fIsize\fP\fB, u64\fP \fIflags\fP\fB, struct bpf_dynptr *\fP\fIptr\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Reserve \fIsize\fP bytes of payload in a ring buffer \fIringbuf\fP -through the dynptr interface. \fIflags\fP must be 0. -.sp -Please note that a corresponding bpf_ringbuf_submit_dynptr or -bpf_ringbuf_discard_dynptr must be called on \fIptr\fP, even if the -reservation fails. This is enforced by the verifier. -.TP -.B Return -0 on success, or a negative error in case of failure. -.UNINDENT -.TP -.B \fBvoid bpf_ringbuf_submit_dynptr(struct bpf_dynptr *\fP\fIptr\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Submit reserved ring buffer sample, pointed to by \fIdata\fP, -through the dynptr interface. This is a no\-op if the dynptr is -invalid/null. -.sp -For more information on \fIflags\fP, please see -\(aqbpf_ringbuf_submit\(aq. -.TP -.B Return -Nothing. Always succeeds. -.UNINDENT -.TP -.B \fBvoid bpf_ringbuf_discard_dynptr(struct bpf_dynptr *\fP\fIptr\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Discard reserved ring buffer sample through the dynptr -interface. This is a no\-op if the dynptr is invalid/null. -.sp -For more information on \fIflags\fP, please see -\(aqbpf_ringbuf_discard\(aq. -.TP -.B Return -Nothing. Always succeeds. -.UNINDENT -.TP -.B \fBlong bpf_dynptr_read(void *\fP\fIdst\fP\fB, u32\fP \fIlen\fP\fB, const struct bpf_dynptr *\fP\fIsrc\fP\fB, u32\fP \fIoffset\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Read \fIlen\fP bytes from \fIsrc\fP into \fIdst\fP, starting from \fIoffset\fP -into \fIsrc\fP\&. -\fIflags\fP is currently unused. -.TP -.B Return -0 on success, \-E2BIG if \fIoffset\fP + \fIlen\fP exceeds the length -of \fIsrc\fP\(aqs data, \-EINVAL if \fIsrc\fP is an invalid dynptr or if -\fIflags\fP is not 0. -.UNINDENT -.TP -.B \fBlong bpf_dynptr_write(const struct bpf_dynptr *\fP\fIdst\fP\fB, u32\fP \fIoffset\fP\fB, void *\fP\fIsrc\fP\fB, u32\fP \fIlen\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Write \fIlen\fP bytes from \fIsrc\fP into \fIdst\fP, starting from \fIoffset\fP -into \fIdst\fP\&. -.sp -\fIflags\fP must be 0 except for skb\-type dynptrs. -.INDENT 7.0 -.TP -.B For skb\-type dynptrs: -.INDENT 7.0 -.IP \(bu 2 -All data slices of the dynptr are automatically -invalidated after \fBbpf_dynptr_write\fP(). This is -because writing may pull the skb and change the -underlying packet buffer. -.IP \(bu 2 -For \fIflags\fP, please see the flags accepted by -\fBbpf_skb_store_bytes\fP(). -.UNINDENT -.UNINDENT -.TP -.B Return -0 on success, \-E2BIG if \fIoffset\fP + \fIlen\fP exceeds the length -of \fIdst\fP\(aqs data, \-EINVAL if \fIdst\fP is an invalid dynptr or if \fIdst\fP -is a read\-only dynptr or if \fIflags\fP is not correct. For skb\-type dynptrs, -other errors correspond to errors returned by \fBbpf_skb_store_bytes\fP(). -.UNINDENT -.TP -.B \fBvoid *bpf_dynptr_data(const struct bpf_dynptr *\fP\fIptr\fP\fB, u32\fP \fIoffset\fP\fB, u32\fP \fIlen\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get a pointer to the underlying dynptr data. -.sp -\fIlen\fP must be a statically known value. The returned data slice -is invalidated whenever the dynptr is invalidated. -.sp -skb and xdp type dynptrs may not use bpf_dynptr_data. They should -instead use bpf_dynptr_slice and bpf_dynptr_slice_rdwr. -.TP -.B Return -Pointer to the underlying dynptr data, NULL if the dynptr is -read\-only, if the dynptr is invalid, or if the offset and length -is out of bounds. -.UNINDENT -.TP -.B \fBs64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *\fP\fIiph\fP\fB, struct tcphdr *\fP\fIth\fP\fB, u32\fP \fIth_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Try to issue a SYN cookie for the packet with corresponding -IPv4/TCP headers, \fIiph\fP and \fIth\fP, without depending on a -listening socket. -.sp -\fIiph\fP points to the IPv4 header. -.sp -\fIth\fP points to the start of the TCP header, while \fIth_len\fP -contains the length of the TCP header (at least -\fBsizeof\fP(\fBstruct tcphdr\fP)). -.TP -.B Return -On success, lower 32 bits hold the generated SYN cookie in -followed by 16 bits which hold the MSS value for that cookie, -and the top 16 bits are unused. -.sp -On failure, the returned value is one of the following: -.sp -\fB\-EINVAL\fP if \fIth_len\fP is invalid. -.UNINDENT -.TP -.B \fBs64 bpf_tcp_raw_gen_syncookie_ipv6(struct ipv6hdr *\fP\fIiph\fP\fB, struct tcphdr *\fP\fIth\fP\fB, u32\fP \fIth_len\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Try to issue a SYN cookie for the packet with corresponding -IPv6/TCP headers, \fIiph\fP and \fIth\fP, without depending on a -listening socket. -.sp -\fIiph\fP points to the IPv6 header. -.sp -\fIth\fP points to the start of the TCP header, while \fIth_len\fP -contains the length of the TCP header (at least -\fBsizeof\fP(\fBstruct tcphdr\fP)). -.TP -.B Return -On success, lower 32 bits hold the generated SYN cookie in -followed by 16 bits which hold the MSS value for that cookie, -and the top 16 bits are unused. -.sp -On failure, the returned value is one of the following: -.sp -\fB\-EINVAL\fP if \fIth_len\fP is invalid. -.sp -\fB\-EPROTONOSUPPORT\fP if CONFIG_IPV6 is not builtin. -.UNINDENT -.TP -.B \fBlong bpf_tcp_raw_check_syncookie_ipv4(struct iphdr *\fP\fIiph\fP\fB, struct tcphdr *\fP\fIth\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Check whether \fIiph\fP and \fIth\fP contain a valid SYN cookie ACK -without depending on a listening socket. -.sp -\fIiph\fP points to the IPv4 header. -.sp -\fIth\fP points to the TCP header. -.TP -.B Return -0 if \fIiph\fP and \fIth\fP are a valid SYN cookie ACK. -.sp -On failure, the returned value is one of the following: -.sp -\fB\-EACCES\fP if the SYN cookie is not valid. -.UNINDENT -.TP -.B \fBlong bpf_tcp_raw_check_syncookie_ipv6(struct ipv6hdr *\fP\fIiph\fP\fB, struct tcphdr *\fP\fIth\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Check whether \fIiph\fP and \fIth\fP contain a valid SYN cookie ACK -without depending on a listening socket. -.sp -\fIiph\fP points to the IPv6 header. -.sp -\fIth\fP points to the TCP header. -.TP -.B Return -0 if \fIiph\fP and \fIth\fP are a valid SYN cookie ACK. -.sp -On failure, the returned value is one of the following: -.sp -\fB\-EACCES\fP if the SYN cookie is not valid. -.sp -\fB\-EPROTONOSUPPORT\fP if CONFIG_IPV6 is not builtin. -.UNINDENT -.TP -.B \fBu64 bpf_ktime_get_tai_ns(void)\fP -.INDENT 7.0 -.TP -.B Description -A nonsettable system\-wide clock derived from wall\-clock time but -ignoring leap seconds. This clock does not experience -discontinuities and backwards jumps caused by NTP inserting leap -seconds as CLOCK_REALTIME does. -.sp -See: \fBclock_gettime\fP(\fBCLOCK_TAI\fP) -.TP -.B Return -Current \fIktime\fP\&. -.UNINDENT -.TP -.B \fBlong bpf_user_ringbuf_drain(struct bpf_map *\fP\fImap\fP\fB, void *\fP\fIcallback_fn\fP\fB, void *\fP\fIctx\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Drain samples from the specified user ring buffer, and invoke -the provided callback for each such sample: -.sp -long (*callback_fn)(const struct bpf_dynptr *dynptr, void *ctx); -.sp -If \fBcallback_fn\fP returns 0, the helper will continue to try -and drain the next sample, up to a maximum of -BPF_MAX_USER_RINGBUF_SAMPLES samples. If the return value is 1, -the helper will skip the rest of the samples and return. Other -return values are not used now, and will be rejected by the -verifier. -.TP -.B Return -The number of drained samples if no error was encountered while -draining samples, or 0 if no samples were present in the ring -buffer. If a user\-space producer was epoll\-waiting on this map, -and at least one sample was drained, they will receive an event -notification notifying them of available space in the ring -buffer. If the BPF_RB_NO_WAKEUP flag is passed to this -function, no wakeup notification will be sent. If the -BPF_RB_FORCE_WAKEUP flag is passed, a wakeup notification will -be sent even if no sample was drained. -.sp -On failure, the returned value is one of the following: -.sp -\fB\-EBUSY\fP if the ring buffer is contended, and another calling -context was concurrently draining the ring buffer. -.sp -\fB\-EINVAL\fP if user\-space is not properly tracking the ring -buffer due to the producer position not being aligned to 8 -bytes, a sample not being aligned to 8 bytes, or the producer -position not matching the advertised length of a sample. -.sp -\fB\-E2BIG\fP if user\-space has tried to publish a sample which is -larger than the size of the ring buffer, or which cannot fit -within a struct bpf_dynptr. -.UNINDENT -.TP -.B \fBvoid *bpf_cgrp_storage_get(struct bpf_map *\fP\fImap\fP\fB, struct cgroup *\fP\fIcgroup\fP\fB, void *\fP\fIvalue\fP\fB, u64\fP \fIflags\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Get a bpf_local_storage from the \fIcgroup\fP\&. -.sp -Logically, it could be thought of as getting the value from -a \fImap\fP with \fIcgroup\fP as the \fBkey\fP\&. From this -perspective, the usage is not much different from -\fBbpf_map_lookup_elem\fP(\fImap\fP, \fB&\fP\fIcgroup\fP) except this -helper enforces the key must be a cgroup struct and the map must also -be a \fBBPF_MAP_TYPE_CGRP_STORAGE\fP\&. -.sp -In reality, the local\-storage value is embedded directly inside of the -\fIcgroup\fP object itself, rather than being located in the -\fBBPF_MAP_TYPE_CGRP_STORAGE\fP map. When the local\-storage value is -queried for some \fImap\fP on a \fIcgroup\fP object, the kernel will perform an -O(n) iteration over all of the live local\-storage values for that -\fIcgroup\fP object until the local\-storage value for the \fImap\fP is found. -.sp -An optional \fIflags\fP (\fBBPF_LOCAL_STORAGE_GET_F_CREATE\fP) can be -used such that a new bpf_local_storage will be -created if one does not exist. \fIvalue\fP can be used -together with \fBBPF_LOCAL_STORAGE_GET_F_CREATE\fP to specify -the initial value of a bpf_local_storage. If \fIvalue\fP is -\fBNULL\fP, the new bpf_local_storage will be zero initialized. -.TP -.B Return -A bpf_local_storage pointer is returned on success. -.sp -\fBNULL\fP if not found or there was an error in adding -a new bpf_local_storage. -.UNINDENT -.TP -.B \fBlong bpf_cgrp_storage_delete(struct bpf_map *\fP\fImap\fP\fB, struct cgroup *\fP\fIcgroup\fP\fB)\fP -.INDENT 7.0 -.TP -.B Description -Delete a bpf_local_storage from a \fIcgroup\fP\&. -.TP -.B Return -0 on success. -.sp -\fB\-ENOENT\fP if the bpf_local_storage cannot be found. -.UNINDENT -.UNINDENT -.SH EXAMPLES -.sp -Example usage for most of the eBPF helpers listed in this manual page are -available within the Linux kernel sources, at the following locations: -.INDENT 0.0 -.IP \(bu 2 -\fIsamples/bpf/\fP -.IP \(bu 2 -\fItools/testing/selftests/bpf/\fP -.UNINDENT -.SH LICENSE -.sp -eBPF programs can have an associated license, passed along with the bytecode -instructions to the kernel when the programs are loaded. The format for that -string is identical to the one in use for kernel modules (Dual licenses, such -as \(dqDual BSD/GPL\(dq, may be used). Some helper functions are only accessible to -programs that are compatible with the GNU General Public License (GNU GPL). -.sp -In order to use such helpers, the eBPF program must be loaded with the correct -license string passed (via \fBattr\fP) to the \fBbpf\fP() system call, and this -generally translates into the C source code of the program containing a line -similar to the following: -.INDENT 0.0 -.INDENT 3.5 -.sp -.EX -char ____license[] __attribute__((section(\(dqlicense\(dq), used)) = \(dqGPL\(dq; -.EE -.UNINDENT -.UNINDENT -.SH IMPLEMENTATION -.sp -This manual page is an effort to document the existing eBPF helper functions. -But as of this writing, the BPF sub\-system is under heavy development. New eBPF -program or map types are added, along with new helper functions. Some helpers -are occasionally made available for additional program types. So in spite of -the efforts of the community, this page might not be up\-to\-date. If you want to -check by yourself what helper functions exist in your kernel, or what types of -programs they can support, here are some files among the kernel tree that you -may be interested in: -.INDENT 0.0 -.IP \(bu 2 -\fIinclude/uapi/linux/bpf.h\fP is the main BPF header. It contains the full list -of all helper functions, as well as many other BPF definitions including most -of the flags, structs or constants used by the helpers. -.IP \(bu 2 -\fInet/core/filter.c\fP contains the definition of most network\-related helper -functions, and the list of program types from which they can be used. -.IP \(bu 2 -\fIkernel/trace/bpf_trace.c\fP is the equivalent for most tracing program\-related -helpers. -.IP \(bu 2 -\fIkernel/bpf/verifier.c\fP contains the functions used to check that valid types -of eBPF maps are used with a given helper function. -.IP \(bu 2 -\fIkernel/bpf/\fP directory contains other files in which additional helpers are -defined (for cgroups, sockmaps, etc.). -.IP \(bu 2 -The bpftool utility can be used to probe the availability of helper functions -on the system (as well as supported program and map types, and a number of -other parameters). To do so, run \fBbpftool feature probe\fP (see -\fBbpftool\-feature\fP(8) for details). Add the \fBunprivileged\fP keyword to -list features available to unprivileged users. -.UNINDENT -.sp -Compatibility between helper functions and program types can generally be found -in the files where helper functions are defined. Look for the \fBstruct -bpf_func_proto\fP objects and for functions returning them: these functions -contain a list of helpers that a given program type can call. Note that the -\fBdefault:\fP label of the \fBswitch ... case\fP used to filter helpers can call -other functions, themselves allowing access to additional helpers. The -requirement for GPL license is also in those \fBstruct bpf_func_proto\fP\&. -.sp -Compatibility between helper functions and map types can be found in the -\fBcheck_map_func_compatibility\fP() function in file \fIkernel/bpf/verifier.c\fP\&. -.sp -Helper functions that invalidate the checks on \fBdata\fP and \fBdata_end\fP -pointers for network processing are listed in function -\fBbpf_helper_changes_pkt_data\fP() in file \fInet/core/filter.c\fP\&. -.SH SEE ALSO -.sp -\fBbpf\fP(2), -\fBbpftool\fP(8), -\fBcgroups\fP(7), -\fBip\fP(8), -\fBperf_event_open\fP(2), -\fBsendmsg\fP(2), -\fBsocket\fP(7), -\fBtc\-bpf\fP(8) -.\" Generated by docutils manpage writer. -. |