summaryrefslogtreecommitdiffstats
path: root/man7/vdso.7
diff options
context:
space:
mode:
Diffstat (limited to 'man7/vdso.7')
-rw-r--r--man7/vdso.7612
1 files changed, 0 insertions, 612 deletions
diff --git a/man7/vdso.7 b/man7/vdso.7
deleted file mode 100644
index d37360c..0000000
--- a/man7/vdso.7
+++ /dev/null
@@ -1,612 +0,0 @@
-'\" t
-.\" Written by Mike Frysinger <vapier@gentoo.org>
-.\"
-.\" %%%LICENSE_START(PUBLIC_DOMAIN)
-.\" This page is in the public domain.
-.\" %%%LICENSE_END
-.\"
-.\" Useful background:
-.\" http://articles.manugarg.com/systemcallinlinux2_6.html
-.\" https://lwn.net/Articles/446528/
-.\" http://www.linuxjournal.com/content/creating-vdso-colonels-other-chicken
-.\" http://www.trilithium.com/johan/2005/08/linux-gate/
-.\"
-.TH vDSO 7 2024-02-18 "Linux man-pages 6.7"
-.SH NAME
-vdso \- overview of the virtual ELF dynamic shared object
-.SH SYNOPSIS
-.nf
-.B #include <sys/auxv.h>
-.P
-.B void *vdso = (uintptr_t) getauxval(AT_SYSINFO_EHDR);
-.fi
-.SH DESCRIPTION
-The "vDSO" (virtual dynamic shared object) is a small shared library that
-the kernel automatically maps into the
-address space of all user-space applications.
-Applications usually do not need to concern themselves with these details
-as the vDSO is most commonly called by the C library.
-This way you can code in the normal way using standard functions
-and the C library will take care
-of using any functionality that is available via the vDSO.
-.P
-Why does the vDSO exist at all?
-There are some system calls the kernel provides that
-user-space code ends up using frequently,
-to the point that such calls can dominate overall performance.
-This is due both to the frequency of the call as well as the
-context-switch overhead that results
-from exiting user space and entering the kernel.
-.P
-The rest of this documentation is geared toward the curious and/or
-C library writers rather than general developers.
-If you're trying to call the vDSO in your own application rather than using
-the C library, you're most likely doing it wrong.
-.SS Example background
-Making system calls can be slow.
-In x86 32-bit systems, you can trigger a software interrupt
-.RI ( "int $0x80" )
-to tell the kernel you wish to make a system call.
-However, this instruction is expensive: it goes through
-the full interrupt-handling paths
-in the processor's microcode as well as in the kernel.
-Newer processors have faster (but backward incompatible) instructions to
-initiate system calls.
-Rather than require the C library to figure out if this functionality is
-available at run time,
-the C library can use functions provided by the kernel in
-the vDSO.
-.P
-Note that the terminology can be confusing.
-On x86 systems, the vDSO function
-used to determine the preferred method of making a system call is
-named "__kernel_vsyscall", but on x86-64,
-the term "vsyscall" also refers to an obsolete way to ask the kernel
-what time it is or what CPU the caller is on.
-.P
-One frequently used system call is
-.BR gettimeofday (2).
-This system call is called both directly by user-space applications
-as well as indirectly by
-the C library.
-Think timestamps or timing loops or polling\[em]all of these
-frequently need to know what time it is right now.
-This information is also not secret\[em]any application in any
-privilege mode (root or any unprivileged user) will get the same answer.
-Thus the kernel arranges for the information required to answer
-this question to be placed in memory the process can access.
-Now a call to
-.BR gettimeofday (2)
-changes from a system call to a normal function
-call and a few memory accesses.
-.SS Finding the vDSO
-The base address of the vDSO (if one exists) is passed by the kernel to
-each program in the initial auxiliary vector (see
-.BR getauxval (3)),
-via the
-.B AT_SYSINFO_EHDR
-tag.
-.P
-You must not assume the vDSO is mapped at any particular location in the
-user's memory map.
-The base address will usually be randomized at run time every time a new
-process image is created (at
-.BR execve (2)
-time).
-This is done for security reasons,
-to prevent "return-to-libc" attacks.
-.P
-For some architectures, there is also an
-.B AT_SYSINFO
-tag.
-This is used only for locating the vsyscall entry point and is frequently
-omitted or set to 0 (meaning it's not available).
-This tag is a throwback to the initial vDSO work (see
-.I History
-below) and its use should be avoided.
-.SS File format
-Since the vDSO is a fully formed ELF image, you can do symbol lookups on it.
-This allows new symbols to be added with newer kernel releases,
-and allows the C library to detect available functionality at
-run time when running under different kernel versions.
-Oftentimes the C library will do detection with the first call and then
-cache the result for subsequent calls.
-.P
-All symbols are also versioned (using the GNU version format).
-This allows the kernel to update the function signature without breaking
-backward compatibility.
-This means changing the arguments that the function accepts as well as the
-return value.
-Thus, when looking up a symbol in the vDSO,
-you must always include the version
-to match the ABI you expect.
-.P
-Typically the vDSO follows the naming convention of prefixing
-all symbols with "__vdso_" or "__kernel_"
-so as to distinguish them from other standard symbols.
-For example, the "gettimeofday" function is named "__vdso_gettimeofday".
-.P
-You use the standard C calling conventions when calling
-any of these functions.
-No need to worry about weird register or stack behavior.
-.SH NOTES
-.SS Source
-When you compile the kernel,
-it will automatically compile and link the vDSO code for you.
-You will frequently find it under the architecture-specific directory:
-.P
-.in +4n
-.EX
-find arch/$ARCH/ \-name \[aq]*vdso*.so*\[aq] \-o \-name \[aq]*gate*.so*\[aq]
-.EE
-.in
-.\"
-.SS vDSO names
-The name of the vDSO varies across architectures.
-It will often show up in things like glibc's
-.BR ldd (1)
-output.
-The exact name should not matter to any code, so do not hardcode it.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-user ABI vDSO name
-_
-aarch64 linux\-vdso.so.1
-arm linux\-vdso.so.1
-ia64 linux\-gate.so.1
-mips linux\-vdso.so.1
-ppc/32 linux\-vdso32.so.1
-ppc/64 linux\-vdso64.so.1
-riscv linux\-vdso.so.1
-s390 linux\-vdso32.so.1
-s390x linux\-vdso64.so.1
-sh linux\-gate.so.1
-i386 linux\-gate.so.1
-x86-64 linux\-vdso.so.1
-x86/x32 linux\-vdso.so.1
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS strace(1), seccomp(2), and the vDSO
-When tracing system calls with
-.BR strace (1),
-symbols (system calls) that are exported by the vDSO will
-.I not
-appear in the trace output.
-Those system calls will likewise not be visible to
-.BR seccomp (2)
-filters.
-.SH ARCHITECTURE-SPECIFIC NOTES
-The subsections below provide architecture-specific notes
-on the vDSO.
-.P
-Note that the vDSO that is used is based on the ABI of your user-space code
-and not the ABI of the kernel.
-Thus, for example,
-when you run an i386 32-bit ELF binary,
-you'll get the same vDSO regardless of whether you run it under
-an i386 32-bit kernel or under an x86-64 64-bit kernel.
-Therefore, the name of the user-space ABI should be used to determine
-which of the sections below is relevant.
-.SS ARM functions
-.\" See linux/arch/arm/vdso/vdso.lds.S
-.\" Commit: 8512287a8165592466cb9cb347ba94892e9c56a5
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__vdso_gettimeofday LINUX_2.6 (exported since Linux 4.1)
-__vdso_clock_gettime LINUX_2.6 (exported since Linux 4.1)
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.P
-.\" See linux/arch/arm/kernel/entry-armv.S
-.\" See linux/Documentation/arm/kernel_user_helpers.rst
-Additionally, the ARM port has a code page full of utility functions.
-Since it's just a raw page of code, there is no ELF information for doing
-symbol lookups or versioning.
-It does provide support for different versions though.
-.P
-For information on this code page,
-it's best to refer to the kernel documentation
-as it's extremely detailed and covers everything you need to know:
-.IR Documentation/arm/kernel_user_helpers.rst .
-.SS aarch64 functions
-.\" See linux/arch/arm64/kernel/vdso/vdso.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_rt_sigreturn LINUX_2.6.39
-__kernel_gettimeofday LINUX_2.6.39
-__kernel_clock_gettime LINUX_2.6.39
-__kernel_clock_getres LINUX_2.6.39
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS bfin (Blackfin) functions (port removed in Linux 4.17)
-.\" See linux/arch/blackfin/kernel/fixed_code.S
-.\" See http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:fixed-code
-As this CPU lacks a memory management unit (MMU),
-it doesn't set up a vDSO in the normal sense.
-Instead, it maps at boot time a few raw functions into
-a fixed location in memory.
-User-space applications then call directly into that region.
-There is no provision for backward compatibility
-beyond sniffing raw opcodes,
-but as this is an embedded CPU, it can get away with things\[em]some of the
-object formats it runs aren't even ELF based (they're bFLT/FLAT).
-.P
-For information on this code page,
-it's best to refer to the public documentation:
-.br
-http://docs.blackfin.uclinux.org/doku.php?id=linux\-kernel:fixed\-code
-.SS mips functions
-.\" See linux/arch/mips/vdso/vdso.ld.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_gettimeofday LINUX_2.6 (exported since Linux 4.4)
-__kernel_clock_gettime LINUX_2.6 (exported since Linux 4.4)
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS ia64 (Itanium) functions
-.\" See linux/arch/ia64/kernel/gate.lds.S
-.\" Also linux/arch/ia64/kernel/fsys.S and linux/Documentation/ia64/fsys.rst
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_sigtramp LINUX_2.5
-__kernel_syscall_via_break LINUX_2.5
-__kernel_syscall_via_epc LINUX_2.5
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.P
-The Itanium port is somewhat tricky.
-In addition to the vDSO above, it also has "light-weight system calls"
-(also known as "fast syscalls" or "fsys").
-You can invoke these via the
-.I __kernel_syscall_via_epc
-vDSO helper.
-The system calls listed here have the same semantics as if you called them
-directly via
-.BR syscall (2),
-so refer to the relevant
-documentation for each.
-The table below lists the functions available via this mechanism.
-.if t \{\
-.ft CW
-\}
-.TS
-l.
-function
-_
-clock_gettime
-getcpu
-getpid
-getppid
-gettimeofday
-set_tid_address
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS parisc (hppa) functions
-.\" See linux/arch/parisc/kernel/syscall.S
-.\" See linux/Documentation/parisc/registers.rst
-The parisc port has a code page with utility functions
-called a gateway page.
-Rather than use the normal ELF auxiliary vector approach,
-it passes the address of
-the page to the process via the SR2 register.
-The permissions on the page are such that merely executing those addresses
-automatically executes with kernel privileges and not in user space.
-This is done to match the way HP-UX works.
-.P
-Since it's just a raw page of code, there is no ELF information for doing
-symbol lookups or versioning.
-Simply call into the appropriate offset via the branch instruction,
-for example:
-.P
-.in +4n
-.EX
-ble <offset>(%sr2, %r0)
-.EE
-.in
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-offset function
-_
-00b0 lws_entry (CAS operations)
-00e0 set_thread_pointer (used by glibc)
-0100 linux_gateway_entry (syscall)
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS ppc/32 functions
-.\" See linux/arch/powerpc/kernel/vdso32/vdso32.lds.S
-The table below lists the symbols exported by the vDSO.
-The functions marked with a
-.I *
-are available only when the kernel is
-a PowerPC64 (64-bit) kernel.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_clock_getres LINUX_2.6.15
-__kernel_clock_gettime LINUX_2.6.15
-__kernel_clock_gettime64 LINUX_5.11
-__kernel_datapage_offset LINUX_2.6.15
-__kernel_get_syscall_map LINUX_2.6.15
-__kernel_get_tbfreq LINUX_2.6.15
-__kernel_getcpu \fI*\fR LINUX_2.6.15
-__kernel_gettimeofday LINUX_2.6.15
-__kernel_sigtramp_rt32 LINUX_2.6.15
-__kernel_sigtramp32 LINUX_2.6.15
-__kernel_sync_dicache LINUX_2.6.15
-__kernel_sync_dicache_p5 LINUX_2.6.15
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.P
-Before Linux 5.6,
-.\" commit 654abc69ef2e69712e6d4e8a6cb9292b97a4aa39
-the
-.B CLOCK_REALTIME_COARSE
-and
-.B CLOCK_MONOTONIC_COARSE
-clocks are
-.I not
-supported by the
-.I __kernel_clock_getres
-and
-.I __kernel_clock_gettime
-interfaces;
-the kernel falls back to the real system call.
-.SS ppc/64 functions
-.\" See linux/arch/powerpc/kernel/vdso64/vdso64.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_clock_getres LINUX_2.6.15
-__kernel_clock_gettime LINUX_2.6.15
-__kernel_datapage_offset LINUX_2.6.15
-__kernel_get_syscall_map LINUX_2.6.15
-__kernel_get_tbfreq LINUX_2.6.15
-__kernel_getcpu LINUX_2.6.15
-__kernel_gettimeofday LINUX_2.6.15
-__kernel_sigtramp_rt64 LINUX_2.6.15
-__kernel_sync_dicache LINUX_2.6.15
-__kernel_sync_dicache_p5 LINUX_2.6.15
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.P
-Before Linux 4.16,
-.\" commit 5c929885f1bb4b77f85b1769c49405a0e0f154a1
-the
-.B CLOCK_REALTIME_COARSE
-and
-.B CLOCK_MONOTONIC_COARSE
-clocks are
-.I not
-supported by the
-.I __kernel_clock_getres
-and
-.I __kernel_clock_gettime
-interfaces;
-the kernel falls back to the real system call.
-.SS riscv functions
-.\" See linux/arch/riscv/kernel/vdso/vdso.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__vdso_rt_sigreturn LINUX_4.15
-__vdso_gettimeofday LINUX_4.15
-__vdso_clock_gettime LINUX_4.15
-__vdso_clock_getres LINUX_4.15
-__vdso_getcpu LINUX_4.15
-__vdso_flush_icache LINUX_4.15
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS s390 functions
-.\" See linux/arch/s390/kernel/vdso32/vdso32.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_clock_getres LINUX_2.6.29
-__kernel_clock_gettime LINUX_2.6.29
-__kernel_gettimeofday LINUX_2.6.29
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS s390x functions
-.\" See linux/arch/s390/kernel/vdso64/vdso64.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_clock_getres LINUX_2.6.29
-__kernel_clock_gettime LINUX_2.6.29
-__kernel_gettimeofday LINUX_2.6.29
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS sh (SuperH) functions
-.\" See linux/arch/sh/kernel/vsyscall/vsyscall.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_rt_sigreturn LINUX_2.6
-__kernel_sigreturn LINUX_2.6
-__kernel_vsyscall LINUX_2.6
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS i386 functions
-.\" See linux/arch/x86/vdso/vdso32/vdso32.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__kernel_sigreturn LINUX_2.5
-__kernel_rt_sigreturn LINUX_2.5
-__kernel_vsyscall LINUX_2.5
-.\" Added in 7a59ed415f5b57469e22e41fc4188d5399e0b194 and updated
-.\" in 37c975545ec63320789962bf307f000f08fabd48.
-__vdso_clock_gettime LINUX_2.6 (exported since Linux 3.15)
-__vdso_gettimeofday LINUX_2.6 (exported since Linux 3.15)
-__vdso_time LINUX_2.6 (exported since Linux 3.15)
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS x86-64 functions
-.\" See linux/arch/x86/vdso/vdso.lds.S
-The table below lists the symbols exported by the vDSO.
-All of these symbols are also available without the "__vdso_" prefix, but
-you should ignore those and stick to the names below.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__vdso_clock_gettime LINUX_2.6
-__vdso_getcpu LINUX_2.6
-__vdso_gettimeofday LINUX_2.6
-__vdso_time LINUX_2.6
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS x86/x32 functions
-.\" See linux/arch/x86/vdso/vdso32.lds.S
-The table below lists the symbols exported by the vDSO.
-.if t \{\
-.ft CW
-\}
-.TS
-l l.
-symbol version
-_
-__vdso_clock_gettime LINUX_2.6
-__vdso_getcpu LINUX_2.6
-__vdso_gettimeofday LINUX_2.6
-__vdso_time LINUX_2.6
-.TE
-.if t \{\
-.in
-.ft P
-\}
-.SS History
-The vDSO was originally just a single function\[em]the vsyscall.
-In older kernels, you might see that name
-in a process's memory map rather than "vdso".
-Over time, people realized that this mechanism
-was a great way to pass more functionality
-to user space, so it was reconceived as a vDSO in the current format.
-.SH SEE ALSO
-.BR syscalls (2),
-.BR getauxval (3),
-.BR proc (5)
-.P
-The documents, examples, and source code in the Linux source code tree:
-.P
-.in +4n
-.EX
-Documentation/ABI/stable/vdso
-Documentation/ia64/fsys.rst
-Documentation/vDSO/* (includes examples of using the vDSO)
-.P
-find arch/ \-iname \[aq]*vdso*\[aq] \-o \-iname \[aq]*gate*\[aq]
-.EE
-.in