summaryrefslogtreecommitdiffstats
path: root/man2/ptrace.2
diff options
context:
space:
mode:
Diffstat (limited to 'man2/ptrace.2')
-rw-r--r--man2/ptrace.22974
1 files changed, 2974 insertions, 0 deletions
diff --git a/man2/ptrace.2 b/man2/ptrace.2
new file mode 100644
index 0000000..4149a32
--- /dev/null
+++ b/man2/ptrace.2
@@ -0,0 +1,2974 @@
+.\" Copyright (c) 1993 Michael Haardt <michael@moria.de>
+.\" Fri Apr 2 11:32:09 MET DST 1993
+.\"
+.\" and changes Copyright (C) 1999 Mike Coleman (mkc@acm.org)
+.\" -- major revision to fully document ptrace semantics per recent Linux
+.\" kernel (2.2.10) and glibc (2.1.2)
+.\" Sun Nov 7 03:18:35 CST 1999
+.\"
+.\" and Copyright (c) 2011, Denys Vlasenko <vda.linux@googlemail.com>
+.\" and Copyright (c) 2015, 2016, Michael Kerrisk <mtk.manpages@gmail.com>
+.\"
+.\" SPDX-License-Identifier: GPL-2.0-or-later
+.\"
+.\" Modified Fri Jul 23 23:47:18 1993 by Rik Faith <faith@cs.unc.edu>
+.\" Modified Fri Jan 31 16:46:30 1997 by Eric S. Raymond <esr@thyrsus.com>
+.\" Modified Thu Oct 7 17:28:49 1999 by Andries Brouwer <aeb@cwi.nl>
+.\" Modified, 27 May 2004, Michael Kerrisk <mtk.manpages@gmail.com>
+.\" Added notes on capability requirements
+.\"
+.\" 2006-03-24, Chuck Ebbert <76306.1226@compuserve.com>
+.\" Added PTRACE_SETOPTIONS, PTRACE_GETEVENTMSG, PTRACE_GETSIGINFO,
+.\" PTRACE_SETSIGINFO, PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP
+.\" (Thanks to Blaisorblade, Daniel Jacobowitz and others who helped.)
+.\" 2011-09, major update by Denys Vlasenko <vda.linux@googlemail.com>
+.\" 2015-01, Kees Cook <keescook@chromium.org>
+.\" Added PTRACE_O_TRACESECCOMP, PTRACE_EVENT_SECCOMP
+.\"
+.\" FIXME The following are undocumented:
+.\"
+.\" PTRACE_GETWMMXREGS
+.\" PTRACE_SETWMMXREGS
+.\" ARM
+.\" Linux 2.6.12
+.\"
+.\" PTRACE_SET_SYSCALL
+.\" ARM and ARM64
+.\" Linux 2.6.16
+.\" commit 3f471126ee53feb5e9b210ea2f525ed3bb9b7a7f
+.\" Author: Nicolas Pitre <nico@cam.org>
+.\" Date: Sat Jan 14 19:30:04 2006 +0000
+.\"
+.\" PTRACE_GETCRUNCHREGS
+.\" PTRACE_SETCRUNCHREGS
+.\" ARM
+.\" Linux 2.6.18
+.\" commit 3bec6ded282b331552587267d67a06ed7fd95ddd
+.\" Author: Lennert Buytenhek <buytenh@wantstofly.org>
+.\" Date: Tue Jun 27 22:56:18 2006 +0100
+.\"
+.\" PTRACE_GETVFPREGS
+.\" PTRACE_SETVFPREGS
+.\" ARM and ARM64
+.\" Linux 2.6.30
+.\" commit 3d1228ead618b88e8606015cbabc49019981805d
+.\" Author: Catalin Marinas <catalin.marinas@arm.com>
+.\" Date: Wed Feb 11 13:12:56 2009 +0100
+.\"
+.\" PTRACE_GETHBPREGS
+.\" PTRACE_SETHBPREGS
+.\" ARM and ARM64
+.\" Linux 2.6.37
+.\" commit 864232fa1a2f8dfe003438ef0851a56722740f3e
+.\" Author: Will Deacon <will.deacon@arm.com>
+.\" Date: Fri Sep 3 10:42:55 2010 +0100
+.\"
+.\" PTRACE_SINGLEBLOCK
+.\" Since at least Linux 2.4.0 on various architectures
+.\" Since Linux 2.6.25 on x86 (and others?)
+.\" commit 5b88abbf770a0e1975c668743100f42934f385e8
+.\" Author: Roland McGrath <roland@redhat.com>
+.\" Date: Wed Jan 30 13:30:53 2008 +0100
+.\" ptrace: generic PTRACE_SINGLEBLOCK
+.\"
+.\" PTRACE_GETFPXREGS
+.\" PTRACE_SETFPXREGS
+.\" Since at least Linux 2.4.0 on various architectures
+.\"
+.\" PTRACE_GETFDPIC
+.\" PTRACE_GETFDPIC_EXEC
+.\" PTRACE_GETFDPIC_INTERP
+.\" blackfin, c6x, frv, sh
+.\" First appearance in Linux 2.6.11 on frv
+.\"
+.\" and others that can be found in the arch/*/include/uapi/asm/ptrace files
+.\"
+.TH ptrace 2 2023-03-30 "Linux man-pages 6.05.01"
+.SH NAME
+ptrace \- process trace
+.SH LIBRARY
+Standard C library
+.RI ( libc ", " \-lc )
+.SH SYNOPSIS
+.nf
+.B #include <sys/ptrace.h>
+.PP
+.BI "long ptrace(enum __ptrace_request " request ", pid_t " pid ,
+.BI " void *" addr ", void *" data );
+.fi
+.SH DESCRIPTION
+The
+.BR ptrace ()
+system call provides a means by which one process (the "tracer")
+may observe and control the execution of another process (the "tracee"),
+and examine and change the tracee's memory and registers.
+It is primarily used to implement breakpoint debugging and system
+call tracing.
+.PP
+A tracee first needs to be attached to the tracer.
+Attachment and subsequent commands are per thread:
+in a multithreaded process,
+every thread can be individually attached to a
+(potentially different) tracer,
+or left not attached and thus not debugged.
+Therefore, "tracee" always means "(one) thread",
+never "a (possibly multithreaded) process".
+Ptrace commands are always sent to
+a specific tracee using a call of the form
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_foo, pid, ...)
+.EE
+.in
+.PP
+where
+.I pid
+is the thread ID of the corresponding Linux thread.
+.PP
+(Note that in this page, a "multithreaded process"
+means a thread group consisting of threads created using the
+.BR clone (2)
+.B CLONE_THREAD
+flag.)
+.PP
+A process can initiate a trace by calling
+.BR fork (2)
+and having the resulting child do a
+.BR PTRACE_TRACEME ,
+followed (typically) by an
+.BR execve (2).
+Alternatively, one process may commence tracing another process using
+.B PTRACE_ATTACH
+or
+.BR PTRACE_SEIZE .
+.PP
+While being traced, the tracee will stop each time a signal is delivered,
+even if the signal is being ignored.
+(An exception is
+.BR SIGKILL ,
+which has its usual effect.)
+The tracer will be notified at its next call to
+.BR waitpid (2)
+(or one of the related "wait" system calls); that call will return a
+.I status
+value containing information that indicates
+the cause of the stop in the tracee.
+While the tracee is stopped,
+the tracer can use various ptrace requests to inspect and modify the tracee.
+The tracer then causes the tracee to continue,
+optionally ignoring the delivered signal
+(or even delivering a different signal instead).
+.PP
+If the
+.B PTRACE_O_TRACEEXEC
+option is not in effect, all successful calls to
+.BR execve (2)
+by the traced process will cause it to be sent a
+.B SIGTRAP
+signal,
+giving the parent a chance to gain control before the new program
+begins execution.
+.PP
+When the tracer is finished tracing, it can cause the tracee to continue
+executing in a normal, untraced mode via
+.BR PTRACE_DETACH .
+.PP
+The value of
+.I request
+determines the action to be performed:
+.TP
+.B PTRACE_TRACEME
+Indicate that this process is to be traced by its parent.
+A process probably shouldn't make this request if its parent
+isn't expecting to trace it.
+.RI ( pid ,
+.IR addr ,
+and
+.I data
+are ignored.)
+.IP
+The
+.B PTRACE_TRACEME
+request is used only by the tracee;
+the remaining requests are used only by the tracer.
+In the following requests,
+.I pid
+specifies the thread ID of the tracee to be acted on.
+For requests other than
+.BR PTRACE_ATTACH ,
+.BR PTRACE_SEIZE ,
+.BR PTRACE_INTERRUPT ,
+and
+.BR PTRACE_KILL ,
+the tracee must be stopped.
+.TP
+.BR PTRACE_PEEKTEXT ", " PTRACE_PEEKDATA
+Read a word at the address
+.I addr
+in the tracee's memory, returning the word as the result of the
+.BR ptrace ()
+call.
+Linux does not have separate text and data address spaces,
+so these two requests are currently equivalent.
+.RI ( data
+is ignored; but see NOTES.)
+.TP
+.B PTRACE_PEEKUSER
+.\" PTRACE_PEEKUSR in kernel source, but glibc uses PTRACE_PEEKUSER,
+.\" and that is the name that seems common on other systems.
+Read a word at offset
+.I addr
+in the tracee's USER area,
+which holds the registers and other information about the process
+(see
+.IR <sys/user.h> ).
+The word is returned as the result of the
+.BR ptrace ()
+call.
+Typically, the offset must be word-aligned, though this might vary by
+architecture.
+See NOTES.
+.RI ( data
+is ignored; but see NOTES.)
+.TP
+.BR PTRACE_POKETEXT ", " PTRACE_POKEDATA
+Copy the word
+.I data
+to the address
+.I addr
+in the tracee's memory.
+As for
+.B PTRACE_PEEKTEXT
+and
+.BR PTRACE_PEEKDATA ,
+these two requests are currently equivalent.
+.TP
+.B PTRACE_POKEUSER
+.\" PTRACE_POKEUSR in kernel source, but glibc uses PTRACE_POKEUSER,
+.\" and that is the name that seems common on other systems.
+Copy the word
+.I data
+to offset
+.I addr
+in the tracee's USER area.
+As for
+.BR PTRACE_PEEKUSER ,
+the offset must typically be word-aligned.
+In order to maintain the integrity of the kernel,
+some modifications to the USER area are disallowed.
+.\" FIXME In the preceding sentence, which modifications are disallowed,
+.\" and when they are disallowed, how does user space discover that fact?
+.TP
+.BR PTRACE_GETREGS ", " PTRACE_GETFPREGS
+Copy the tracee's general-purpose or floating-point registers,
+respectively, to the address
+.I data
+in the tracer.
+See
+.I <sys/user.h>
+for information on the format of this data.
+.RI ( addr
+is ignored.)
+Note that SPARC systems have the meaning of
+.I data
+and
+.I addr
+reversed; that is,
+.I data
+is ignored and the registers are copied to the address
+.IR addr .
+.B PTRACE_GETREGS
+and
+.B PTRACE_GETFPREGS
+are not present on all architectures.
+.TP
+.BR PTRACE_GETREGSET " (since Linux 2.6.34)"
+Read the tracee's registers.
+.I addr
+specifies, in an architecture-dependent way, the type of registers to be read.
+.B NT_PRSTATUS
+(with numerical value 1)
+usually results in reading of general-purpose registers.
+If the CPU has, for example,
+floating-point and/or vector registers, they can be retrieved by setting
+.I addr
+to the corresponding
+.B NT_foo
+constant.
+.I data
+points to a
+.BR "struct iovec" ,
+which describes the destination buffer's location and length.
+On return, the kernel modifies
+.B iov.len
+to indicate the actual number of bytes returned.
+.TP
+.BR PTRACE_SETREGS ", " PTRACE_SETFPREGS
+Modify the tracee's general-purpose or floating-point registers,
+respectively, from the address
+.I data
+in the tracer.
+As for
+.BR PTRACE_POKEUSER ,
+some general-purpose register modifications may be disallowed.
+.\" FIXME . In the preceding sentence, which modifications are disallowed,
+.\" and when they are disallowed, how does user space discover that fact?
+.RI ( addr
+is ignored.)
+Note that SPARC systems have the meaning of
+.I data
+and
+.I addr
+reversed; that is,
+.I data
+is ignored and the registers are copied from the address
+.IR addr .
+.B PTRACE_SETREGS
+and
+.B PTRACE_SETFPREGS
+are not present on all architectures.
+.TP
+.BR PTRACE_SETREGSET " (since Linux 2.6.34)"
+Modify the tracee's registers.
+The meaning of
+.I addr
+and
+.I data
+is analogous to
+.BR PTRACE_GETREGSET .
+.TP
+.BR PTRACE_GETSIGINFO " (since Linux 2.3.99-pre6)"
+Retrieve information about the signal that caused the stop.
+Copy a
+.I siginfo_t
+structure (see
+.BR sigaction (2))
+from the tracee to the address
+.I data
+in the tracer.
+.RI ( addr
+is ignored.)
+.TP
+.BR PTRACE_SETSIGINFO " (since Linux 2.3.99-pre6)"
+Set signal information:
+copy a
+.I siginfo_t
+structure from the address
+.I data
+in the tracer to the tracee.
+This will affect only signals that would normally be delivered to
+the tracee and were caught by the tracer.
+It may be difficult to tell
+these normal signals from synthetic signals generated by
+.BR ptrace ()
+itself.
+.RI ( addr
+is ignored.)
+.TP
+.BR PTRACE_PEEKSIGINFO " (since Linux 3.10)"
+.\" commit 84c751bd4aebbaae995fe32279d3dba48327bad4
+Retrieve
+.I siginfo_t
+structures without removing signals from a queue.
+.I addr
+points to a
+.I ptrace_peeksiginfo_args
+structure that specifies the ordinal position from which
+copying of signals should start,
+and the number of signals to copy.
+.I siginfo_t
+structures are copied into the buffer pointed to by
+.IR data .
+The return value contains the number of copied signals (zero indicates
+that there is no signal corresponding to the specified ordinal position).
+Within the returned
+.I siginfo
+structures,
+the
+.I si_code
+field includes information
+.RB ( __SI_CHLD ,
+.BR __SI_FAULT ,
+etc.) that are not otherwise exposed to user space.
+.PP
+.in +4n
+.EX
+struct ptrace_peeksiginfo_args {
+ u64 off; /* Ordinal position in queue at which
+ to start copying signals */
+ u32 flags; /* PTRACE_PEEKSIGINFO_SHARED or 0 */
+ s32 nr; /* Number of signals to copy */
+};
+.EE
+.in
+.IP
+Currently, there is only one flag,
+.BR PTRACE_PEEKSIGINFO_SHARED ,
+for dumping signals from the process-wide signal queue.
+If this flag is not set,
+signals are read from the per-thread queue of the specified thread.
+.in
+.TP
+.BR PTRACE_GETSIGMASK " (since Linux 3.11)"
+.\" commit 29000caecbe87b6b66f144f72111f0d02fbbf0c1
+Place a copy of the mask of blocked signals (see
+.BR sigprocmask (2))
+in the buffer pointed to by
+.IR data ,
+which should be a pointer to a buffer of type
+.IR sigset_t .
+The
+.I addr
+argument contains the size of the buffer pointed to by
+.I data
+(i.e.,
+.IR sizeof(sigset_t) ).
+.TP
+.BR PTRACE_SETSIGMASK " (since Linux 3.11)"
+Change the mask of blocked signals (see
+.BR sigprocmask (2))
+to the value specified in the buffer pointed to by
+.IR data ,
+which should be a pointer to a buffer of type
+.IR sigset_t .
+The
+.I addr
+argument contains the size of the buffer pointed to by
+.I data
+(i.e.,
+.IR sizeof(sigset_t) ).
+.TP
+.BR PTRACE_SETOPTIONS " (since Linux 2.4.6; see BUGS for caveats)"
+Set ptrace options from
+.IR data .
+.RI ( addr
+is ignored.)
+.I data
+is interpreted as a bit mask of options,
+which are specified by the following flags:
+.RS
+.TP
+.BR PTRACE_O_EXITKILL " (since Linux 3.8)"
+.\" commit 992fb6e170639b0849bace8e49bf31bd37c4123
+Send a
+.B SIGKILL
+signal to the tracee if the tracer exits.
+This option is useful for ptrace jailers that
+want to ensure that tracees can never escape the tracer's control.
+.TP
+.BR PTRACE_O_TRACECLONE " (since Linux 2.5.46)"
+Stop the tracee at the next
+.BR clone (2)
+and automatically start tracing the newly cloned process,
+which will start with a
+.BR SIGSTOP ,
+or
+.B PTRACE_EVENT_STOP
+if
+.B PTRACE_SEIZE
+was used.
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_CLONE<<8))
+.fi
+.IP
+The PID of the new process can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.IP
+This option may not catch
+.BR clone (2)
+calls in all cases.
+If the tracee calls
+.BR clone (2)
+with the
+.B CLONE_VFORK
+flag,
+.B PTRACE_EVENT_VFORK
+will be delivered instead
+if
+.B PTRACE_O_TRACEVFORK
+is set; otherwise if the tracee calls
+.BR clone (2)
+with the exit signal set to
+.BR SIGCHLD ,
+.B PTRACE_EVENT_FORK
+will be delivered if
+.B PTRACE_O_TRACEFORK
+is set.
+.TP
+.BR PTRACE_O_TRACEEXEC " (since Linux 2.5.46)"
+Stop the tracee at the next
+.BR execve (2).
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8))
+.fi
+.IP
+If the execing thread is not a thread group leader,
+the thread ID is reset to thread group leader's ID before this stop.
+Since Linux 3.0, the former thread ID can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.TP
+.BR PTRACE_O_TRACEEXIT " (since Linux 2.5.60)"
+Stop the tracee at exit.
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_EXIT<<8))
+.fi
+.IP
+The tracee's exit status can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.IP
+The tracee is stopped early during process exit,
+when registers are still available,
+allowing the tracer to see where the exit occurred,
+whereas the normal exit notification is done after the process
+is finished exiting.
+Even though context is available,
+the tracer cannot prevent the exit from happening at this point.
+.TP
+.BR PTRACE_O_TRACEFORK " (since Linux 2.5.46)"
+Stop the tracee at the next
+.BR fork (2)
+and automatically start tracing the newly forked process,
+which will start with a
+.BR SIGSTOP ,
+or
+.B PTRACE_EVENT_STOP
+if
+.B PTRACE_SEIZE
+was used.
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_FORK<<8))
+.fi
+.IP
+The PID of the new process can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.TP
+.BR PTRACE_O_TRACESYSGOOD " (since Linux 2.4.6)"
+When delivering system call traps, set bit 7 in the signal number
+(i.e., deliver
+.IR "SIGTRAP|0x80" ).
+This makes it easy for the tracer to distinguish
+normal traps from those caused by a system call.
+.TP
+.BR PTRACE_O_TRACEVFORK " (since Linux 2.5.46)"
+Stop the tracee at the next
+.BR vfork (2)
+and automatically start tracing the newly vforked process,
+which will start with a
+.BR SIGSTOP ,
+or
+.B PTRACE_EVENT_STOP
+if
+.B PTRACE_SEIZE
+was used.
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK<<8))
+.fi
+.IP
+The PID of the new process can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.TP
+.BR PTRACE_O_TRACEVFORKDONE " (since Linux 2.5.60)"
+Stop the tracee at the completion of the next
+.BR vfork (2).
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK_DONE<<8))
+.fi
+.IP
+The PID of the new process can (since Linux 2.6.18) be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.TP
+.BR PTRACE_O_TRACESECCOMP " (since Linux 3.5)"
+Stop the tracee when a
+.BR seccomp (2)
+.B SECCOMP_RET_TRACE
+rule is triggered.
+A
+.BR waitpid (2)
+by the tracer will return a
+.I status
+value such that
+.IP
+.nf
+ status>>8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP<<8))
+.fi
+.IP
+While this triggers a
+.B PTRACE_EVENT
+stop, it is similar to a syscall-enter-stop.
+For details, see the note on
+.B PTRACE_EVENT_SECCOMP
+below.
+The seccomp event message data (from the
+.B SECCOMP_RET_DATA
+portion of the seccomp filter rule) can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+.TP
+.BR PTRACE_O_SUSPEND_SECCOMP " (since Linux 4.3)"
+.\" commit 13c4a90119d28cfcb6b5bdd820c233b86c2b0237
+Suspend the tracee's seccomp protections.
+This applies regardless of mode, and
+can be used when the tracee has not yet installed seccomp filters.
+That is, a valid use case is to suspend a tracee's seccomp protections
+before they are installed by the tracee,
+let the tracee install the filters,
+and then clear this flag when the filters should be resumed.
+Setting this option requires that the tracer have the
+.B CAP_SYS_ADMIN
+capability,
+not have any seccomp protections installed, and not have
+.B PTRACE_O_SUSPEND_SECCOMP
+set on itself.
+.RE
+.TP
+.BR PTRACE_GETEVENTMSG " (since Linux 2.5.46)"
+Retrieve a message (as an
+.IR "unsigned long" )
+about the ptrace event
+that just happened, placing it at the address
+.I data
+in the tracer.
+For
+.BR PTRACE_EVENT_EXIT ,
+this is the tracee's exit status.
+For
+.BR PTRACE_EVENT_FORK ,
+.BR PTRACE_EVENT_VFORK ,
+.BR PTRACE_EVENT_VFORK_DONE ,
+and
+.BR PTRACE_EVENT_CLONE ,
+this is the PID of the new process.
+For
+.BR PTRACE_EVENT_SECCOMP ,
+this is the
+.BR seccomp (2)
+filter's
+.B SECCOMP_RET_DATA
+associated with the triggered rule.
+.RI ( addr
+is ignored.)
+.TP
+.B PTRACE_CONT
+Restart the stopped tracee process.
+If
+.I data
+is nonzero,
+it is interpreted as the number of a signal to be delivered to the tracee;
+otherwise, no signal is delivered.
+Thus, for example, the tracer can control
+whether a signal sent to the tracee is delivered or not.
+.RI ( addr
+is ignored.)
+.TP
+.BR PTRACE_SYSCALL ", " PTRACE_SINGLESTEP
+Restart the stopped tracee as for
+.BR PTRACE_CONT ,
+but arrange for the tracee to be stopped at
+the next entry to or exit from a system call,
+or after execution of a single instruction, respectively.
+(The tracee will also, as usual, be stopped upon receipt of a signal.)
+From the tracer's perspective, the tracee will appear to have been
+stopped by receipt of a
+.BR SIGTRAP .
+So, for
+.BR PTRACE_SYSCALL ,
+for example, the idea is to inspect
+the arguments to the system call at the first stop,
+then do another
+.B PTRACE_SYSCALL
+and inspect the return value of the system call at the second stop.
+The
+.I data
+argument is treated as for
+.BR PTRACE_CONT .
+.RI ( addr
+is ignored.)
+.TP
+.BR PTRACE_SET_SYSCALL " (since Linux 2.6.16)"
+.\" commit 3f471126ee53feb5e9b210ea2f525ed3bb9b7a7f
+When in syscall-enter-stop,
+change the number of the system call that is about to
+be executed to the number specified in the
+.I data
+argument.
+The
+.I addr
+argument is ignored.
+This request is currently
+.\" As of 4.19-rc2
+supported only on arm (and arm64, though only for backwards compatibility),
+.\" commit 27aa55c5e5123fa8b8ad0156559d34d7edff58ca
+but most other architectures have other means of accomplishing this
+(usually by changing the register that the userland code passed the
+system call number in).
+.\" see change_syscall in tools/testing/selftests/seccomp/seccomp_bpf.c
+.\" and also strace's linux/*/set_scno.c files.
+.TP
+.BR PTRACE_SYSEMU ", " PTRACE_SYSEMU_SINGLESTEP " (since Linux 2.6.14)"
+For
+.BR PTRACE_SYSEMU ,
+continue and stop on entry to the next system call,
+which will not be executed.
+See the documentation on syscall-stops below.
+For
+.BR PTRACE_SYSEMU_SINGLESTEP ,
+do the same but also singlestep if not a system call.
+This call is used by programs like
+User Mode Linux that want to emulate all the tracee's system calls.
+The
+.I data
+argument is treated as for
+.BR PTRACE_CONT .
+The
+.I addr
+argument is ignored.
+These requests are currently
+.\" As at 3.7
+supported only on x86.
+.TP
+.BR PTRACE_LISTEN " (since Linux 3.4)"
+Restart the stopped tracee, but prevent it from executing.
+The resulting state of the tracee is similar to a process which
+has been stopped by a
+.B SIGSTOP
+(or other stopping signal).
+See the "group-stop" subsection for additional information.
+.B PTRACE_LISTEN
+works only on tracees attached by
+.BR PTRACE_SEIZE .
+.TP
+.B PTRACE_KILL
+Send the tracee a
+.B SIGKILL
+to terminate it.
+.RI ( addr
+and
+.I data
+are ignored.)
+.IP
+.I This operation is deprecated; do not use it!
+Instead, send a
+.B SIGKILL
+directly using
+.BR kill (2)
+or
+.BR tgkill (2).
+The problem with
+.B PTRACE_KILL
+is that it requires the tracee to be in signal-delivery-stop,
+otherwise it may not work
+(i.e., may complete successfully but won't kill the tracee).
+By contrast, sending a
+.B SIGKILL
+directly has no such limitation.
+.\" [Note from Denys Vlasenko:
+.\" deprecation suggested by Oleg Nesterov. He prefers to deprecate it
+.\" instead of describing (and needing to support) PTRACE_KILL's quirks.]
+.TP
+.BR PTRACE_INTERRUPT " (since Linux 3.4)"
+Stop a tracee.
+If the tracee is running or sleeping in kernel space and
+.B PTRACE_SYSCALL
+is in effect,
+the system call is interrupted and syscall-exit-stop is reported.
+(The interrupted system call is restarted when the tracee is restarted.)
+If the tracee was already stopped by a signal and
+.B PTRACE_LISTEN
+was sent to it,
+the tracee stops with
+.B PTRACE_EVENT_STOP
+and
+.I WSTOPSIG(status)
+returns the stop signal.
+If any other ptrace-stop is generated at the same time (for example,
+if a signal is sent to the tracee), this ptrace-stop happens.
+If none of the above applies (for example, if the tracee is running in user
+space), it stops with
+.B PTRACE_EVENT_STOP
+with
+.I WSTOPSIG(status)
+==
+.BR SIGTRAP .
+.B PTRACE_INTERRUPT
+only works on tracees attached by
+.BR PTRACE_SEIZE .
+.TP
+.B PTRACE_ATTACH
+Attach to the process specified in
+.IR pid ,
+making it a tracee of the calling process.
+.\" No longer true (removed by Denys Vlasenko, 2011, who remarks:
+.\" "I think it isn't true in non-ancient 2.4 and in Linux 2.6/3.x.
+.\" Basically, it's not true for any Linux in practical use.
+.\" ; the behavior of the tracee is as if it had done a
+.\" .BR PTRACE_TRACEME .
+.\" The calling process actually becomes the parent of the tracee
+.\" process for most purposes (e.g., it will receive
+.\" notification of tracee events and appears in
+.\" .BR ps (1)
+.\" output as the tracee's parent), but a
+.\" .BR getppid (2)
+.\" by the tracee will still return the PID of the original parent.
+The tracee is sent a
+.BR SIGSTOP ,
+but will not necessarily have stopped
+by the completion of this call; use
+.BR waitpid (2)
+to wait for the tracee to stop.
+See the "Attaching and detaching" subsection for additional information.
+.RI ( addr
+and
+.I data
+are ignored.)
+.IP
+Permission to perform a
+.B PTRACE_ATTACH
+is governed by a ptrace access mode
+.B PTRACE_MODE_ATTACH_REALCREDS
+check; see below.
+.TP
+.BR PTRACE_SEIZE " (since Linux 3.4)"
+.\"
+.\" Noted by Dmitry Levin:
+.\"
+.\" PTRACE_SEIZE was introduced by commit v3.1-rc1~308^2~28, but
+.\" it had to be used along with a temporary flag PTRACE_SEIZE_DEVEL,
+.\" which was removed later by commit v3.4-rc1~109^2~20.
+.\"
+.\" That is, [before] v3.4 we had a test mode of PTRACE_SEIZE API,
+.\" which was not compatible with the current PTRACE_SEIZE API introduced
+.\" in Linux 3.4.
+.\"
+Attach to the process specified in
+.IR pid ,
+making it a tracee of the calling process.
+Unlike
+.BR PTRACE_ATTACH ,
+.B PTRACE_SEIZE
+does not stop the process.
+Group-stops are reported as
+.B PTRACE_EVENT_STOP
+and
+.I WSTOPSIG(status)
+returns the stop signal.
+Automatically attached children stop with
+.B PTRACE_EVENT_STOP
+and
+.I WSTOPSIG(status)
+returns
+.B SIGTRAP
+instead of having
+.B SIGSTOP
+signal delivered to them.
+.BR execve (2)
+does not deliver an extra
+.BR SIGTRAP .
+Only a
+.BR PTRACE_SEIZE d
+process can accept
+.B PTRACE_INTERRUPT
+and
+.B PTRACE_LISTEN
+commands.
+The "seized" behavior just described is inherited by
+children that are automatically attached using
+.BR PTRACE_O_TRACEFORK ,
+.BR PTRACE_O_TRACEVFORK ,
+and
+.BR PTRACE_O_TRACECLONE .
+.I addr
+must be zero.
+.I data
+contains a bit mask of ptrace options to activate immediately.
+.IP
+Permission to perform a
+.B PTRACE_SEIZE
+is governed by a ptrace access mode
+.B PTRACE_MODE_ATTACH_REALCREDS
+check; see below.
+.\"
+.TP
+.BR PTRACE_SECCOMP_GET_FILTER " (since Linux 4.4)"
+.\" commit f8e529ed941ba2bbcbf310b575d968159ce7e895
+This operation allows the tracer to dump the tracee's
+classic BPF filters.
+.IP
+.I addr
+is an integer specifying the index of the filter to be dumped.
+The most recently installed filter has the index 0.
+If
+.I addr
+is greater than the number of installed filters,
+the operation fails with the error
+.BR ENOENT .
+.IP
+.I data
+is either a pointer to a
+.I struct sock_filter
+array that is large enough to store the BPF program,
+or NULL if the program is not to be stored.
+.IP
+Upon success,
+the return value is the number of instructions in the BPF program.
+If
+.I data
+was NULL, then this return value can be used to correctly size the
+.I struct sock_filter
+array passed in a subsequent call.
+.IP
+This operation fails with the error
+.B EACCES
+if the caller does not have the
+.B CAP_SYS_ADMIN
+capability or if the caller is in strict or filter seccomp mode.
+If the filter referred to by
+.I addr
+is not a classic BPF filter, the operation fails with the error
+.BR EMEDIUMTYPE .
+.IP
+This operation is available if the kernel was configured with both the
+.B CONFIG_SECCOMP_FILTER
+and the
+.B CONFIG_CHECKPOINT_RESTORE
+options.
+.TP
+.B PTRACE_DETACH
+Restart the stopped tracee as for
+.BR PTRACE_CONT ,
+but first detach from it.
+Under Linux, a tracee can be detached in this way regardless
+of which method was used to initiate tracing.
+.RI ( addr
+is ignored.)
+.\"
+.TP
+.BR PTRACE_GET_THREAD_AREA " (since Linux 2.6.0)"
+This operation performs a similar task to
+.BR get_thread_area (2).
+It reads the TLS entry in the GDT whose index is given in
+.IR addr ,
+placing a copy of the entry into the
+.I struct user_desc
+pointed to by
+.IR data .
+(By contrast with
+.BR get_thread_area (2),
+the
+.I entry_number
+of the
+.I struct user_desc
+is ignored.)
+.TP
+.BR PTRACE_SET_THREAD_AREA " (since Linux 2.6.0)"
+This operation performs a similar task to
+.BR set_thread_area (2).
+It sets the TLS entry in the GDT whose index is given in
+.IR addr ,
+assigning it the data supplied in the
+.I struct user_desc
+pointed to by
+.IR data .
+(By contrast with
+.BR set_thread_area (2),
+the
+.I entry_number
+of the
+.I struct user_desc
+is ignored; in other words,
+this ptrace operation can't be used to allocate a free TLS entry.)
+.TP
+.BR PTRACE_GET_SYSCALL_INFO " (since Linux 5.3)"
+.\" commit 201766a20e30f982ccfe36bebfad9602c3ff574a
+Retrieve information about the system call that caused the stop.
+The information is placed into the buffer pointed by the
+.I data
+argument, which should be a pointer to a buffer of type
+.IR "struct ptrace_syscall_info" .
+The
+.I addr
+argument contains the size of the buffer pointed to
+by the
+.I data
+argument (i.e.,
+.IR "sizeof(struct ptrace_syscall_info)" ).
+The return value contains the number of bytes available
+to be written by the kernel.
+If the size of the data to be written by the kernel exceeds the size
+specified by the
+.I addr
+argument, the output data is truncated.
+.IP
+The
+.I ptrace_syscall_info
+structure contains the following fields:
+.IP
+.in +4n
+.EX
+struct ptrace_syscall_info {
+ __u8 op; /* Type of system call stop */
+ __u32 arch; /* AUDIT_ARCH_* value; see seccomp(2) */
+ __u64 instruction_pointer; /* CPU instruction pointer */
+ __u64 stack_pointer; /* CPU stack pointer */
+ union {
+ struct { /* op == PTRACE_SYSCALL_INFO_ENTRY */
+ __u64 nr; /* System call number */
+ __u64 args[6]; /* System call arguments */
+ } entry;
+ struct { /* op == PTRACE_SYSCALL_INFO_EXIT */
+ __s64 rval; /* System call return value */
+ __u8 is_error; /* System call error flag;
+ Boolean: does rval contain
+ an error value (\-ERRCODE) or
+ a nonerror return value? */
+ } exit;
+ struct { /* op == PTRACE_SYSCALL_INFO_SECCOMP */
+ __u64 nr; /* System call number */
+ __u64 args[6]; /* System call arguments */
+ __u32 ret_data; /* SECCOMP_RET_DATA portion
+ of SECCOMP_RET_TRACE
+ return value */
+ } seccomp;
+ };
+};
+.EE
+.in
+.IP
+The
+.IR op ,
+.IR arch ,
+.IR instruction_pointer ,
+and
+.I stack_pointer
+fields are defined for all kinds of ptrace system call stops.
+The rest of the structure is a union; one should read only those fields
+that are meaningful for the kind of system call stop specified by the
+.I op
+field.
+.IP
+The
+.I op
+field has one of the following values (defined in
+.IR <linux/ptrace.h> )
+indicating what type of stop occurred and
+which part of the union is filled:
+.RS
+.TP
+.B PTRACE_SYSCALL_INFO_ENTRY
+The
+.I entry
+component of the union contains information relating to a
+system call entry stop.
+.TP
+.B PTRACE_SYSCALL_INFO_EXIT
+The
+.I exit
+component of the union contains information relating to a
+system call exit stop.
+.TP
+.B PTRACE_SYSCALL_INFO_SECCOMP
+The
+.I seccomp
+component of the union contains information relating to a
+.B PTRACE_EVENT_SECCOMP
+stop.
+.TP
+.B PTRACE_SYSCALL_INFO_NONE
+No component of the union contains relevant information.
+.RE
+.IP
+In case of system call entry or exit stops,
+the data returned by
+.B PTRACE_GET_SYSCALL_INFO
+is limited to type
+.B PTRACE_SYSCALL_INFO_NONE
+unless
+.B PTRACE_O_TRACESYSGOOD
+option is set before the corresponding system call stop has occurred.
+.\"
+.SS Death under ptrace
+When a (possibly multithreaded) process receives a killing signal
+(one whose disposition is set to
+.B SIG_DFL
+and whose default action is to kill the process),
+all threads exit.
+Tracees report their death to their tracer(s).
+Notification of this event is delivered via
+.BR waitpid (2).
+.PP
+Note that the killing signal will first cause signal-delivery-stop
+(on one tracee only),
+and only after it is injected by the tracer
+(or after it was dispatched to a thread which isn't traced),
+will death from the signal happen on
+.I all
+tracees within a multithreaded process.
+(The term "signal-delivery-stop" is explained below.)
+.PP
+.B SIGKILL
+does not generate signal-delivery-stop and
+therefore the tracer can't suppress it.
+.B SIGKILL
+kills even within system calls
+(syscall-exit-stop is not generated prior to death by
+.BR SIGKILL ).
+The net effect is that
+.B SIGKILL
+always kills the process (all its threads),
+even if some threads of the process are ptraced.
+.PP
+When the tracee calls
+.BR _exit (2),
+it reports its death to its tracer.
+Other threads are not affected.
+.PP
+When any thread executes
+.BR exit_group (2),
+every tracee in its thread group reports its death to its tracer.
+.PP
+If the
+.B PTRACE_O_TRACEEXIT
+option is on,
+.B PTRACE_EVENT_EXIT
+will happen before actual death.
+This applies to exits via
+.BR exit (2),
+.BR exit_group (2),
+and signal deaths (except
+.BR SIGKILL ,
+depending on the kernel version; see BUGS below),
+and when threads are torn down on
+.BR execve (2)
+in a multithreaded process.
+.PP
+The tracer cannot assume that the ptrace-stopped tracee exists.
+There are many scenarios when the tracee may die while stopped (such as
+.BR SIGKILL ).
+Therefore, the tracer must be prepared to handle an
+.B ESRCH
+error on any ptrace operation.
+Unfortunately, the same error is returned if the tracee
+exists but is not ptrace-stopped
+(for commands which require a stopped tracee),
+or if it is not traced by the process which issued the ptrace call.
+The tracer needs to keep track of the stopped/running state of the tracee,
+and interpret
+.B ESRCH
+as "tracee died unexpectedly" only if it knows that the tracee has
+been observed to enter ptrace-stop.
+Note that there is no guarantee that
+.I waitpid(WNOHANG)
+will reliably report the tracee's death status if a
+ptrace operation returned
+.BR ESRCH .
+.I waitpid(WNOHANG)
+may return 0 instead.
+In other words, the tracee may be "not yet fully dead",
+but already refusing ptrace requests.
+.PP
+The tracer can't assume that the tracee
+.I always
+ends its life by reporting
+.I WIFEXITED(status)
+or
+.IR WIFSIGNALED(status) ;
+there are cases where this does not occur.
+For example, if a thread other than thread group leader does an
+.BR execve (2),
+it disappears;
+its PID will never be seen again,
+and any subsequent ptrace stops will be reported under
+the thread group leader's PID.
+.SS Stopped states
+A tracee can be in two states: running or stopped.
+For the purposes of ptrace, a tracee which is blocked in a system call
+(such as
+.BR read (2),
+.BR pause (2),
+etc.)
+is nevertheless considered to be running, even if the tracee is blocked
+for a long time.
+The state of the tracee after
+.B PTRACE_LISTEN
+is somewhat of a gray area: it is not in any ptrace-stop (ptrace commands
+won't work on it, and it will deliver
+.BR waitpid (2)
+notifications),
+but it also may be considered "stopped" because
+it is not executing instructions (is not scheduled), and if it was
+in group-stop before
+.BR PTRACE_LISTEN ,
+it will not respond to signals until
+.B SIGCONT
+is received.
+.PP
+There are many kinds of states when the tracee is stopped, and in ptrace
+discussions they are often conflated.
+Therefore, it is important to use precise terms.
+.PP
+In this manual page, any stopped state in which the tracee is ready
+to accept ptrace commands from the tracer is called
+.IR ptrace-stop .
+Ptrace-stops can
+be further subdivided into
+.IR signal-delivery-stop ,
+.IR group-stop ,
+.IR syscall-stop ,
+.IR "PTRACE_EVENT stops" ,
+and so on.
+These stopped states are described in detail below.
+.PP
+When the running tracee enters ptrace-stop, it notifies its tracer using
+.BR waitpid (2)
+(or one of the other "wait" system calls).
+Most of this manual page assumes that the tracer waits with:
+.PP
+.in +4n
+.EX
+pid = waitpid(pid_or_minus_1, &status, __WALL);
+.EE
+.in
+.PP
+Ptrace-stopped tracees are reported as returns with
+.I pid
+greater than 0 and
+.I WIFSTOPPED(status)
+true.
+.\" Denys Vlasenko:
+.\" Do we require __WALL usage, or will just using 0 be ok? (With 0,
+.\" I am not 100% sure there aren't ugly corner cases.) Are the
+.\" rules different if user wants to use waitid? Will waitid require
+.\" WEXITED?
+.\"
+.PP
+The
+.B __WALL
+flag does not include the
+.B WSTOPPED
+and
+.B WEXITED
+flags, but implies their functionality.
+.PP
+Setting the
+.B WCONTINUED
+flag when calling
+.BR waitpid (2)
+is not recommended: the "continued" state is per-process and
+consuming it can confuse the real parent of the tracee.
+.PP
+Use of the
+.B WNOHANG
+flag may cause
+.BR waitpid (2)
+to return 0 ("no wait results available yet")
+even if the tracer knows there should be a notification.
+Example:
+.PP
+.in +4n
+.EX
+errno = 0;
+ptrace(PTRACE_CONT, pid, 0L, 0L);
+if (errno == ESRCH) {
+ /* tracee is dead */
+ r = waitpid(tracee, &status, __WALL | WNOHANG);
+ /* r can still be 0 here! */
+}
+.EE
+.in
+.\" FIXME .
+.\" waitid usage? WNOWAIT?
+.\" describe how wait notifications queue (or not queue)
+.PP
+The following kinds of ptrace-stops exist: signal-delivery-stops,
+group-stops,
+.B PTRACE_EVENT
+stops, syscall-stops.
+They all are reported by
+.BR waitpid (2)
+with
+.I WIFSTOPPED(status)
+true.
+They may be differentiated by examining the value
+.IR status>>8 ,
+and if there is ambiguity in that value, by querying
+.BR PTRACE_GETSIGINFO .
+(Note: the
+.I WSTOPSIG(status)
+macro can't be used to perform this examination,
+because it returns the value
+.IR "(status>>8)\ &\ 0xff" .)
+.SS Signal-delivery-stop
+When a (possibly multithreaded) process receives any signal except
+.BR SIGKILL ,
+the kernel selects an arbitrary thread which handles the signal.
+(If the signal is generated with
+.BR tgkill (2),
+the target thread can be explicitly selected by the caller.)
+If the selected thread is traced, it enters signal-delivery-stop.
+At this point, the signal is not yet delivered to the process,
+and can be suppressed by the tracer.
+If the tracer doesn't suppress the signal,
+it passes the signal to the tracee in the next ptrace restart request.
+This second step of signal delivery is called
+.I "signal injection"
+in this manual page.
+Note that if the signal is blocked,
+signal-delivery-stop doesn't happen until the signal is unblocked,
+with the usual exception that
+.B SIGSTOP
+can't be blocked.
+.PP
+Signal-delivery-stop is observed by the tracer as
+.BR waitpid (2)
+returning with
+.I WIFSTOPPED(status)
+true, with the signal returned by
+.IR WSTOPSIG(status) .
+If the signal is
+.BR SIGTRAP ,
+this may be a different kind of ptrace-stop;
+see the "Syscall-stops" and "execve" sections below for details.
+If
+.I WSTOPSIG(status)
+returns a stopping signal, this may be a group-stop; see below.
+.SS Signal injection and suppression
+After signal-delivery-stop is observed by the tracer,
+the tracer should restart the tracee with the call
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_restart, pid, 0, sig)
+.EE
+.in
+.PP
+where
+.B PTRACE_restart
+is one of the restarting ptrace requests.
+If
+.I sig
+is 0, then a signal is not delivered.
+Otherwise, the signal
+.I sig
+is delivered.
+This operation is called
+.I "signal injection"
+in this manual page, to distinguish it from signal-delivery-stop.
+.PP
+The
+.I sig
+value may be different from the
+.I WSTOPSIG(status)
+value: the tracer can cause a different signal to be injected.
+.PP
+Note that a suppressed signal still causes system calls to return
+prematurely.
+In this case, system calls will be restarted: the tracer will
+observe the tracee to reexecute the interrupted system call (or
+.BR restart_syscall (2)
+system call for a few system calls which use a different mechanism
+for restarting) if the tracer uses
+.BR PTRACE_SYSCALL .
+Even system calls (such as
+.BR poll (2))
+which are not restartable after signal are restarted after
+signal is suppressed;
+however, kernel bugs exist which cause some system calls to fail with
+.B EINTR
+even though no observable signal is injected to the tracee.
+.PP
+Restarting ptrace commands issued in ptrace-stops other than
+signal-delivery-stop are not guaranteed to inject a signal, even if
+.I sig
+is nonzero.
+No error is reported; a nonzero
+.I sig
+may simply be ignored.
+Ptrace users should not try to "create a new signal" this way: use
+.BR tgkill (2)
+instead.
+.PP
+The fact that signal injection requests may be ignored
+when restarting the tracee after
+ptrace stops that are not signal-delivery-stops
+is a cause of confusion among ptrace users.
+One typical scenario is that the tracer observes group-stop,
+mistakes it for signal-delivery-stop, restarts the tracee with
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_restart, pid, 0, stopsig)
+.EE
+.in
+.PP
+with the intention of injecting
+.IR stopsig ,
+but
+.I stopsig
+gets ignored and the tracee continues to run.
+.PP
+The
+.B SIGCONT
+signal has a side effect of waking up (all threads of)
+a group-stopped process.
+This side effect happens before signal-delivery-stop.
+The tracer can't suppress this side effect (it can
+only suppress signal injection, which only causes the
+.B SIGCONT
+handler to not be executed in the tracee, if such a handler is installed).
+In fact, waking up from group-stop may be followed by
+signal-delivery-stop for signal(s)
+.I other than
+.BR SIGCONT ,
+if they were pending when
+.B SIGCONT
+was delivered.
+In other words,
+.B SIGCONT
+may be not the first signal observed by the tracee after it was sent.
+.PP
+Stopping signals cause (all threads of) a process to enter group-stop.
+This side effect happens after signal injection, and therefore can be
+suppressed by the tracer.
+.PP
+In Linux 2.4 and earlier, the
+.B SIGSTOP
+signal can't be injected.
+.\" In the Linux 2.4 sources, in arch/i386/kernel/signal.c::do_signal(),
+.\" there is:
+.\"
+.\" /* The debugger continued. Ignore SIGSTOP. */
+.\" if (signr == SIGSTOP)
+.\" continue;
+.PP
+.B PTRACE_GETSIGINFO
+can be used to retrieve a
+.I siginfo_t
+structure which corresponds to the delivered signal.
+.B PTRACE_SETSIGINFO
+may be used to modify it.
+If
+.B PTRACE_SETSIGINFO
+has been used to alter
+.IR siginfo_t ,
+the
+.I si_signo
+field and the
+.I sig
+parameter in the restarting command must match,
+otherwise the result is undefined.
+.SS Group-stop
+When a (possibly multithreaded) process receives a stopping signal,
+all threads stop.
+If some threads are traced, they enter a group-stop.
+Note that the stopping signal will first cause signal-delivery-stop
+(on one tracee only), and only after it is injected by the tracer
+(or after it was dispatched to a thread which isn't traced),
+will group-stop be initiated on
+.I all
+tracees within the multithreaded process.
+As usual, every tracee reports its group-stop separately
+to the corresponding tracer.
+.PP
+Group-stop is observed by the tracer as
+.BR waitpid (2)
+returning with
+.I WIFSTOPPED(status)
+true, with the stopping signal available via
+.IR WSTOPSIG(status) .
+The same result is returned by some other classes of ptrace-stops,
+therefore the recommended practice is to perform the call
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo)
+.EE
+.in
+.PP
+The call can be avoided if the signal is not
+.BR SIGSTOP ,
+.BR SIGTSTP ,
+.BR SIGTTIN ,
+or
+.BR SIGTTOU ;
+only these four signals are stopping signals.
+If the tracer sees something else, it can't be a group-stop.
+Otherwise, the tracer needs to call
+.BR PTRACE_GETSIGINFO .
+If
+.B PTRACE_GETSIGINFO
+fails with
+.BR EINVAL ,
+then it is definitely a group-stop.
+(Other failure codes are possible, such as
+.B ESRCH
+("no such process") if a
+.B SIGKILL
+killed the tracee.)
+.PP
+If tracee was attached using
+.BR PTRACE_SEIZE ,
+group-stop is indicated by
+.BR PTRACE_EVENT_STOP :
+.IR "status>>16 == PTRACE_EVENT_STOP" .
+This allows detection of group-stops
+without requiring an extra
+.B PTRACE_GETSIGINFO
+call.
+.PP
+As of Linux 2.6.38,
+after the tracer sees the tracee ptrace-stop and until it
+restarts or kills it, the tracee will not run,
+and will not send notifications (except
+.B SIGKILL
+death) to the tracer, even if the tracer enters into another
+.BR waitpid (2)
+call.
+.PP
+The kernel behavior described in the previous paragraph
+causes a problem with transparent handling of stopping signals.
+If the tracer restarts the tracee after group-stop,
+the stopping signal
+is effectively ignored\[em]the tracee doesn't remain stopped, it runs.
+If the tracer doesn't restart the tracee before entering into the next
+.BR waitpid (2),
+future
+.B SIGCONT
+signals will not be reported to the tracer;
+this would cause the
+.B SIGCONT
+signals to have no effect on the tracee.
+.PP
+Since Linux 3.4, there is a method to overcome this problem: instead of
+.BR PTRACE_CONT ,
+a
+.B PTRACE_LISTEN
+command can be used to restart a tracee in a way where it does not execute,
+but waits for a new event which it can report via
+.BR waitpid (2)
+(such as when
+it is restarted by a
+.BR SIGCONT ).
+.SS PTRACE_EVENT stops
+If the tracer sets
+.B PTRACE_O_TRACE_*
+options, the tracee will enter ptrace-stops called
+.B PTRACE_EVENT
+stops.
+.PP
+.B PTRACE_EVENT
+stops are observed by the tracer as
+.BR waitpid (2)
+returning with
+.IR WIFSTOPPED(status) ,
+and
+.I WSTOPSIG(status)
+returns
+.B SIGTRAP
+(or for
+.BR PTRACE_EVENT_STOP ,
+returns the stopping signal if tracee is in a group-stop).
+An additional bit is set in the higher byte of the status word:
+the value
+.I status>>8
+will be
+.PP
+.in +4n
+.EX
+((PTRACE_EVENT_foo<<8) | SIGTRAP).
+.EE
+.in
+.PP
+The following events exist:
+.TP
+.B PTRACE_EVENT_VFORK
+Stop before return from
+.BR vfork (2)
+or
+.BR clone (2)
+with the
+.B CLONE_VFORK
+flag.
+When the tracee is continued after this stop, it will wait for child to
+exit/exec before continuing its execution
+(in other words, the usual behavior on
+.BR vfork (2)).
+.TP
+.B PTRACE_EVENT_FORK
+Stop before return from
+.BR fork (2)
+or
+.BR clone (2)
+with the exit signal set to
+.BR SIGCHLD .
+.TP
+.B PTRACE_EVENT_CLONE
+Stop before return from
+.BR clone (2).
+.TP
+.B PTRACE_EVENT_VFORK_DONE
+Stop before return from
+.BR vfork (2)
+or
+.BR clone (2)
+with the
+.B CLONE_VFORK
+flag,
+but after the child unblocked this tracee by exiting or execing.
+.PP
+For all four stops described above,
+the stop occurs in the parent (i.e., the tracee),
+not in the newly created thread.
+.B PTRACE_GETEVENTMSG
+can be used to retrieve the new thread's ID.
+.TP
+.B PTRACE_EVENT_EXEC
+Stop before return from
+.BR execve (2).
+Since Linux 3.0,
+.B PTRACE_GETEVENTMSG
+returns the former thread ID.
+.TP
+.B PTRACE_EVENT_EXIT
+Stop before exit (including death from
+.BR exit_group (2)),
+signal death, or exit caused by
+.BR execve (2)
+in a multithreaded process.
+.B PTRACE_GETEVENTMSG
+returns the exit status.
+Registers can be examined
+(unlike when "real" exit happens).
+The tracee is still alive; it needs to be
+.BR PTRACE_CONT ed
+or
+.BR PTRACE_DETACH ed
+to finish exiting.
+.TP
+.B PTRACE_EVENT_STOP
+Stop induced by
+.B PTRACE_INTERRUPT
+command, or group-stop, or initial ptrace-stop when a new child is attached
+(only if attached using
+.BR PTRACE_SEIZE ).
+.TP
+.B PTRACE_EVENT_SECCOMP
+Stop triggered by a
+.BR seccomp (2)
+rule on tracee syscall entry when
+.B PTRACE_O_TRACESECCOMP
+has been set by the tracer.
+The seccomp event message data (from the
+.B SECCOMP_RET_DATA
+portion of the seccomp filter rule) can be retrieved with
+.BR PTRACE_GETEVENTMSG .
+The semantics of this stop are described in
+detail in a separate section below.
+.PP
+.B PTRACE_GETSIGINFO
+on
+.B PTRACE_EVENT
+stops returns
+.B SIGTRAP
+in
+.IR si_signo ,
+with
+.I si_code
+set to
+.IR "(event<<8)\ |\ SIGTRAP" .
+.SS Syscall-stops
+If the tracee was restarted by
+.B PTRACE_SYSCALL
+or
+.BR PTRACE_SYSEMU ,
+the tracee enters
+syscall-enter-stop just prior to entering any system call (which
+will not be executed if the restart was using
+.BR PTRACE_SYSEMU ,
+regardless of any change made to registers at this point or how the
+tracee is restarted after this stop).
+No matter which method caused the syscall-entry-stop,
+if the tracer restarts the tracee with
+.BR PTRACE_SYSCALL ,
+the tracee enters syscall-exit-stop when the system call is finished,
+or if it is interrupted by a signal.
+(That is, signal-delivery-stop never happens between syscall-enter-stop
+and syscall-exit-stop; it happens
+.I after
+syscall-exit-stop.).
+If the tracee is continued using any other method (including
+.BR PTRACE_SYSEMU ),
+no syscall-exit-stop occurs.
+Note that all mentions
+.B PTRACE_SYSEMU
+apply equally to
+.BR PTRACE_SYSEMU_SINGLESTEP .
+.PP
+However, even if the tracee was continued using
+.BR PTRACE_SYSCALL ,
+it is not guaranteed that the next stop will be a syscall-exit-stop.
+Other possibilities are that the tracee may stop in a
+.B PTRACE_EVENT
+stop (including seccomp stops), exit (if it entered
+.BR _exit (2)
+or
+.BR exit_group (2)),
+be killed by
+.BR SIGKILL ,
+or die silently (if it is a thread group leader, the
+.BR execve (2)
+happened in another thread,
+and that thread is not traced by the same tracer;
+this situation is discussed later).
+.PP
+Syscall-enter-stop and syscall-exit-stop are observed by the tracer as
+.BR waitpid (2)
+returning with
+.I WIFSTOPPED(status)
+true, and
+.I WSTOPSIG(status)
+giving
+.BR SIGTRAP .
+If the
+.B PTRACE_O_TRACESYSGOOD
+option was set by the tracer, then
+.I WSTOPSIG(status)
+will give the value
+.IR "(SIGTRAP\ |\ 0x80)" .
+.PP
+Syscall-stops can be distinguished from signal-delivery-stop with
+.B SIGTRAP
+by querying
+.B PTRACE_GETSIGINFO
+for the following cases:
+.TP
+.IR si_code " <= 0"
+.B SIGTRAP
+was delivered as a result of a user-space action,
+for example, a system call
+.RB ( tgkill (2),
+.BR kill (2),
+.BR sigqueue (3),
+etc.),
+expiration of a POSIX timer,
+change of state on a POSIX message queue,
+or completion of an asynchronous I/O request.
+.TP
+.IR si_code " == SI_KERNEL (0x80)"
+.B SIGTRAP
+was sent by the kernel.
+.TP
+.IR si_code " == SIGTRAP or " si_code " == (SIGTRAP|0x80)"
+This is a syscall-stop.
+.PP
+However, syscall-stops happen very often (twice per system call),
+and performing
+.B PTRACE_GETSIGINFO
+for every syscall-stop may be somewhat expensive.
+.PP
+Some architectures allow the cases to be distinguished
+by examining registers.
+For example, on x86,
+.I rax
+==
+.RB \- ENOSYS
+in syscall-enter-stop.
+Since
+.B SIGTRAP
+(like any other signal) always happens
+.I after
+syscall-exit-stop,
+and at this point
+.I rax
+almost never contains
+.RB \- ENOSYS ,
+the
+.B SIGTRAP
+looks like "syscall-stop which is not syscall-enter-stop";
+in other words, it looks like a
+"stray syscall-exit-stop" and can be detected this way.
+But such detection is fragile and is best avoided.
+.PP
+Using the
+.B PTRACE_O_TRACESYSGOOD
+option is the recommended method to distinguish syscall-stops
+from other kinds of ptrace-stops,
+since it is reliable and does not incur a performance penalty.
+.PP
+Syscall-enter-stop and syscall-exit-stop are
+indistinguishable from each other by the tracer.
+The tracer needs to keep track of the sequence of
+ptrace-stops in order to not misinterpret syscall-enter-stop as
+syscall-exit-stop or vice versa.
+In general, a syscall-enter-stop is
+always followed by syscall-exit-stop,
+.B PTRACE_EVENT
+stop, or the tracee's death;
+no other kinds of ptrace-stop can occur in between.
+However, note that seccomp stops (see below) can cause syscall-exit-stops,
+without preceding syscall-entry-stops.
+If seccomp is in use, care needs
+to be taken not to misinterpret such stops as syscall-entry-stops.
+.PP
+If after syscall-enter-stop,
+the tracer uses a restarting command other than
+.BR PTRACE_SYSCALL ,
+syscall-exit-stop is not generated.
+.PP
+.B PTRACE_GETSIGINFO
+on syscall-stops returns
+.B SIGTRAP
+in
+.IR si_signo ,
+with
+.I si_code
+set to
+.B SIGTRAP
+or
+.IR (SIGTRAP|0x80) .
+.\"
+.SS PTRACE_EVENT_SECCOMP stops (Linux 3.5 to Linux 4.7)
+The behavior of
+.B PTRACE_EVENT_SECCOMP
+stops and their interaction with other kinds
+of ptrace stops has changed between kernel versions.
+This documents the behavior
+from their introduction until Linux 4.7 (inclusive).
+The behavior in later kernel versions is documented in the next section.
+.PP
+A
+.B PTRACE_EVENT_SECCOMP
+stop occurs whenever a
+.B SECCOMP_RET_TRACE
+rule is triggered.
+This is independent of which methods was used to restart the system call.
+Notably, seccomp still runs even if the tracee was restarted using
+.B PTRACE_SYSEMU
+and this system call is unconditionally skipped.
+.PP
+Restarts from this stop will behave as if the stop had occurred right
+before the system call in question.
+In particular, both
+.B PTRACE_SYSCALL
+and
+.B PTRACE_SYSEMU
+will normally cause a subsequent syscall-entry-stop.
+However, if after the
+.B PTRACE_EVENT_SECCOMP
+the system call number is negative,
+both the syscall-entry-stop and the system call itself will be skipped.
+This means that if the system call number is negative after a
+.B PTRACE_EVENT_SECCOMP
+and the tracee is restarted using
+.BR PTRACE_SYSCALL ,
+the next observed stop will be a syscall-exit-stop,
+rather than the syscall-entry-stop that might have been expected.
+.\"
+.SS PTRACE_EVENT_SECCOMP stops (since Linux 4.8)
+Starting with Linux 4.8,
+.\" commit 93e35efb8de45393cf61ed07f7b407629bf698ea
+the
+.B PTRACE_EVENT_SECCOMP
+stop was reordered to occur between syscall-entry-stop and
+syscall-exit-stop.
+Note that seccomp no longer runs (and no
+.B PTRACE_EVENT_SECCOMP
+will be reported) if the system call is skipped due to
+.BR PTRACE_SYSEMU .
+.PP
+Functionally, a
+.B PTRACE_EVENT_SECCOMP
+stop functions comparably
+to a syscall-entry-stop (i.e., continuations using
+.B PTRACE_SYSCALL
+will cause syscall-exit-stops,
+the system call number may be changed and any other modified registers
+are visible to the to-be-executed system call as well).
+Note that there may be,
+but need not have been a preceding syscall-entry-stop.
+.PP
+After a
+.B PTRACE_EVENT_SECCOMP
+stop, seccomp will be rerun, with a
+.B SECCOMP_RET_TRACE
+rule now functioning the same as a
+.BR SECCOMP_RET_ALLOW .
+Specifically, this means that if registers are not modified during the
+.B PTRACE_EVENT_SECCOMP
+stop, the system call will then be allowed.
+.\"
+.SS PTRACE_SINGLESTEP stops
+[Details of these kinds of stops are yet to be documented.]
+.\"
+.\" FIXME .
+.\" document stops occurring with PTRACE_SINGLESTEP
+.\"
+.SS Informational and restarting ptrace commands
+Most ptrace commands (all except
+.BR PTRACE_ATTACH ,
+.BR PTRACE_SEIZE ,
+.BR PTRACE_TRACEME ,
+.BR PTRACE_INTERRUPT ,
+and
+.BR PTRACE_KILL )
+require the tracee to be in a ptrace-stop, otherwise they fail with
+.BR ESRCH .
+.PP
+When the tracee is in ptrace-stop,
+the tracer can read and write data to
+the tracee using informational commands.
+These commands leave the tracee in ptrace-stopped state:
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_PEEKTEXT/PEEKDATA/PEEKUSER, pid, addr, 0);
+ptrace(PTRACE_POKETEXT/POKEDATA/POKEUSER, pid, addr, long_val);
+ptrace(PTRACE_GETREGS/GETFPREGS, pid, 0, &struct);
+ptrace(PTRACE_SETREGS/SETFPREGS, pid, 0, &struct);
+ptrace(PTRACE_GETREGSET, pid, NT_foo, &iov);
+ptrace(PTRACE_SETREGSET, pid, NT_foo, &iov);
+ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo);
+ptrace(PTRACE_SETSIGINFO, pid, 0, &siginfo);
+ptrace(PTRACE_GETEVENTMSG, pid, 0, &long_var);
+ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags);
+.EE
+.in
+.PP
+Note that some errors are not reported.
+For example, setting signal information
+.RI ( siginfo )
+may have no effect in some ptrace-stops, yet the call may succeed
+(return 0 and not set
+.IR errno );
+querying
+.B PTRACE_GETEVENTMSG
+may succeed and return some random value if current ptrace-stop
+is not documented as returning a meaningful event message.
+.PP
+The call
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags);
+.EE
+.in
+.PP
+affects one tracee.
+The tracee's current flags are replaced.
+Flags are inherited by new tracees created and "auto-attached" via active
+.BR PTRACE_O_TRACEFORK ,
+.BR PTRACE_O_TRACEVFORK ,
+or
+.B PTRACE_O_TRACECLONE
+options.
+.PP
+Another group of commands makes the ptrace-stopped tracee run.
+They have the form:
+.PP
+.in +4n
+.EX
+ptrace(cmd, pid, 0, sig);
+.EE
+.in
+.PP
+where
+.I cmd
+is
+.BR PTRACE_CONT ,
+.BR PTRACE_LISTEN ,
+.BR PTRACE_DETACH ,
+.BR PTRACE_SYSCALL ,
+.BR PTRACE_SINGLESTEP ,
+.BR PTRACE_SYSEMU ,
+or
+.BR PTRACE_SYSEMU_SINGLESTEP .
+If the tracee is in signal-delivery-stop,
+.I sig
+is the signal to be injected (if it is nonzero).
+Otherwise,
+.I sig
+may be ignored.
+(When restarting a tracee from a ptrace-stop other than signal-delivery-stop,
+recommended practice is to always pass 0 in
+.IR sig .)
+.SS Attaching and detaching
+A thread can be attached to the tracer using the call
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_ATTACH, pid, 0, 0);
+.EE
+.in
+.PP
+or
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_flags);
+.EE
+.in
+.PP
+.B PTRACE_ATTACH
+sends
+.B SIGSTOP
+to this thread.
+If the tracer wants this
+.B SIGSTOP
+to have no effect, it needs to suppress it.
+Note that if other signals are concurrently sent to
+this thread during attach,
+the tracer may see the tracee enter signal-delivery-stop
+with other signal(s) first!
+The usual practice is to reinject these signals until
+.B SIGSTOP
+is seen, then suppress
+.B SIGSTOP
+injection.
+The design bug here is that a ptrace attach and a concurrently delivered
+.B SIGSTOP
+may race and the concurrent
+.B SIGSTOP
+may be lost.
+.\"
+.\" FIXME Describe how to attach to a thread which is already group-stopped.
+.PP
+Since attaching sends
+.B SIGSTOP
+and the tracer usually suppresses it, this may cause a stray
+.B EINTR
+return from the currently executing system call in the tracee,
+as described in the "Signal injection and suppression" section.
+.PP
+Since Linux 3.4,
+.B PTRACE_SEIZE
+can be used instead of
+.BR PTRACE_ATTACH .
+.B PTRACE_SEIZE
+does not stop the attached process.
+If you need to stop
+it after attach (or at any other time) without sending it any signals,
+use
+.B PTRACE_INTERRUPT
+command.
+.PP
+The request
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_TRACEME, 0, 0, 0);
+.EE
+.in
+.PP
+turns the calling thread into a tracee.
+The thread continues to run (doesn't enter ptrace-stop).
+A common practice is to follow the
+.B PTRACE_TRACEME
+with
+.PP
+.in +4n
+.EX
+raise(SIGSTOP);
+.EE
+.in
+.PP
+and allow the parent (which is our tracer now) to observe our
+signal-delivery-stop.
+.PP
+If the
+.BR PTRACE_O_TRACEFORK ,
+.BR PTRACE_O_TRACEVFORK ,
+or
+.B PTRACE_O_TRACECLONE
+options are in effect, then children created by, respectively,
+.BR vfork (2)
+or
+.BR clone (2)
+with the
+.B CLONE_VFORK
+flag,
+.BR fork (2)
+or
+.BR clone (2)
+with the exit signal set to
+.BR SIGCHLD ,
+and other kinds of
+.BR clone (2),
+are automatically attached to the same tracer which traced their parent.
+.B SIGSTOP
+is delivered to the children, causing them to enter
+signal-delivery-stop after they exit the system call which created them.
+.PP
+Detaching of the tracee is performed by:
+.PP
+.in +4n
+.EX
+ptrace(PTRACE_DETACH, pid, 0, sig);
+.EE
+.in
+.PP
+.B PTRACE_DETACH
+is a restarting operation;
+therefore it requires the tracee to be in ptrace-stop.
+If the tracee is in signal-delivery-stop, a signal can be injected.
+Otherwise, the
+.I sig
+parameter may be silently ignored.
+.PP
+If the tracee is running when the tracer wants to detach it,
+the usual solution is to send
+.B SIGSTOP
+(using
+.BR tgkill (2),
+to make sure it goes to the correct thread),
+wait for the tracee to stop in signal-delivery-stop for
+.B SIGSTOP
+and then detach it (suppressing
+.B SIGSTOP
+injection).
+A design bug is that this can race with concurrent
+.BR SIGSTOP s.
+Another complication is that the tracee may enter other ptrace-stops
+and needs to be restarted and waited for again, until
+.B SIGSTOP
+is seen.
+Yet another complication is to be sure that
+the tracee is not already ptrace-stopped,
+because no signal delivery happens while it is\[em]not even
+.BR SIGSTOP .
+.\" FIXME Describe how to detach from a group-stopped tracee so that it
+.\" doesn't run, but continues to wait for SIGCONT.
+.PP
+If the tracer dies, all tracees are automatically detached and restarted,
+unless they were in group-stop.
+Handling of restart from group-stop is currently buggy,
+but the "as planned" behavior is to leave tracee stopped and waiting for
+.BR SIGCONT .
+If the tracee is restarted from signal-delivery-stop,
+the pending signal is injected.
+.SS execve(2) under ptrace
+.\" clone(2) CLONE_THREAD says:
+.\" If any of the threads in a thread group performs an execve(2),
+.\" then all threads other than the thread group leader are terminated,
+.\" and the new program is executed in the thread group leader.
+.\"
+When one thread in a multithreaded process calls
+.BR execve (2),
+the kernel destroys all other threads in the process,
+.\" In Linux 3.1 sources, see fs/exec.c::de_thread()
+and resets the thread ID of the execing thread to the
+thread group ID (process ID).
+(Or, to put things another way, when a multithreaded process does an
+.BR execve (2),
+at completion of the call, it appears as though the
+.BR execve (2)
+occurred in the thread group leader, regardless of which thread did the
+.BR execve (2).)
+This resetting of the thread ID looks very confusing to tracers:
+.IP \[bu] 3
+All other threads stop in
+.B PTRACE_EVENT_EXIT
+stop, if the
+.B PTRACE_O_TRACEEXIT
+option was turned on.
+Then all other threads except the thread group leader report
+death as if they exited via
+.BR _exit (2)
+with exit code 0.
+.IP \[bu]
+The execing tracee changes its thread ID while it is in the
+.BR execve (2).
+(Remember, under ptrace, the "pid" returned from
+.BR waitpid (2),
+or fed into ptrace calls, is the tracee's thread ID.)
+That is, the tracee's thread ID is reset to be the same as its process ID,
+which is the same as the thread group leader's thread ID.
+.IP \[bu]
+Then a
+.B PTRACE_EVENT_EXEC
+stop happens, if the
+.B PTRACE_O_TRACEEXEC
+option was turned on.
+.IP \[bu]
+If the thread group leader has reported its
+.B PTRACE_EVENT_EXIT
+stop by this time,
+it appears to the tracer that
+the dead thread leader "reappears from nowhere".
+(Note: the thread group leader does not report death via
+.I WIFEXITED(status)
+until there is at least one other live thread.
+This eliminates the possibility that the tracer will see
+it dying and then reappearing.)
+If the thread group leader was still alive,
+for the tracer this may look as if thread group leader
+returns from a different system call than it entered,
+or even "returned from a system call even though
+it was not in any system call".
+If the thread group leader was not traced
+(or was traced by a different tracer), then during
+.BR execve (2)
+it will appear as if it has become a tracee of
+the tracer of the execing tracee.
+.PP
+All of the above effects are the artifacts of
+the thread ID change in the tracee.
+.PP
+The
+.B PTRACE_O_TRACEEXEC
+option is the recommended tool for dealing with this situation.
+First, it enables
+.B PTRACE_EVENT_EXEC
+stop,
+which occurs before
+.BR execve (2)
+returns.
+In this stop, the tracer can use
+.B PTRACE_GETEVENTMSG
+to retrieve the tracee's former thread ID.
+(This feature was introduced in Linux 3.0.)
+Second, the
+.B PTRACE_O_TRACEEXEC
+option disables legacy
+.B SIGTRAP
+generation on
+.BR execve (2).
+.PP
+When the tracer receives
+.B PTRACE_EVENT_EXEC
+stop notification,
+it is guaranteed that except this tracee and the thread group leader,
+no other threads from the process are alive.
+.PP
+On receiving the
+.B PTRACE_EVENT_EXEC
+stop notification,
+the tracer should clean up all its internal
+data structures describing the threads of this process,
+and retain only one data structure\[em]one which
+describes the single still running tracee, with
+.PP
+.in +4n
+.EX
+thread ID == thread group ID == process ID.
+.EE
+.in
+.PP
+Example: two threads call
+.BR execve (2)
+at the same time:
+.PP
+.nf
+*** we get syscall-enter-stop in thread 1: **
+PID1 execve("/bin/foo", "foo" <unfinished ...>
+*** we issue PTRACE_SYSCALL for thread 1 **
+*** we get syscall-enter-stop in thread 2: **
+PID2 execve("/bin/bar", "bar" <unfinished ...>
+*** we issue PTRACE_SYSCALL for thread 2 **
+*** we get PTRACE_EVENT_EXEC for PID0, we issue PTRACE_SYSCALL **
+*** we get syscall-exit-stop for PID0: **
+PID0 <... execve resumed> ) = 0
+.fi
+.PP
+If the
+.B PTRACE_O_TRACEEXEC
+option is
+.I not
+in effect for the execing tracee,
+and if the tracee was
+.BR PTRACE_ATTACH ed
+rather that
+.BR PTRACE_SEIZE d,
+the kernel delivers an extra
+.B SIGTRAP
+to the tracee after
+.BR execve (2)
+returns.
+This is an ordinary signal (similar to one which can be
+generated by
+.IR "kill \-TRAP" ),
+not a special kind of ptrace-stop.
+Employing
+.B PTRACE_GETSIGINFO
+for this signal returns
+.I si_code
+set to 0
+.RI ( SI_USER ).
+This signal may be blocked by signal mask,
+and thus may be delivered (much) later.
+.PP
+Usually, the tracer (for example,
+.BR strace (1))
+would not want to show this extra post-execve
+.B SIGTRAP
+signal to the user, and would suppress its delivery to the tracee (if
+.B SIGTRAP
+is set to
+.BR SIG_DFL ,
+it is a killing signal).
+However, determining
+.I which
+.B SIGTRAP
+to suppress is not easy.
+Setting the
+.B PTRACE_O_TRACEEXEC
+option or using
+.B PTRACE_SEIZE
+and thus suppressing this extra
+.B SIGTRAP
+is the recommended approach.
+.SS Real parent
+The ptrace API (ab)uses the standard UNIX parent/child signaling over
+.BR waitpid (2).
+This used to cause the real parent of the process to stop receiving
+several kinds of
+.BR waitpid (2)
+notifications when the child process is traced by some other process.
+.PP
+Many of these bugs have been fixed, but as of Linux 2.6.38 several still
+exist; see BUGS below.
+.PP
+As of Linux 2.6.38, the following is believed to work correctly:
+.IP \[bu] 3
+exit/death by signal is reported first to the tracer, then,
+when the tracer consumes the
+.BR waitpid (2)
+result, to the real parent (to the real parent only when the
+whole multithreaded process exits).
+If the tracer and the real parent are the same process,
+the report is sent only once.
+.SH RETURN VALUE
+On success, the
+.B PTRACE_PEEK*
+requests return the requested data (but see NOTES),
+the
+.B PTRACE_SECCOMP_GET_FILTER
+request returns the number of instructions in the BPF program,
+the
+.B PTRACE_GET_SYSCALL_INFO
+request returns the number of bytes available to be written by the kernel,
+and other requests return zero.
+.PP
+On error, all requests return \-1, and
+.I errno
+is set to indicate the error.
+Since the value returned by a successful
+.B PTRACE_PEEK*
+request may be \-1, the caller must clear
+.I errno
+before the call, and then check it afterward
+to determine whether or not an error occurred.
+.SH ERRORS
+.TP
+.B EBUSY
+(i386 only) There was an error with allocating or freeing a debug register.
+.TP
+.B EFAULT
+There was an attempt to read from or write to an invalid area in
+the tracer's or the tracee's memory,
+probably because the area wasn't mapped or accessible.
+Unfortunately, under Linux, different variations of this fault
+will return
+.B EIO
+or
+.B EFAULT
+more or less arbitrarily.
+.TP
+.B EINVAL
+An attempt was made to set an invalid option.
+.TP
+.B EIO
+.I request
+is invalid, or an attempt was made to read from or
+write to an invalid area in the tracer's or the tracee's memory,
+or there was a word-alignment violation,
+or an invalid signal was specified during a restart request.
+.TP
+.B EPERM
+The specified process cannot be traced.
+This could be because the
+tracer has insufficient privileges (the required capability is
+.BR CAP_SYS_PTRACE );
+unprivileged processes cannot trace processes that they
+cannot send signals to or those running
+set-user-ID/set-group-ID programs, for obvious reasons.
+Alternatively, the process may already be being traced,
+or (before Linux 2.6.26) be
+.BR init (1)
+(PID 1).
+.TP
+.B ESRCH
+The specified process does not exist, or is not currently being traced
+by the caller, or is not stopped
+(for requests that require a stopped tracee).
+.SH STANDARDS
+None.
+.SH HISTORY
+SVr4, 4.3BSD.
+.PP
+Before Linux 2.6.26,
+.\" See commit 00cd5c37afd5f431ac186dd131705048c0a11fdb
+.BR init (1),
+the process with PID 1, may not be traced.
+.SH NOTES
+Although arguments to
+.BR ptrace ()
+are interpreted according to the prototype given,
+glibc currently declares
+.BR ptrace ()
+as a variadic function with only the
+.I request
+argument fixed.
+It is recommended to always supply four arguments,
+even if the requested operation does not use them,
+setting unused/ignored arguments to
+.I 0L
+or
+.IR "(void\ *)\ 0".
+.PP
+A tracees parent continues to be the tracer even if that tracer calls
+.BR execve (2).
+.PP
+The layout of the contents of memory and the USER area are
+quite operating-system- and architecture-specific.
+The offset supplied, and the data returned,
+might not entirely match with the definition of
+.IR "struct user" .
+.\" See http://lkml.org/lkml/2008/5/8/375
+.PP
+The size of a "word" is determined by the operating-system variant
+(e.g., for 32-bit Linux it is 32 bits).
+.PP
+This page documents the way the
+.BR ptrace ()
+call works currently in Linux.
+Its behavior differs significantly on other flavors of UNIX.
+In any case, use of
+.BR ptrace ()
+is highly specific to the operating system and architecture.
+.\"
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"
+.SS Ptrace access mode checking
+Various parts of the kernel-user-space API (not just
+.BR ptrace ()
+operations), require so-called "ptrace access mode" checks,
+whose outcome determines whether an operation is permitted
+(or, in a few cases, causes a "read" operation to return sanitized data).
+These checks are performed in cases where one process can
+inspect sensitive information about,
+or in some cases modify the state of, another process.
+The checks are based on factors such as the credentials and capabilities
+of the two processes,
+whether or not the "target" process is dumpable,
+and the results of checks performed by any enabled Linux Security Module
+(LSM)\[em]for example, SELinux, Yama, or Smack\[em]and by the commoncap LSM
+(which is always invoked).
+.PP
+Prior to Linux 2.6.27, all access checks were of a single type.
+Since Linux 2.6.27,
+.\" commit 006ebb40d3d65338bd74abb03b945f8d60e362bd
+two access mode levels are distinguished:
+.TP
+.B PTRACE_MODE_READ
+For "read" operations or other operations that are less dangerous,
+such as:
+.BR get_robust_list (2);
+.BR kcmp (2);
+reading
+.IR /proc/ pid /auxv ,
+.IR /proc/ pid /environ ,
+or
+.IR /proc/ pid /stat ;
+or
+.BR readlink (2)
+of a
+.IR /proc/ pid /ns/*
+file.
+.TP
+.B PTRACE_MODE_ATTACH
+For "write" operations, or other operations that are more dangerous,
+such as: ptrace attaching
+.RB ( PTRACE_ATTACH )
+to another process
+or calling
+.BR process_vm_writev (2).
+.RB ( PTRACE_MODE_ATTACH
+was effectively the default before Linux 2.6.27.)
+.\"
+.\" Regarding the above description of the distinction between
+.\" PTRACE_MODE_READ and PTRACE_MODE_ATTACH, Stephen Smalley notes:
+.\"
+.\" That was the intent when the distinction was introduced, but it doesn't
+.\" appear to have been properly maintained, e.g. there is now a common
+.\" helper lock_trace() that is used for
+.\" /proc/pid/{stack,syscall,personality} but checks PTRACE_MODE_ATTACH, and
+.\" PTRACE_MODE_ATTACH is also used in timerslack_ns_write/show(). Likely
+.\" should review and make them consistent. There was also some debate
+.\" about proper handling of /proc/pid/fd. Arguably that one might belong
+.\" back in the _ATTACH camp.
+.\"
+.PP
+Since Linux 4.5,
+.\" commit caaee6234d05a58c5b4d05e7bf766131b810a657
+the above access mode checks are combined (ORed) with
+one of the following modifiers:
+.TP
+.B PTRACE_MODE_FSCREDS
+Use the caller's filesystem UID and GID (see
+.BR credentials (7))
+or effective capabilities for LSM checks.
+.TP
+.B PTRACE_MODE_REALCREDS
+Use the caller's real UID and GID or permitted capabilities for LSM checks.
+This was effectively the default before Linux 4.5.
+.PP
+Because combining one of the credential modifiers with one of
+the aforementioned access modes is typical,
+some macros are defined in the kernel sources for the combinations:
+.TP
+.B PTRACE_MODE_READ_FSCREDS
+Defined as
+.BR "PTRACE_MODE_READ | PTRACE_MODE_FSCREDS" .
+.TP
+.B PTRACE_MODE_READ_REALCREDS
+Defined as
+.BR "PTRACE_MODE_READ | PTRACE_MODE_REALCREDS" .
+.TP
+.B PTRACE_MODE_ATTACH_FSCREDS
+Defined as
+.BR "PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS" .
+.TP
+.B PTRACE_MODE_ATTACH_REALCREDS
+Defined as
+.BR "PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS" .
+.PP
+One further modifier can be ORed with the access mode:
+.TP
+.BR PTRACE_MODE_NOAUDIT " (since Linux 3.3)"
+.\" commit 69f594a38967f4540ce7a29b3fd214e68a8330bd
+.\" Just for /proc/pid/stat
+Don't audit this access mode check.
+This modifier is employed for ptrace access mode checks
+(such as checks when reading
+.IR /proc/ pid /stat )
+that merely cause the output to be filtered or sanitized,
+rather than causing an error to be returned to the caller.
+In these cases, accessing the file is not a security violation and
+there is no reason to generate a security audit record.
+This modifier suppresses the generation of
+such an audit record for the particular access check.
+.PP
+Note that all of the
+.B PTRACE_MODE_*
+constants described in this subsection are kernel-internal,
+and not visible to user space.
+The constant names are mentioned here in order to label the various kinds of
+ptrace access mode checks that are performed for various system calls
+and accesses to various pseudofiles (e.g., under
+.IR /proc ).
+These names are used in other manual pages to provide a simple
+shorthand for labeling the different kernel checks.
+.PP
+The algorithm employed for ptrace access mode checking determines whether
+the calling process is allowed to perform the corresponding action
+on the target process.
+(In the case of opening
+.IR /proc/ pid
+files, the "calling process" is the one opening the file,
+and the process with the corresponding PID is the "target process".)
+The algorithm is as follows:
+.IP (1) 5
+If the calling thread and the target thread are in the same
+thread group, access is always allowed.
+.IP (2)
+If the access mode specifies
+.BR PTRACE_MODE_FSCREDS ,
+then, for the check in the next step,
+employ the caller's filesystem UID and GID.
+(As noted in
+.BR credentials (7),
+the filesystem UID and GID almost always have the same values
+as the corresponding effective IDs.)
+.IP
+Otherwise, the access mode specifies
+.BR PTRACE_MODE_REALCREDS ,
+so use the caller's real UID and GID for the checks in the next step.
+(Most APIs that check the caller's UID and GID use the effective IDs.
+For historical reasons, the
+.B PTRACE_MODE_REALCREDS
+check uses the real IDs instead.)
+.IP (3)
+Deny access if
+.I neither
+of the following is true:
+.RS
+.IP \[bu] 3
+The real, effective, and saved-set user IDs of the target
+match the caller's user ID,
+.I and
+the real, effective, and saved-set group IDs of the target
+match the caller's group ID.
+.IP \[bu]
+The caller has the
+.B CAP_SYS_PTRACE
+capability in the user namespace of the target.
+.RE
+.IP (4)
+Deny access if the target process "dumpable" attribute has a value other than 1
+.RB ( SUID_DUMP_USER ;
+see the discussion of
+.B PR_SET_DUMPABLE
+in
+.BR prctl (2)),
+and the caller does not have the
+.B CAP_SYS_PTRACE
+capability in the user namespace of the target process.
+.IP (5)
+The kernel LSM
+.IR security_ptrace_access_check ()
+interface is invoked to see if ptrace access is permitted.
+The results depend on the LSM(s).
+The implementation of this interface in the commoncap LSM performs
+the following steps:
+.\" (in cap_ptrace_access_check()):
+.RS
+.IP (5.1) 7
+If the access mode includes
+.BR PTRACE_MODE_FSCREDS ,
+then use the caller's
+.I effective
+capability set
+in the following check;
+otherwise (the access mode specifies
+.BR PTRACE_MODE_REALCREDS ,
+so) use the caller's
+.I permitted
+capability set.
+.IP (5.2)
+Deny access if
+.I neither
+of the following is true:
+.RS
+.IP \[bu] 3
+The caller and the target process are in the same user namespace,
+and the caller's capabilities are a superset of the target process's
+.I permitted
+capabilities.
+.IP \[bu]
+The caller has the
+.B CAP_SYS_PTRACE
+capability in the target process's user namespace.
+.RE
+.IP
+Note that the commoncap LSM does not distinguish between
+.B PTRACE_MODE_READ
+and
+.BR PTRACE_MODE_ATTACH .
+.RE
+.IP (6)
+If access has not been denied by any of the preceding steps,
+then access is allowed.
+.\"
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"
+.SS /proc/sys/kernel/yama/ptrace_scope
+On systems with the Yama Linux Security Module (LSM) installed
+(i.e., the kernel was configured with
+.BR CONFIG_SECURITY_YAMA ),
+the
+.I /proc/sys/kernel/yama/ptrace_scope
+file (available since Linux 3.4)
+.\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb
+can be used to restrict the ability to trace a process with
+.BR ptrace ()
+(and thus also the ability to use tools such as
+.BR strace (1)
+and
+.BR gdb (1)).
+The goal of such restrictions is to prevent attack escalation whereby
+a compromised process can ptrace-attach to other sensitive processes
+(e.g., a GPG agent or an SSH session) owned by the user in order
+to gain additional credentials that may exist in memory
+and thus expand the scope of the attack.
+.PP
+More precisely, the Yama LSM limits two types of operations:
+.IP \[bu] 3
+Any operation that performs a ptrace access mode
+.B PTRACE_MODE_ATTACH
+check\[em]for example,
+.BR ptrace ()
+.BR PTRACE_ATTACH .
+(See the "Ptrace access mode checking" discussion above.)
+.IP \[bu]
+.BR ptrace ()
+.BR PTRACE_TRACEME .
+.PP
+A process that has the
+.B CAP_SYS_PTRACE
+capability can update the
+.I /proc/sys/kernel/yama/ptrace_scope
+file with one of the following values:
+.TP
+0 ("classic ptrace permissions")
+No additional restrictions on operations that perform
+.B PTRACE_MODE_ATTACH
+checks (beyond those imposed by the commoncap and other LSMs).
+.IP
+The use of
+.B PTRACE_TRACEME
+is unchanged.
+.TP
+1 ("restricted ptrace") [default value]
+When performing an operation that requires a
+.B PTRACE_MODE_ATTACH
+check, the calling process must either have the
+.B CAP_SYS_PTRACE
+capability in the user namespace of the target process or
+it must have a predefined relationship with the target process.
+By default,
+the predefined relationship is that the target process
+must be a descendant of the caller.
+.IP
+A target process can employ the
+.BR prctl (2)
+.B PR_SET_PTRACER
+operation to declare an additional PID that is allowed to perform
+.B PTRACE_MODE_ATTACH
+operations on the target.
+See the kernel source file
+.I Documentation/admin\-guide/LSM/Yama.rst
+.\" commit 90bb766440f2147486a2acc3e793d7b8348b0c22
+(or
+.I Documentation/security/Yama.txt
+before Linux 4.13)
+for further details.
+.IP
+The use of
+.B PTRACE_TRACEME
+is unchanged.
+.TP
+2 ("admin-only attach")
+Only processes with the
+.B CAP_SYS_PTRACE
+capability in the user namespace of the target process may perform
+.B PTRACE_MODE_ATTACH
+operations or trace children that employ
+.BR PTRACE_TRACEME .
+.TP
+3 ("no attach")
+No process may perform
+.B PTRACE_MODE_ATTACH
+operations or trace children that employ
+.BR PTRACE_TRACEME .
+.IP
+Once this value has been written to the file, it cannot be changed.
+.PP
+With respect to values 1 and 2,
+note that creating a new user namespace effectively removes the
+protection offered by Yama.
+This is because a process in the parent user namespace whose effective
+UID matches the UID of the creator of a child namespace
+has all capabilities (including
+.BR CAP_SYS_PTRACE )
+when performing operations within the child user namespace
+(and further-removed descendants of that namespace).
+Consequently, when a process tries to use user namespaces to sandbox itself,
+it inadvertently weakens the protections offered by the Yama LSM.
+.\"
+.\"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+.\"
+.SS C library/kernel differences
+At the system call level, the
+.BR PTRACE_PEEKTEXT ,
+.BR PTRACE_PEEKDATA ,
+and
+.B PTRACE_PEEKUSER
+requests have a different API: they store the result
+at the address specified by the
+.I data
+parameter, and the return value is the error flag.
+The glibc wrapper function provides the API given in DESCRIPTION above,
+with the result being returned via the function return value.
+.SH BUGS
+On hosts with Linux 2.6 kernel headers,
+.B PTRACE_SETOPTIONS
+is declared with a different value than the one for Linux 2.4.
+This leads to applications compiled with Linux 2.6 kernel
+headers failing when run on Linux 2.4.
+This can be worked around by redefining
+.B PTRACE_SETOPTIONS
+to
+.BR PTRACE_OLDSETOPTIONS ,
+if that is defined.
+.PP
+Group-stop notifications are sent to the tracer, but not to real parent.
+Last confirmed on 2.6.38.6.
+.PP
+If a thread group leader is traced and exits by calling
+.BR _exit (2),
+.\" Note from Denys Vlasenko:
+.\" Here "exits" means any kind of death - _exit, exit_group,
+.\" signal death. Signal death and exit_group cases are trivial,
+.\" though: since signal death and exit_group kill all other threads
+.\" too, "until all other threads exit" thing happens rather soon
+.\" in these cases. Therefore, only _exit presents observably
+.\" puzzling behavior to ptrace users: thread leader _exit's,
+.\" but WIFEXITED isn't reported! We are trying to explain here
+.\" why it is so.
+a
+.B PTRACE_EVENT_EXIT
+stop will happen for it (if requested), but the subsequent
+.B WIFEXITED
+notification will not be delivered until all other threads exit.
+As explained above, if one of other threads calls
+.BR execve (2),
+the death of the thread group leader will
+.I never
+be reported.
+If the execed thread is not traced by this tracer,
+the tracer will never know that
+.BR execve (2)
+happened.
+One possible workaround is to
+.B PTRACE_DETACH
+the thread group leader instead of restarting it in this case.
+Last confirmed on 2.6.38.6.
+.\" FIXME . need to test/verify this scenario
+.PP
+A
+.B SIGKILL
+signal may still cause a
+.B PTRACE_EVENT_EXIT
+stop before actual signal death.
+This may be changed in the future;
+.B SIGKILL
+is meant to always immediately kill tasks even under ptrace.
+Last confirmed on Linux 3.13.
+.PP
+Some system calls return with
+.B EINTR
+if a signal was sent to a tracee, but delivery was suppressed by the tracer.
+(This is very typical operation: it is usually
+done by debuggers on every attach, in order to not introduce
+a bogus
+.BR SIGSTOP ).
+As of Linux 3.2.9, the following system calls are affected
+(this list is likely incomplete):
+.BR epoll_wait (2),
+and
+.BR read (2)
+from an
+.BR inotify (7)
+file descriptor.
+The usual symptom of this bug is that when you attach to
+a quiescent process with the command
+.PP
+.in +4n
+.EX
+strace \-p <process\-ID>
+.EE
+.in
+.PP
+then, instead of the usual
+and expected one-line output such as
+.PP
+.in +4n
+.EX
+restart_syscall(<... resuming interrupted call ...>_
+.EE
+.in
+.PP
+or
+.PP
+.in +4n
+.EX
+select(6, [5], NULL, [5], NULL_
+.EE
+.in
+.PP
+('_' denotes the cursor position), you observe more than one line.
+For example:
+.PP
+.in +4n
+.EX
+ clock_gettime(CLOCK_MONOTONIC, {15370, 690928118}) = 0
+ epoll_wait(4,_
+.EE
+.in
+.PP
+What is not visible here is that the process was blocked in
+.BR epoll_wait (2)
+before
+.BR strace (1)
+has attached to it.
+Attaching caused
+.BR epoll_wait (2)
+to return to user space with the error
+.BR EINTR .
+In this particular case, the program reacted to
+.B EINTR
+by checking the current time, and then executing
+.BR epoll_wait (2)
+again.
+(Programs which do not expect such "stray"
+.B EINTR
+errors may behave in an unintended way upon an
+.BR strace (1)
+attach.)
+.PP
+Contrary to the normal rules, the glibc wrapper for
+.BR ptrace ()
+can set
+.I errno
+to zero.
+.SH SEE ALSO
+.BR gdb (1),
+.BR ltrace (1),
+.BR strace (1),
+.BR clone (2),
+.BR execve (2),
+.BR fork (2),
+.BR gettid (2),
+.BR prctl (2),
+.BR seccomp (2),
+.BR sigaction (2),
+.BR tgkill (2),
+.BR vfork (2),
+.BR waitpid (2),
+.BR exec (3),
+.BR capabilities (7),
+.BR signal (7)