summaryrefslogtreecommitdiffstats
path: root/tools/perf/Documentation/perf-trace.txt
diff options
context:
space:
mode:
Diffstat (limited to 'tools/perf/Documentation/perf-trace.txt')
-rw-r--r--tools/perf/Documentation/perf-trace.txt243
1 files changed, 243 insertions, 0 deletions
diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
new file mode 100644
index 000000000..115db9e06
--- /dev/null
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -0,0 +1,243 @@
+perf-trace(1)
+=============
+
+NAME
+----
+perf-trace - strace inspired tool
+
+SYNOPSIS
+--------
+[verse]
+'perf trace'
+'perf trace record'
+
+DESCRIPTION
+-----------
+This command will show the events associated with the target, initially
+syscalls, but other system events like pagefaults, task lifetime events,
+scheduling events, etc.
+
+This is a live mode tool in addition to working with perf.data files like
+the other perf tools. Files can be generated using the 'perf record' command
+but the session needs to include the raw_syscalls events (-e 'raw_syscalls:*').
+Alternatively, 'perf trace record' can be used as a shortcut to
+automatically include the raw_syscalls events when writing events to a file.
+
+The following options apply to perf trace; options to perf trace record are
+found in the perf record man page.
+
+OPTIONS
+-------
+
+-a::
+--all-cpus::
+ System-wide collection from all CPUs.
+
+-e::
+--expr::
+--event::
+ List of syscalls and other perf events (tracepoints, HW cache events,
+ etc) to show. Globbing is supported, e.g.: "epoll_*", "*msg*", etc.
+ See 'perf list' for a complete list of events.
+ Prefixing with ! shows all syscalls but the ones specified. You may
+ need to escape it.
+
+-D msecs::
+--delay msecs::
+After starting the program, wait msecs before measuring. This is useful to
+filter out the startup phase of the program, which is often very different.
+
+-o::
+--output=::
+ Output file name.
+
+-p::
+--pid=::
+ Record events on existing process ID (comma separated list).
+
+-t::
+--tid=::
+ Record events on existing thread ID (comma separated list).
+
+-u::
+--uid=::
+ Record events in threads owned by uid. Name or number.
+
+-G::
+--cgroup::
+ Record events in threads in a cgroup.
+
+ Look for cgroups to set at the /sys/fs/cgroup/perf_event directory, then
+ remove the /sys/fs/cgroup/perf_event/ part and try:
+
+ perf trace -G A -e sched:*switch
+
+ Will set all raw_syscalls:sys_{enter,exit}, pgfault, vfs_getname, etc
+ _and_ sched:sched_switch to the 'A' cgroup, while:
+
+ perf trace -e sched:*switch -G A
+
+ will only set the sched:sched_switch event to the 'A' cgroup, all the
+ other events (raw_syscalls:sys_{enter,exit}, etc are left "without"
+ a cgroup (on the root cgroup, sys wide, etc).
+
+ Multiple cgroups:
+
+ perf trace -G A -e sched:*switch -G B
+
+ the syscall ones go to the 'A' cgroup, the sched:sched_switch goes
+ to the 'B' cgroup.
+
+--filter-pids=::
+ Filter out events for these pids and for 'trace' itself (comma separated list).
+
+-v::
+--verbose=::
+ Verbosity level.
+
+--no-inherit::
+ Child tasks do not inherit counters.
+
+-m::
+--mmap-pages=::
+ Number of mmap data pages (must be a power of two) or size
+ specification with appended unit character - B/K/M/G. The
+ size is rounded up to have nearest pages power of two value.
+
+-C::
+--cpu::
+Collect samples only on the list of CPUs provided. Multiple CPUs can be provided as a
+comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2.
+In per-thread mode with inheritance mode on (default), Events are captured only when
+the thread executes on the designated CPUs. Default is to monitor all CPUs.
+
+--duration::
+ Show only events that had a duration greater than N.M ms.
+
+--sched::
+ Accrue thread runtime and provide a summary at the end of the session.
+
+--failure::
+ Show only syscalls that failed, i.e. that returned < 0.
+
+-i::
+--input::
+ Process events from a given perf data file.
+
+-T::
+--time::
+ Print full timestamp rather time relative to first sample.
+
+--comm::
+ Show process COMM right beside its ID, on by default, disable with --no-comm.
+
+-s::
+--summary::
+ Show only a summary of syscalls by thread with min, max, and average times
+ (in msec) and relative stddev.
+
+-S::
+--with-summary::
+ Show all syscalls followed by a summary by thread with min, max, and
+ average times (in msec) and relative stddev.
+
+--tool_stats::
+ Show tool stats such as number of times fd->pathname was discovered thru
+ hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
+
+-f::
+--force::
+ Don't complain, do it.
+
+-F=[all|min|maj]::
+--pf=[all|min|maj]::
+ Trace pagefaults. Optionally, you can specify whether you want minor,
+ major or all pagefaults. Default value is maj.
+
+--syscalls::
+ Trace system calls. This options is enabled by default, disable with
+ --no-syscalls.
+
+--call-graph [mode,type,min[,limit],order[,key][,branch]]::
+ Setup and enable call-graph (stack chain/backtrace) recording.
+ See `--call-graph` section in perf-record and perf-report
+ man pages for details. The ones that are most useful in 'perf trace'
+ are 'dwarf' and 'lbr', where available, try: 'perf trace --call-graph dwarf'.
+
+ Using this will, for the root user, bump the value of --mmap-pages to 4
+ times the maximum for non-root users, based on the kernel.perf_event_mlock_kb
+ sysctl. This is done only if the user doesn't specify a --mmap-pages value.
+
+--kernel-syscall-graph::
+ Show the kernel callchains on the syscall exit path.
+
+--max-stack::
+ Set the stack depth limit when parsing the callchain, anything
+ beyond the specified depth will be ignored. Note that at this point
+ this is just about the presentation part, i.e. the kernel is still
+ not limiting, the overhead of callchains needs to be set via the
+ knobs in --call-graph dwarf.
+
+ Implies '--call-graph dwarf' when --call-graph not present on the
+ command line, on systems where DWARF unwinding was built in.
+
+ Default: /proc/sys/kernel/perf_event_max_stack when present for
+ live sessions (without --input/-i), 127 otherwise.
+
+--min-stack::
+ Set the stack depth limit when parsing the callchain, anything
+ below the specified depth will be ignored. Disabled by default.
+
+ Implies '--call-graph dwarf' when --call-graph not present on the
+ command line, on systems where DWARF unwinding was built in.
+
+--print-sample::
+ Print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info for the
+ raw_syscalls:sys_{enter,exit} tracepoints, for debugging.
+
+--proc-map-timeout::
+ When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
+ because the file may be huge. A time out is needed in such cases.
+ This option sets the time out limit. The default value is 500 ms.
+
+PAGEFAULTS
+----------
+
+When tracing pagefaults, the format of the trace is as follows:
+
+<min|maj>fault [<ip.symbol>+<ip.offset>] => <addr.dso@addr.offset> (<map type><addr level>).
+
+- min/maj indicates whether fault event is minor or major;
+- ip.symbol shows symbol for instruction pointer (the code that generated the
+ fault); if no debug symbols available, perf trace will print raw IP;
+- addr.dso shows DSO for the faulted address;
+- map type is either 'd' for non-executable maps or 'x' for executable maps;
+- addr level is either 'k' for kernel dso or '.' for user dso.
+
+For symbols resolution you may need to install debugging symbols.
+
+Please be aware that duration is currently always 0 and doesn't reflect actual
+time it took for fault to be handled!
+
+When --verbose specified, perf trace tries to print all available information
+for both IP and fault address in the form of dso@symbol+offset.
+
+EXAMPLES
+--------
+
+Trace only major pagefaults:
+
+ $ perf trace --no-syscalls -F
+
+Trace syscalls, major and minor pagefaults:
+
+ $ perf trace -F all
+
+ 1416.547 ( 0.000 ms): python/20235 majfault [CRYPTO_push_info_+0x0] => /lib/x86_64-linux-gnu/libcrypto.so.1.0.0@0x61be0 (x.)
+
+ As you can see, there was major pagefault in python process, from
+ CRYPTO_push_info_ routine which faulted somewhere in libcrypto.so.
+
+SEE ALSO
+--------
+linkperf:perf-record[1], linkperf:perf-script[1]