summaryrefslogtreecommitdiffstats
path: root/tools/perf/Documentation/perf-mem.txt
blob: 19862572e3f2b64dcfc40f0fbf175baadfc284b4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
perf-mem(1)
===========

NAME
----
perf-mem - Profile memory accesses

SYNOPSIS
--------
[verse]
'perf mem' [<options>] (record [<command>] | report)

DESCRIPTION
-----------
"perf mem record" runs a command and gathers memory operation data
from it, into perf.data. Perf record options are accepted and are passed through.

"perf mem report" displays the result. It invokes perf report with the
right set of options to display a memory access profile. By default, loads
and stores are sampled. Use the -t option to limit to loads or stores.

Note that on Intel systems the memory latency reported is the use-latency,
not the pure load (or store latency). Use latency includes any pipeline
queueing delays in addition to the memory subsystem latency.

On Arm64 this uses SPE to sample load and store operations, therefore hardware
and kernel support is required. See linkperf:perf-arm-spe[1] for a setup guide.
Due to the statistical nature of SPE sampling, not every memory operation will
be sampled.

OPTIONS
-------
<command>...::
	Any command you can specify in a shell.

-i::
--input=<file>::
	Input file name.

-f::
--force::
	Don't do ownership validation

-t::
--type=<type>::
	Select the memory operation type: load or store (default: load,store)

-D::
--dump-raw-samples::
	Dump the raw decoded samples on the screen in a format that is easy to parse with
	one sample per line.

-x::
--field-separator=<separator>::
	Specify the field separator used when dump raw samples (-D option). By default,
	The separator is the space character.

-C::
--cpu=<cpu>::
	Monitor only on the list of CPUs provided. Multiple CPUs can be provided as a
        comma-separated list with no space: 0,1. Ranges of CPUs are specified with -: 0-2. Default
        is to monitor all CPUS.
-U::
--hide-unresolved::
	Only display entries resolved to a symbol.

-p::
--phys-data::
	Record/Report sample physical addresses

--data-page-size::
	Record/Report sample data address page size

RECORD OPTIONS
--------------
-e::
--event <event>::
	Event selector. Use 'perf mem record -e list' to list available events.

-K::
--all-kernel::
	Configure all used events to run in kernel space.

-U::
--all-user::
	Configure all used events to run in user space.

-v::
--verbose::
	Be more verbose (show counter open errors, etc)

--ldlat <n>::
	Specify desired latency for loads event. Supported on Intel and Arm64
	processors only. Ignored on other archs.

In addition, for report all perf report options are valid, and for record
all perf record options.

SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-arm-spe[1]