From 070852d8604cece0c31f28ff3eb8d21d9ba415fb Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 28 Apr 2024 09:24:57 +0200 Subject: Adding upstream version 1.3.3. Signed-off-by: Daniel Baumann --- HOWTO.md | 669 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 669 insertions(+) create mode 100644 HOWTO.md (limited to 'HOWTO.md') diff --git a/HOWTO.md b/HOWTO.md new file mode 100644 index 0000000..c1196ce --- /dev/null +++ b/HOWTO.md @@ -0,0 +1,669 @@ +HOWTO - using the library with perf {#howto_perf} +=================================== + +@brief Using command line perf and OpenCSD to collect and decode trace. + +This HOWTO explains how to use the perf cmd line tools and the openCSD +library to collect and extract program flow traces generated by the +CoreSight IP blocks on a Linux system. The examples have been generated using +an aarch64 Juno-r0 platform. + + +On Target Trace Acquisition - Perf Record +----------------------------------------- + +Compile the perf tool from the same kernel source code version you are using with: + + make -C tools/perf + +This will yield a `perf` executable that will support CoreSight trace collection. + +*Note:* If traces are to be decompressed **off** target, there is no need to download +and compile the openCSD library (on the target). + +If you are instead planning to use perf to record and decode the trace on the target, +compile the perf tool linking against the openCSD library, in the following way: + + make -C tools/perf VF=1 CORESIGHT=1 + +Further information on the needed build environments and options are detailed later +in the section **Off Target Perf Tools Compilation**. + +Before launching a trace run a sink that will collect trace data needs to be +identified. All CoreSight blocks identified by the framework are registed in +sysFS: + + + linaro@linaro-nano:~$ ls /sys/bus/coresight/devices/ + etm0 etm2 etm4 etm6 funnel0 funnel2 funnel4 stm0 tmc_etr0 + etm1 etm3 etm5 etm7 funnel1 funnel3 replicator0 tmc_etf0 + + +CoreSight blocks are listed in the device tree for a specific system and +discovered at boot time. Since tracers can be linked to more than one sink, +the sink that will recieve trace data needs to be identified and given as an +option on the perf command line. Once a sink has been identify trace collection +can start. An easy and yet interesting example is the `uname` command: + + linaro@linaro-nano:~/kernel$ ./tools/perf/perf record -e cs_etm/@tmc_etr0/ --per-thread uname + +This will generate a `perf.data` file where execution has been traced for both +user and kernel space. To narrow the field to either user or kernel space the +`u` and `k` options can be specified. For example the following will limit +traces to user space: + + + linaro@linaro-nano:~/kernel$ ./tools/perf/perf record -vvv -e cs_etm/@tmc_etr0/u --per-thread uname + Problems setting modules path maps, continuing anyway... + ----------------------------------------------------------- + perf_event_attr: + type 8 + size 112 + { sample_period, sample_freq } 1 + sample_type IP|TID|IDENTIFIER + read_format ID + disabled 1 + exclude_kernel 1 + exclude_hv 1 + enable_on_exec 1 + sample_id_all 1 + ------------------------------------------------------------ + sys_perf_event_open: pid 11375 cpu -1 group_fd -1 flags 0x8 + ------------------------------------------------------------ + perf_event_attr: + type 1 + size 112 + config 0x9 + { sample_period, sample_freq } 1 + sample_type IP|TID|IDENTIFIER + read_format ID + disabled 1 + exclude_kernel 1 + exclude_hv 1 + mmap 1 + comm 1 + enable_on_exec 1 + task 1 + sample_id_all 1 + mmap2 1 + comm_exec 1 + ------------------------------------------------------------ + sys_perf_event_open: pid 11375 cpu -1 group_fd -1 flags 0x8 + mmap size 266240B + AUX area mmap length 131072 + perf event ring buffer mmapped per thread + Synthesizing auxtrace information + Linux + auxtrace idx 0 old 0 head 0x11ea0 diff 0x11ea0 + [ perf record: Woken up 1 times to write data ] + overlapping maps: + 7f99daf000-7f99db0000 0 [vdso] + 7f99d84000-7f99db3000 0 /lib/aarch64-linux-gnu/ld-2.21.so + 7f99d84000-7f99daf000 0 /lib/aarch64-linux-gnu/ld-2.21.so + 7f99db0000-7f99db3000 0 /lib/aarch64-linux-gnu/ld-2.21.so + failed to write feature 8 + failed to write feature 9 + failed to write feature 14 + [ perf record: Captured and wrote 0.072 MB perf.data ] + + linaro@linaro-nano:~/kernel$ ls -l ~/.debug/ perf.data + _-rw------- 1 linaro linaro 77888 Mar 2 20:41 perf.data + + /home/linaro/.debug/: + total 16 + drwxr-xr-x 2 linaro linaro 4096 Mar 2 20:40 [kernel.kallsyms] + drwxr-xr-x 2 linaro linaro 4096 Mar 2 20:40 [vdso] + drwxr-xr-x 3 linaro linaro 4096 Mar 2 20:40 bin + drwxr-xr-x 3 linaro linaro 4096 Mar 2 20:40 lib + +Trace data filtering +-------------------- +The amount of traces generated by CoreSight tracers is staggering, event for +the most simple trace scenario. Reducing trace generation to specific areas +of interest is desirable to save trace buffer space and avoid getting lost in +the trace data that isn't relevant. Supplementing the 'k' and 'u' options +described above is the notion of address filters. + +On CoreSight two types of address filter have been implemented - address range +and start/stop filter: + +**Address range filters:** +With address range filters traces are generated if the instruction pointer +falls within the specified range. Any work done by the CPU outside of that +range will not be traced. Address range filters can be specified for both +user and kernel space session: + + perf record -e cs_etm/@tmc_etr0/k --filter 'filter 0xffffff8008562d0c/0x48' --per-thread uname + + perf record -e cs_etm/@tmc_etr0/u --filter 'filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' --per-thread ./main + +When dealing with kernel space trace addresses are typically taken in the +'System.map' file. In user space addresses are relocatable and can be +extracted from an objdump output: + + $ aarch64-linux-gnu-objdump -d libcstest.so.1.0 + ... + ... + 000000000000072c : <------------ Beginning of traces + 72c: d10083ff sub sp, sp, #0x20 + 730: b9000fe0 str w0, [sp,#12] + 734: b9001fff str wzr, [sp,#28] + 738: 14000007 b 754 + 73c: b9400fe0 ldr w0, [sp,#12] + 740: 11000800 add w0, w0, #0x2 + 744: b9000fe0 str w0, [sp,#12] + 748: b9401fe0 ldr w0, [sp,#28] + 74c: 11000400 add w0, w0, #0x1 + 750: b9001fe0 str w0, [sp,#28] + 754: b9401fe0 ldr w0, [sp,#28] + 758: 7100101f cmp w0, #0x4 + 75c: 54ffff0d b.le 73c + 760: b9400fe0 ldr w0, [sp,#12] + 764: 910083ff add sp, sp, #0x20 + 768: d65f03c0 ret + ... + ... + +Following the address the amount of byte is specified and if tracing in user +space, the full path to the binary (or library) being traced. + +**Start/Stop filters:** +With start/stop filters traces are generated when the instruction pointer is +equal to the start address. Incidentally traces stop being generated when the +insruction pointer is equal to the stop address. Anything that happens between +there to events is traced: + + perf record -e cs_etm/@tmc_etr0/k --filter 'start 0xffffff800856bc50,stop 0xffffff800856bcb0' --per-thread uname + + perf record -vvv -e cs_etm/@tmc_etr0/u --filter 'start 0x72c@/opt/lib/libcstest.so.1.0, \ + stop 0x40082c@/home/linaro/main' \ + --per-thread ./main + +**Limitation on address filters:** +The only limitation on address filters is the amount of address comparator +found on an implementation and the mutual exclusion between range and +start stop filters. As such the following example would _not_ work: + + perf record -e cs_etm/@tmc_etr0/k --filter 'start 0xffffff800856bc50,stop 0xffffff800856bcb0, \ // start/stop + filter 0x72c/0x40@/opt/lib/libcstest.so.1.0' \ // address range + --per-thread uname + +Additional Trace Options +------------------------ +Additional options can be used during trace collection that add information to the captured trace. + +- Timestamps: These packets are added to the trace streams to allow correlation of different sources where tools support this. +- Cycle Counts: These packets are added to get a count of cycles for blocks of executed instructions. Adding cycle counts will considerably increase the amount of generated trace. +The relationship between cycle counts and executed instructions differs according to the trace protocol. +For example, the ETMv4 protocol will emit counts for groups of instructions according to a minimum count threshold. +Presently this threshold is fixed at 256 cycles for `perf record`. + +Command line options in `perf record` to use these features are part of the options for the `cs_etm` event: + + perf record -e cs_etm/timestamp,cycacc,@tmc_etr0/ --per-thread uname + +At current version, `perf record` and `perf script` do not use this additional information. + +The cs_etm perf event +--------------------- + +System information for this perf pmu event can be found at: + + /sys/devices/cs_etm + +This contains internal format of the parameters described above: + + root@linaro-developer:~# ls /sys/devices/cs_etm/format + contextid cycacc retstack sinkid timestamp + +and names of registered sinks: + + root@linaro-developer:~# ls /sys/devices/cs_etm/sinks + tmc_etf0 tmc_etr0 tpiu0 + +Note: The `sinkid` parameter is there to document the usage of a 32-bit internal parameter to +pass the sink name used in the cs_etm/@sink/ command to the kernel drivers. It can be used +directly as cs_etm/sinkid=/ but this is not recommended as the values used are +considered opaque and subject to changes. + +On Target Trace Collection +-------------------------- +The entire program flow will have been recorded in the `perf.data` file. +Information about libraries and executable is stored under `$HOME/.debug`: + + linaro@linaro-nano:~/kernel$ tree ~/.debug + .debug + ├── [kernel.kallsyms] + │   └── 0542921808098d591a7acba5a1163e8991897669 + │   └── kallsyms + ├── [vdso] + │   └── 551fbbe29579eb63be3178a04c16830b8d449769 + │   └── vdso + ├── bin + │   └── uname + │   └── ed95e81f97c4471fb2ccc21e356b780eb0c92676 + │   └── elf + └── lib + └── aarch64-linux-gnu + ├── ld-2.21.so + │   └── 94912dc5a1dc8c7ef2c4e4649d4b1639b6ebc8b7 + │   └── elf + └── libc-2.21.so + └── 169a143e9c40cfd9d09695333e45fd67743cd2d6 + └── elf + + 13 directories, 5 files + linaro@linaro-nano:~/kernel$ + + +All this information needs to be collected in order to successfully decode +traces off target: + + linaro@linaro-nano:~/kernel$ tar czf uname.trace.tgz perf.data ~/.debug + + +Note that file `vmlinux` should also be added to the bundle if kernel traces +have also been collected. + + +Off Target OpenCSD Compilation +------------------------------ +The openCSD library is not part of the perf tools. It is available on +[github][1] and needs to be compiled before the perf tools. Checkout the +required branch/tag version into a local directory. + + linaro@t430:~/linaro/coresight$ git clone https://github.com/Linaro/OpenCSD.git my-opencsd + Cloning into 'OpenCSD'... + remote: Counting objects: 2063, done. + remote: Total 2063 (delta 0), reused 0 (delta 0), pack-reused 2063 + Receiving objects: 100% (2063/2063), 2.51 MiB | 1.24 MiB/s, done. + Resolving deltas: 100% (1399/1399), done. + Checking connectivity... done. + linaro@t430:~/linaro/coresight$ ls my-opencsd + decoder LICENSE README.md HOWTO.md TODO + +Once the source code has been acquired compilation of the openCSD library can +take place. For Linux two options are available, LINUX and LINUX64, based on +the host's (which has nothing to do with the target) architecture: + + linaro@t430:~/linaro/coresight/$ cd my-opencsd/decoder/build/linux/ + linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls + makefile rctdl_c_api_lib ref_trace_decode_lib + + linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ make LINUX64=1 DEBUG=1 + ... + ... + + linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls ../../lib/linux64/dbg/ + libopencsd.a libopencsd_c_api.a libopencsd_c_api.so libopencsd.so + +From there the header file and libraries need to be installed on the system, +something that requires root privileges. The default installation path is +/usr/include/opencsd for the header files and /usr/lib/ for the libraries: + + linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ sudo make install + linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls -l /usr/include/opencsd + total 60 + drwxr-xr-x 2 root root 4096 Dec 12 10:19 c_api + drwxr-xr-x 2 root root 4096 Dec 12 10:19 etmv3 + drwxr-xr-x 2 root root 4096 Dec 12 10:19 etmv4 + -rw-r--r-- 1 root root 28049 Dec 12 10:19 ocsd_if_types.h + drwxr-xr-x 2 root root 4096 Dec 12 10:19 ptm + drwxr-xr-x 2 root root 4096 Dec 12 10:19 stm + -rw-r--r-- 1 root root 7264 Dec 12 10:19 trc_gen_elem_types.h + -rw-r--r-- 1 root root 3972 Dec 12 10:19 trc_pkt_types.h + + linaro@t430:~/linaro/coresight/my-opencsd/decoder/build/linux$ ls -l /usr/lib/libopencsd* + -rw-r--r-- 1 root root 598720 Dec 12 10:19 /usr/lib/libopencsd_c_api.so + -rw-r--r-- 1 root root 4692200 Dec 12 10:19 /usr/lib/libopencsd.so + +A "clean_install" target is also available so that openCSD installed files can +be removed from a system. Going forward the goal is to have the openCSD library +packaged as a Debian or RPM archive so that it can be installed from a +distribution without having to be compiled. + + +Off Target Perf Tools Compilation +--------------------------------- + +As mentioned above the openCSD library is not part of the perf tools' code base +and needs to be installed on a system prior to compilation. Information about +the status of the openCSD library on a system is given at compile time by the +perf tools build script: + + linaro@t430:~/linaro/linux-kernel$ make CORESIGHT=1 VF=1 -C tools/perf + Auto-detecting system features: + ... dwarf: [ on ] + ... dwarf_getlocations: [ on ] + ... glibc: [ on ] + ... gtk2: [ on ] + ... libaudit: [ on ] + ... libbfd: [ OFF ] + ... libelf: [ on ] + ... libnuma: [ OFF ] + ... numa_num_possible_cpus: [ OFF ] + ... libperl: [ on ] + ... libpython: [ on ] + ... libslang: [ on ] + ... libcrypto: [ on ] + ... libunwind: [ OFF ] + ... libdw-dwarf-unwind: [ on ] + ... zlib: [ on ] + ... lzma: [ OFF ] + ... get_cpuid: [ on ] + ... bpf: [ on ] + ... libopencsd: [ on ] <------- + + +At the end of the compilation a new perf binary is available in `tools/perf/`: + + linaro@t430:~/linaro/linux-kernel$ ldd tools/perf/perf + linux-vdso.so.1 => (0x00007fff135db000) + libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f15f9176000) + librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f15f8f6e000) + libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f15f8c64000) + libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f15f8a60000) + libopencsd_c_api.so => /usr/lib/libopencsd_c_api.so (0x00007f15f884e000) <------- + libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f15f8635000) + libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f15f83ec000) + libaudit.so.1 => /lib/x86_64-linux-gnu/libaudit.so.1 (0x00007f15f81c5000) + libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f15f7e38000) + libperl.so.5.22 => /usr/lib/x86_64-linux-gnu/libperl.so.5.22 (0x00007f15f7a5d000) + libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f15f7693000) + libpython2.7.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 (0x00007f15f7104000) + libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f15f6eea000) + /lib64/ld-linux-x86-64.so.2 (0x0000559b88038000) + libopencsd.so => /usr/lib/libopencsd.so (0x00007f15f6c62000) <------- + libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f15f68df000) + libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f15f66c9000) + liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f15f64a6000) + libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f15f6296000) + libcrypt.so.1 => /lib/x86_64-linux-gnu/libcrypt.so.1 (0x00007f15f605e000) + libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f15f5e5a000) + + +Additional debug output from the decoder can be compiled in by setting the +`CSTRACE_RAW` environment variable. Setting this to `packed` gets trace frame +output as follows:- + + Frame Data; Index 576; RAW_PACKED; d6 d6 d6 d6 d6 d6 d6 d6 fc fb d6 d6 d6 d6 e0 7f + Frame Data; Index 576; ID_DATA[0x14]; d7 d6 d7 d6 d7 d6 d7 d6 fd fb d7 d6 d7 d6 e0 + +Set to any other value will remove the RAW_PACKED lines. + +Working with an alternate version of the openCSD library +-------------------------------------------------------- +When compiling the perf tools it is possible to reference another version of +the openCSD library than the one installed on the system. This is useful when +working with multiple development trees or having the desire to keep system +libraries intact. Two environment variable are available to tell the perf tools +build script where to get the header file and libraries, namely CSINCLUDES and +CSLIBS: + + linaro@t430:~/linaro/linux-kernel$ export CSINCLUDES=~/linaro/coresight/my-opencsd/decoder/include/ + linaro@t430:~/linaro/linux-kernel$ export CSLIBS=~/linaro/coresight/my-opencsd/decoder/lib/builddir/ + linaro@t430:~/linaro/linux-kernel$ make CORESIGHT=1 VF=1 -C tools/perf + +This will have the effect of compiling and linking against the provided library. +Since the system's openCSD library is in the loader's search patch the +LD_LIBRARY_PATH environment variable needs to be set. + + linaro@t430:~/linaro/linux-kernel$ export LD_LIBRARY_PATH=$CSLIBS + + +Trace Decoding with Perf Report +------------------------------- +Before working with custom traces it is suggested to use a trace bundle that +is known to be working properly. A sample bundle has been made available +here [2]. Trace bundles can be extracted anywhere and have no dependencies on +where the perf tools and openCSD library have been compiled. + + linaro@t430:~/linaro/coresight$ mkdir sept20 + linaro@t430:~/linaro/coresight$ cd sept20 + linaro@t430:~/linaro/coresight/sept20$ wget http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.sept20.tgz + linaro@t430:~/linaro/coresight/sept20$ md5sum uname.v4.user.sept20.tgz + f53f11d687ce72bdbe9de2e67e960ec6 uname.v4.user.sept20.tgz + linaro@t430:~/linaro/coresight/sept20$ tar xf uname.v4.user.sept20.tgz + linaro@t430:~/linaro/coresight/sept20$ ls -la + total 1312 + drwxrwxr-x 3 linaro linaro 4096 Mar 3 10:26 . + drwxrwxr-x 5 linaro linaro 4096 Mar 3 10:13 .. + drwxr-xr-x 7 linaro linaro 4096 Feb 24 12:21 .debug + -rw------- 1 linaro linaro 78016 Feb 24 12:21 perf.data + -rw-rw-r-- 1 linaro linaro 1245881 Feb 24 12:25 uname.v4.user.sept20.tgz + +Perf is expecting files related to the trace capture (`perf.data`) to be located in the `buildid` directory. +By default this is under `~/.debug`. Alternatively the default `buildid` directory can be changed +using the command: + + perf config --system buildid.dir=/my/own/buildid/dir + +This example will remove the current `~/.debug` directory to be sure everything is clean. + + linaro@t430:~/linaro/coresight/sept20$ rm -rf ~/.debug + linaro@t430:~/linaro/coresight/sept20$ cp -dpR .debug ~/ + linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf report --stdio + + # To display the perf.data header info, please use --header/--header-only options. + # + # + # Total Lost Samples: 0 + # + # Samples: 0 of event 'cs_etm//u' + # Event count (approx.): 0 + # + # Children Self Command Shared Object Symbol + # ........ ........ ....... ............. ...... + # + + + # Samples: 0 of event 'dummy:u' + # Event count (approx.): 0 + # + # Children Self Command Shared Object Symbol + # ........ ........ ....... ............. ...... + # + + + # Samples: 115K of event 'instructions:u' + # Event count (approx.): 522009 + # + # Children Self Command Shared Object Symbol + # ........ ........ ....... ................ ...................... + # + 4.13% 4.13% uname libc-2.21.so [.] 0x0000000000078758 + 3.81% 3.81% uname libc-2.21.so [.] 0x0000000000078e50 + 2.06% 2.06% uname libc-2.21.so [.] 0x00000000000fcaf4 + 1.65% 1.65% uname libc-2.21.so [.] 0x00000000000fcae4 + 1.59% 1.59% uname ld-2.21.so [.] 0x000000000000a7f4 + 1.50% 1.50% uname libc-2.21.so [.] 0x0000000000078e40 + 1.43% 1.43% uname libc-2.21.so [.] 0x00000000000fcac4 + 1.31% 1.31% uname libc-2.21.so [.] 0x000000000002f0c0 + 1.26% 1.26% uname ld-2.21.so [.] 0x0000000000016888 + 1.24% 1.24% uname libc-2.21.so [.] 0x0000000000078e7c + 1.24% 1.24% uname libc-2.21.so [.] 0x00000000000fcab8 + ... + +Additional data can be obtained, which contains a dump of the trace packets received using the command + + mjl@ubuntu-vbox:./perf-opencsd-master/coresight/tools/perf/perf report --stdio --dump + +resulting a large amount of data, trace looking like:- + + 0x618 [0x30]: PERF_RECORD_AUXTRACE size: 0x11ef0 offset: 0 ref: 0x4d881c1f13216016 idx: 0 tid: 15244 cpu: -1 + + . ... CoreSight ETM Trace data: size 73456 bytes + + 0: I_ASYNC : Alignment Synchronisation. + 12: I_TRACE_INFO : Trace Info. + 17: I_TRACE_ON : Trace On. + 18: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F24D80; Ctxt: AArch64,EL0, NS; + 28: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE + 29: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE + 30: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE + 32: I_ATOM_F6 : Atom format 6.; EEEEN + 33: I_ATOM_F1 : Atom format 1.; E + 34: I_EXCEPT : Exception.; Data Fault; Ret Addr Follows; + 36: I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000007F89F2832C; + 45: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFFFFC000083400; Ctxt: AArch64,EL1, NS; + 56: I_TRACE_ON : Trace On. + 57: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F2832C; Ctxt: AArch64,EL0, NS; + 68: I_ATOM_F3 : Atom format 3.; NEE + 69: I_ATOM_F3 : Atom format 3.; NEN + 70: I_ATOM_F3 : Atom format 3.; NNE + 71: I_ATOM_F5 : Atom format 5.; ENENE + 72: I_ATOM_F5 : Atom format 5.; NENEN + 73: I_ATOM_F5 : Atom format 5.; ENENE + 74: I_ATOM_F5 : Atom format 5.; NENEN + 75: I_ATOM_F5 : Atom format 5.; ENENE + 76: I_ATOM_F3 : Atom format 3.; NNE + 77: I_ATOM_F3 : Atom format 3.; NNE + 78: I_ATOM_F3 : Atom format 3.; NNE + 80: I_ATOM_F3 : Atom format 3.; NNE + 81: I_ATOM_F3 : Atom format 3.; ENN + 82: I_EXCEPT : Exception.; Data Fault; Ret Addr Follows; + 84: I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000007F89F283F0; + 93: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0xFFFFFFC000083400; Ctxt: AArch64,EL1, NS; + 104: I_TRACE_ON : Trace On. + 105: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000007F89F283F0; Ctxt: AArch64,EL0, NS; + 116: I_ATOM_F5 : Atom format 5.; NNNNN + 117: I_ATOM_F5 : Atom format 5.; NNNNN + + +Trace Decoding with Perf Script +------------------------------- +Working with perf scripts needs more command line options but yields +interesting results. + + linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-master/tools/perf/ + linaro@t430:~/linaro/coresight/sept20$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/ + linaro@t430:~/linaro/coresight/sept20$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/ + linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump + + 7f89f24d80: 910003e0 mov x0, sp + 7f89f24d84: 94000d53 bl 7f89f282d0 + 7f89f282d0: d11203ff sub sp, sp, #0x480 + 7f89f282d4: a9ba7bfd stp x29, x30, [sp,#-96]! + 7f89f282d8: 910003fd mov x29, sp + 7f89f282dc: a90363f7 stp x23, x24, [sp,#48] + 7f89f282e0: 9101e3b7 add x23, x29, #0x78 + 7f89f282e4: a90573fb stp x27, x28, [sp,#80] + 7f89f282e8: a90153f3 stp x19, x20, [sp,#16] + 7f89f282ec: aa0003fb mov x27, x0 + 7f89f282f0: 910a82e1 add x1, x23, #0x2a0 + 7f89f282f4: a9025bf5 stp x21, x22, [sp,#32] + 7f89f282f8: a9046bf9 stp x25, x26, [sp,#64] + 7f89f282fc: 910102e0 add x0, x23, #0x40 + 7f89f28300: f800841f str xzr, [x0],#8 + 7f89f28304: eb01001f cmp x0, x1 + 7f89f28308: 54ffffc1 b.ne 7f89f28300 + 7f89f28300: f800841f str xzr, [x0],#8 + 7f89f28304: eb01001f cmp x0, x1 + 7f89f28308: 54ffffc1 b.ne 7f89f28300 + 7f89f28300: f800841f str xzr, [x0],#8 + 7f89f28304: eb01001f cmp x0, x1 + 7f89f28308: 54ffffc1 b.ne 7f89f28300 + +Kernel Trace Decoding +--------------------- + +When dealing with kernel space traces the vmlinux file has to be communicated +explicitely to perf using the "--vmlinux" command line option: + + linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf report --stdio --vmlinux=./vmlinux + ... + ... + linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf script --vmlinux=./vmlinux + +When using scripts things get a little more convoluted. Using the same example +an above but for traces but for kernel traces, the command line becomes: + + linaro@t430:~/linaro/coresight/sept20$ export EXEC_PATH=/home/linaro/coresight/perf-opencsd-master/tools/perf/ + linaro@t430:~/linaro/coresight/sept20$ export SCRIPT_PATH=$EXEC_PATH/scripts/python/ + linaro@t430:~/linaro/coresight/sept20$ export XTOOL_PATH=/your/aarch64/toolchain/path/bin/ + linaro@t430:~/linaro/coresight/sept20$ ../perf-opencsd-master/tools/perf/perf --exec-path=${EXEC_PATH} script \ + --vmlinux=./vmlinux \ + --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- \ + -d ${XTOOLS_PATH}/aarch64-linux-gnu-objdump \ + -k ./vmlinux + ... + ... + +The option "--vmlinux=./vmlinux" is interpreted by the "perf script" command +the same way it if for "perf report". The option "-k ./vmlinux" is dependant +on the script being executed and has no related to the "--vmlinux", though it +is highly advised to keep them synchronized. + + +Perf Test Environment Scripts +----------------------------- + +The decoder library comes with a number of `bash` scripts that ease the setting up of the +offline build and test environment for perf, and executing tests. + +These scripts can be found in + + decoder/tests/perf-test-scripts + +There are three scripts provided: + +- `perf-setup-env.bash` : this sets up all the environment variables mentioned above. +- `perf-test-report.bash` : this runs `perf report` - using the environment setup by `perf-setup-env.bash` +- `perf-test-script.bash` : this runs `perf script` - using the environment setup by `perf-setup-env.bash` + +Use as follows:- + +1. Prior to building perf, edit `perf-setup-env.bash` to conform to your environment. There are four lines at the top of the file that will require editing. + +2. Execute the script using the command: + + source perf-setup-env.bash + + This will set up a perf execute environment for using the perf report and script commands. + + Alternatively use the command: + + source perf-setup-env.base buildenv + + This will add in the build environment variables mentioned in the sections on building above alongside the + environment for using the used by the `perf-test...` scripts to run the tests. + +3. Build perf as described above. +4. Follow the instructions for downloading the test capture, or create a capture from your target. +5. Copy the `perf-test...` scripts into the capture data directory -> the one that contains `perf.data`. + +6. The scripts can now be run. No options are required for the default operation, but any command line options will be added to the perf report / perf script command line. + +e.g. + + ./perf-test-report.bash --dump + +will add the --dump option to the end of the command line and run + + ${PERF_EXEC_PATH}/perf report --stdio --dump + + +Generating coverage files for Feedback Directed Optimization: AutoFDO +--------------------------------------------------------------------- + +See autofdo.md (@ref AutoFDO) for details and scripts. + + +The Linaro CoreSight Team +------------------------- +- Mike Leach +- Mathieu Poirier + + +One Last Thing +-------------- +We welcome help on this project. If you would like to add features or help +improve the way things work, we want to hear from you. + +Best regards, +*The Linaro CoreSight Team* + +-------------------------------------- +[1]: https://github.com/Linaro/OpenCSD + +[2]: http://people.linaro.org/~mathieu.poirier/openCSD/uname.v4.user.sept20.tgz -- cgit v1.2.3