From 2c3c1048746a4622d8c89a29670120dc8fab93c4 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 7 Apr 2024 20:49:45 +0200 Subject: Adding upstream version 6.1.76. Signed-off-by: Daniel Baumann --- tools/perf/Documentation/Build.txt | 73 + tools/perf/Documentation/Makefile | 308 ++++ tools/perf/Documentation/android.txt | 78 + tools/perf/Documentation/arm-coresight.txt | 5 + tools/perf/Documentation/asciidoc.conf | 94 + tools/perf/Documentation/asciidoctor-extensions.rb | 29 + tools/perf/Documentation/build-docdep.perl | 46 + tools/perf/Documentation/build-xed.txt | 19 + .../callchain-overhead-calculation.txt | 108 ++ tools/perf/Documentation/cat-texi.perl | 46 + tools/perf/Documentation/db-export.txt | 41 + tools/perf/Documentation/examples.txt | 225 +++ tools/perf/Documentation/guest-files.txt | 16 + tools/perf/Documentation/guestmount.txt | 11 + tools/perf/Documentation/intel-bts.txt | 86 + tools/perf/Documentation/intel-hybrid.txt | 204 +++ tools/perf/Documentation/intel-pt.txt | 1 + tools/perf/Documentation/itrace.txt | 70 + tools/perf/Documentation/jit-interface.txt | 15 + tools/perf/Documentation/jitdump-specification.txt | 170 ++ tools/perf/Documentation/manpage-1.72.xsl | 14 + tools/perf/Documentation/manpage-base.xsl | 35 + tools/perf/Documentation/manpage-bold-literal.xsl | 17 + tools/perf/Documentation/manpage-normal.xsl | 13 + tools/perf/Documentation/manpage-suppress-sp.xsl | 21 + tools/perf/Documentation/perf-annotate.txt | 157 ++ tools/perf/Documentation/perf-archive.txt | 22 + tools/perf/Documentation/perf-arm-spe.txt | 218 +++ tools/perf/Documentation/perf-bench.txt | 238 +++ tools/perf/Documentation/perf-buildid-cache.txt | 88 + tools/perf/Documentation/perf-buildid-list.txt | 47 + tools/perf/Documentation/perf-c2c.txt | 336 ++++ tools/perf/Documentation/perf-config.txt | 755 ++++++++ tools/perf/Documentation/perf-daemon.txt | 208 +++ tools/perf/Documentation/perf-data.txt | 54 + tools/perf/Documentation/perf-diff.txt | 305 ++++ tools/perf/Documentation/perf-dlfilter.txt | 281 +++ tools/perf/Documentation/perf-evlist.txt | 45 + tools/perf/Documentation/perf-ftrace.txt | 148 ++ tools/perf/Documentation/perf-help.txt | 38 + tools/perf/Documentation/perf-inject.txt | 119 ++ tools/perf/Documentation/perf-intel-pt.txt | 1858 ++++++++++++++++++++ tools/perf/Documentation/perf-iostat.txt | 88 + tools/perf/Documentation/perf-kallsyms.txt | 24 + tools/perf/Documentation/perf-kmem.txt | 80 + tools/perf/Documentation/perf-kvm.txt | 153 ++ tools/perf/Documentation/perf-kwork.txt | 180 ++ tools/perf/Documentation/perf-list.txt | 357 ++++ tools/perf/Documentation/perf-lock.txt | 174 ++ tools/perf/Documentation/perf-mem.txt | 96 + tools/perf/Documentation/perf-probe.txt | 313 ++++ tools/perf/Documentation/perf-record.txt | 790 +++++++++ tools/perf/Documentation/perf-report.txt | 583 ++++++ tools/perf/Documentation/perf-sched.txt | 171 ++ tools/perf/Documentation/perf-script-perl.txt | 216 +++ tools/perf/Documentation/perf-script-python.txt | 679 +++++++ tools/perf/Documentation/perf-script.txt | 518 ++++++ tools/perf/Documentation/perf-stat.txt | 596 +++++++ tools/perf/Documentation/perf-test.txt | 36 + tools/perf/Documentation/perf-timechart.txt | 128 ++ tools/perf/Documentation/perf-top.txt | 398 +++++ tools/perf/Documentation/perf-trace.txt | 347 ++++ tools/perf/Documentation/perf-version.txt | 24 + .../Documentation/perf.data-directory-format.txt | 63 + tools/perf/Documentation/perf.data-file-format.txt | 682 +++++++ tools/perf/Documentation/perf.txt | 90 + tools/perf/Documentation/perfconfig.example | 38 + tools/perf/Documentation/security.txt | 237 +++ tools/perf/Documentation/tips.txt | 43 + tools/perf/Documentation/topdown.txt | 344 ++++ 70 files changed, 14110 insertions(+) create mode 100644 tools/perf/Documentation/Build.txt create mode 100644 tools/perf/Documentation/Makefile create mode 100644 tools/perf/Documentation/android.txt create mode 100644 tools/perf/Documentation/arm-coresight.txt create mode 100644 tools/perf/Documentation/asciidoc.conf create mode 100644 tools/perf/Documentation/asciidoctor-extensions.rb create mode 100755 tools/perf/Documentation/build-docdep.perl create mode 100644 tools/perf/Documentation/build-xed.txt create mode 100644 tools/perf/Documentation/callchain-overhead-calculation.txt create mode 100755 tools/perf/Documentation/cat-texi.perl create mode 100644 tools/perf/Documentation/db-export.txt create mode 100644 tools/perf/Documentation/examples.txt create mode 100644 tools/perf/Documentation/guest-files.txt create mode 100644 tools/perf/Documentation/guestmount.txt create mode 100644 tools/perf/Documentation/intel-bts.txt create mode 100644 tools/perf/Documentation/intel-hybrid.txt create mode 100644 tools/perf/Documentation/intel-pt.txt create mode 100644 tools/perf/Documentation/itrace.txt create mode 100644 tools/perf/Documentation/jit-interface.txt create mode 100644 tools/perf/Documentation/jitdump-specification.txt create mode 100644 tools/perf/Documentation/manpage-1.72.xsl create mode 100644 tools/perf/Documentation/manpage-base.xsl create mode 100644 tools/perf/Documentation/manpage-bold-literal.xsl create mode 100644 tools/perf/Documentation/manpage-normal.xsl create mode 100644 tools/perf/Documentation/manpage-suppress-sp.xsl create mode 100644 tools/perf/Documentation/perf-annotate.txt create mode 100644 tools/perf/Documentation/perf-archive.txt create mode 100644 tools/perf/Documentation/perf-arm-spe.txt create mode 100644 tools/perf/Documentation/perf-bench.txt create mode 100644 tools/perf/Documentation/perf-buildid-cache.txt create mode 100644 tools/perf/Documentation/perf-buildid-list.txt create mode 100644 tools/perf/Documentation/perf-c2c.txt create mode 100644 tools/perf/Documentation/perf-config.txt create mode 100644 tools/perf/Documentation/perf-daemon.txt create mode 100644 tools/perf/Documentation/perf-data.txt create mode 100644 tools/perf/Documentation/perf-diff.txt create mode 100644 tools/perf/Documentation/perf-dlfilter.txt create mode 100644 tools/perf/Documentation/perf-evlist.txt create mode 100644 tools/perf/Documentation/perf-ftrace.txt create mode 100644 tools/perf/Documentation/perf-help.txt create mode 100644 tools/perf/Documentation/perf-inject.txt create mode 100644 tools/perf/Documentation/perf-intel-pt.txt create mode 100644 tools/perf/Documentation/perf-iostat.txt create mode 100644 tools/perf/Documentation/perf-kallsyms.txt create mode 100644 tools/perf/Documentation/perf-kmem.txt create mode 100644 tools/perf/Documentation/perf-kvm.txt create mode 100644 tools/perf/Documentation/perf-kwork.txt create mode 100644 tools/perf/Documentation/perf-list.txt create mode 100644 tools/perf/Documentation/perf-lock.txt create mode 100644 tools/perf/Documentation/perf-mem.txt create mode 100644 tools/perf/Documentation/perf-probe.txt create mode 100644 tools/perf/Documentation/perf-record.txt create mode 100644 tools/perf/Documentation/perf-report.txt create mode 100644 tools/perf/Documentation/perf-sched.txt create mode 100644 tools/perf/Documentation/perf-script-perl.txt create mode 100644 tools/perf/Documentation/perf-script-python.txt create mode 100644 tools/perf/Documentation/perf-script.txt create mode 100644 tools/perf/Documentation/perf-stat.txt create mode 100644 tools/perf/Documentation/perf-test.txt create mode 100644 tools/perf/Documentation/perf-timechart.txt create mode 100644 tools/perf/Documentation/perf-top.txt create mode 100644 tools/perf/Documentation/perf-trace.txt create mode 100644 tools/perf/Documentation/perf-version.txt create mode 100644 tools/perf/Documentation/perf.data-directory-format.txt create mode 100644 tools/perf/Documentation/perf.data-file-format.txt create mode 100644 tools/perf/Documentation/perf.txt create mode 100644 tools/perf/Documentation/perfconfig.example create mode 100644 tools/perf/Documentation/security.txt create mode 100644 tools/perf/Documentation/tips.txt create mode 100644 tools/perf/Documentation/topdown.txt (limited to 'tools/perf/Documentation') diff --git a/tools/perf/Documentation/Build.txt b/tools/perf/Documentation/Build.txt new file mode 100644 index 000000000..3766886c4 --- /dev/null +++ b/tools/perf/Documentation/Build.txt @@ -0,0 +1,73 @@ + +1) perf build +============= +The perf build process consists of several separated building blocks, +which are linked together to form the perf binary: + - libperf library (static) + - perf builtin commands + - traceevent library (static) + - GTK ui library + +Several makefiles govern the perf build: + + - Makefile + top level Makefile working as a wrapper that calls the main + Makefile.perf with a -j option to do parallel builds. + + - Makefile.perf + main makefile that triggers build of all perf objects including + installation and documentation processing. + + - tools/build/Makefile.build + main makefile of the build framework + + - tools/build/Build.include + build framework generic definitions + + - Build makefiles + makefiles that defines build objects + +Please refer to tools/build/Documentation/Build.txt for more +information about build framework. + + +2) perf build +============= +The Makefile.perf triggers the build framework for build objects: + perf, libperf, gtk + +resulting in following objects: + $ ls *-in.o + gtk-in.o libperf-in.o perf-in.o + +Those objects are then used in final linking: + libperf-gtk.so <- gtk-in.o libperf-in.o + perf <- perf-in.o libperf-in.o + + +NOTE this description is omitting other libraries involved, only + focusing on build framework outcomes + +3) Build with ASan or UBSan +========================== + $ cd tools/perf + $ make DESTDIR=/usr + $ make DESTDIR=/usr install + +AddressSanitizer (or ASan) is a GCC feature that detects memory corruption bugs +such as buffer overflows and memory leaks. + + $ cd tools/perf + $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=address' + $ ASAN_OPTIONS=log_path=asan.log ./perf record -a + +ASan outputs all detected issues into a log file named 'asan.log.'. + +UndefinedBehaviorSanitizer (or UBSan) is a fast undefined behavior detector +supported by GCC. UBSan detects undefined behaviors of programs at runtime. + + $ cd tools/perf + $ make DEBUG=1 EXTRA_CFLAGS='-fno-omit-frame-pointer -fsanitize=undefined' + $ UBSAN_OPTIONS=print_stacktrace=1 ./perf record -a + +If UBSan detects any problem at runtime, it outputs a “runtime error:” message. diff --git a/tools/perf/Documentation/Makefile b/tools/perf/Documentation/Makefile new file mode 100644 index 000000000..6e7b88917 --- /dev/null +++ b/tools/perf/Documentation/Makefile @@ -0,0 +1,308 @@ +# SPDX-License-Identifier: GPL-2.0-only +include ../../scripts/Makefile.include +include ../../scripts/utilities.mak + +ARTICLES = +# with their own formatting rules. +SP_ARTICLES = + +MAN1_TXT= \ + $(filter-out $(addsuffix .txt, $(ARTICLES) $(SP_ARTICLES)), \ + $(wildcard perf-*.txt)) \ + perf.txt +MAN5_TXT= +MAN7_TXT= + +MAN_TXT = $(MAN1_TXT) $(MAN5_TXT) $(MAN7_TXT) +_MAN_XML=$(patsubst %.txt,%.xml,$(MAN_TXT)) +_MAN_HTML=$(patsubst %.txt,%.html,$(MAN_TXT)) + +MAN_XML=$(addprefix $(OUTPUT),$(_MAN_XML)) +MAN_HTML=$(addprefix $(OUTPUT),$(_MAN_HTML)) + +_DOC_HTML = $(_MAN_HTML) +_DOC_HTML+=$(patsubst %,%.html,$(ARTICLES) $(SP_ARTICLES)) +DOC_HTML=$(addprefix $(OUTPUT),$(_DOC_HTML)) + +_DOC_MAN1=$(patsubst %.txt,%.1,$(MAN1_TXT)) +_DOC_MAN5=$(patsubst %.txt,%.5,$(MAN5_TXT)) +_DOC_MAN7=$(patsubst %.txt,%.7,$(MAN7_TXT)) + +DOC_MAN1=$(addprefix $(OUTPUT),$(_DOC_MAN1)) +DOC_MAN5=$(addprefix $(OUTPUT),$(_DOC_MAN5)) +DOC_MAN7=$(addprefix $(OUTPUT),$(_DOC_MAN7)) + +# Make the path relative to DESTDIR, not prefix +ifndef DESTDIR +prefix?=$(HOME) +endif +bindir?=$(prefix)/bin +htmldir?=$(prefix)/share/doc/perf-doc +pdfdir?=$(prefix)/share/doc/perf-doc +mandir?=$(prefix)/share/man +man1dir=$(mandir)/man1 +man5dir=$(mandir)/man5 +man7dir=$(mandir)/man7 + +ASCIIDOC=asciidoc +ASCIIDOC_EXTRA += --unsafe -f asciidoc.conf +ASCIIDOC_HTML = xhtml11 +MANPAGE_XSL = manpage-normal.xsl +XMLTO_EXTRA = +INSTALL?=install +RM ?= rm -f +DOC_REF = origin/man +HTML_REF = origin/html + +ifdef USE_ASCIIDOCTOR +ASCIIDOC = asciidoctor +ASCIIDOC_EXTRA += -a compat-mode +ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions +ASCIIDOC_EXTRA += -a mansource="perf" -a manmanual="perf Manual" +ASCIIDOC_HTML = xhtml5 +endif + +infodir?=$(prefix)/share/info +MAKEINFO=makeinfo +INSTALL_INFO=install-info +DOCBOOK2X_TEXI=docbook2x-texi +DBLATEX=dblatex +XMLTO=xmlto +ifndef PERL_PATH + PERL_PATH = /usr/bin/perl +endif + +-include ../config.mak.autogen +-include ../config.mak + +_tmp_tool_path := $(call get-executable,$(ASCIIDOC)) +ifeq ($(_tmp_tool_path),) + missing_tools = $(ASCIIDOC) +endif + +ifndef USE_ASCIIDOCTOR +_tmp_tool_path := $(call get-executable,$(XMLTO)) +ifeq ($(_tmp_tool_path),) + missing_tools += $(XMLTO) +endif +endif + +# +# For asciidoc ... +# -7.1.2, no extra settings are needed. +# 8.0-, set ASCIIDOC8. +# + +# +# For docbook-xsl ... +# -1.68.1, set ASCIIDOC_NO_ROFF? (based on changelog from 1.73.0) +# 1.69.0, no extra settings are needed? +# 1.69.1-1.71.0, set DOCBOOK_SUPPRESS_SP? +# 1.71.1, no extra settings are needed? +# 1.72.0, set DOCBOOK_XSL_172. +# 1.73.0-, set ASCIIDOC_NO_ROFF +# + +# +# If you had been using DOCBOOK_XSL_172 in an attempt to get rid +# of 'the ".ft C" problem' in your generated manpages, and you +# instead ended up with weird characters around callouts, try +# using ASCIIDOC_NO_ROFF instead (it works fine with ASCIIDOC8). +# + +ifdef ASCIIDOC8 +ASCIIDOC_EXTRA += -a asciidoc7compatible +endif +ifdef DOCBOOK_XSL_172 +ASCIIDOC_EXTRA += -a perf-asciidoc-no-roff +MANPAGE_XSL = manpage-1.72.xsl +else + ifdef ASCIIDOC_NO_ROFF + # docbook-xsl after 1.72 needs the regular XSL, but will not + # pass-thru raw roff codes from asciidoc.conf, so turn them off. + ASCIIDOC_EXTRA += -a perf-asciidoc-no-roff + endif +endif +ifdef MAN_BOLD_LITERAL +XMLTO_EXTRA += -m manpage-bold-literal.xsl +endif +ifdef DOCBOOK_SUPPRESS_SP +XMLTO_EXTRA += -m manpage-suppress-sp.xsl +endif + +SHELL_PATH ?= $(SHELL) +# Shell quote; +SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH)) + +# +# Please note that there is a minor bug in asciidoc. +# The version after 6.0.3 _will_ include the patch found here: +# http://marc.theaimsgroup.com/?l=perf&m=111558757202243&w=2 +# +# Until that version is released you may have to apply the patch +# yourself - yes, all 6 characters of it! +# + +QUIET_SUBDIR0 = +$(MAKE) -C # space to separate -C and subdir +QUIET_SUBDIR1 = + +ifneq ($(findstring $(MAKEFLAGS),w),w) +PRINT_DIR = --no-print-directory +else # "make -w" +NO_SUBDIR = : +endif + +ifneq ($(findstring $(MAKEFLAGS),s),s) +ifneq ($(V),1) + QUIET_ASCIIDOC = @echo ' ASCIIDOC '$@; + QUIET_XMLTO = @echo ' XMLTO '$@; + QUIET_DB2TEXI = @echo ' DB2TEXI '$@; + QUIET_MAKEINFO = @echo ' MAKEINFO '$@; + QUIET_DBLATEX = @echo ' DBLATEX '$@; + QUIET_XSLTPROC = @echo ' XSLTPROC '$@; + QUIET_GEN = @echo ' GEN '$@; + QUIET_STDERR = 2> /dev/null + QUIET_SUBDIR0 = +@subdir= + QUIET_SUBDIR1 = ;$(NO_SUBDIR) \ + echo ' SUBDIR ' $$subdir; \ + $(MAKE) $(PRINT_DIR) -C $$subdir + export V +endif +endif + +all: html man info + +html: $(DOC_HTML) + +$(DOC_HTML) $(DOC_MAN1) $(DOC_MAN5) $(DOC_MAN7): asciidoc.conf + +man: man1 man5 man7 +man1: $(DOC_MAN1) +man5: $(DOC_MAN5) +man7: $(DOC_MAN7) + +info: $(OUTPUT)perf.info $(OUTPUT)perfman.info + +install: install-man + +check-man-tools: +ifdef missing_tools + $(error "You need to install $(missing_tools) for man pages") +endif + +do-install-man: man + $(call QUIET_INSTALL, Documentation-man) \ + $(INSTALL) -d -m 755 $(DESTDIR)$(man1dir); \ +# $(INSTALL) -d -m 755 $(DESTDIR)$(man5dir); \ +# $(INSTALL) -d -m 755 $(DESTDIR)$(man7dir); \ + $(INSTALL) -m 644 $(DOC_MAN1) $(DESTDIR)$(man1dir); \ +# $(INSTALL) -m 644 $(DOC_MAN5) $(DESTDIR)$(man5dir); \ +# $(INSTALL) -m 644 $(DOC_MAN7) $(DESTDIR)$(man7dir) + +install-man: check-man-tools man do-install-man + +ifdef missing_tools + DO_INSTALL_MAN = $(warning Please install $(missing_tools) to have the man pages installed) +else + DO_INSTALL_MAN = do-install-man +endif + +try-install-man: $(DO_INSTALL_MAN) + +install-info: info + $(call QUIET_INSTALL, Documentation-info) \ + $(INSTALL) -d -m 755 $(DESTDIR)$(infodir); \ + $(INSTALL) -m 644 $(OUTPUT)perf.info $(OUTPUT)perfman.info $(DESTDIR)$(infodir); \ + if test -r $(DESTDIR)$(infodir)/dir; then \ + $(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perf.info ;\ + $(INSTALL_INFO) --info-dir=$(DESTDIR)$(infodir) perfman.info ;\ + else \ + echo "No directory found in $(DESTDIR)$(infodir)" >&2 ; \ + fi + +#install-html: html +# '$(SHELL_PATH_SQ)' ./install-webdoc.sh $(DESTDIR)$(htmldir) + + +# +# Determine "include::" file references in asciidoc files. +# +$(OUTPUT)doc.dep : $(wildcard *.txt) build-docdep.perl + $(QUIET_GEN)$(RM) $@+ $@ && \ + $(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \ + mv $@+ $@ + +-include $(OUTPUT)doc.dep + +CLEAN_FILES = \ + $(MAN_XML) $(addsuffix +,$(MAN_XML)) \ + $(MAN_HTML) $(addsuffix +,$(MAN_HTML)) \ + $(DOC_HTML) $(DOC_MAN1) $(DOC_MAN5) $(DOC_MAN7) \ + $(OUTPUT)*.texi $(OUTPUT)*.texi+ $(OUTPUT)*.texi++ \ + $(OUTPUT)perf.info $(OUTPUT)perfman.info $(OUTPUT)doc.dep \ + $(OUTPUT)technical/api-*.html $(OUTPUT)technical/api-index.txt +clean: + $(call QUIET_CLEAN, Documentation) $(RM) $(CLEAN_FILES) + +$(MAN_HTML): $(OUTPUT)%.html : %.txt + $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \ + $(ASCIIDOC) -b $(ASCIIDOC_HTML) -d manpage \ + $(ASCIIDOC_EXTRA) -aperf_version=$(PERF_VERSION) -o $@+ $< && \ + mv $@+ $@ + +ifdef USE_ASCIIDOCTOR +$(OUTPUT)%.1 $(OUTPUT)%.5 $(OUTPUT)%.7 : %.txt + $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \ + $(ASCIIDOC) -b manpage -d manpage \ + $(ASCIIDOC_EXTRA) -aperf_version=$(PERF_VERSION) -o $@+ $< && \ + mv $@+ $@ +endif + +$(OUTPUT)%.1 $(OUTPUT)%.5 $(OUTPUT)%.7 : $(OUTPUT)%.xml + $(QUIET_XMLTO)$(RM) $@ && \ + $(XMLTO) -o $(OUTPUT). -m $(MANPAGE_XSL) $(XMLTO_EXTRA) man $< + +$(OUTPUT)%.xml : %.txt + $(QUIET_ASCIIDOC)$(RM) $@+ $@ && \ + $(ASCIIDOC) -b docbook -d manpage \ + $(ASCIIDOC_EXTRA) -aperf_version=$(PERF_VERSION) \ + -aperf_date=$(shell git log -1 --pretty="format:%cd" \ + --date=short $<) \ + -o $@+ $< && \ + mv $@+ $@ + +XSLT = docbook.xsl +XSLTOPTS = --xinclude --stringparam html.stylesheet docbook-xsl.css + +$(OUTPUT)perfman.texi: $(MAN_XML) cat-texi.perl + $(QUIET_DB2TEXI)$(RM) $@+ $@ && \ + ($(foreach xml,$(MAN_XML),$(DOCBOOK2X_TEXI) --encoding=UTF-8 \ + --to-stdout $(xml) &&) true) > $@++ && \ + $(PERL_PATH) cat-texi.perl $@ <$@++ >$@+ && \ + rm $@++ && \ + mv $@+ $@ + +$(OUTPUT)perfman.info: $(OUTPUT)perfman.texi + $(QUIET_MAKEINFO)$(MAKEINFO) --no-split --no-validate -o $@ $*.texi + +$(patsubst %.txt,%.texi,$(MAN_TXT)): %.texi : %.xml + $(QUIET_DB2TEXI)$(RM) $@+ $@ && \ + $(DOCBOOK2X_TEXI) --to-stdout $*.xml >$@+ && \ + mv $@+ $@ + +$(patsubst %,%.html,$(ARTICLES)) : %.html : %.txt + $(QUIET_ASCIIDOC)$(ASCIIDOC) -b $(ASCIIDOC_HTML) $*.txt + +WEBDOC_DEST = /pub/software/tools/perf/docs + +# UNIMPLEMENTED +#install-webdoc : html +# '$(SHELL_PATH_SQ)' ./install-webdoc.sh $(WEBDOC_DEST) + +# quick-install: quick-install-man + +# quick-install-man: +# '$(SHELL_PATH_SQ)' ./install-doc-quick.sh $(DOC_REF) $(DESTDIR)$(mandir) + +#quick-install-html: +# '$(SHELL_PATH_SQ)' ./install-doc-quick.sh $(HTML_REF) $(DESTDIR)$(htmldir) diff --git a/tools/perf/Documentation/android.txt b/tools/perf/Documentation/android.txt new file mode 100644 index 000000000..24a59998f --- /dev/null +++ b/tools/perf/Documentation/android.txt @@ -0,0 +1,78 @@ +How to compile perf for Android +========================================= + +I. Set the Android NDK environment +------------------------------------------------ + +(a). Use the Android NDK +------------------------------------------------ +1. You need to download and install the Android Native Development Kit (NDK). +Set the NDK variable to point to the path where you installed the NDK: + export NDK=/path/to/android-ndk + +2. Set cross-compiling environment variables for NDK toolchain and sysroot. +For arm: + export NDK_TOOLCHAIN=${NDK}/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi- + export NDK_SYSROOT=${NDK}/platforms/android-24/arch-arm +For x86: + export NDK_TOOLCHAIN=${NDK}/toolchains/x86-4.9/prebuilt/linux-x86_64/bin/i686-linux-android- + export NDK_SYSROOT=${NDK}/platforms/android-24/arch-x86 + +This method is only tested for Android NDK versions Revision 11b and later. +perf uses some bionic enhancements that are not included in prior NDK versions. +You can use method (b) described below instead. + +(b). Use the Android source tree +----------------------------------------------- +1. Download the master branch of the Android source tree. +Set the environment for the target you want using: + source build/envsetup.sh + lunch + +2. Build your own NDK sysroot to contain latest bionic changes and set the +NDK sysroot environment variable. + cd ${ANDROID_BUILD_TOP}/ndk +For arm: + ./build/tools/build-ndk-sysroot.sh --abi=arm + export NDK_SYSROOT=${ANDROID_BUILD_TOP}/ndk/build/platforms/android-3/arch-arm +For x86: + ./build/tools/build-ndk-sysroot.sh --abi=x86 + export NDK_SYSROOT=${ANDROID_BUILD_TOP}/ndk/build/platforms/android-3/arch-x86 + +3. Set the NDK toolchain environment variable. +For arm: + export NDK_TOOLCHAIN=${ANDROID_TOOLCHAIN}/arm-linux-androideabi- +For x86: + export NDK_TOOLCHAIN=${ANDROID_TOOLCHAIN}/i686-linux-android- + +II. Compile perf for Android +------------------------------------------------ +You need to run make with the NDK toolchain and sysroot defined above: +For arm: + make WERROR=0 ARCH=arm CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie --sysroot=${NDK_SYSROOT}" +For x86: + make WERROR=0 ARCH=x86 CROSS_COMPILE=${NDK_TOOLCHAIN} EXTRA_CFLAGS="-pie --sysroot=${NDK_SYSROOT}" + +III. Install perf +----------------------------------------------- +You need to connect to your Android device/emulator using adb. +Install perf using: + adb push perf /data/perf + +If you also want to use perf-archive you need busybox tools for Android. +For installing perf-archive, you first need to replace #!/bin/bash with #!/system/bin/sh: + sed 's/#!\/bin\/bash/#!\/system\/bin\/sh/g' perf-archive >> /tmp/perf-archive + chmod +x /tmp/perf-archive + adb push /tmp/perf-archive /data/perf-archive + +IV. Environment settings for running perf +------------------------------------------------ +Some perf features need environment variables to run properly. +You need to set these before running perf on the target: + adb shell + # PERF_PAGER=cat + +IV. Run perf +------------------------------------------------ +Run perf on your device/emulator to which you previously connected using adb: + # ./data/perf diff --git a/tools/perf/Documentation/arm-coresight.txt b/tools/perf/Documentation/arm-coresight.txt new file mode 100644 index 000000000..c117fc50a --- /dev/null +++ b/tools/perf/Documentation/arm-coresight.txt @@ -0,0 +1,5 @@ +Arm CoreSight Support +===================== + +For full documentation, see Documentation/trace/coresight/coresight-perf.rst +in the kernel tree. diff --git a/tools/perf/Documentation/asciidoc.conf b/tools/perf/Documentation/asciidoc.conf new file mode 100644 index 000000000..2b62ba1e7 --- /dev/null +++ b/tools/perf/Documentation/asciidoc.conf @@ -0,0 +1,94 @@ +## linkperf: macro +# +# Usage: linkperf:command[manpage-section] +# +# Note, {0} is the manpage section, while {target} is the command. +# +# Show PERF link as: (
); if section is defined, else just show +# the command. + +[macros] +(?su)[\\]?(?Plinkperf):(?P\S*?)\[(?P.*?)\]= + +[attributes] +asterisk=* +plus=+ +caret=^ +startsb=[ +endsb=] +tilde=~ + +ifdef::backend-docbook[] +[linkperf-inlinemacro] +{0%{target}} +{0#} +{0#{target}{0}} +{0#} +endif::backend-docbook[] + +ifdef::backend-docbook[] +ifndef::perf-asciidoc-no-roff[] +# "unbreak" docbook-xsl v1.68 for manpages. v1.69 works with or without this. +# v1.72 breaks with this because it replaces dots not in roff requests. +[listingblock] +{title} + +ifdef::doctype-manpage[] + .ft C +endif::doctype-manpage[] +| +ifdef::doctype-manpage[] + .ft +endif::doctype-manpage[] + +{title#} +endif::perf-asciidoc-no-roff[] + +ifdef::perf-asciidoc-no-roff[] +ifdef::doctype-manpage[] +# The following two small workarounds insert a simple paragraph after screen +[listingblock] +{title} + +| + +{title#} + +[verseblock] +{title} +{title%} +{title#} +| + +{title#} +{title%} +endif::doctype-manpage[] +endif::perf-asciidoc-no-roff[] +endif::backend-docbook[] + +ifdef::doctype-manpage[] +ifdef::backend-docbook[] +[header] +template::[header-declarations] + +ifdef::perf_date[] +{perf_date} +endif::perf_date[] + +{mantitle} +{manvolnum} +perf +{perf_version} +perf Manual + + + {manname} + {manpurpose} + +endif::backend-docbook[] +endif::doctype-manpage[] + +ifdef::backend-xhtml11[] +[linkperf-inlinemacro] +{target}{0?({0})} +endif::backend-xhtml11[] diff --git a/tools/perf/Documentation/asciidoctor-extensions.rb b/tools/perf/Documentation/asciidoctor-extensions.rb new file mode 100644 index 000000000..d148fe95c --- /dev/null +++ b/tools/perf/Documentation/asciidoctor-extensions.rb @@ -0,0 +1,29 @@ +require 'asciidoctor' +require 'asciidoctor/extensions' + +module Perf + module Documentation + class LinkPerfProcessor < Asciidoctor::Extensions::InlineMacroProcessor + use_dsl + + named :chrome + + def process(parent, target, attrs) + if parent.document.basebackend? 'html' + %(#{target}(#{attrs[1]})\n) + elsif parent.document.basebackend? 'manpage' + "#{target}(#{attrs[1]})" + elsif parent.document.basebackend? 'docbook' + "\n" \ + "#{target}" \ + "#{attrs[1]}\n" \ + "\n" + end + end + end + end +end + +Asciidoctor::Extensions.register do + inline_macro Perf::Documentation::LinkPerfProcessor, :linkperf +end diff --git a/tools/perf/Documentation/build-docdep.perl b/tools/perf/Documentation/build-docdep.perl new file mode 100755 index 000000000..ba4205e03 --- /dev/null +++ b/tools/perf/Documentation/build-docdep.perl @@ -0,0 +1,46 @@ +#!/usr/bin/perl + +my %include = (); +my %included = (); + +for my $text (<*.txt>) { + open I, '<', $text || die "cannot read: $text"; + while () { + if (/^include::/) { + chomp; + s/^include::\s*//; + s/\[\]//; + $include{$text}{$_} = 1; + $included{$_} = 1; + } + } + close I; +} + +# Do we care about chained includes??? +my $changed = 1; +while ($changed) { + $changed = 0; + while (my ($text, $included) = each %include) { + for my $i (keys %$included) { + # $text has include::$i; if $i includes $j + # $text indirectly includes $j. + if (exists $include{$i}) { + for my $j (keys %{$include{$i}}) { + if (!exists $include{$text}{$j}) { + $include{$text}{$j} = 1; + $included{$j} = 1; + $changed = 1; + } + } + } + } + } +} + +while (my ($text, $included) = each %include) { + if (! exists $included{$text} && + (my $base = $text) =~ s/\.txt$//) { + print "$base.html $base.xml : ", join(" ", keys %$included), "\n"; + } +} diff --git a/tools/perf/Documentation/build-xed.txt b/tools/perf/Documentation/build-xed.txt new file mode 100644 index 000000000..6222c1e72 --- /dev/null +++ b/tools/perf/Documentation/build-xed.txt @@ -0,0 +1,19 @@ + +For --xed the xed tool is needed. Here is how to install it: + + $ git clone https://github.com/intelxed/mbuild.git mbuild + $ git clone https://github.com/intelxed/xed + $ cd xed + $ ./mfile.py --share + $ ./mfile.py examples + $ sudo ./mfile.py --prefix=/usr/local install + $ sudo ldconfig + $ sudo cp obj/examples/xed /usr/local/bin + +Basic xed testing: + + $ xed | head -3 + ERROR: required argument(s) were missing + Copyright (C) 2017, Intel Corporation. All rights reserved. + XED version: [v10.0-328-g7d62c8c49b7b] + $ diff --git a/tools/perf/Documentation/callchain-overhead-calculation.txt b/tools/perf/Documentation/callchain-overhead-calculation.txt new file mode 100644 index 000000000..1a7579271 --- /dev/null +++ b/tools/perf/Documentation/callchain-overhead-calculation.txt @@ -0,0 +1,108 @@ +Overhead calculation +-------------------- +The overhead can be shown in two columns as 'Children' and 'Self' when +perf collects callchains. The 'self' overhead is simply calculated by +adding all period values of the entry - usually a function (symbol). +This is the value that perf shows traditionally and sum of all the +'self' overhead values should be 100%. + +The 'children' overhead is calculated by adding all period values of +the child functions so that it can show the total overhead of the +higher level functions even if they don't directly execute much. +'Children' here means functions that are called from another (parent) +function. + +It might be confusing that the sum of all the 'children' overhead +values exceeds 100% since each of them is already an accumulation of +'self' overhead of its child functions. But with this enabled, users +can find which function has the most overhead even if samples are +spread over the children. + +Consider the following example; there are three functions like below. + +----------------------- +void foo(void) { + /* do something */ +} + +void bar(void) { + /* do something */ + foo(); +} + +int main(void) { + bar() + return 0; +} +----------------------- + +In this case 'foo' is a child of 'bar', and 'bar' is an immediate +child of 'main' so 'foo' also is a child of 'main'. In other words, +'main' is a parent of 'foo' and 'bar', and 'bar' is a parent of 'foo'. + +Suppose all samples are recorded in 'foo' and 'bar' only. When it's +recorded with callchains the output will show something like below +in the usual (self-overhead-only) output of perf report: + +---------------------------------- +Overhead Symbol +........ ..................... + 60.00% foo + | + --- foo + bar + main + __libc_start_main + + 40.00% bar + | + --- bar + main + __libc_start_main +---------------------------------- + +When the --children option is enabled, the 'self' overhead values of +child functions (i.e. 'foo' and 'bar') are added to the parents to +calculate the 'children' overhead. In this case the report could be +displayed as: + +------------------------------------------- +Children Self Symbol +........ ........ .................... + 100.00% 0.00% __libc_start_main + | + --- __libc_start_main + + 100.00% 0.00% main + | + --- main + __libc_start_main + + 100.00% 40.00% bar + | + --- bar + main + __libc_start_main + + 60.00% 60.00% foo + | + --- foo + bar + main + __libc_start_main +------------------------------------------- + +In the above output, the 'self' overhead of 'foo' (60%) was add to the +'children' overhead of 'bar', 'main' and '\_\_libc_start_main'. +Likewise, the 'self' overhead of 'bar' (40%) was added to the +'children' overhead of 'main' and '\_\_libc_start_main'. + +So '\_\_libc_start_main' and 'main' are shown first since they have +same (100%) 'children' overhead (even though they have zero 'self' +overhead) and they are the parents of 'foo' and 'bar'. + +Since v3.16 the 'children' overhead is shown by default and the output +is sorted by its values. The 'children' overhead is disabled by +specifying --no-children option on the command line or by adding +'report.children = false' or 'top.children = false' in the perf config +file. diff --git a/tools/perf/Documentation/cat-texi.perl b/tools/perf/Documentation/cat-texi.perl new file mode 100755 index 000000000..14d2f8341 --- /dev/null +++ b/tools/perf/Documentation/cat-texi.perl @@ -0,0 +1,46 @@ +#!/usr/bin/perl -w + +use strict; +use warnings; + +my @menu = (); +my $output = $ARGV[0]; + +open my $tmp, '>', "$output.tmp"; + +while () { + next if (/^\\input texinfo/../\@node Top/); + next if (/^\@bye/ || /^\.ft/); + if (s/^\@top (.*)/\@node $1,,,Top/) { + push @menu, $1; + } + s/\(\@pxref\{\[(URLS|REMOTES)\]}\)//; + s/\@anchor\{[^{}]*\}//g; + print $tmp $_; +} +close $tmp; + +print '\input texinfo +@setfilename gitman.info +@documentencoding UTF-8 +@dircategory Development +@direntry +* Git Man Pages: (gitman). Manual pages for Git revision control system +@end direntry +@node Top,,, (dir) +@top Git Manual Pages +@documentlanguage en +@menu +'; + +for (@menu) { + print "* ${_}::\n"; +} +print "\@end menu\n"; +open $tmp, '<', "$output.tmp"; +while (<$tmp>) { + print; +} +close $tmp; +print "\@bye\n"; +unlink "$output.tmp"; diff --git a/tools/perf/Documentation/db-export.txt b/tools/perf/Documentation/db-export.txt new file mode 100644 index 000000000..52ffccb02 --- /dev/null +++ b/tools/perf/Documentation/db-export.txt @@ -0,0 +1,41 @@ +Database Export +=============== + +perf tool's python scripting engine: + + tools/perf/util/scripting-engines/trace-event-python.c + +supports scripts: + + tools/perf/scripts/python/export-to-sqlite.py + tools/perf/scripts/python/export-to-postgresql.py + +which export data to a SQLite3 or PostgreSQL database. + +The export process provides records with unique sequential ids which allows the +data to be imported directly to a database and provides the relationships +between tables. + +Over time it is possible to continue to expand the export while maintaining +backward and forward compatibility, by following some simple rules: + +1. Because of the nature of SQL, existing tables and columns can continue to be +used so long as the names and meanings (and to some extent data types) remain +the same. + +2. New tables and columns can be added, without affecting existing SQL queries, +so long as the new names are unique. + +3. Scripts that use a database (e.g. exported-sql-viewer.py) can maintain +backward compatibility by testing for the presence of new tables and columns +before using them. e.g. function IsSelectable() in exported-sql-viewer.py + +4. The export scripts themselves maintain forward compatibility (i.e. an existing +script will continue to work with new versions of perf) by accepting a variable +number of arguments (e.g. def call_return_table(*x)) i.e. perf can pass more +arguments which old scripts will ignore. + +5. The scripting engine tests for the existence of script handler functions +before calling them. The scripting engine can also test for the support of new +or optional features by checking for the existence and value of script global +variables. diff --git a/tools/perf/Documentation/examples.txt b/tools/perf/Documentation/examples.txt new file mode 100644 index 000000000..c0d22fbe9 --- /dev/null +++ b/tools/perf/Documentation/examples.txt @@ -0,0 +1,225 @@ + + ------------------------------ + ****** perf by examples ****** + ------------------------------ + +[ From an e-mail by Ingo Molnar, https://lore.kernel.org/lkml/20090804195717.GA5998@elte.hu ] + + +First, discovery/enumeration of available counters can be done via +'perf list': + +titan:~> perf list + [...] + kmem:kmalloc [Tracepoint event] + kmem:kmem_cache_alloc [Tracepoint event] + kmem:kmalloc_node [Tracepoint event] + kmem:kmem_cache_alloc_node [Tracepoint event] + kmem:kfree [Tracepoint event] + kmem:kmem_cache_free [Tracepoint event] + kmem:mm_page_free [Tracepoint event] + kmem:mm_page_free_batched [Tracepoint event] + kmem:mm_page_alloc [Tracepoint event] + kmem:mm_page_alloc_zone_locked [Tracepoint event] + kmem:mm_page_pcpu_drain [Tracepoint event] + kmem:mm_page_alloc_extfrag [Tracepoint event] + +Then any (or all) of the above event sources can be activated and +measured. For example the page alloc/free properties of a 'hackbench +run' are: + + titan:~> perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc + -e kmem:mm_page_free_batched -e kmem:mm_page_free ./hackbench 10 + Time: 0.575 + + Performance counter stats for './hackbench 10': + + 13857 kmem:mm_page_pcpu_drain + 27576 kmem:mm_page_alloc + 6025 kmem:mm_page_free_batched + 20934 kmem:mm_page_free + + 0.613972165 seconds time elapsed + +You can observe the statistical properties as well, by using the +'repeat the workload N times' feature of perf stat: + + titan:~> perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e + kmem:mm_page_alloc -e kmem:mm_page_free_batched -e + kmem:mm_page_free ./hackbench 10 + Time: 0.627 + Time: 0.644 + Time: 0.564 + Time: 0.559 + Time: 0.626 + + Performance counter stats for './hackbench 10' (5 runs): + + 12920 kmem:mm_page_pcpu_drain ( +- 3.359% ) + 25035 kmem:mm_page_alloc ( +- 3.783% ) + 6104 kmem:mm_page_free_batched ( +- 0.934% ) + 18376 kmem:mm_page_free ( +- 4.941% ) + + 0.643954516 seconds time elapsed ( +- 2.363% ) + +Furthermore, these tracepoints can be used to sample the workload as +well. For example the page allocations done by a 'git gc' can be +captured the following way: + + titan:~/git> perf record -e kmem:mm_page_alloc -c 1 ./git gc + Counting objects: 1148, done. + Delta compression using up to 2 threads. + Compressing objects: 100% (450/450), done. + Writing objects: 100% (1148/1148), done. + Total 1148 (delta 690), reused 1148 (delta 690) + [ perf record: Captured and wrote 0.267 MB perf.data (~11679 samples) ] + +To check which functions generated page allocations: + + titan:~/git> perf report + # Samples: 10646 + # + # Overhead Command Shared Object + # ........ ............... .......................... + # + 23.57% git-repack /lib64/libc-2.5.so + 21.81% git /lib64/libc-2.5.so + 14.59% git ./git + 11.79% git-repack ./git + 7.12% git /lib64/ld-2.5.so + 3.16% git-repack /lib64/libpthread-2.5.so + 2.09% git-repack /bin/bash + 1.97% rm /lib64/libc-2.5.so + 1.39% mv /lib64/ld-2.5.so + 1.37% mv /lib64/libc-2.5.so + 1.12% git-repack /lib64/ld-2.5.so + 0.95% rm /lib64/ld-2.5.so + 0.90% git-update-serv /lib64/libc-2.5.so + 0.73% git-update-serv /lib64/ld-2.5.so + 0.68% perf /lib64/libpthread-2.5.so + 0.64% git-repack /usr/lib64/libz.so.1.2.3 + +Or to see it on a more finegrained level: + +titan:~/git> perf report --sort comm,dso,symbol +# Samples: 10646 +# +# Overhead Command Shared Object Symbol +# ........ ............... .......................... ...... +# + 9.35% git-repack ./git [.] insert_obj_hash + 9.12% git ./git [.] insert_obj_hash + 7.31% git /lib64/libc-2.5.so [.] memcpy + 6.34% git-repack /lib64/libc-2.5.so [.] _int_malloc + 6.24% git-repack /lib64/libc-2.5.so [.] memcpy + 5.82% git-repack /lib64/libc-2.5.so [.] __GI___fork + 5.47% git /lib64/libc-2.5.so [.] _int_malloc + 2.99% git /lib64/libc-2.5.so [.] memset + +Furthermore, call-graph sampling can be done too, of page +allocations - to see precisely what kind of page allocations there +are: + + titan:~/git> perf record -g -e kmem:mm_page_alloc -c 1 ./git gc + Counting objects: 1148, done. + Delta compression using up to 2 threads. + Compressing objects: 100% (450/450), done. + Writing objects: 100% (1148/1148), done. + Total 1148 (delta 690), reused 1148 (delta 690) + [ perf record: Captured and wrote 0.963 MB perf.data (~42069 samples) ] + + titan:~/git> perf report -g + # Samples: 10686 + # + # Overhead Command Shared Object + # ........ ............... .......................... + # + 23.25% git-repack /lib64/libc-2.5.so + | + |--50.00%-- _int_free + | + |--37.50%-- __GI___fork + | make_child + | + |--12.50%-- ptmalloc_unlock_all2 + | make_child + | + --6.25%-- __GI_strcpy + 21.61% git /lib64/libc-2.5.so + | + |--30.00%-- __GI_read + | | + | --83.33%-- git_config_from_file + | git_config + | | + [...] + +Or you can observe the whole system's page allocations for 10 +seconds: + +titan:~/git> perf stat -a -e kmem:mm_page_pcpu_drain -e +kmem:mm_page_alloc -e kmem:mm_page_free_batched -e +kmem:mm_page_free sleep 10 + + Performance counter stats for 'sleep 10': + + 171585 kmem:mm_page_pcpu_drain + 322114 kmem:mm_page_alloc + 73623 kmem:mm_page_free_batched + 254115 kmem:mm_page_free + + 10.000591410 seconds time elapsed + +Or observe how fluctuating the page allocations are, via statistical +analysis done over ten 1-second intervals: + + titan:~/git> perf stat --repeat 10 -a -e kmem:mm_page_pcpu_drain -e + kmem:mm_page_alloc -e kmem:mm_page_free_batched -e + kmem:mm_page_free sleep 1 + + Performance counter stats for 'sleep 1' (10 runs): + + 17254 kmem:mm_page_pcpu_drain ( +- 3.709% ) + 34394 kmem:mm_page_alloc ( +- 4.617% ) + 7509 kmem:mm_page_free_batched ( +- 4.820% ) + 25653 kmem:mm_page_free ( +- 3.672% ) + + 1.058135029 seconds time elapsed ( +- 3.089% ) + +Or you can annotate the recorded 'git gc' run on a per symbol basis +and check which instructions/source-code generated page allocations: + + titan:~/git> perf annotate __GI___fork + ------------------------------------------------ + Percent | Source code & Disassembly of libc-2.5.so + ------------------------------------------------ + : + : + : Disassembly of section .plt: + : Disassembly of section .text: + : + : 00000031a2e95560 <__fork>: + [...] + 0.00 : 31a2e95602: b8 38 00 00 00 mov $0x38,%eax + 0.00 : 31a2e95607: 0f 05 syscall + 83.42 : 31a2e95609: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax + 0.00 : 31a2e9560f: 0f 87 4d 01 00 00 ja 31a2e95762 <__fork+0x202> + 0.00 : 31a2e95615: 85 c0 test %eax,%eax + +( this shows that 83.42% of __GI___fork's page allocations come from + the 0x38 system call it performs. ) + +etc. etc. - a lot more is possible. I could list a dozen of +other different usecases straight away - neither of which is +possible via /proc/vmstat. + +/proc/vmstat is not in the same league really, in terms of +expressive power of system analysis and performance +analysis. + +All that the above results needed were those new tracepoints +in include/tracing/events/kmem.h. + + Ingo + + diff --git a/tools/perf/Documentation/guest-files.txt b/tools/perf/Documentation/guest-files.txt new file mode 100644 index 000000000..8cc0b092f --- /dev/null +++ b/tools/perf/Documentation/guest-files.txt @@ -0,0 +1,16 @@ +include::guestmount.txt[] + +--guestkallsyms=:: + Guest OS /proc/kallsyms file copy. perf reads it to get guest + kernel symbols. Users copy it out from guest OS. + +--guestmodules=:: + Guest OS /proc/modules file copy. perf reads it to get guest + kernel module information. Users copy it out from guest OS. + +--guestvmlinux=:: + Guest OS kernel vmlinux. + +--guest-code:: + Indicate that guest code can be found in the hypervisor process, + which is a common case for KVM test programs. diff --git a/tools/perf/Documentation/guestmount.txt b/tools/perf/Documentation/guestmount.txt new file mode 100644 index 000000000..6edf12363 --- /dev/null +++ b/tools/perf/Documentation/guestmount.txt @@ -0,0 +1,11 @@ +--guestmount=:: + Guest OS root file system mount directory. Users mount guest OS + root directories under by a specific filesystem access method, + typically, sshfs. + For example, start 2 guest OS, one's pid is 8888 and the other's is 9999: +[verse] + $ mkdir \~/guestmount + $ cd \~/guestmount + $ sshfs -o allow_other,direct_io -p 5551 localhost:/ 8888/ + $ sshfs -o allow_other,direct_io -p 5552 localhost:/ 9999/ + $ perf {GMEXAMPLECMD} --guestmount=~/guestmount {GMEXAMPLESUBCMD} diff --git a/tools/perf/Documentation/intel-bts.txt b/tools/perf/Documentation/intel-bts.txt new file mode 100644 index 000000000..8bdc93bd7 --- /dev/null +++ b/tools/perf/Documentation/intel-bts.txt @@ -0,0 +1,86 @@ +Intel Branch Trace Store +======================== + +Overview +======== + +Intel BTS could be regarded as a predecessor to Intel PT and has some +similarities because it can also identify every branch a program takes. A +notable difference is that Intel BTS has no timing information and as a +consequence the present implementation is limited to per-thread recording. + +While decoding Intel BTS does not require walking the object code, the object +code is still needed to pair up calls and returns correctly, consequently much +of the Intel PT documentation applies also to Intel BTS. Refer to the Intel PT +documentation and consider that the PMU 'intel_bts' can usually be used in +place of 'intel_pt' in the examples provided, with the proviso that per-thread +recording must also be stipulated i.e. the --per-thread option for +'perf record'. + + +perf record +=========== + +new event +--------- + +The Intel BTS kernel driver creates a new PMU for Intel BTS. The perf record +option is: + + -e intel_bts// + +Currently Intel BTS is limited to per-thread tracing so the --per-thread option +is also needed. + + +snapshot option +--------------- + +The snapshot option is the same as Intel PT (refer Intel PT documentation). + + +auxtrace mmap size option +----------------------- + +The mmap size option is the same as Intel PT (refer Intel PT documentation). + + +perf script +=========== + +By default, perf script will decode trace data found in the perf.data file. +This can be further controlled by option --itrace. The --itrace option is +the same as Intel PT (refer Intel PT documentation) except that neither +"instructions" events nor "transactions" events (and consequently call +chains) are supported. + +To disable trace decoding entirely, use the option --no-itrace. + + +dump option +----------- + +perf script has an option (-D) to "dump" the events i.e. display the binary +data. + +When -D is used, Intel BTS packets are displayed. + +To disable the display of Intel BTS packets, combine the -D option with +--no-itrace. + + +perf report +=========== + +By default, perf report will decode trace data found in the perf.data file. +This can be further controlled by new option --itrace exactly the same as +perf script. + + +perf inject +=========== + +perf inject also accepts the --itrace option in which case tracing data is +removed and replaced with the synthesized events. e.g. + + perf inject --itrace -i perf.data -o perf.data.new diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt new file mode 100644 index 000000000..e7a776ad2 --- /dev/null +++ b/tools/perf/Documentation/intel-hybrid.txt @@ -0,0 +1,204 @@ +Intel hybrid support +-------------------- +Support for Intel hybrid events within perf tools. + +For some Intel platforms, such as AlderLake, which is hybrid platform and +it consists of atom cpu and core cpu. Each cpu has dedicated event list. +Part of events are available on core cpu, part of events are available +on atom cpu and even part of events are available on both. + +Kernel exports two new cpu pmus via sysfs: +/sys/devices/cpu_core +/sys/devices/cpu_atom + +The 'cpus' files are created under the directories. For example, + +cat /sys/devices/cpu_core/cpus +0-15 + +cat /sys/devices/cpu_atom/cpus +16-23 + +It indicates cpu0-cpu15 are core cpus and cpu16-cpu23 are atom cpus. + +As before, use perf-list to list the symbolic event. + +perf list + +inst_retired.any + [Fixed Counter: Counts the number of instructions retired. Unit: cpu_atom] +inst_retired.any + [Number of instructions retired. Fixed Counter - architectural event. Unit: cpu_core] + +The 'Unit: xxx' is added to brief description to indicate which pmu +the event is belong to. Same event name but with different pmu can +be supported. + +Enable hybrid event with a specific pmu + +To enable a core only event or atom only event, following syntax is supported: + + cpu_core// +or + cpu_atom// + +For example, count the 'cycles' event on core cpus. + + perf stat -e cpu_core/cycles/ + +Create two events for one hardware event automatically + +When creating one event and the event is available on both atom and core, +two events are created automatically. One is for atom, the other is for +core. Most of hardware events and cache events are available on both +cpu_core and cpu_atom. + +For hardware events, they have pre-defined configs (e.g. 0 for cycles). +But on hybrid platform, kernel needs to know where the event comes from +(from atom or from core). The original perf event type PERF_TYPE_HARDWARE +can't carry pmu information. So now this type is extended to be PMU aware +type. The PMU type ID is stored at attr.config[63:32]. + +PMU type ID is retrieved from sysfs. +/sys/devices/cpu_atom/type +/sys/devices/cpu_core/type + +The new attr.config layout for PERF_TYPE_HARDWARE: + +PERF_TYPE_HARDWARE: 0xEEEEEEEE000000AA + AA: hardware event ID + EEEEEEEE: PMU type ID + +Cache event is similar. The type PERF_TYPE_HW_CACHE is extended to be +PMU aware type. The PMU type ID is stored at attr.config[63:32]. + +The new attr.config layout for PERF_TYPE_HW_CACHE: + +PERF_TYPE_HW_CACHE: 0xEEEEEEEE00DDCCBB + BB: hardware cache ID + CC: hardware cache op ID + DD: hardware cache op result ID + EEEEEEEE: PMU type ID + +When enabling a hardware event without specified pmu, such as, +perf stat -e cycles -a (use system-wide in this example), two events +are created automatically. + + ------------------------------------------------------------ + perf_event_attr: + size 120 + config 0x400000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + exclude_guest 1 + ------------------------------------------------------------ + +and + + ------------------------------------------------------------ + perf_event_attr: + size 120 + config 0x800000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + exclude_guest 1 + ------------------------------------------------------------ + +type 0 is PERF_TYPE_HARDWARE. +0x4 in 0x400000000 indicates it's cpu_core pmu. +0x8 in 0x800000000 indicates it's cpu_atom pmu (atom pmu type id is random). + +The kernel creates 'cycles' (0x400000000) on cpu0-cpu15 (core cpus), +and create 'cycles' (0x800000000) on cpu16-cpu23 (atom cpus). + +For perf-stat result, it displays two events: + + Performance counter stats for 'system wide': + + 6,744,979 cpu_core/cycles/ + 1,965,552 cpu_atom/cycles/ + +The first 'cycles' is core event, the second 'cycles' is atom event. + +Thread mode example: + +perf-stat reports the scaled counts for hybrid event and with a percentage +displayed. The percentage is the event's running time/enabling time. + +One example, 'triad_loop' runs on cpu16 (atom core), while we can see the +scaled value for core cycles is 160,444,092 and the percentage is 0.47%. + +perf stat -e cycles \-- taskset -c 16 ./triad_loop + +As previous, two events are created. + +------------------------------------------------------------ +perf_event_attr: + size 120 + config 0x400000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + enable_on_exec 1 + exclude_guest 1 +------------------------------------------------------------ + +and + +------------------------------------------------------------ +perf_event_attr: + size 120 + config 0x800000000 + sample_type IDENTIFIER + read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING + disabled 1 + inherit 1 + enable_on_exec 1 + exclude_guest 1 +------------------------------------------------------------ + + Performance counter stats for 'taskset -c 16 ./triad_loop': + + 233,066,666 cpu_core/cycles/ (0.43%) + 604,097,080 cpu_atom/cycles/ (99.57%) + +perf-record: + +If there is no '-e' specified in perf record, on hybrid platform, +it creates two default 'cycles' and adds them to event list. One +is for core, the other is for atom. + +perf-stat: + +If there is no '-e' specified in perf stat, on hybrid platform, +besides of software events, following events are created and +added to event list in order. + +cpu_core/cycles/, +cpu_atom/cycles/, +cpu_core/instructions/, +cpu_atom/instructions/, +cpu_core/branches/, +cpu_atom/branches/, +cpu_core/branch-misses/, +cpu_atom/branch-misses/ + +Of course, both perf-stat and perf-record support to enable +hybrid event with a specific pmu. + +e.g. +perf stat -e cpu_core/cycles/ +perf stat -e cpu_atom/cycles/ +perf stat -e cpu_core/r1a/ +perf stat -e cpu_atom/L1-icache-loads/ +perf stat -e cpu_core/cycles/,cpu_atom/instructions/ +perf stat -e '{cpu_core/cycles/,cpu_core/instructions/}' + +But '{cpu_core/cycles/,cpu_atom/instructions/}' will return +warning and disable grouping, because the pmus in group are +not matched (cpu_core vs. cpu_atom). diff --git a/tools/perf/Documentation/intel-pt.txt b/tools/perf/Documentation/intel-pt.txt new file mode 100644 index 000000000..fd9241a1b --- /dev/null +++ b/tools/perf/Documentation/intel-pt.txt @@ -0,0 +1 @@ +Documentation for support for Intel Processor Trace within perf tools' has moved to file perf-intel-pt.txt diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt new file mode 100644 index 000000000..0916bbfe6 --- /dev/null +++ b/tools/perf/Documentation/itrace.txt @@ -0,0 +1,70 @@ + i synthesize instructions events + b synthesize branches events (branch misses for Arm SPE) + c synthesize branches events (calls only) + r synthesize branches events (returns only) + x synthesize transactions events + w synthesize ptwrite events + p synthesize power events (incl. PSB events for Intel PT) + o synthesize other events recorded due to the use + of aux-output (refer to perf record) + I synthesize interrupt or similar (asynchronous) events + (e.g. Intel PT Event Trace) + e synthesize error events + d create a debug log + f synthesize first level cache events + m synthesize last level cache events + M synthesize memory events + t synthesize TLB events + a synthesize remote access events + g synthesize a call chain (use with i or x) + G synthesize a call chain on existing event records + l synthesize last branch entries (use with i or x) + L synthesize last branch entries on existing event records + s skip initial number of events + q quicker (less detailed) decoding + A approximate IPC + Z prefer to ignore timestamps (so-called "timeless" decoding) + + The default is all events i.e. the same as --itrace=ibxwpe, + except for perf script where it is --itrace=ce + + In addition, the period (default 100000, except for perf script where it is 1) + for instructions events can be specified in units of: + + i instructions + t ticks + ms milliseconds + us microseconds + ns nanoseconds (default) + + Also the call chain size (default 16, max. 1024) for instructions or + transactions events can be specified. + + Also the number of last branch entries (default 64, max. 1024) for + instructions or transactions events can be specified. + + Similar to options g and l, size may also be specified for options G and L. + On x86, note that G and L work poorly when data has been recorded with + large PEBS. Refer linkperf:perf-intel-pt[1] man page for details. + + It is also possible to skip events generated (instructions, branches, transactions, + ptwrite, power) at the beginning. This is useful to ignore initialization code. + + --itrace=i0nss1000000 + + skips the first million instructions. + + The 'e' option may be followed by flags which affect what errors will or + will not be reported. Each flag must be preceded by either '+' or '-'. + The flags are: + o overflow + l trace data lost + + If supported, the 'd' option may be followed by flags which affect what + debug messages will or will not be logged. Each flag must be preceded + by either '+' or '-'. The flags are: + a all perf events + e output only on errors (size configurable - see linkperf:perf-config[1]) + o output to stdout + + If supported, the 'q' option may be repeated to increase the effect. diff --git a/tools/perf/Documentation/jit-interface.txt b/tools/perf/Documentation/jit-interface.txt new file mode 100644 index 000000000..a8656f564 --- /dev/null +++ b/tools/perf/Documentation/jit-interface.txt @@ -0,0 +1,15 @@ +perf supports a simple JIT interface to resolve symbols for dynamic code generated +by a JIT. + +The JIT has to write a /tmp/perf-%d.map (%d = pid of process) file + +This is a text file. + +Each line has the following format, fields separated with spaces: + +START SIZE symbolname + +START and SIZE are hex numbers without 0x. +symbolname is the rest of the line, so it could contain special characters. + +The ownership of the file has to match the process. diff --git a/tools/perf/Documentation/jitdump-specification.txt b/tools/perf/Documentation/jitdump-specification.txt new file mode 100644 index 000000000..79936355d --- /dev/null +++ b/tools/perf/Documentation/jitdump-specification.txt @@ -0,0 +1,170 @@ +JITDUMP specification version 2 +Last Revised: 09/15/2016 +Author: Stephane Eranian + +-------------------------------------------------------- +| Revision | Date | Description | +-------------------------------------------------------- +| 1 | 09/07/2016 | Initial revision | +-------------------------------------------------------- +| 2 | 09/15/2016 | Add JIT_CODE_UNWINDING_INFO | +-------------------------------------------------------- + + +I/ Introduction + + +This document describes the jitdump file format. The file is generated by Just-In-time compiler runtimes to save meta-data information about the generated code, such as address, size, and name of generated functions, the native code generated, the source line information. The data may then be used by performance tools, such as Linux perf to generate function and assembly level profiles. + +The format is not specific to any particular programming language. It can be extended as need be. + +The format of the file is binary. It is self-describing in terms of endianness and is portable across multiple processor architectures. + + +II/ Overview of the format + + +The format requires only sequential accesses, i.e., append only mode. The file starts with a fixed size file header describing the version of the specification, the endianness. + +The header is followed by a series of records, each starting with a fixed size header describing the type of record and its size. It is, itself, followed by the payload for the record. Records can have a variable size even for a given type. + +Each entry in the file is timestamped. All timestamps must use the same clock source. The CLOCK_MONOTONIC clock source is recommended. + + +III/ Jitdump file header format + +Each jitdump file starts with a fixed size header containing the following fields in order: + + +* uint32_t magic : a magic number tagging the file type. The value is 4-byte long and represents the string "JiTD" in ASCII form. It written is as 0x4A695444. The reader will detect an endian mismatch when it reads 0x4454694a. +* uint32_t version : a 4-byte value representing the format version. It is currently set to 1 +* uint32_t total_size: size in bytes of file header +* uint32_t elf_mach : ELF architecture encoding (ELF e_machine value as specified in /usr/include/elf.h) +* uint32_t pad1 : padding. Reserved for future use +* uint32_t pid : JIT runtime process identification (OS specific) +* uint64_t timestamp : timestamp of when the file was created +* uint64_t flags : a bitmask of flags + +The flags currently defined are as follows: + * bit 0: JITDUMP_FLAGS_ARCH_TIMESTAMP : set if the jitdump file is using an architecture-specific timestamp clock source. For instance, on x86, one could use TSC directly + +IV/ Record header + +The file header is immediately followed by records. Each record starts with a fixed size header describing the record that follows. + +The record header is specified in order as follows: +* uint32_t id : a value identifying the record type (see below) +* uint32_t total_size: the size in bytes of the record including the header. +* uint64_t timestamp : a timestamp of when the record was created. + +The following record types are defined: + * Value 0 : JIT_CODE_LOAD : record describing a jitted function + * Value 1 : JIT_CODE_MOVE : record describing an already jitted function which is moved + * Value 2 : JIT_CODE_DEBUG_INFO: record describing the debug information for a jitted function + * Value 3 : JIT_CODE_CLOSE : record marking the end of the jit runtime (optional) + * Value 4 : JIT_CODE_UNWINDING_INFO: record describing a function unwinding information + + The payload of the record must immediately follow the record header without padding. + +V/ JIT_CODE_LOAD record + + + The record has the following fields following the fixed-size record header in order: + * uint32_t pid: OS process id of the runtime generating the jitted code + * uint32_t tid: OS thread identification of the runtime thread generating the jitted code + * uint64_t vma: virtual address of jitted code start + * uint64_t code_addr: code start address for the jitted code. By default vma = code_addr + * uint64_t code_size: size in bytes of the generated jitted code + * uint64_t code_index: unique identifier for the jitted code (see below) + * char[n]: function name in ASCII including the null termination + * native code: raw byte encoding of the jitted code + + The record header total_size field is inclusive of all components: + * record header + * fixed-sized fields + * function name string, including termination + * native code length + * record specific variable data (e.g., array of data entries) + +The code_index is used to uniquely identify each jitted function. The index can be a monotonically increasing 64-bit value. Each time a function is jitted it gets a new number. This value is used in case the code for a function is moved and avoids having to issue another JIT_CODE_LOAD record. + +The format supports empty functions with no native code. + + +VI/ JIT_CODE_MOVE record + + The record type is optional. + + The record has the following fields following the fixed-size record header in order: + * uint32_t pid : OS process id of the runtime generating the jitted code + * uint32_t tid : OS thread identification of the runtime thread generating the jitted code + * uint64_t vma : new virtual address of jitted code start + * uint64_t old_code_addr: previous code address for the same function + * uint64_t new_code_addr: alternate new code started address for the jitted code. By default it should be equal to the vma address. + * uint64_t code_size : size in bytes of the jitted code + * uint64_t code_index : index referring to the JIT_CODE_LOAD code_index record of when the function was initially jitted + + +The MOVE record can be used in case an already jitted function is simply moved by the runtime inside the code cache. + +The JIT_CODE_MOVE record cannot come before the JIT_CODE_LOAD record for the same function name. The function cannot have changed name, otherwise a new JIT_CODE_LOAD record must be emitted. + +The code size of the function cannot change. + + +VII/ JIT_DEBUG_INFO record + +The record type is optional. + +The record contains source lines debug information, i.e., a way to map a code address back to a source line. This information may be used by the performance tool. + +The record has the following fields following the fixed-size record header in order: + * uint64_t code_addr: address of function for which the debug information is generated + * uint64_t nr_entry : number of debug entries for the function + * debug_entry[n]: array of nr_entry debug entries for the function + +The debug_entry describes the source line information. It is defined as follows in order: +* uint64_t code_addr: address of function for which the debug information is generated +* uint32_t line : source file line number (starting at 1) +* uint32_t discrim : column discriminator, 0 is default +* char name[n] : source file name in ASCII, including null termination + +The debug_entry entries are saved in sequence but given that they have variable sizes due to the file name string, they cannot be indexed directly. +They need to be walked sequentially. The next debug_entry is found at sizeof(debug_entry) + strlen(name) + 1. + +IMPORTANT: + The JIT_CODE_DEBUG for a given function must always be generated BEFORE the JIT_CODE_LOAD for the function. This facilitates greatly the parser for the jitdump file. + + +VIII/ JIT_CODE_CLOSE record + + +The record type is optional. + +The record is used as a marker for the end of the jitted runtime. It can be replaced by the end of the file. + +The JIT_CODE_CLOSE record does not have any specific fields, the record header contains all the information needed. + + +IX/ JIT_CODE_UNWINDING_INFO + + +The record type is optional. + +The record is used to describe the unwinding information for a jitted function. + +The record has the following fields following the fixed-size record header in order: + +uint64_t unwind_data_size : the size in bytes of the unwinding data table at the end of the record +uint64_t eh_frame_hdr_size : the size in bytes of the DWARF EH Frame Header at the start of the unwinding data table at the end of the record +uint64_t mapped_size : the size of the unwinding data mapped in memory +const char unwinding_data[n]: an array of unwinding data, consisting of the EH Frame Header, followed by the actual EH Frame + + +The EH Frame header follows the Linux Standard Base (LSB) specification as described in the document at https://refspecs.linuxfoundation.org/LSB_1.3.0/gLSB/gLSB/ehframehdr.html + + +The EH Frame follows the LSB specification as described in the document at https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/ehframechpt.html + + +NOTE: The mapped_size is generally either the same as unwind_data_size (if the unwinding data was mapped in memory by the running process) or zero (if the unwinding data is not mapped by the process). If the unwinding data was not mapped, then only the EH Frame Header will be read, which can be used to specify FP based unwinding for a function which does not have unwinding information. diff --git a/tools/perf/Documentation/manpage-1.72.xsl b/tools/perf/Documentation/manpage-1.72.xsl new file mode 100644 index 000000000..b4d315cb8 --- /dev/null +++ b/tools/perf/Documentation/manpage-1.72.xsl @@ -0,0 +1,14 @@ + + + + + + + + + + diff --git a/tools/perf/Documentation/manpage-base.xsl b/tools/perf/Documentation/manpage-base.xsl new file mode 100644 index 000000000..a264fa616 --- /dev/null +++ b/tools/perf/Documentation/manpage-base.xsl @@ -0,0 +1,35 @@ + + + + + + + + + + + + + + sp + + + + + + + + br + + + diff --git a/tools/perf/Documentation/manpage-bold-literal.xsl b/tools/perf/Documentation/manpage-bold-literal.xsl new file mode 100644 index 000000000..608eb5df6 --- /dev/null +++ b/tools/perf/Documentation/manpage-bold-literal.xsl @@ -0,0 +1,17 @@ + + + + + + + fB + + + fR + + + diff --git a/tools/perf/Documentation/manpage-normal.xsl b/tools/perf/Documentation/manpage-normal.xsl new file mode 100644 index 000000000..a48f5b11f --- /dev/null +++ b/tools/perf/Documentation/manpage-normal.xsl @@ -0,0 +1,13 @@ + + + + + + +\ +. + + diff --git a/tools/perf/Documentation/manpage-suppress-sp.xsl b/tools/perf/Documentation/manpage-suppress-sp.xsl new file mode 100644 index 000000000..a63c7632a --- /dev/null +++ b/tools/perf/Documentation/manpage-suppress-sp.xsl @@ -0,0 +1,21 @@ + + + + + + + + + + + + + + + diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt new file mode 100644 index 000000000..980fe2c29 --- /dev/null +++ b/tools/perf/Documentation/perf-annotate.txt @@ -0,0 +1,157 @@ +perf-annotate(1) +================ + +NAME +---- +perf-annotate - Read perf.data (created by perf record) and display annotated code + +SYNOPSIS +-------- +[verse] +'perf annotate' [-i | --input=file] [symbol_name] + +DESCRIPTION +----------- +This command reads the input file and displays an annotated version of the +code. If the object file has debug symbols then the source code will be +displayed alongside assembly code. + +If there is no debug info in the object, then annotated assembly is displayed. + +OPTIONS +------- +-i:: +--input=:: + Input file name. (default: perf.data unless stdin is a fifo) + +-d:: +--dsos=:: + Only consider symbols in these dsos. +-s:: +--symbol=:: + Symbol to annotate. + +-f:: +--force:: + Don't do ownership validation. + +-v:: +--verbose:: + Be more verbose. (Show symbol address, etc) + +-q:: +--quiet:: + Do not show any warnings or messages. (Suppress -v) + +-n:: +--show-nr-samples:: + Show the number of samples for each symbol + +-D:: +--dump-raw-trace:: + Dump raw trace in ASCII. + +-k:: +--vmlinux=:: + vmlinux pathname. + +--ignore-vmlinux:: + Ignore vmlinux files. + +--itrace:: + Options for decoding instruction tracing data. The options are: + +include::itrace.txt[] + + To disable decoding entirely, use --no-itrace. + +-m:: +--modules:: + Load module symbols. WARNING: use only with -k and LIVE kernel. + +-l:: +--print-line:: + Print matching source lines (may be slow). + +-P:: +--full-paths:: + Don't shorten the displayed pathnames. + +--stdio:: Use the stdio interface. + +--stdio2:: Use the stdio2 interface, non-interactive, uses the TUI formatting. + +--stdio-color=:: + 'always', 'never' or 'auto', allowing configuring color output + via the command line, in addition to via "color.ui" .perfconfig. + Use '--stdio-color always' to generate color even when redirecting + to a pipe or file. Using just '--stdio-color' is equivalent to + using 'always'. + +--tui:: Use the TUI interface. Use of --tui requires a tty, if one is not + present, as when piping to other commands, the stdio interface is + used. This interfaces starts by centering on the line with more + samples, TAB/UNTAB cycles through the lines with more samples. + +--gtk:: Use the GTK interface. + +-C:: +--cpu=:: Only report samples for the list of CPUs provided. Multiple CPUs can + be provided as a comma-separated list with no space: 0,1. Ranges of + CPUs are specified with -: 0-2. Default is to report samples on all + CPUs. + +--asm-raw:: + Show raw instruction encoding of assembly instructions. + +--show-total-period:: Show a column with the sum of periods. + +--source:: + Interleave source code with assembly code. Enabled by default, + disable with --no-source. + +--symfs=:: + Look for files with symbols relative to this directory. + +-M:: +--disassembler-style=:: Set disassembler style for objdump. + +--objdump=:: + Path to objdump binary. + +--prefix=PREFIX:: +--prefix-strip=N:: + Remove first N entries from source file path names in executables + and add PREFIX. This allows to display source code compiled on systems + with different file system layout. + +--skip-missing:: + Skip symbols that cannot be annotated. + +--group:: + Show event group information together + +--demangle:: + Demangle symbol names to human readable form. It's enabled by default, + disable with --no-demangle. + +--demangle-kernel:: + Demangle kernel symbol names to human readable form (for C++ kernels). + +--percent-type:: + Set annotation percent type from following choices: + global-period, local-period, global-hits, local-hits + + The local/global keywords set if the percentage is computed + in the scope of the function (local) or the whole data (global). + The period/hits keywords set the base the percentage is computed + on - the samples period or the number of samples (hits). + +--percent-limit:: + Do not show functions which have an overhead under that percent on + stdio or stdio2 (Default: 0). Note that this is about selection of + functions to display, not about lines within the function. + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-report[1] diff --git a/tools/perf/Documentation/perf-archive.txt b/tools/perf/Documentation/perf-archive.txt new file mode 100644 index 000000000..ac6ecbb3e --- /dev/null +++ b/tools/perf/Documentation/perf-archive.txt @@ -0,0 +1,22 @@ +perf-archive(1) +=============== + +NAME +---- +perf-archive - Create archive with object files with build-ids found in perf.data file + +SYNOPSIS +-------- +[verse] +'perf archive' [file] + +DESCRIPTION +----------- +This command runs perf-buildid-list --with-hits, and collects the files with the +buildids found so that analysis of perf.data contents can be possible on another +machine. + + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-buildid-list[1], linkperf:perf-report[1] diff --git a/tools/perf/Documentation/perf-arm-spe.txt b/tools/perf/Documentation/perf-arm-spe.txt new file mode 100644 index 000000000..bf03222e9 --- /dev/null +++ b/tools/perf/Documentation/perf-arm-spe.txt @@ -0,0 +1,218 @@ +perf-arm-spe(1) +================ + +NAME +---- +perf-arm-spe - Support for Arm Statistical Profiling Extension within Perf tools + +SYNOPSIS +-------- +[verse] +'perf record' -e arm_spe// + +DESCRIPTION +----------- + +The SPE (Statistical Profiling Extension) feature provides accurate attribution of latencies and + events down to individual instructions. Rather than being interrupt-driven, it picks an +instruction to sample and then captures data for it during execution. Data includes execution time +in cycles. For loads and stores it also includes data address, cache miss events, and data origin. + +The sampling has 5 stages: + + 1. Choose an operation + 2. Collect data about the operation + 3. Optionally discard the record based on a filter + 4. Write the record to memory + 5. Interrupt when the buffer is full + +Choose an operation +~~~~~~~~~~~~~~~~~~~ + +This is chosen from a sample population, for SPE this is an IMPLEMENTATION DEFINED choice of all +architectural instructions or all micro-ops. Sampling happens at a programmable interval. The +architecture provides a mechanism for the SPE driver to infer the minimum interval at which it should +sample. This minimum interval is used by the driver if no interval is specified. A pseudo-random +perturbation is also added to the sampling interval by default. + +Collect data about the operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Program counter, PMU events, timings and data addresses related to the operation are recorded. +Sampling ensures there is only one sampled operation is in flight. + +Optionally discard the record based on a filter +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Based on programmable criteria, choose whether to keep the record or discard it. If the record is +discarded then the flow stops here for this sample. + +Write the record to memory +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The record is appended to a memory buffer + +Interrupt when the buffer is full +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When the buffer fills, an interrupt is sent and the driver signals Perf to collect the records. +Perf saves the raw data in the perf.data file. + +Opening the file +---------------- + +Up until this point no decoding of the SPE data was done by either the kernel or Perf. Only when the +recorded file is opened with 'perf report' or 'perf script' does the decoding happen. When decoding +the data, Perf generates "synthetic samples" as if these were generated at the time of the +recording. These samples are the same as if normal sampling was done by Perf without using SPE, +although they may have more attributes associated with them. For example a normal sample may have +just the instruction pointer, but an SPE sample can have data addresses and latency attributes. + +Why Sampling? +------------- + + - Sampling, rather than tracing, cuts down the profiling problem to something more manageable for + hardware. Only one sampled operation is in flight at a time. + + - Allows precise attribution data, including: Full PC of instruction, data virtual and physical + addresses. + + - Allows correlation between an instruction and events, such as TLB and cache miss. (Data source + indicates which particular cache was hit, but the meaning is implementation defined because + different implementations can have different cache configurations.) + +However, SPE does not provide any call-graph information, and relies on statistical methods. + +Collisions +---------- + +When an operation is sampled while a previous sampled operation has not finished, a collision +occurs. The new sample is dropped. Collisions affect the integrity of the data, so the sample rate +should be set to avoid collisions. + +The 'sample_collision' PMU event can be used to determine the number of lost samples. Although this +count is based on collisions _before_ filtering occurs. Therefore this can not be used as an exact +number for samples dropped that would have made it through the filter, but can be a rough +guide. + +The effect of microarchitectural sampling +----------------------------------------- + +If an implementation samples micro-operations instead of instructions, the results of sampling must +be weighted accordingly. + +For example, if a given instruction A is always converted into two micro-operations, A0 and A1, it +becomes twice as likely to appear in the sample population. + +The coarse effect of conversions, and, if applicable, sampling of speculative operations, can be +estimated from the 'sample_pop' and 'inst_retired' PMU events. + +Kernel Requirements +------------------- + +The ARM_SPE_PMU config must be set to build as either a module or statically. + +Depending on CPU model, the kernel may need to be booted with page table isolation disabled +(kpti=off). If KPTI needs to be disabled, this will fail with a console message "profiling buffer +inaccessible. Try passing 'kpti=off' on the kernel command line". + +Capturing SPE with perf command-line tools +------------------------------------------ + +You can record a session with SPE samples: + + perf record -e arm_spe// -- ./mybench + +The sample period is set from the -c option, and because the minimum interval is used by default +it's recommended to set this to a higher value. The value is written to PMSIRR.INTERVAL. + +Config parameters +~~~~~~~~~~~~~~~~~ + +These are placed between the // in the event and comma separated. For example '-e +arm_spe/load_filter=1,min_latency=10/' + + branch_filter=1 - collect branches only (PMSFCR.B) + event_filter= - filter on specific events (PMSEVFR) - see bitfield description below + jitter=1 - use jitter to avoid resonance when sampling (PMSIRR.RND) + load_filter=1 - collect loads only (PMSFCR.LD) + min_latency= - collect only samples with this latency or higher* (PMSLATFR) + pa_enable=1 - collect physical address (as well as VA) of loads/stores (PMSCR.PA) - requires privilege + pct_enable=1 - collect physical timestamp instead of virtual timestamp (PMSCR.PCT) - requires privilege + store_filter=1 - collect stores only (PMSFCR.ST) + ts_enable=1 - enable timestamping with value of generic timer (PMSCR.TS) + ++++*+++ Latency is the total latency from the point at which sampling started on that instruction, rather +than only the execution latency. + +Only some events can be filtered on; these include: + + bit 1 - instruction retired (i.e. omit speculative instructions) + bit 3 - L1D refill + bit 5 - TLB refill + bit 7 - mispredict + bit 11 - misaligned access + +So to sample just retired instructions: + + perf record -e arm_spe/event_filter=2/ -- ./mybench + +or just mispredicted branches: + + perf record -e arm_spe/event_filter=0x80/ -- ./mybench + +Viewing the data +~~~~~~~~~~~~~~~~~ + +By default perf report and perf script will assign samples to separate groups depending on the +attributes/events of the SPE record. Because instructions can have multiple events associated with +them, the samples in these groups are not necessarily unique. For example perf report shows these +groups: + + Available samples + 0 arm_spe// + 0 dummy:u + 21 l1d-miss + 897 l1d-access + 5 llc-miss + 7 llc-access + 2 tlb-miss + 1K tlb-access + 36 branch-miss + 0 remote-access + 900 memory + +The arm_spe// and dummy:u events are implementation details and are expected to be empty. + +To get a full list of unique samples that are not sorted into groups, set the itrace option to +generate 'instruction' samples. The period option is also taken into account, so set it to 1 +instruction unless you want to further downsample the already sampled SPE data: + + perf report --itrace=i1i + +Memory access details are also stored on the samples and this can be viewed with: + + perf report --mem-mode + +Common errors +~~~~~~~~~~~~~ + + - "Cannot find PMU `arm_spe'. Missing kernel support?" + + Module not built or loaded, KPTI not disabled (see above), or running on a VM + + - "Arm SPE CONTEXT packets not found in the traces." + + Root privilege is required to collect context packets. But these only increase the accuracy of + assigning PIDs to kernel samples. For userspace sampling this can be ignored. + + - Excessively large perf.data file size + + Increase sampling interval (see above) + + +SEE ALSO +-------- + +linkperf:perf-record[1], linkperf:perf-script[1], linkperf:perf-report[1], +linkperf:perf-inject[1] diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt new file mode 100644 index 000000000..a0529c7fa --- /dev/null +++ b/tools/perf/Documentation/perf-bench.txt @@ -0,0 +1,238 @@ +perf-bench(1) +============= + +NAME +---- +perf-bench - General framework for benchmark suites + +SYNOPSIS +-------- +[verse] +'perf bench' [] [] + +DESCRIPTION +----------- +This 'perf bench' command is a general framework for benchmark suites. + +COMMON OPTIONS +-------------- +-r:: +--repeat=:: +Specify amount of times to repeat the run (default 10). + +-f:: +--format=:: +Specify format style. +Current available format styles are: + +'default':: +Default style. This is mainly for human reading. +--------------------- +% perf bench sched pipe # with no style specified +(executing 1000000 pipe operations between two tasks) + Total time:5.855 sec + 5.855061 usecs/op + 170792 ops/sec +--------------------- + +'simple':: +This simple style is friendly for automated +processing by scripts. +--------------------- +% perf bench --format=simple sched pipe # specified simple +5.988 +--------------------- + +SUBSYSTEM +--------- + +'sched':: + Scheduler and IPC mechanisms. + +'syscall':: + System call performance (throughput). + +'mem':: + Memory access performance. + +'numa':: + NUMA scheduling and MM benchmarks. + +'futex':: + Futex stressing benchmarks. + +'epoll':: + Eventpoll (epoll) stressing benchmarks. + +'internals':: + Benchmark internal perf functionality. + +'all':: + All benchmark subsystems. + +SUITES FOR 'sched' +~~~~~~~~~~~~~~~~~~ +*messaging*:: +Suite for evaluating performance of scheduler and IPC mechanisms. +Based on hackbench by Rusty Russell. + +Options of *messaging* +^^^^^^^^^^^^^^^^^^^^^^ +-p:: +--pipe:: +Use pipe() instead of socketpair() + +-t:: +--thread:: +Be multi thread instead of multi process + +-g:: +--group=:: +Specify number of groups + +-l:: +--nr_loops=:: +Specify number of loops + +Example of *messaging* +^^^^^^^^^^^^^^^^^^^^^^ + +--------------------- +% perf bench sched messaging # run with default +options (20 sender and receiver processes per group) +(10 groups == 400 processes run) + + Total time:0.308 sec + +% perf bench sched messaging -t -g 20 # be multi-thread, with 20 groups +(20 sender and receiver threads per group) +(20 groups == 800 threads run) + + Total time:0.582 sec +--------------------- + +*pipe*:: +Suite for pipe() system call. +Based on pipe-test-1m.c by Ingo Molnar. + +Options of *pipe* +^^^^^^^^^^^^^^^^^ +-l:: +--loop=:: +Specify number of loops. + +Example of *pipe* +^^^^^^^^^^^^^^^^^ + +--------------------- +% perf bench sched pipe +(executing 1000000 pipe operations between two tasks) + + Total time:8.091 sec + 8.091833 usecs/op + 123581 ops/sec + +% perf bench sched pipe -l 1000 # loop 1000 +(executing 1000 pipe operations between two tasks) + + Total time:0.016 sec + 16.948000 usecs/op + 59004 ops/sec +--------------------- + +SUITES FOR 'syscall' +~~~~~~~~~~~~~~~~~~ +*basic*:: +Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics). +This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not +cached by glibc. + + +SUITES FOR 'mem' +~~~~~~~~~~~~~~~~ +*memcpy*:: +Suite for evaluating performance of simple memory copy in various ways. + +Options of *memcpy* +^^^^^^^^^^^^^^^^^^^ +-l:: +--size:: +Specify size of memory to copy (default: 1MB). +Available units are B, KB, MB, GB and TB (case insensitive). + +-f:: +--function:: +Specify function to copy (default: default). +Available functions are depend on the architecture. +On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported. + +-l:: +--nr_loops:: +Repeat memcpy invocation this number of times. + +-c:: +--cycles:: +Use perf's cpu-cycles event instead of gettimeofday syscall. + +*memset*:: +Suite for evaluating performance of simple memory set in various ways. + +Options of *memset* +^^^^^^^^^^^^^^^^^^^ +-l:: +--size:: +Specify size of memory to set (default: 1MB). +Available units are B, KB, MB, GB and TB (case insensitive). + +-f:: +--function:: +Specify function to set (default: default). +Available functions are depend on the architecture. +On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported. + +-l:: +--nr_loops:: +Repeat memset invocation this number of times. + +-c:: +--cycles:: +Use perf's cpu-cycles event instead of gettimeofday syscall. + +SUITES FOR 'numa' +~~~~~~~~~~~~~~~~~ +*mem*:: +Suite for evaluating NUMA workloads. + +SUITES FOR 'futex' +~~~~~~~~~~~~~~~~~~ +*hash*:: +Suite for evaluating hash tables. + +*wake*:: +Suite for evaluating wake calls. + +*wake-parallel*:: +Suite for evaluating parallel wake calls. + +*requeue*:: +Suite for evaluating requeue calls. + +*lock-pi*:: +Suite for evaluating futex lock_pi calls. + +SUITES FOR 'epoll' +~~~~~~~~~~~~~~~~~~ +*wait*:: +Suite for evaluating concurrent epoll_wait calls. + +*ctl*:: +Suite for evaluating multiple epoll_ctl calls. + +SUITES FOR 'internals' +~~~~~~~~~~~~~~~~~~~~~~ +*synthesize*:: +Suite for evaluating perf's event synthesis performance. + +SEE ALSO +-------- +linkperf:perf[1] diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt new file mode 100644 index 000000000..7e44b419d --- /dev/null +++ b/tools/perf/Documentation/perf-buildid-cache.txt @@ -0,0 +1,88 @@ +perf-buildid-cache(1) +===================== + +NAME +---- +perf-buildid-cache - Manage build-id cache. + +SYNOPSIS +-------- +[verse] +'perf buildid-cache ' + +DESCRIPTION +----------- +This command manages the build-id cache. It can add, remove, update and purge +files to/from the cache. In the future it should as well set upper limits for +the space used by the cache, etc. +This also scans the target binary for SDT (Statically Defined Tracing) and +record it along with the buildid-cache, which will be used by perf-probe. +For more details, see linkperf:perf-probe[1]. + +OPTIONS +------- +-a:: +--add=:: + Add specified file to the cache. +-f:: +--force:: + Don't complain, do it. +-k:: +--kcore:: + Add specified kcore file to the cache. For the current host that is + /proc/kcore which requires root permissions to read. Be aware that + running 'perf buildid-cache' as root may update root's build-id cache + not the user's. Use the -v option to see where the file is created. + Note that the copied file contains only code sections not the whole core + image. Note also that files "kallsyms" and "modules" must also be in the + same directory and are also copied. All 3 files are created with read + permissions for root only. kcore will not be added if there is already a + kcore in the cache (with the same build-id) that has the same modules at + the same addresses. Use the -v option to see if a copy of kcore is + actually made. +-r:: +--remove=:: + Remove a cached binary which has same build-id of specified file + from the cache. +-p:: +--purge=:: + Purge all cached binaries including older caches which have specified + path from the cache. +-P:: +--purge-all:: + Purge all cached binaries. This will flush out entire cache. +-M:: +--missing=:: + List missing build ids in the cache for the specified file. +-u:: +--update=:: + Update specified file of the cache. Note that this doesn't remove + older entries since those may be still needed for annotating old + (or remote) perf.data. Only if there is already a cache which has + exactly same build-id, that is replaced by new one. It can be used + to update kallsyms and kernel dso to vmlinux in order to support + annotation. +-l:: +--list:: + List all valid binaries from cache. +-v:: +--verbose:: + Be more verbose. + +--target-ns=PID: + Obtain mount namespace information from the target pid. This is + used when creating a uprobe for a process that resides in a + different mount namespace from the perf(1) utility. + +--debuginfod[=URLs]:: + Specify debuginfod URL to be used when retrieving perf.data binaries, + it follows the same syntax as the DEBUGINFOD_URLS variable, like: + + buildid-cache.debuginfod=http://192.168.122.174:8002 + + If the URLs is not specified, the value of DEBUGINFOD_URLS + system environment variable is used. + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-buildid-list[1] diff --git a/tools/perf/Documentation/perf-buildid-list.txt b/tools/perf/Documentation/perf-buildid-list.txt new file mode 100644 index 000000000..e1e8fdbe0 --- /dev/null +++ b/tools/perf/Documentation/perf-buildid-list.txt @@ -0,0 +1,47 @@ +perf-buildid-list(1) +==================== + +NAME +---- +perf-buildid-list - List the buildids in a perf.data file + +SYNOPSIS +-------- +[verse] +'perf buildid-list ' + +DESCRIPTION +----------- +This command displays the buildids found in a perf.data file, so that other +tools can be used to fetch packages with matching symbol tables for use by +perf report. + +It can also be used to show the build id of the running kernel or in an ELF +file using -i/--input. + +OPTIONS +------- +-H:: +--with-hits:: + Show only DSOs with hits. +-i:: +--input=:: + Input file name. (default: perf.data unless stdin is a fifo) +-f:: +--force:: + Don't do ownership validation. +-k:: +--kernel:: + Show running kernel build id. +-m:: +--kernel-maps:: + Show buildid, start/end text address, and path of running kernel and + its modules. +-v:: +--verbose:: + Be more verbose. + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-top[1], +linkperf:perf-report[1] diff --git a/tools/perf/Documentation/perf-c2c.txt b/tools/perf/Documentation/perf-c2c.txt new file mode 100644 index 000000000..5c5eb2def --- /dev/null +++ b/tools/perf/Documentation/perf-c2c.txt @@ -0,0 +1,336 @@ +perf-c2c(1) +=========== + +NAME +---- +perf-c2c - Shared Data C2C/HITM Analyzer. + +SYNOPSIS +-------- +[verse] +'perf c2c record' [] +'perf c2c record' [] \-- [] +'perf c2c report' [] + +DESCRIPTION +----------- +C2C stands for Cache To Cache. + +The perf c2c tool provides means for Shared Data C2C/HITM analysis. It allows +you to track down the cacheline contentions. + +On Intel, the tool is based on load latency and precise store facility events +provided by Intel CPUs. On PowerPC, the tool uses random instruction sampling +with thresholding feature. On AMD, the tool uses IBS op pmu (due to hardware +limitations, perf c2c is not supported on Zen3 cpus). + +These events provide: + - memory address of the access + - type of the access (load and store details) + - latency (in cycles) of the load access + +The c2c tool provide means to record this data and report back access details +for cachelines with highest contention - highest number of HITM accesses. + +The basic workflow with this tool follows the standard record/report phase. +User uses the record command to record events data and report command to +display it. + + +RECORD OPTIONS +-------------- +-e:: +--event=:: + Select the PMU event. Use 'perf c2c record -e list' + to list available events. + +-v:: +--verbose:: + Be more verbose (show counter open errors, etc). + +-l:: +--ldlat:: + Configure mem-loads latency. Supported on Intel and Arm64 processors + only. Ignored on other archs. + +-k:: +--all-kernel:: + Configure all used events to run in kernel space. + +-u:: +--all-user:: + Configure all used events to run in user space. + +REPORT OPTIONS +-------------- +-k:: +--vmlinux=:: + vmlinux pathname + +-v:: +--verbose:: + Be more verbose (show counter open errors, etc). + +-i:: +--input:: + Specify the input file to process. + +-N:: +--node-info:: + Show extra node info in report (see NODE INFO section) + +-c:: +--coalesce:: + Specify sorting fields for single cacheline display. + Following fields are available: tid,pid,iaddr,dso + (see COALESCE) + +-g:: +--call-graph:: + Setup callchains parameters. + Please refer to perf-report man page for details. + +--stdio:: + Force the stdio output (see STDIO OUTPUT) + +--stats:: + Display only statistic tables and force stdio mode. + +--full-symbols:: + Display full length of symbols. + +--no-source:: + Do not display Source:Line column. + +--show-all:: + Show all captured HITM lines, with no regard to HITM % 0.0005 limit. + +-f:: +--force:: + Don't do ownership validation. + +-d:: +--display:: + Switch to HITM type (rmt, lcl) or peer snooping type (peer) to display + and sort on. Total HITMs (tot) as default, except Arm64 uses peer mode + as default. + +--stitch-lbr:: + Show callgraph with stitched LBRs, which may have more complete + callgraph. The perf.data file must have been obtained using + perf c2c record --call-graph lbr. + Disabled by default. In common cases with call stack overflows, + it can recreate better call stacks than the default lbr call stack + output. But this approach is not full proof. There can be cases + where it creates incorrect call stacks from incorrect matches. + The known limitations include exception handing such as + setjmp/longjmp will have calls/returns not match. + +C2C RECORD +---------- +The perf c2c record command setup options related to HITM cacheline analysis +and calls standard perf record command. + +Following perf record options are configured by default: +(check perf record man page for details) + + -W,-d,--phys-data,--sample-cpu + +Unless specified otherwise with '-e' option, following events are monitored by +default on Intel: + + cpu/mem-loads,ldlat=30/P + cpu/mem-stores/P + +following on AMD: + + ibs_op// + +and following on PowerPC: + + cpu/mem-loads/ + cpu/mem-stores/ + +User can pass any 'perf record' option behind '--' mark, like (to enable +callchains and system wide monitoring): + + $ perf c2c record -- -g -a + +Please check RECORD OPTIONS section for specific c2c record options. + +C2C REPORT +---------- +The perf c2c report command displays shared data analysis. It comes in two +display modes: stdio and tui (default). + +The report command workflow is following: + - sort all the data based on the cacheline address + - store access details for each cacheline + - sort all cachelines based on user settings + - display data + +In general perf report output consist of 2 basic views: + 1) most expensive cachelines list + 2) offsets details for each cacheline + +For each cacheline in the 1) list we display following data: +(Both stdio and TUI modes follow the same fields output) + + Index + - zero based index to identify the cacheline + + Cacheline + - cacheline address (hex number) + + Rmt/Lcl Hitm (Display with HITM types) + - cacheline percentage of all Remote/Local HITM accesses + + Peer Snoop (Display with peer type) + - cacheline percentage of all peer accesses + + LLC Load Hitm - Total, LclHitm, RmtHitm (For display with HITM types) + - count of Total/Local/Remote load HITMs + + Load Peer - Total, Local, Remote (For display with peer type) + - count of Total/Local/Remote load from peer cache or DRAM + + Total records + - sum of all cachelines accesses + + Total loads + - sum of all load accesses + + Total stores + - sum of all store accesses + + Store Reference - L1Hit, L1Miss, N/A + L1Hit - store accesses that hit L1 + L1Miss - store accesses that missed L1 + N/A - store accesses with memory level is not available + + Core Load Hit - FB, L1, L2 + - count of load hits in FB (Fill Buffer), L1 and L2 cache + + LLC Load Hit - LlcHit, LclHitm + - count of LLC load accesses, includes LLC hits and LLC HITMs + + RMT Load Hit - RmtHit, RmtHitm + - count of remote load accesses, includes remote hits and remote HITMs; + on Arm neoverse cores, RmtHit is used to account remote accesses, + includes remote DRAM or any upward cache level in remote node + + Load Dram - Lcl, Rmt + - count of local and remote DRAM accesses + +For each offset in the 2) list we display following data: + + HITM - Rmt, Lcl (Display with HITM types) + - % of Remote/Local HITM accesses for given offset within cacheline + + Peer Snoop - Rmt, Lcl (Display with peer type) + - % of Remote/Local peer accesses for given offset within cacheline + + Store Refs - L1 Hit, L1 Miss, N/A + - % of store accesses that hit L1, missed L1 and N/A (no available) memory + level for given offset within cacheline + + Data address - Offset + - offset address + + Pid + - pid of the process responsible for the accesses + + Tid + - tid of the process responsible for the accesses + + Code address + - code address responsible for the accesses + + cycles - rmt hitm, lcl hitm, load (Display with HITM types) + - sum of cycles for given accesses - Remote/Local HITM and generic load + + cycles - rmt peer, lcl peer, load (Display with peer type) + - sum of cycles for given accesses - Remote/Local peer load and generic load + + cpu cnt + - number of cpus that participated on the access + + Symbol + - code symbol related to the 'Code address' value + + Shared Object + - shared object name related to the 'Code address' value + + Source:Line + - source information related to the 'Code address' value + + Node + - nodes participating on the access (see NODE INFO section) + +NODE INFO +--------- +The 'Node' field displays nodes that accesses given cacheline +offset. Its output comes in 3 flavors: + - node IDs separated by ',' + - node IDs with stats for each ID, in following format: + Node{cpus %hitms %stores} (Display with HITM types) + Node{cpus %peers %stores} (Display with peer type) + - node IDs with list of affected CPUs in following format: + Node{cpu list} + +User can switch between above flavors with -N option or +use 'n' key to interactively switch in TUI mode. + +COALESCE +-------- +User can specify how to sort offsets for cacheline. + +Following fields are available and governs the final +output fields set for cacheline offsets output: + + tid - coalesced by process TIDs + pid - coalesced by process PIDs + iaddr - coalesced by code address, following fields are displayed: + Code address, Code symbol, Shared Object, Source line + dso - coalesced by shared object + +By default the coalescing is setup with 'pid,iaddr'. + +STDIO OUTPUT +------------ +The stdio output displays data on standard output. + +Following tables are displayed: + Trace Event Information + - overall statistics of memory accesses + + Global Shared Cache Line Event Information + - overall statistics on shared cachelines + + Shared Data Cache Line Table + - list of most expensive cachelines + + Shared Cache Line Distribution Pareto + - list of all accessed offsets for each cacheline + +TUI OUTPUT +---------- +The TUI output provides interactive interface to navigate +through cachelines list and to display offset details. + +For details please refer to the help window by pressing '?' key. + +CREDITS +------- +Although Don Zickus, Dick Fowles and Joe Mario worked together +to get this implemented, we got lots of early help from Arnaldo +Carvalho de Melo, Stephane Eranian, Jiri Olsa and Andi Kleen. + +C2C BLOG +-------- +Check Joe's blog on c2c tool for detailed use case explanation: + https://joemario.github.io/blog/2016/09/01/c2c-blog/ + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-mem[1] diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt new file mode 100644 index 000000000..39c890ead --- /dev/null +++ b/tools/perf/Documentation/perf-config.txt @@ -0,0 +1,755 @@ +perf-config(1) +============== + +NAME +---- +perf-config - Get and set variables in a configuration file. + +SYNOPSIS +-------- +[verse] +'perf config' [] [section.name[=value] ...] +or +'perf config' [] -l | --list + +DESCRIPTION +----------- +You can manage variables in a configuration file with this command. + +OPTIONS +------- + +-l:: +--list:: + Show current config variables, name and value, for all sections. + +--user:: + For writing and reading options: write to user + '$HOME/.perfconfig' file or read it. + +--system:: + For writing and reading options: write to system-wide + '$(sysconfdir)/perfconfig' or read it. + +CONFIGURATION FILE +------------------ + +The perf configuration file contains many variables to change various +aspects of each of its tools, including output, disk usage, etc. +The '$HOME/.perfconfig' file is used to store a per-user configuration. +The file '$(sysconfdir)/perfconfig' can be used to +store a system-wide default configuration. + +One an disable reading config files by setting the PERF_CONFIG environment +variable to /dev/null, or provide an alternate config file by setting that +variable. + +When reading or writing, the values are read from the system and user +configuration files by default, and options '--system' and '--user' +can be used to tell the command to read from or write to only that location. + +Syntax +~~~~~~ + +The file consist of sections. A section starts with its name +surrounded by square brackets and continues till the next section +begins. Each variable must be in a section, and have the form +'name = value', for example: + + [section] + name1 = value1 + name2 = value2 + +Section names are case sensitive and can contain any characters except +newline (double quote `"` and backslash have to be escaped as `\"` and `\\`, +respectively). Section headers can't span multiple lines. + +Example +~~~~~~~ + +Given a $HOME/.perfconfig like this: + +# +# This is the config file, and +# a '#' and ';' character indicates a comment +# + + [colors] + # Color variables + top = red, default + medium = green, default + normal = lightgray, default + selected = white, lightgray + jump_arrows = blue, default + addr = magenta, default + root = white, blue + + [tui] + # Defaults if linked with libslang + report = on + annotate = on + top = on + + [buildid] + # Default, disable using /dev/null + dir = ~/.debug + + [annotate] + # Defaults + hide_src_code = false + use_offset = true + jump_arrows = true + show_nr_jumps = false + + [help] + # Format can be man, info, web or html + format = man + autocorrect = 0 + + [ui] + show-headers = true + + [call-graph] + # fp (framepointer), dwarf + record-mode = fp + print-type = graph + order = caller + sort-key = function + + [report] + # Defaults + sort_order = comm,dso,symbol + percent-limit = 0 + queue-size = 0 + children = true + group = true + skip-empty = true + + [llvm] + dump-obj = true + clang-opt = -g + +You can hide source code of annotate feature setting the config to false with + + % perf config annotate.hide_src_code=true + +If you want to add or modify several config items, you can do like + + % perf config ui.show-headers=false kmem.default=slab + +To modify the sort order of report functionality in user config file(i.e. `~/.perfconfig`), do + + % perf config --user report.sort-order=srcline + +To change colors of selected line to other foreground and background colors +in system config file (i.e. `$(sysconf)/perfconfig`), do + + % perf config --system colors.selected=yellow,green + +To query the record mode of call graph, do + + % perf config call-graph.record-mode + +If you want to know multiple config key/value pairs, you can do like + + % perf config report.queue-size call-graph.order report.children + +To query the config value of sort order of call graph in user config file (i.e. `~/.perfconfig`), do + + % perf config --user call-graph.sort-order + +To query the config value of buildid directory in system config file (i.e. `$(sysconf)/perfconfig`), do + + % perf config --system buildid.dir + +Variables +~~~~~~~~~ + +colors.*:: + The variables for customizing the colors used in the output for the + 'report', 'top' and 'annotate' in the TUI. They should specify the + foreground and background colors, separated by a comma, for example: + + medium = green, lightgray + + If you want to use the color configured for you terminal, just leave it + as 'default', for example: + + medium = default, lightgray + + Available colors: + red, yellow, green, cyan, gray, black, blue, + white, default, magenta, lightgray + + colors.top:: + 'top' means a overhead percentage which is more than 5%. + And values of this variable specify percentage colors. + Basic key values are foreground-color 'red' and + background-color 'default'. + colors.medium:: + 'medium' means a overhead percentage which has more than 0.5%. + Default values are 'green' and 'default'. + colors.normal:: + 'normal' means the rest of overhead percentages + except 'top', 'medium', 'selected'. + Default values are 'lightgray' and 'default'. + colors.selected:: + This selects the colors for the current entry in a list of entries + from sub-commands (top, report, annotate). + Default values are 'black' and 'lightgray'. + colors.jump_arrows:: + Colors for jump arrows on assembly code listings + such as 'jns', 'jmp', 'jane', etc. + Default values are 'blue', 'default'. + colors.addr:: + This selects colors for addresses from 'annotate'. + Default values are 'magenta', 'default'. + colors.root:: + Colors for headers in the output of a sub-commands (top, report). + Default values are 'white', 'blue'. + +core.*:: + core.proc-map-timeout:: + Sets a timeout (in milliseconds) for parsing /proc//maps files. + Can be overridden by the --proc-map-timeout option on supported + subcommands. The default timeout is 500ms. + +tui.*, gtk.*:: + Subcommands that can be configured here are 'top', 'report' and 'annotate'. + These values are booleans, for example: + + [tui] + top = true + + will make the TUI be the default for the 'top' subcommand. Those will be + available if the required libs were detected at tool build time. + +buildid.*:: + buildid.dir:: + Each executable and shared library in modern distributions comes with a + content based identifier that, if available, will be inserted in a + 'perf.data' file header to, at analysis time find what is needed to do + symbol resolution, code annotation, etc. + + The recording tools also stores a hard link or copy in a per-user + directory, $HOME/.debug/, of binaries, shared libraries, /proc/kallsyms + and /proc/kcore files to be used at analysis time. + + The buildid.dir variable can be used to either change this directory + cache location, or to disable it altogether. If you want to disable it, + set buildid.dir to /dev/null. The default is $HOME/.debug + +buildid-cache.*:: + buildid-cache.debuginfod=URLs + Specify debuginfod URLs to be used when retrieving perf.data binaries, + it follows the same syntax as the DEBUGINFOD_URLS variable, like: + + buildid-cache.debuginfod=http://192.168.122.174:8002 + +annotate.*:: + These are in control of addresses, jump function, source code + in lines of assembly code from a specific program. + + annotate.disassembler_style: + Use this to change the default disassembler style to some other value + supported by binutils, such as "intel", see the '-M' option help in the + 'objdump' man page. + + annotate.hide_src_code:: + If a program which is analyzed has source code, + this option lets 'annotate' print a list of assembly code with the source code. + For example, let's see a part of a program. There're four lines. + If this option is 'true', they can be printed + without source code from a program as below. + + │ push %rbp + │ mov %rsp,%rbp + │ sub $0x10,%rsp + │ mov (%rdi),%rdx + + But if this option is 'false', source code of the part + can be also printed as below. Default is 'false'. + + │ struct rb_node *rb_next(const struct rb_node *node) + │ { + │ push %rbp + │ mov %rsp,%rbp + │ sub $0x10,%rsp + │ struct rb_node *parent; + │ + │ if (RB_EMPTY_NODE(node)) + │ mov (%rdi),%rdx + │ return n; + + This option works with tui, stdio2 browsers. + + annotate.use_offset:: + Basing on a first address of a loaded function, offset can be used. + Instead of using original addresses of assembly code, + addresses subtracted from a base address can be printed. + Let's illustrate an example. + If a base address is 0XFFFFFFFF81624d50 as below, + + ffffffff81624d50 + + an address on assembly code has a specific absolute address as below + + ffffffff816250b8:│ mov 0x8(%r14),%rdi + + but if use_offset is 'true', an address subtracted from a base address is printed. + Default is true. This option is only applied to TUI. + + 368:│ mov 0x8(%r14),%rdi + + This option works with tui, stdio2 browsers. + + annotate.jump_arrows:: + There can be jump instruction among assembly code. + Depending on a boolean value of jump_arrows, + arrows can be printed or not which represent + where do the instruction jump into as below. + + │ ┌──jmp 1333 + │ │ xchg %ax,%ax + │1330:│ mov %r15,%r10 + │1333:└─→cmp %r15,%r14 + + If jump_arrow is 'false', the arrows isn't printed as below. + Default is 'false'. + + │ ↓ jmp 1333 + │ xchg %ax,%ax + │1330: mov %r15,%r10 + │1333: cmp %r15,%r14 + + This option works with tui browser. + + annotate.show_linenr:: + When showing source code if this option is 'true', + line numbers are printed as below. + + │1628 if (type & PERF_SAMPLE_IDENTIFIER) { + │ ↓ jne 508 + │1628 data->id = *array; + │1629 array++; + │1630 } + + However if this option is 'false', they aren't printed as below. + Default is 'false'. + + │ if (type & PERF_SAMPLE_IDENTIFIER) { + │ ↓ jne 508 + │ data->id = *array; + │ array++; + │ } + + This option works with tui, stdio2 browsers. + + annotate.show_nr_jumps:: + Let's see a part of assembly code. + + │1382: movb $0x1,-0x270(%rbp) + + If use this, the number of branches jumping to that address can be printed as below. + Default is 'false'. + + │1 1382: movb $0x1,-0x270(%rbp) + + This option works with tui, stdio2 browsers. + + annotate.show_total_period:: + To compare two records on an instruction base, with this option + provided, display total number of samples that belong to a line + in assembly code. If this option is 'true', total periods are printed + instead of percent values as below. + + 302 │ mov %eax,%eax + + But if this option is 'false', percent values for overhead are printed i.e. + Default is 'false'. + + 99.93 │ mov %eax,%eax + + This option works with tui, stdio2, stdio browsers. + + annotate.show_nr_samples:: + By default perf annotate shows percentage of samples. This option + can be used to print absolute number of samples. Ex, when set as + false: + + Percent│ + 74.03 │ mov %fs:0x28,%rax + + When set as true: + + Samples│ + 6 │ mov %fs:0x28,%rax + + This option works with tui, stdio2, stdio browsers. + + annotate.offset_level:: + Default is '1', meaning just jump targets will have offsets show right beside + the instruction. When set to '2' 'call' instructions will also have its offsets + shown, 3 or higher will show offsets for all instructions. + + This option works with tui, stdio2 browsers. + + annotate.demangle:: + Demangle symbol names to human readable form. Default is 'true'. + + annotate.demangle_kernel:: + Demangle kernel symbol names to human readable form. Default is 'true'. + +hist.*:: + hist.percentage:: + This option control the way to calculate overhead of filtered entries - + that means the value of this option is effective only if there's a + filter (by comm, dso or symbol name). Suppose a following example: + + Overhead Symbols + ........ ....... + 33.33% foo + 33.33% bar + 33.33% baz + + This is an original overhead and we'll filter out the first 'foo' + entry. The value of 'relative' would increase the overhead of 'bar' + and 'baz' to 50.00% for each, while 'absolute' would show their + current overhead (33.33%). + +ui.*:: + ui.show-headers:: + This option controls display of column headers (like 'Overhead' and 'Symbol') + in 'report' and 'top'. If this option is false, they are hidden. + This option is only applied to TUI. + +call-graph.*:: + The following controls the handling of call-graphs (obtained via the + -g/--call-graph options). + + call-graph.record-mode:: + The mode for user space can be 'fp' (frame pointer), 'dwarf' + and 'lbr'. The value 'dwarf' is effective only if libunwind + (or a recent version of libdw) is present on the system; + the value 'lbr' only works for certain cpus. The method for + kernel space is controlled not by this option but by the + kernel config (CONFIG_UNWINDER_*). + + call-graph.dump-size:: + The size of stack to dump in order to do post-unwinding. Default is 8192 (byte). + When using dwarf into record-mode, the default size will be used if omitted. + + call-graph.print-type:: + The print-types can be graph (graph absolute), fractal (graph relative), + flat and folded. This option controls a way to show overhead for each callchain + entry. Suppose a following example. + + Overhead Symbols + ........ ....... + 40.00% foo + | + ---foo + | + |--50.00%--bar + | main + | + --50.00%--baz + main + + This output is a 'fractal' format. The 'foo' came from 'bar' and 'baz' exactly + half and half so 'fractal' shows 50.00% for each + (meaning that it assumes 100% total overhead of 'foo'). + + The 'graph' uses absolute overhead value of 'foo' as total so each of + 'bar' and 'baz' callchain will have 20.00% of overhead. + If 'flat' is used, single column and linear exposure of call chains. + 'folded' mean call chains are displayed in a line, separated by semicolons. + + call-graph.order:: + This option controls print order of callchains. The default is + 'callee' which means callee is printed at top and then followed by its + caller and so on. The 'caller' prints it in reverse order. + + If this option is not set and report.children or top.children is + set to true (or the equivalent command line option is given), + the default value of this option is changed to 'caller' for the + execution of 'perf report' or 'perf top'. Other commands will + still default to 'callee'. + + call-graph.sort-key:: + The callchains are merged if they contain same information. + The sort-key option determines a way to compare the callchains. + A value of 'sort-key' can be 'function' or 'address'. + The default is 'function'. + + call-graph.threshold:: + When there're many callchains it'd print tons of lines. So perf omits + small callchains under a certain overhead (threshold) and this option + control the threshold. Default is 0.5 (%). The overhead is calculated + by value depends on call-graph.print-type. + + call-graph.print-limit:: + This is a maximum number of lines of callchain printed for a single + histogram entry. Default is 0 which means no limitation. + +report.*:: + report.sort_order:: + Allows changing the default sort order from "comm,dso,symbol" to + some other default, for instance "sym,dso" may be more fitting for + kernel developers. + report.percent-limit:: + This one is mostly the same as call-graph.threshold but works for + histogram entries. Entries having an overhead lower than this + percentage will not be printed. Default is '0'. If percent-limit + is '10', only entries which have more than 10% of overhead will be + printed. + + report.queue-size:: + This option sets up the maximum allocation size of the internal + event queue for ordering events. Default is 0, meaning no limit. + + report.children:: + 'Children' means functions called from another function. + If this option is true, 'perf report' cumulates callchains of children + and show (accumulated) total overhead as well as 'Self' overhead. + Please refer to the 'perf report' manual. The default is 'true'. + + report.group:: + This option is to show event group information together. + Example output with this turned on, notice that there is one column + per event in the group, ref-cycles and cycles: + + # group: {ref-cycles,cycles} + # ======== + # + # Samples: 7K of event 'anon group { ref-cycles, cycles }' + # Event count (approx.): 6876107743 + # + # Overhead Command Shared Object Symbol + # ................ ....... ................. ................... + # + 99.84% 99.76% noploop noploop [.] main + 0.07% 0.00% noploop ld-2.15.so [.] strcmp + 0.03% 0.00% noploop [kernel.kallsyms] [k] timerqueue_del + + report.skip-empty:: + This option can change default stat behavior with empty results. + If it's set true, 'perf report --stat' will not show 0 stats. + +top.*:: + top.children:: + Same as 'report.children'. So if it is enabled, the output of 'top' + command will have 'Children' overhead column as well as 'Self' overhead + column by default. + The default is 'true'. + + top.call-graph:: + This is identical to 'call-graph.record-mode', except it is + applicable only for 'top' subcommand. This option ONLY setup + the unwind method. To enable 'perf top' to actually use it, + the command line option -g must be specified. + +man.*:: + man.viewer:: + This option can assign a tool to view manual pages when 'help' + subcommand was invoked. Supported tools are 'man', 'woman' + (with emacs client) and 'konqueror'. Default is 'man'. + + New man viewer tool can be also added using 'man..cmd' + or use different path using 'man..path' config option. + +pager.*:: + pager.:: + When the subcommand is run on stdio, determine whether it uses + pager or not based on this value. Default is 'unspecified'. + +kmem.*:: + kmem.default:: + This option decides which allocator is to be analyzed if neither + '--slab' nor '--page' option is used. Default is 'slab'. + +record.*:: + record.build-id:: + This option can be 'cache', 'no-cache', 'skip' or 'mmap'. + 'cache' is to post-process data and save/update the binaries into + the build-id cache (in ~/.debug). This is the default. + But if this option is 'no-cache', it will not update the build-id cache. + 'skip' skips post-processing and does not update the cache. + 'mmap' skips post-processing and reads build-ids from MMAP events. + + record.call-graph:: + This is identical to 'call-graph.record-mode', except it is + applicable only for 'record' subcommand. This option ONLY setup + the unwind method. To enable 'perf record' to actually use it, + the command line option -g must be specified. + + record.aio:: + Use 'n' control blocks in asynchronous (Posix AIO) trace writing + mode ('n' default: 1, max: 4). + + record.debuginfod:: + Specify debuginfod URL to be used when cacheing perf.data binaries, + it follows the same syntax as the DEBUGINFOD_URLS variable, like: + + http://192.168.122.174:8002 + + If the URLs is 'system', the value of DEBUGINFOD_URLS system environment + variable is used. + +diff.*:: + diff.order:: + This option sets the number of columns to sort the result. + The default is 0, which means sorting by baseline. + Setting it to 1 will sort the result by delta (or other + compute method selected). + + diff.compute:: + This options sets the method for computing the diff result. + Possible values are 'delta', 'delta-abs', 'ratio' and + 'wdiff'. Default is 'delta'. + +trace.*:: + trace.add_events:: + Allows adding a set of events to add to the ones specified + by the user, or use as a default one if none was specified. + The initial use case is to add augmented_raw_syscalls.o to + activate the 'perf trace' logic that looks for syscall + pointer contents after the normal tracepoint payload. + + trace.args_alignment:: + Number of columns to align the argument list, default is 70, + use 40 for the strace default, zero to no alignment. + + trace.no_inherit:: + Do not follow children threads. + + trace.show_arg_names:: + Should syscall argument names be printed? If not then trace.show_zeros + will be set. + + trace.show_duration:: + Show syscall duration. + + trace.show_prefix:: + If set to 'yes' will show common string prefixes in tables. The default + is to remove the common prefix in things like "MAP_SHARED", showing just "SHARED". + + trace.show_timestamp:: + Show syscall start timestamp. + + trace.show_zeros:: + Do not suppress syscall arguments that are equal to zero. + + trace.tracepoint_beautifiers:: + Use "libtraceevent" to use that library to augment the tracepoint arguments, + "libbeauty", the default, to use the same argument beautifiers used in the + strace-like sys_enter+sys_exit lines. + +ftrace.*:: + ftrace.tracer:: + Can be used to select the default tracer when neither -G nor + -F option is not specified. Possible values are 'function' and + 'function_graph'. + +llvm.*:: + llvm.clang-path:: + Path to clang. If omit, search it from $PATH. + + llvm.clang-bpf-cmd-template:: + Cmdline template. Below lines show its default value. Environment + variable is used to pass options. + "$CLANG_EXEC -D__KERNEL__ -D__NR_CPUS__=$NR_CPUS "\ + "-DLINUX_VERSION_CODE=$LINUX_VERSION_CODE " \ + "$CLANG_OPTIONS $PERF_BPF_INC_OPTIONS $KERNEL_INC_OPTIONS " \ + "-Wno-unused-value -Wno-pointer-sign " \ + "-working-directory $WORKING_DIR " \ + "-c \"$CLANG_SOURCE\" -target bpf $CLANG_EMIT_LLVM -O2 -o - $LLVM_OPTIONS_PIPE" + + llvm.clang-opt:: + Options passed to clang. + + llvm.kbuild-dir:: + kbuild directory. If not set, use /lib/modules/`uname -r`/build. + If set to "" deliberately, skip kernel header auto-detector. + + llvm.kbuild-opts:: + Options passed to 'make' when detecting kernel header options. + + llvm.dump-obj:: + Enable perf dump BPF object files compiled by LLVM. + + llvm.opts:: + Options passed to llc. + +samples.*:: + + samples.context:: + Define how many ns worth of time to show + around samples in perf report sample context browser. + +scripts.*:: + + Any option defines a script that is added to the scripts menu + in the interactive perf browser and whose output is displayed. + The name of the option is the name, the value is a script command line. + The script gets the same options passed as a full perf script, + in particular -i perfdata file, --cpu, --tid + +convert.*:: + + convert.queue-size:: + Limit the size of ordered_events queue, so we could control + allocation size of perf data files without proper finished + round events. +stat.*:: + + stat.big-num:: + (boolean) Change the default for "--big-num". To make + "--no-big-num" the default, set "stat.big-num=false". + +intel-pt.*:: + + intel-pt.cache-divisor:: + + intel-pt.mispred-all:: + If set, Intel PT decoder will set the mispred flag on all + branches. + + intel-pt.max-loops:: + If set and non-zero, the maximum number of unconditional + branches decoded without consuming any trace packets. If + the maximum is exceeded there will be a "Never-ending loop" + error. The default is 100000. + +auxtrace.*:: + + auxtrace.dumpdir:: + s390 only. The directory to save the auxiliary trace buffer + can be changed using this option. Ex, auxtrace.dumpdir=/tmp. + If the directory does not exist or has the wrong file type, + the current directory is used. + +itrace.*:: + + debug-log-buffer-size:: + Log size in bytes to output when using the option --itrace=d+e + Refer 'itrace' option of linkperf:perf-script[1] or + linkperf:perf-report[1]. The default is 16384. + +daemon.*:: + + daemon.base:: + Base path for daemon data. All sessions data are stored under + this path. + +session-.*:: + + session-.run:: + + Defines new record session for daemon. The value is record's + command line without the 'record' keyword. + + +SEE ALSO +-------- +linkperf:perf[1] diff --git a/tools/perf/Documentation/perf-daemon.txt b/tools/perf/Documentation/perf-daemon.txt new file mode 100644 index 000000000..f558f8e4b --- /dev/null +++ b/tools/perf/Documentation/perf-daemon.txt @@ -0,0 +1,208 @@ +perf-daemon(1) +============== + + +NAME +---- +perf-daemon - Run record sessions on background + + +SYNOPSIS +-------- +[verse] +'perf daemon' +'perf daemon' [] +'perf daemon start' [] +'perf daemon stop' [] +'perf daemon signal' [] +'perf daemon ping' [] + + +DESCRIPTION +----------- +This command allows to run simple daemon process that starts and +monitors configured record sessions. + +You can imagine 'perf daemon' of background process with several +'perf record' child tasks, like: + + # ps axjf + ... + 1 916507 ... perf daemon start + 916507 916508 ... \_ perf record --control=fifo:control,ack -m 10M -e cycles --overwrite --switch-output -a + 916507 916509 ... \_ perf record --control=fifo:control,ack -m 20M -e sched:* --overwrite --switch-output -a + +Not every 'perf record' session is suitable for running under daemon. +User need perf session that either produces data on query, like the +flight recorder sessions in above example or session that is configured +to produce data periodically, like with --switch-output configuration +for time and size. + +Each session is started with control setup (with perf record --control +options). + +Sessions are configured through config file, see CONFIG FILE section +with EXAMPLES. + + +OPTIONS +------- +-v:: +--verbose:: + Be more verbose. + +--config=:: + Config file path. If not provided, perf will check system and default + locations (/etc/perfconfig, $HOME/.perfconfig). + +--base=:: + Base directory path. Each daemon instance is running on top + of base directory. Only one instance of server can run on + top of one directory at the time. + +All generic options are available also under commands. + + +START COMMAND +------------- +The start command creates the daemon process. + +-f:: +--foreground:: + Do not put the process in background. + + +STOP COMMAND +------------ +The stop command stops all the session and the daemon process. + + +SIGNAL COMMAND +-------------- +The signal command sends signal to configured sessions. + +--session:: + Send signal to specific session. + + +PING COMMAND +------------ +The ping command sends control ping to configured sessions. + +--session:: + Send ping to specific session. + + +CONFIG FILE +----------- +The daemon is configured within standard perf config file by +following new variables: + +daemon.base: + Base path for daemon data. All sessions data are + stored under this path. + +session-.run: + Defines new record session. The value is record's command + line without the 'record' keyword. + +Each perf record session is run in daemon.base/ directory. + + +EXAMPLES +-------- +Example with 2 record sessions: + + # cat ~/.perfconfig + [daemon] + base=/opt/perfdata + + [session-cycles] + run = -m 10M -e cycles --overwrite --switch-output -a + + [session-sched] + run = -m 20M -e sched:* --overwrite --switch-output -a + + +Starting the daemon: + + # perf daemon start + + +Check sessions: + + # perf daemon + [603349:daemon] base: /opt/perfdata + [603350:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a + [603351:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a + +First line is daemon process info with configured daemon base. + + +Check sessions with more info: + + # perf daemon -v + [603349:daemon] base: /opt/perfdata + output: /opt/perfdata/output + lock: /opt/perfdata/lock + up: 1 minutes + [603350:cycles] perf record -m 10M -e cycles --overwrite --switch-output -a + base: /opt/perfdata/session-cycles + output: /opt/perfdata/session-cycles/output + control: /opt/perfdata/session-cycles/control + ack: /opt/perfdata/session-cycles/ack + up: 1 minutes + [603351:sched] perf record -m 20M -e sched:* --overwrite --switch-output -a + base: /opt/perfdata/session-sched + output: /opt/perfdata/session-sched/output + control: /opt/perfdata/session-sched/control + ack: /opt/perfdata/session-sched/ack + up: 1 minutes + +The 'base' path is daemon/session base. +The 'lock' file is daemon's lock file guarding that no other +daemon is running on top of the base. +The 'output' file is perf record output for specific session. +The 'control' and 'ack' files are perf control files. +The 'up' number shows minutes daemon/session is running. + + +Make sure control session is online: + + # perf daemon ping + OK cycles + OK sched + + +Send USR2 signal to session 'cycles' to generate perf.data file: + + # perf daemon signal --session cycles + signal 12 sent to session 'cycles [603452]' + + # tail -2 /opt/perfdata/session-cycles/output + [ perf record: dump data: Woken up 1 times ] + [ perf record: Dump perf.data.2020123017013149 ] + + +Send USR2 signal to all sessions: + + # perf daemon signal + signal 12 sent to session 'cycles [603452]' + signal 12 sent to session 'sched [603453]' + + # tail -2 /opt/perfdata/session-cycles/output + [ perf record: dump data: Woken up 1 times ] + [ perf record: Dump perf.data.2020123017024689 ] + # tail -2 /opt/perfdata/session-sched/output + [ perf record: dump data: Woken up 1 times ] + [ perf record: Dump perf.data.2020123017024713 ] + + +Stop daemon: + + # perf daemon stop + + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-config[1] diff --git a/tools/perf/Documentation/perf-data.txt b/tools/perf/Documentation/perf-data.txt new file mode 100644 index 000000000..417bf17e2 --- /dev/null +++ b/tools/perf/Documentation/perf-data.txt @@ -0,0 +1,54 @@ +perf-data(1) +============ + +NAME +---- +perf-data - Data file related processing + +SYNOPSIS +-------- +[verse] +'perf data' [] []", + +DESCRIPTION +----------- +Data file related processing. + +COMMANDS +-------- +convert:: + Converts perf data file into another format. + It's possible to set data-convert debug variable to get debug messages from conversion, + like: + perf --debug data-convert data convert ... + +OPTIONS for 'convert' +--------------------- +--to-ctf:: + Triggers the CTF conversion, specify the path of CTF data directory. + +--to-json:: + Triggers JSON conversion. Specify the JSON filename to output. + +--tod:: + Convert time to wall clock time. + +-i:: + Specify input perf data file path. + +-f:: +--force:: + Don't complain, do it. + +-v:: +--verbose:: + Be more verbose (show counter open errors, etc). + +--all:: + Convert all events, including non-sample events (comm, fork, ...), to output. + Default is off, only convert samples. + +SEE ALSO +-------- +linkperf:perf[1] +[1] Common Trace Format - http://www.efficios.com/ctf diff --git a/tools/perf/Documentation/perf-diff.txt b/tools/perf/Documentation/perf-diff.txt new file mode 100644 index 000000000..f3067a4af --- /dev/null +++ b/tools/perf/Documentation/perf-diff.txt @@ -0,0 +1,305 @@ +perf-diff(1) +============ + +NAME +---- +perf-diff - Read perf.data files and display the differential profile + +SYNOPSIS +-------- +[verse] +'perf diff' [baseline file] [data file1] [[data file2] ... ] + +DESCRIPTION +----------- +This command displays the performance difference amongst two or more perf.data +files captured via perf record. + +If no parameters are passed it will assume perf.data.old and perf.data. + +The differential profile is displayed only for events matching both +specified perf.data files. + +If no parameters are passed the samples will be sorted by dso and symbol. +As the perf.data files could come from different binaries, the symbols addresses +could vary. So perf diff is based on the comparison of the files and +symbols name. + +OPTIONS +------- +-D:: +--dump-raw-trace:: + Dump raw trace in ASCII. + +--kallsyms=:: + kallsyms pathname + +-m:: +--modules:: + Load module symbols. WARNING: use only with -k and LIVE kernel + +-d:: +--dsos=:: + Only consider symbols in these dsos. CSV that understands + file://filename entries. This option will affect the percentage + of the Baseline/Delta column. See --percentage for more info. + +-C:: +--comms=:: + Only consider symbols in these comms. CSV that understands + file://filename entries. This option will affect the percentage + of the Baseline/Delta column. See --percentage for more info. + +-S:: +--symbols=:: + Only consider these symbols. CSV that understands + file://filename entries. This option will affect the percentage + of the Baseline/Delta column. See --percentage for more info. + +-s:: +--sort=:: + Sort by key(s): pid, comm, dso, symbol, cpu, parent, srcline. + Please see description of --sort in the perf-report man page. + +-t:: +--field-separator=:: + + Use a special separator character and don't pad with spaces, replacing + all occurrences of this separator in symbol names (and other output) + with a '.' character, that thus it's the only non valid separator. + +-v:: +--verbose:: + Be verbose, for instance, show the raw counts in addition to the + diff. + +-q:: +--quiet:: + Do not show any warnings or messages. (Suppress -v) + +-f:: +--force:: + Don't do ownership validation. + +--symfs=:: + Look for files with symbols relative to this directory. + +-b:: +--baseline-only:: + Show only items with match in baseline. + +-c:: +--compute:: + Differential computation selection - delta, ratio, wdiff, cycles, + delta-abs (default is delta-abs). Default can be changed using + diff.compute config option. See COMPARISON METHODS section for + more info. + +--cycles-hist:: + Report a histogram and the standard deviation for cycles data. + It can help us to judge if the reported cycles data is noisy or + not. This option should be used with '-c cycles'. + +-p:: +--period:: + Show period values for both compared hist entries. + +-F:: +--formula:: + Show formula for given computation. + +-o:: +--order:: + Specify compute sorting column number. 0 means sorting by baseline + overhead and 1 (default) means sorting by computed value of column 1 + (data from the first file other base baseline). Values more than 1 + can be used only if enough data files are provided. + The default value can be set using the diff.order config option. + +--percentage:: + Determine how to display the overhead percentage of filtered entries. + Filters can be applied by --comms, --dsos and/or --symbols options. + + "relative" means it's relative to filtered entries only so that the + sum of shown entries will be always 100%. "absolute" means it retains + the original value before and after the filter is applied. + +--time:: + Analyze samples within given time window. It supports time + percent with multiple time ranges. Time string is 'a%/n,b%/m,...' + or 'a%-b%,c%-%d,...'. + + For example: + + Select the second 10% time slice to diff: + + perf diff --time 10%/2 + + Select from 0% to 10% time slice to diff: + + perf diff --time 0%-10% + + Select the first and the second 10% time slices to diff: + + perf diff --time 10%/1,10%/2 + + Select from 0% to 10% and 30% to 40% slices to diff: + + perf diff --time 0%-10%,30%-40% + + It also supports analyzing samples within a given time window + ,. Times have the format seconds.nanoseconds. If 'start' + is not given (i.e. time string is ',x.y') then analysis starts at + the beginning of the file. If stop time is not given (i.e. time + string is 'x.y,') then analysis goes to the end of the file. + Multiple ranges can be separated by spaces, which requires the argument + to be quoted e.g. --time "1234.567,1234.789 1235," + Time string is'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps + for different perf.data files. + + For example, we get the timestamp information from 'perf script'. + + perf script -i perf.data.old + mgen 13940 [000] 3946.361400: ... + + perf script -i perf.data + mgen 13940 [000] 3971.150589 ... + + perf diff --time 3946.361400,:3971.150589, + + It analyzes the perf.data.old from the timestamp 3946.361400 to + the end of perf.data.old and analyzes the perf.data from the + timestamp 3971.150589 to the end of perf.data. + +--cpu:: Only diff samples for the list of CPUs provided. Multiple CPUs can + be provided as a comma-separated list with no space: 0,1. Ranges of + CPUs are specified with -: 0-2. Default is to report samples on all + CPUs. + +--pid=:: + Only diff samples for given process ID (comma separated list). + +--tid=:: + Only diff samples for given thread ID (comma separated list). + +--stream:: + Enable hot streams comparison. Stream can be a callchain which is + aggregated by the branch records from samples. + +COMPARISON +---------- +The comparison is governed by the baseline file. The baseline perf.data +file is iterated for samples. All other perf.data files specified on +the command line are searched for the baseline sample pair. If the pair +is found, specified computation is made and result is displayed. + +All samples from non-baseline perf.data files, that do not match any +baseline entry, are displayed with empty space within baseline column +and possible computation results (delta) in their related column. + +Example files samples: +- file A with samples f1, f2, f3, f4, f6 +- file B with samples f2, f4, f5 +- file C with samples f1, f2, f5 + +Example output: + x - computation takes place for pair + b - baseline sample percentage + +- perf diff A B C + + baseline/A compute/B compute/C samples + --------------------------------------- + b x f1 + b x x f2 + b f3 + b x f4 + b f6 + x x f5 + +- perf diff B A C + + baseline/B compute/A compute/C samples + --------------------------------------- + b x x f2 + b x f4 + b x f5 + x x f1 + x f3 + x f6 + +- perf diff C B A + + baseline/C compute/B compute/A samples + --------------------------------------- + b x f1 + b x x f2 + b x f5 + x f3 + x x f4 + x f6 + +COMPARISON METHODS +------------------ +delta +~~~~~ +If specified the 'Delta' column is displayed with value 'd' computed as: + + d = A->period_percent - B->period_percent + +with: + - A/B being matching hist entry from data/baseline file specified + (or perf.data/perf.data.old) respectively. + + - period_percent being the % of the hist entry period value within + single data file + + - with filtering by -C, -d and/or -S, period_percent might be changed + relative to how entries are filtered. Use --percentage=absolute to + prevent such fluctuation. + +delta-abs +~~~~~~~~~ +Same as 'delta` method, but sort the result with the absolute values. + +ratio +~~~~~ +If specified the 'Ratio' column is displayed with value 'r' computed as: + + r = A->period / B->period + +with: + - A/B being matching hist entry from data/baseline file specified + (or perf.data/perf.data.old) respectively. + + - period being the hist entry period value + +wdiff:WEIGHT-B,WEIGHT-A +~~~~~~~~~~~~~~~~~~~~~~~ +If specified the 'Weighted diff' column is displayed with value 'd' computed as: + + d = B->period * WEIGHT-A - A->period * WEIGHT-B + + - A/B being matching hist entry from data/baseline file specified + (or perf.data/perf.data.old) respectively. + + - period being the hist entry period value + + - WEIGHT-A/WEIGHT-B being user supplied weights in the the '-c' option + behind ':' separator like '-c wdiff:1,2'. + - WEIGHT-A being the weight of the data file + - WEIGHT-B being the weight of the baseline data file + +cycles +~~~~~~ +If specified the '[Program Block Range] Cycles Diff' column is displayed. +It displays the cycles difference of same program basic block amongst +two perf.data. The program basic block is the code between two branches. + +'[Program Block Range]' indicates the range of a program basic block. +Source line is reported if it can be found otherwise uses symbol+offset +instead. + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-report[1] diff --git a/tools/perf/Documentation/perf-dlfilter.txt b/tools/perf/Documentation/perf-dlfilter.txt new file mode 100644 index 000000000..fb22e3b31 --- /dev/null +++ b/tools/perf/Documentation/perf-dlfilter.txt @@ -0,0 +1,281 @@ +perf-dlfilter(1) +================ + +NAME +---- +perf-dlfilter - Filter sample events using a dynamically loaded shared +object file + +SYNOPSIS +-------- +[verse] +'perf script' [--dlfilter file.so ] [ --dlarg arg ]... + +DESCRIPTION +----------- + +This option is used to process data through a custom filter provided by a +dynamically loaded shared object file. Arguments can be passed using --dlarg +and retrieved using perf_dlfilter_fns.args(). + +If 'file.so' does not contain "/", then it will be found either in the current +directory, or perf tools exec path which is ~/libexec/perf-core/dlfilters for +a local build and install (refer perf --exec-path), or the dynamic linker +paths. + +API +--- + +The API for filtering consists of the following: + +[source,c] +---- +#include + +struct perf_dlfilter_fns perf_dlfilter_fns; + +int start(void **data, void *ctx); +int stop(void *data, void *ctx); +int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx); +int filter_event_early(void *data, const struct perf_dlfilter_sample *sample, void *ctx); +const char *filter_description(const char **long_description); +---- + +If implemented, 'start' will be called at the beginning, before any +calls to 'filter_event' or 'filter_event_early'. Return 0 to indicate success, +or return a negative error code. '*data' can be assigned for use by other +functions. 'ctx' is needed for calls to perf_dlfilter_fns, but most +perf_dlfilter_fns are not valid when called from 'start'. + +If implemented, 'stop' will be called at the end, after any calls to +'filter_event' or 'filter_event_early'. Return 0 to indicate success, or +return a negative error code. 'data' is set by 'start'. 'ctx' is needed +for calls to perf_dlfilter_fns, but most perf_dlfilter_fns are not valid +when called from 'stop'. + +If implemented, 'filter_event' will be called for each sample event. +Return 0 to keep the sample event, 1 to filter it out, or return a negative +error code. 'data' is set by 'start'. 'ctx' is needed for calls to +'perf_dlfilter_fns'. + +'filter_event_early' is the same as 'filter_event' except it is called before +internal filtering. + +If implemented, 'filter_description' should return a one-line description +of the filter, and optionally a longer description. + +The perf_dlfilter_sample structure +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +'filter_event' and 'filter_event_early' are passed a perf_dlfilter_sample +structure, which contains the following fields: +[source,c] +---- +/* + * perf sample event information (as per perf script and ) + */ +struct perf_dlfilter_sample { + __u32 size; /* Size of this structure (for compatibility checking) */ + __u16 ins_lat; /* Refer PERF_SAMPLE_WEIGHT_TYPE in */ + __u16 p_stage_cyc; /* Refer PERF_SAMPLE_WEIGHT_TYPE in */ + __u64 ip; + __s32 pid; + __s32 tid; + __u64 time; + __u64 addr; + __u64 id; + __u64 stream_id; + __u64 period; + __u64 weight; /* Refer PERF_SAMPLE_WEIGHT_TYPE in */ + __u64 transaction; /* Refer PERF_SAMPLE_TRANSACTION in */ + __u64 insn_cnt; /* For instructions-per-cycle (IPC) */ + __u64 cyc_cnt; /* For instructions-per-cycle (IPC) */ + __s32 cpu; + __u32 flags; /* Refer PERF_DLFILTER_FLAG_* above */ + __u64 data_src; /* Refer PERF_SAMPLE_DATA_SRC in */ + __u64 phys_addr; /* Refer PERF_SAMPLE_PHYS_ADDR in */ + __u64 data_page_size; /* Refer PERF_SAMPLE_DATA_PAGE_SIZE in */ + __u64 code_page_size; /* Refer PERF_SAMPLE_CODE_PAGE_SIZE in */ + __u64 cgroup; /* Refer PERF_SAMPLE_CGROUP in */ + __u8 cpumode; /* Refer CPUMODE_MASK etc in */ + __u8 addr_correlates_sym; /* True => resolve_addr() can be called */ + __u16 misc; /* Refer perf_event_header in */ + __u32 raw_size; /* Refer PERF_SAMPLE_RAW in */ + const void *raw_data; /* Refer PERF_SAMPLE_RAW in */ + __u64 brstack_nr; /* Number of brstack entries */ + const struct perf_branch_entry *brstack; /* Refer */ + __u64 raw_callchain_nr; /* Number of raw_callchain entries */ + const __u64 *raw_callchain; /* Refer */ + const char *event; + __s32 machine_pid; + __s32 vcpu; +}; +---- + +Note: 'machine_pid' and 'vcpu' are not original members, but were added together later. +'size' can be used to determine their presence at run time. +PERF_DLFILTER_HAS_MACHINE_PID will be defined if they are present at compile time. +For example: +[source,c] +---- +#include +#include +#include + +static inline bool have_machine_pid(const struct perf_dlfilter_sample *sample) +{ +#ifdef PERF_DLFILTER_HAS_MACHINE_PID + return sample->size >= offsetof(struct perf_dlfilter_sample, vcpu) + sizeof(sample->vcpu); +#else + return false; +#endif +} +---- + +The perf_dlfilter_fns structure +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The 'perf_dlfilter_fns' structure is populated with function pointers when the +file is loaded. The functions can be called by 'filter_event' or +'filter_event_early'. + +[source,c] +---- +struct perf_dlfilter_fns { + const struct perf_dlfilter_al *(*resolve_ip)(void *ctx); + const struct perf_dlfilter_al *(*resolve_addr)(void *ctx); + char **(*args)(void *ctx, int *dlargc); + __s32 (*resolve_address)(void *ctx, __u64 address, struct perf_dlfilter_al *al); + const __u8 *(*insn)(void *ctx, __u32 *length); + const char *(*srcline)(void *ctx, __u32 *line_number); + struct perf_event_attr *(*attr)(void *ctx); + __s32 (*object_code)(void *ctx, __u64 ip, void *buf, __u32 len); + void *(*reserved[120])(void *); +}; +---- + +'resolve_ip' returns information about ip. + +'resolve_addr' returns information about addr (if addr_correlates_sym). + +'args' returns arguments from --dlarg options. + +'resolve_address' provides information about 'address'. al->size must be set +before calling. Returns 0 on success, -1 otherwise. + +'insn' returns instruction bytes and length. + +'srcline' return source file name and line number. + +'attr' returns perf_event_attr, refer . + +'object_code' reads object code and returns the number of bytes read. + +The perf_dlfilter_al structure +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The 'perf_dlfilter_al' structure contains information about an address. + +[source,c] +---- +/* + * Address location (as per perf script) + */ +struct perf_dlfilter_al { + __u32 size; /* Size of this structure (for compatibility checking) */ + __u32 symoff; + const char *sym; + __u64 addr; /* Mapped address (from dso) */ + __u64 sym_start; + __u64 sym_end; + const char *dso; + __u8 sym_binding; /* STB_LOCAL, STB_GLOBAL or STB_WEAK, refer */ + __u8 is_64_bit; /* Only valid if dso is not NULL */ + __u8 is_kernel_ip; /* True if in kernel space */ + __u32 buildid_size; + __u8 *buildid; + /* Below members are only populated by resolve_ip() */ + __u8 filtered; /* true if this sample event will be filtered out */ + const char *comm; +}; +---- + +perf_dlfilter_sample flags +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The 'flags' member of 'perf_dlfilter_sample' corresponds with the flags field +of perf script. The bits of the flags are as follows: + +[source,c] +---- +/* Definitions for perf_dlfilter_sample flags */ +enum { + PERF_DLFILTER_FLAG_BRANCH = 1ULL << 0, + PERF_DLFILTER_FLAG_CALL = 1ULL << 1, + PERF_DLFILTER_FLAG_RETURN = 1ULL << 2, + PERF_DLFILTER_FLAG_CONDITIONAL = 1ULL << 3, + PERF_DLFILTER_FLAG_SYSCALLRET = 1ULL << 4, + PERF_DLFILTER_FLAG_ASYNC = 1ULL << 5, + PERF_DLFILTER_FLAG_INTERRUPT = 1ULL << 6, + PERF_DLFILTER_FLAG_TX_ABORT = 1ULL << 7, + PERF_DLFILTER_FLAG_TRACE_BEGIN = 1ULL << 8, + PERF_DLFILTER_FLAG_TRACE_END = 1ULL << 9, + PERF_DLFILTER_FLAG_IN_TX = 1ULL << 10, + PERF_DLFILTER_FLAG_VMENTRY = 1ULL << 11, + PERF_DLFILTER_FLAG_VMEXIT = 1ULL << 12, +}; +---- + +EXAMPLE +------- + +Filter out everything except branches from "foo" to "bar": + +[source,c] +---- +#include +#include + +struct perf_dlfilter_fns perf_dlfilter_fns; + +int filter_event(void *data, const struct perf_dlfilter_sample *sample, void *ctx) +{ + const struct perf_dlfilter_al *al; + const struct perf_dlfilter_al *addr_al; + + if (!sample->ip || !sample->addr_correlates_sym) + return 1; + + al = perf_dlfilter_fns.resolve_ip(ctx); + if (!al || !al->sym || strcmp(al->sym, "foo")) + return 1; + + addr_al = perf_dlfilter_fns.resolve_addr(ctx); + if (!addr_al || !addr_al->sym || strcmp(addr_al->sym, "bar")) + return 1; + + return 0; +} +---- + +To build the shared object, assuming perf has been installed for the local user +i.e. perf_dlfilter.h is in ~/include/perf : + + gcc -c -I ~/include -fpic dlfilter-example.c + gcc -shared -o dlfilter-example.so dlfilter-example.o + +To use the filter with perf script: + + perf script --dlfilter dlfilter-example.so + +NOTES +----- + +The dlfilter .so file will be dependent on shared libraries. If those change, +it may be necessary to rebuild the .so. Also there may be unexpected results +if the .so uses different versions of the shared libraries that perf uses. +Versions can be checked using the ldd command. + +SEE ALSO +-------- +linkperf:perf-script[1] diff --git a/tools/perf/Documentation/perf-evlist.txt b/tools/perf/Documentation/perf-evlist.txt new file mode 100644 index 000000000..9af8b8dfb --- /dev/null +++ b/tools/perf/Documentation/perf-evlist.txt @@ -0,0 +1,45 @@ +perf-evlist(1) +============== + +NAME +---- +perf-evlist - List the event names in a perf.data file + +SYNOPSIS +-------- +[verse] +'perf evlist ' + +DESCRIPTION +----------- +This command displays the names of events sampled in a perf.data file. + +OPTIONS +------- +-i:: +--input=:: + Input file name. (default: perf.data unless stdin is a fifo) + +-f:: +--force:: + Don't complain, do it. + +-F:: +--freq=:: + Show just the sample frequency used for each event. + +-v:: +--verbose:: + Show all fields. + +-g:: +--group:: + Show event group information. + +--trace-fields:: + Show tracepoint field names. + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-list[1], +linkperf:perf-report[1] diff --git a/tools/perf/Documentation/perf-ftrace.txt b/tools/perf/Documentation/perf-ftrace.txt new file mode 100644 index 000000000..df4595563 --- /dev/null +++ b/tools/perf/Documentation/perf-ftrace.txt @@ -0,0 +1,148 @@ +perf-ftrace(1) +============== + +NAME +---- +perf-ftrace - simple wrapper for kernel's ftrace functionality + + +SYNOPSIS +-------- +[verse] +'perf ftrace' {trace|latency} + +DESCRIPTION +----------- +The 'perf ftrace' command provides a collection of subcommands which use +kernel's ftrace infrastructure. + + 'perf ftrace trace' is a simple wrapper of the ftrace. It only supports + single thread tracing currently and just reads trace_pipe in text and then + write it to stdout. + + 'perf ftrace latency' calculates execution latency of a given function + (optionally with BPF) and display it as a histogram. + +The following options apply to perf ftrace. + +COMMON OPTIONS +-------------- + +-p:: +--pid=:: + Trace on existing process id (comma separated list). + +--tid=:: + Trace on existing thread id (comma separated list). + +-a:: +--all-cpus:: + Force system-wide collection. Scripts run without a + normally use -a by default, while scripts run with a + normally don't - this option allows the latter to be run in + system-wide mode. + +-C:: +--cpu=:: + Only trace for the list of CPUs provided. Multiple CPUs can + be provided as a comma separated list with no space like: 0,1. + Ranges of CPUs are specified with -: 0-2. + Default is to trace on all online CPUs. + +-v:: +--verbose:: + Increase the verbosity level. + + +OPTIONS for 'perf ftrace trace' +------------------------------- + +-t:: +--tracer=:: + Tracer to use when neither -G nor -F option is not + specified: function_graph or function. + +-F:: +--funcs:: + List available functions to trace. It accepts a pattern to + only list interested functions. + +-D:: +--delay:: + Time (ms) to wait before starting tracing after program start. + +-m:: +--buffer-size:: + Set the size of per-cpu tracing buffer, is expected to + be a number with appended unit character - B/K/M/G. + +--inherit:: + Trace children processes spawned by our target. + +-T:: +--trace-funcs=:: + Select function tracer and set function filter on the given + function (or a glob pattern). Multiple functions can be given + by using this option more than once. The function argument also + can be a glob pattern. It will be passed to 'set_ftrace_filter' + in tracefs. + +-N:: +--notrace-funcs=:: + Select function tracer and do not trace functions given by the + argument. Like -T option, this can be used more than once to + specify multiple functions (or glob patterns). It will be + passed to 'set_ftrace_notrace' in tracefs. + +--func-opts:: + List of options allowed to set: + call-graph - Display kernel stack trace for function tracer. + irq-info - Display irq context info for function tracer. + +-G:: +--graph-funcs=:: + Select function_graph tracer and set graph filter on the given + function (or a glob pattern). This is useful to trace for + functions executed from the given function. This can be used more + than once to specify multiple functions. It will be passed to + 'set_graph_function' in tracefs. + +-g:: +--nograph-funcs=:: + Select function_graph tracer and set graph notrace filter on the + given function (or a glob pattern). Like -G option, this is useful + for the function_graph tracer only and disables tracing for function + executed from the given function. This can be used more than once to + specify multiple functions. It will be passed to 'set_graph_notrace' + in tracefs. + +--graph-opts:: + List of options allowed to set: + nosleep-time - Measure on-CPU time only for function_graph tracer. + noirqs - Ignore functions that happen inside interrupt. + verbose - Show process names, PIDs, timestamps, etc. + thresh= - Setup trace duration threshold in microseconds. + depth= - Set max depth for function graph tracer to follow. + + +OPTIONS for 'perf ftrace latency' +--------------------------------- + +-T:: +--trace-funcs=:: + Set the function name to get the histogram. Unlike perf ftrace trace, + it only allows single function to calculate the histogram. + +-b:: +--use-bpf:: + Use BPF to measure function latency instead of using the ftrace (it + uses function_graph tracer internally). + +-n:: +--use-nsec:: + Use nano-second instead of micro-second as a base unit of the histogram. + + +SEE ALSO +-------- +linkperf:perf-record[1], linkperf:perf-trace[1] diff --git a/tools/perf/Documentation/perf-help.txt b/tools/perf/Documentation/perf-help.txt new file mode 100644 index 000000000..514391818 --- /dev/null +++ b/tools/perf/Documentation/perf-help.txt @@ -0,0 +1,38 @@ +perf-help(1) +============ + +NAME +---- +perf-help - display help information about perf + +SYNOPSIS +-------- +'perf help' [-a|--all] [COMMAND] + +DESCRIPTION +----------- + +With no options and no COMMAND given, the synopsis of the 'perf' +command and a list of the most commonly used perf commands are printed +on the standard output. + +If the option '--all' or '-a' is given, then all available commands are +printed on the standard output. + +If a perf command is named, a manual page for that command is brought +up. The 'man' program is used by default for this purpose, but this +can be overridden by other options or configuration variables. + +Note that `perf --help ...` is identical to `perf help ...` because the +former is internally converted into the latter. + +OPTIONS +------- +-a:: +--all:: + Prints all the available commands on the standard output. This + option supersedes any other option. + +PERF +---- +Part of the linkperf:perf[1] suite diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt new file mode 100644 index 000000000..c972032f4 --- /dev/null +++ b/tools/perf/Documentation/perf-inject.txt @@ -0,0 +1,119 @@ +perf-inject(1) +============== + +NAME +---- +perf-inject - Filter to augment the events stream with additional information + +SYNOPSIS +-------- +[verse] +'perf inject ' + +DESCRIPTION +----------- +perf-inject reads a perf-record event stream and repipes it to stdout. At any +point the processing code can inject other events into the event stream - in +this case build-ids (-b option) are read and injected as needed into the event +stream. + +Build-ids are just the first user of perf-inject - potentially anything that +needs userspace processing to augment the events stream with additional +information could make use of this facility. + +OPTIONS +------- +-b:: +--build-ids:: + Inject build-ids of DSOs hit by samples into the output stream. + This means it needs to process all SAMPLE records to find the DSOs. + +--buildid-all:: + Inject build-ids of all DSOs into the output stream regardless of hits + and skip SAMPLE processing. + +--known-build-ids=:: + Override build-ids to inject using these comma-separated pairs of + build-id and path. Understands file://filename to read these pairs + from a file, which can be generated with perf buildid-list. + +-v:: +--verbose:: + Be more verbose. +-i:: +--input=:: + Input file name. (default: stdin) +-o:: +--output=:: + Output file name. (default: stdout) +-s:: +--sched-stat:: + Merge sched_stat and sched_switch for getting events where and how long + tasks slept. sched_switch contains a callchain where a task slept and + sched_stat contains a timeslice how long a task slept. + +-k:: +--vmlinux=:: + vmlinux pathname + +--ignore-vmlinux:: + Ignore vmlinux files. + +--kallsyms=:: + kallsyms pathname + +--itrace:: + Decode Instruction Tracing data, replacing it with synthesized events. + Options are: + +include::itrace.txt[] + +--strip:: + Use with --itrace to strip out non-synthesized events. + +-j:: +--jit:: + Process jitdump files by injecting the mmap records corresponding to jitted + functions. This option also generates the ELF images for each jitted function + found in the jitdumps files captured in the input perf.data file. Use this option + if you are monitoring environment using JIT runtimes, such as Java, DART or V8. + +-f:: +--force:: + Don't complain, do it. + +--vm-time-correlation[=OPTIONS]:: + Some architectures may capture AUX area data which contains timestamps + affected by virtualization. This option will update those timestamps + in place, to correlate with host timestamps. The in-place update means + that an output file is not specified, and instead the input file is + modified. The options are architecture specific, except that they may + start with "dry-run" which will cause the file to be processed but + without updating it. Currently this option is supported only by + Intel PT, refer linkperf:perf-intel-pt[1] + +--guest-data=,[,