From 01a69402cf9d38ff180345d55c2ee51c7e89fbc7 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sat, 18 May 2024 20:50:03 +0200 Subject: Adding upstream version 6.8.9. Signed-off-by: Daniel Baumann --- Documentation/admin-guide/perf/dwc_pcie_pmu.rst | 94 +++++++++++++++++++++++++ Documentation/admin-guide/perf/imx-ddr.rst | 45 +++++++++--- Documentation/admin-guide/perf/index.rst | 1 + 3 files changed, 132 insertions(+), 8 deletions(-) create mode 100644 Documentation/admin-guide/perf/dwc_pcie_pmu.rst (limited to 'Documentation/admin-guide/perf') diff --git a/Documentation/admin-guide/perf/dwc_pcie_pmu.rst b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst new file mode 100644 index 0000000000..d47cd229d7 --- /dev/null +++ b/Documentation/admin-guide/perf/dwc_pcie_pmu.rst @@ -0,0 +1,94 @@ +====================================================================== +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) +====================================================================== + +DesignWare Cores (DWC) PCIe PMU +=============================== + +The PMU is a PCIe configuration space register block provided by each PCIe Root +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error +injection, and Statistics). + +As the name indicates, the RAS DES capability supports system level +debugging, AER error injection, and collection of statistics. To facilitate +collection of statistics, Synopsys DesignWare Cores PCIe controller +provides the following two features: + +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and + time spent in each low-power LTSSM state) and +- one 32-bit counter for Event Counting (error and non-error events for + a specified lane) + +Note: There is no interrupt for counter overflow. + +Time Based Analysis +------------------- + +Using this feature you can obtain information regarding RX/TX data +throughput and time spent in each low-power LTSSM state by the controller. +The PMU measures data in two categories: + +- Group#0: Percentage of time the controller stays in LTSSM states. +- Group#1: Amount of data processed (Units of 16 bytes). + +Lane Event counters +------------------- + +Using this feature you can obtain Error and Non-Error information in +specific lane by the controller. The PMU event is selected by all of: + +- Group i +- Event j within the Group i +- Lane k + +Some of the events only exist for specific configurations. + +DesignWare Cores (DWC) PCIe PMU Driver +======================================= + +This driver adds PMU devices for each PCIe Root Port named based on the BDF of +the Root Port. For example, + + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) + +the PMU device name for this Root Port is dwc_rootport_3018. + +The DWC PCIe PMU driver registers a perf PMU driver, which provides +description of available events and configuration options in sysfs, see +/sys/bus/event_source/devices/dwc_rootport_{bdf}. + +The "format" directory describes format of the config fields of the +perf_event_attr structure. The "events" directory provides configuration +templates for all documented events. For example, +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". + +The "perf list" command shall list the available events from sysfs, e.g.:: + + $# perf list | grep dwc_rootport + <...> + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] + <...> + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] + +Time Based Analysis Event Usage +------------------------------- + +Example usage of counting PCIe RX TLP data payload (Units of bytes):: + + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ + +The average RX/TX bandwidth can be calculated using the following formula: + + PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window + PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window + +Lane Event Usage +------------------------------- + +Each lane has the same event set and to avoid generating a list of hundreds +of events, the user need to specify the lane ID explicitly, e.g.:: + + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ + +The driver does not support sampling, therefore "perf record" will not +work. Per-task (without "-a") perf sessions are not supported. diff --git a/Documentation/admin-guide/perf/imx-ddr.rst b/Documentation/admin-guide/perf/imx-ddr.rst index 90926d0fb8..77418ae5a2 100644 --- a/Documentation/admin-guide/perf/imx-ddr.rst +++ b/Documentation/admin-guide/perf/imx-ddr.rst @@ -13,8 +13,8 @@ is one register for each counter. Counter 0 is special in that it always counts interrupt is raised. If any other counter overflows, it continues counting, and no interrupt is raised. -The "format" directory describes format of the config (event ID) and config1 -(AXI filtering) fields of the perf_event_attr structure, see /sys/bus/event_source/ +The "format" directory describes format of the config (event ID) and config1/2 +(AXI filter setting) fields of the perf_event_attr structure, see /sys/bus/event_source/ devices/imx8_ddr0/format/. The "events" directory describes the events types hardware supported that can be used with perf tool, see /sys/bus/event_source/ devices/imx8_ddr0/events/. The "caps" directory describes filter features implemented @@ -28,12 +28,11 @@ in DDR PMU, see /sys/bus/events_source/devices/imx8_ddr0/caps/. AXI filtering is only used by CSV modes 0x41 (axid-read) and 0x42 (axid-write) to count reading or writing matches filter setting. Filter setting is various from different DRAM controller implementations, which is distinguished by quirks -in the driver. You also can dump info from userspace, filter in "caps" directory -indicates whether PMU supports AXI ID filter or not; enhanced_filter indicates -whether PMU supports enhanced AXI ID filter or not. Value 0 for un-supported, and -value 1 for supported. +in the driver. You also can dump info from userspace, "caps" directory show the +type of AXI filter (filter, enhanced_filter and super_filter). Value 0 for +un-supported, and value 1 for supported. -* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0). +* With DDR_CAP_AXI_ID_FILTER quirk(filter: 1, enhanced_filter: 0, super_filter: 0). Filter is defined with two configuration parts: --AXI_ID defines AxID matching value. --AXI_MASKING defines which bits of AxID are meaningful for the matching. @@ -65,7 +64,37 @@ value 1 for supported. perf stat -a -e imx8_ddr0/axid-read,axi_id=0x12/ cmd, which will monitor ARID=0x12 -* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1). +* With DDR_CAP_AXI_ID_FILTER_ENHANCED quirk(filter: 1, enhanced_filter: 1, super_filter: 0). This is an extension to the DDR_CAP_AXI_ID_FILTER quirk which permits counting the number of bytes (as opposed to the number of bursts) from DDR read and write transactions concurrently with another set of data counters. + +* With DDR_CAP_AXI_ID_PORT_CHANNEL_FILTER quirk(filter: 0, enhanced_filter: 0, super_filter: 1). + There is a limitation in previous AXI filter, it cannot filter different IDs + at the same time as the filter is shared between counters. This quirk is the + extension of AXI ID filter. One improvement is that counter 1-3 has their own + filter, means that it supports concurrently filter various IDs. Another + improvement is that counter 1-3 supports AXI PORT and CHANNEL selection. Support + selecting address channel or data channel. + + Filter is defined with 2 configuration registers per counter 1-3. + --Counter N MASK COMP register - including AXI_ID and AXI_MASKING. + --Counter N MUX CNTL register - including AXI CHANNEL and AXI PORT. + + - 0: address channel + - 1: data channel + + PMU in DDR subsystem, only one single port0 exists, so axi_port is reserved + which should be 0. + + .. code-block:: bash + + perf stat -a -e imx8_ddr0/axid-read,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd + perf stat -a -e imx8_ddr0/axid-write,axi_mask=0xMMMM,axi_id=0xDDDD,axi_channel=0xH/ cmd + + .. note:: + + axi_channel is inverted in userspace, and it will be reverted in driver + automatically. So that users do not need specify axi_channel if want to + monitor data channel from DDR transactions, since data channel is more + meaningful. diff --git a/Documentation/admin-guide/perf/index.rst b/Documentation/admin-guide/perf/index.rst index a2e6f2c811..f4a4513c52 100644 --- a/Documentation/admin-guide/perf/index.rst +++ b/Documentation/admin-guide/perf/index.rst @@ -19,6 +19,7 @@ Performance monitor support arm_dsu_pmu thunderx2-pmu alibaba_pmu + dwc_pcie_pmu nvidia-pmu meson-ddr-pmu cxl -- cgit v1.2.3