diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-18 17:35:05 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-18 17:39:31 +0000 |
commit | 85c675d0d09a45a135bddd15d7b385f8758c32fb (patch) | |
tree | 76267dbc9b9a130337be3640948fe397b04ac629 /Documentation/powerpc/imc.rst | |
parent | Adding upstream version 6.6.15. (diff) | |
download | linux-85c675d0d09a45a135bddd15d7b385f8758c32fb.tar.xz linux-85c675d0d09a45a135bddd15d7b385f8758c32fb.zip |
Adding upstream version 6.7.7.upstream/6.7.7
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'Documentation/powerpc/imc.rst')
-rw-r--r-- | Documentation/powerpc/imc.rst | 199 |
1 files changed, 0 insertions, 199 deletions
diff --git a/Documentation/powerpc/imc.rst b/Documentation/powerpc/imc.rst deleted file mode 100644 index 633bcee7dc..0000000000 --- a/Documentation/powerpc/imc.rst +++ /dev/null @@ -1,199 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 -.. _imc: - -=================================== -IMC (In-Memory Collection Counters) -=================================== - -Anju T Sudhakar, 10 May 2019 - -.. contents:: - :depth: 3 - - -Basic overview -============== - -IMC (In-Memory collection counters) is a hardware monitoring facility that -collects large numbers of hardware performance events at Nest level (these are -on-chip but off-core), Core level and Thread level. - -The Nest PMU counters are handled by a Nest IMC microcode which runs in the OCC -(On-Chip Controller) complex. The microcode collects the counter data and moves -the nest IMC counter data to memory. - -The Core and Thread IMC PMU counters are handled in the core. Core level PMU -counters give us the IMC counters' data per core and thread level PMU counters -give us the IMC counters' data per CPU thread. - -OPAL obtains the IMC PMU and supported events information from the IMC Catalog -and passes on to the kernel via the device tree. The event's information -contains: - -- Event name -- Event Offset -- Event description - -and possibly also: - -- Event scale -- Event unit - -Some PMUs may have a common scale and unit values for all their supported -events. For those cases, the scale and unit properties for those events must be -inherited from the PMU. - -The event offset in the memory is where the counter data gets accumulated. - -IMC catalog is available at: - https://github.com/open-power/ima-catalog - -The kernel discovers the IMC counters information in the device tree at the -`imc-counters` device node which has a compatible field -`ibm,opal-in-memory-counters`. From the device tree, the kernel parses the PMUs -and their event's information and register the PMU and its attributes in the -kernel. - -IMC example usage -================= - -.. code-block:: sh - - # perf list - [...] - nest_mcs01/PM_MCS01_64B_RD_DISP_PORT01/ [Kernel PMU event] - nest_mcs01/PM_MCS01_64B_RD_DISP_PORT23/ [Kernel PMU event] - [...] - core_imc/CPM_0THRD_NON_IDLE_PCYC/ [Kernel PMU event] - core_imc/CPM_1THRD_NON_IDLE_INST/ [Kernel PMU event] - [...] - thread_imc/CPM_0THRD_NON_IDLE_PCYC/ [Kernel PMU event] - thread_imc/CPM_1THRD_NON_IDLE_INST/ [Kernel PMU event] - -To see per chip data for nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/: - -.. code-block:: sh - - # ./perf stat -e "nest_mcs01/PM_MCS01_64B_WR_DISP_PORT01/" -a --per-socket - -To see non-idle instructions for core 0: - -.. code-block:: sh - - # ./perf stat -e "core_imc/CPM_NON_IDLE_INST/" -C 0 -I 1000 - -To see non-idle instructions for a "make": - -.. code-block:: sh - - # ./perf stat -e "thread_imc/CPM_NON_IDLE_PCYC/" make - - -IMC Trace-mode -=============== - -POWER9 supports two modes for IMC which are the Accumulation mode and Trace -mode. In Accumulation mode, event counts are accumulated in system Memory. -Hypervisor then reads the posted counts periodically or when requested. In IMC -Trace mode, the 64 bit trace SCOM value is initialized with the event -information. The CPMCxSEL and CPMC_LOAD in the trace SCOM, specifies the event -to be monitored and the sampling duration. On each overflow in the CPMCxSEL, -hardware snapshots the program counter along with event counts and writes into -memory pointed by LDBAR. - -LDBAR is a 64 bit special purpose per thread register, it has bits to indicate -whether hardware is configured for accumulation or trace mode. - -LDBAR Register Layout ---------------------- - - +-------+----------------------+ - | 0 | Enable/Disable | - +-------+----------------------+ - | 1 | 0: Accumulation Mode | - | +----------------------+ - | | 1: Trace Mode | - +-------+----------------------+ - | 2:3 | Reserved | - +-------+----------------------+ - | 4-6 | PB scope | - +-------+----------------------+ - | 7 | Reserved | - +-------+----------------------+ - | 8:50 | Counter Address | - +-------+----------------------+ - | 51:63 | Reserved | - +-------+----------------------+ - -TRACE_IMC_SCOM bit representation ---------------------------------- - - +-------+------------+ - | 0:1 | SAMPSEL | - +-------+------------+ - | 2:33 | CPMC_LOAD | - +-------+------------+ - | 34:40 | CPMC1SEL | - +-------+------------+ - | 41:47 | CPMC2SEL | - +-------+------------+ - | 48:50 | BUFFERSIZE | - +-------+------------+ - | 51:63 | RESERVED | - +-------+------------+ - -CPMC_LOAD contains the sampling duration. SAMPSEL and CPMCxSEL determines the -event to count. BUFFERSIZE indicates the memory range. On each overflow, -hardware snapshots the program counter along with event counts and updates the -memory and reloads the CMPC_LOAD value for the next sampling duration. IMC -hardware does not support exceptions, so it quietly wraps around if memory -buffer reaches the end. - -*Currently the event monitored for trace-mode is fixed as cycle.* - -Trace IMC example usage -======================= - -.. code-block:: sh - - # perf list - [....] - trace_imc/trace_cycles/ [Kernel PMU event] - -To record an application/process with trace-imc event: - -.. code-block:: sh - - # perf record -e trace_imc/trace_cycles/ yes > /dev/null - [ perf record: Woken up 1 times to write data ] - [ perf record: Captured and wrote 0.012 MB perf.data (21 samples) ] - -The `perf.data` generated, can be read using perf report. - -Benefits of using IMC trace-mode -================================ - -PMI (Performance Monitoring Interrupts) interrupt handling is avoided, since IMC -trace mode snapshots the program counter and updates to the memory. And this -also provide a way for the operating system to do instruction sampling in real -time without PMI processing overhead. - -Performance data using `perf top` with and without trace-imc event. - -PMI interrupts count when `perf top` command is executed without trace-imc event. - -.. code-block:: sh - - # grep PMI /proc/interrupts - PMI: 0 0 0 0 Performance monitoring interrupts - # ./perf top - ... - # grep PMI /proc/interrupts - PMI: 39735 8710 17338 17801 Performance monitoring interrupts - # ./perf top -e trace_imc/trace_cycles/ - ... - # grep PMI /proc/interrupts - PMI: 39735 8710 17338 17801 Performance monitoring interrupts - - -That is, the PMI interrupt counts do not increment when using the `trace_imc` event. |