diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-18 18:50:03 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-18 18:50:03 +0000 |
commit | 01a69402cf9d38ff180345d55c2ee51c7e89fbc7 (patch) | |
tree | b406c5242a088c4f59c6e4b719b783f43aca6ae9 /Documentation/accel | |
parent | Adding upstream version 6.7.12. (diff) | |
download | linux-01a69402cf9d38ff180345d55c2ee51c7e89fbc7.tar.xz linux-01a69402cf9d38ff180345d55c2ee51c7e89fbc7.zip |
Adding upstream version 6.8.9.upstream/6.8.9
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'Documentation/accel')
-rw-r--r-- | Documentation/accel/introduction.rst | 4 | ||||
-rw-r--r-- | Documentation/accel/qaic/aic100.rst | 11 | ||||
-rw-r--r-- | Documentation/accel/qaic/qaic.rst | 37 |
3 files changed, 46 insertions, 6 deletions
diff --git a/Documentation/accel/introduction.rst b/Documentation/accel/introduction.rst index 89984dfece..ae30301366 100644 --- a/Documentation/accel/introduction.rst +++ b/Documentation/accel/introduction.rst @@ -101,8 +101,8 @@ External References email threads ------------- -* `Initial discussion on the New subsystem for acceleration devices <https://lkml.org/lkml/2022/7/31/83>`_ - Oded Gabbay (2022) -* `patch-set to add the new subsystem <https://lkml.org/lkml/2022/10/22/544>`_ - Oded Gabbay (2022) +* `Initial discussion on the New subsystem for acceleration devices <https://lore.kernel.org/lkml/CAFCwf11=9qpNAepL7NL+YAV_QO=Wv6pnWPhKHKAepK3fNn+2Dg@mail.gmail.com/>`_ - Oded Gabbay (2022) +* `patch-set to add the new subsystem <https://lore.kernel.org/lkml/20221022214622.18042-1-ogabbay@kernel.org/>`_ - Oded Gabbay (2022) Conference talks ---------------- diff --git a/Documentation/accel/qaic/aic100.rst b/Documentation/accel/qaic/aic100.rst index c80d0f1307..590dae77ea 100644 --- a/Documentation/accel/qaic/aic100.rst +++ b/Documentation/accel/qaic/aic100.rst @@ -36,8 +36,9 @@ AIC100 DID (0xa100). AIC100 does not implement FLR (function level reset). -AIC100 implements MSI but does not implement MSI-X. AIC100 requires 17 MSIs to -operate (1 for MHI, 16 for the DMA Bridge). +AIC100 implements MSI but does not implement MSI-X. AIC100 prefers 17 MSIs to +operate (1 for MHI, 16 for the DMA Bridge). Falling back to 1 MSI is possible in +scenarios where reserving 32 MSIs isn't feasible. As a PCIe device, AIC100 utilizes BARs to provide host interfaces to the device hardware. AIC100 provides 3, 64-bit BARs. @@ -220,10 +221,14 @@ of the defined channels, and their uses. +----------------+---------+----------+----------------------------------------+ | QAIC_DEBUG | 18 & 19 | AMSS | Not used. | +----------------+---------+----------+----------------------------------------+ -| QAIC_TIMESYNC | 20 & 21 | SBL/AMSS | Used to synchronize timestamps in the | +| QAIC_TIMESYNC | 20 & 21 | SBL | Used to synchronize timestamps in the | | | | | device side logs with the host time | | | | | source. | +----------------+---------+----------+----------------------------------------+ +| QAIC_TIMESYNC | 22 & 23 | AMSS | Used to periodically synchronize | +| _PERIODIC | | | timestamps in the device side logs with| +| | | | the host time source. | ++----------------+---------+----------+----------------------------------------+ DMA Bridge ========== diff --git a/Documentation/accel/qaic/qaic.rst b/Documentation/accel/qaic/qaic.rst index c885023831..efb7771273 100644 --- a/Documentation/accel/qaic/qaic.rst +++ b/Documentation/accel/qaic/qaic.rst @@ -10,6 +10,9 @@ accelerator products. Interrupts ========== +IRQ Storm Mitigation +-------------------- + While the AIC100 DMA Bridge hardware implements an IRQ storm mitigation mechanism, it is still possible for an IRQ storm to occur. A storm can happen if the workload is particularly quick, and the host is responsive. If the host @@ -35,6 +38,26 @@ generates 100k IRQs per second (per /proc/interrupts) is reduced to roughly 64 IRQs over 5 minutes while keeping the host system stable, and having the same workload throughput performance (within run to run noise variation). +Single MSI Mode +--------------- + +MultiMSI is not well supported on all systems; virtualized ones even less so +(circa 2023). Between hypervisors masking the PCIe MSI capability structure to +large memory requirements for vIOMMUs (required for supporting MultiMSI), it is +useful to be able to fall back to a single MSI when needed. + +To support this fallback, we allow the case where only one MSI is able to be +allocated, and share that one MSI between MHI and the DBCs. The device detects +when only one MSI has been configured and directs the interrupts for the DBCs +to the interrupt normally used for MHI. Unfortunately this means that the +interrupt handlers for every DBC and MHI wake up for every interrupt that +arrives; however, the DBC threaded irq handlers only are started when work to be +done is detected (MHI will always start its threaded handler). + +If the DBC is configured to force MSI interrupts, this can circumvent the +software IRQ storm mitigation mentioned above. Since the MSI is shared it is +never disabled, allowing each new entry to the FIFO to trigger a new interrupt. + Neural Network Control (NNC) Protocol ===================================== @@ -70,8 +93,15 @@ commands (does not impact QAIC). uAPI ==== +QAIC creates an accel device per phsyical PCIe device. This accel device exists +for as long as the PCIe device is known to Linux. + +The PCIe device may not be in the state to accept requests from userspace at +all times. QAIC will trigger KOBJ_ONLINE/OFFLINE uevents to advertise when the +device can accept requests (ONLINE) and when the device is no longer accepting +requests (OFFLINE) because of a reset or other state transition. + QAIC defines a number of driver specific IOCTLs as part of the userspace API. -This section describes those APIs. DRM_IOCTL_QAIC_MANAGE This IOCTL allows userspace to send a NNC request to the QSM. The call will @@ -178,3 +208,8 @@ overrides this for that call. Default is 5000 (5 seconds). Sets the polling interval in microseconds (us) when datapath polling is active. Takes effect at the next polling interval. Default is 100 (100 us). + +**timesync_delay_ms (unsigned int)** + +Sets the time interval in milliseconds (ms) between two consecutive timesync +operations. Default is 1000 (1000 ms). |