From 7f3a4257159dea8e7ef66d1a539dc6df708b8ed3 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Wed, 7 Aug 2024 15:17:46 +0200 Subject: Adding upstream version 6.10.3. Signed-off-by: Daniel Baumann --- Documentation/gpu/amdgpu/debugging.rst | 80 ++++++++++++++++++++++ .../gpu/amdgpu/display/display-contributing.rst | 2 +- Documentation/gpu/amdgpu/index.rst | 1 + Documentation/gpu/driver-uapi.rst | 5 ++ Documentation/gpu/drm-kms.rst | 22 ++++++ Documentation/gpu/i915.rst | 9 +++ Documentation/gpu/panfrost.rst | 9 +++ Documentation/gpu/rfc/i915_vm_bind.h | 11 ++- 8 files changed, 132 insertions(+), 7 deletions(-) create mode 100644 Documentation/gpu/amdgpu/debugging.rst (limited to 'Documentation/gpu') diff --git a/Documentation/gpu/amdgpu/debugging.rst b/Documentation/gpu/amdgpu/debugging.rst new file mode 100644 index 0000000000..e75f97d0e4 --- /dev/null +++ b/Documentation/gpu/amdgpu/debugging.rst @@ -0,0 +1,80 @@ +=============== + GPU Debugging +=============== + +GPUVM Debugging +=============== + +To aid in debugging GPU virtual memory related problems, the driver supports a +number of options module parameters: + +`vm_fault_stop` - If non-0, halt the GPU memory controller on a GPU page fault. + +`vm_update_mode` - If non-0, use the CPU to update GPU page tables rather than +the GPU. + + +Decoding a GPUVM Page Fault +=========================== + +If you see a GPU page fault in the kernel log, you can decode it to figure +out what is going wrong in your application. A page fault in your kernel +log may look something like this: + +:: + + [gfxhub0] no-retry page fault (src_id:0 ring:24 vmid:3 pasid:32777, for process glxinfo pid 2424 thread glxinfo:cs0 pid 2425) + in page starting at address 0x0000800102800000 from IH client 0x1b (UTCL2) + VM_L2_PROTECTION_FAULT_STATUS:0x00301030 + Faulty UTCL2 client ID: TCP (0x8) + MORE_FAULTS: 0x0 + WALKER_ERROR: 0x0 + PERMISSION_FAULTS: 0x3 + MAPPING_ERROR: 0x0 + RW: 0x0 + +First you have the memory hub, gfxhub and mmhub. gfxhub is the memory +hub used for graphics, compute, and sdma on some chips. mmhub is the +memory hub used for multi-media and sdma on some chips. + +Next you have the vmid and pasid. If the vmid is 0, this fault was likely +caused by the kernel driver or firmware. If the vmid is non-0, it is generally +a fault in a user application. The pasid is used to link a vmid to a system +process id. If the process is active when the fault happens, the process +information will be printed. + +The GPU virtual address that caused the fault comes next. + +The client ID indicates the GPU block that caused the fault. +Some common client IDs: + +- CB/DB: The color/depth backend of the graphics pipe +- CPF: Command Processor Frontend +- CPC: Command Processor Compute +- CPG: Command Processor Graphics +- TCP/SQC/SQG: Shaders +- SDMA: SDMA engines +- VCN: Video encode/decode engines +- JPEG: JPEG engines + +PERMISSION_FAULTS describe what faults were encountered: + +- bit 0: the PTE was not valid +- bit 1: the PTE read bit was not set +- bit 2: the PTE write bit was not set +- bit 3: the PTE execute bit was not set + +Finally, RW, indicates whether the access was a read (0) or a write (1). + +In the example above, a shader (cliend id = TCP) generated a read (RW = 0x0) to +an invalid page (PERMISSION_FAULTS = 0x3) at GPU virtual address +0x0000800102800000. The user can then inspect their shader code and resource +descriptor state to determine what caused the GPU page fault. + +UMR +=== + +`umr `_ is a general purpose +GPU debugging and diagnostics tool. Please see the umr +`documentation `_ for more information +about its capabilities. diff --git a/Documentation/gpu/amdgpu/display/display-contributing.rst b/Documentation/gpu/amdgpu/display/display-contributing.rst index fdb2bea01d..36f3077eee 100644 --- a/Documentation/gpu/amdgpu/display/display-contributing.rst +++ b/Documentation/gpu/amdgpu/display/display-contributing.rst @@ -135,7 +135,7 @@ Enable underlay --------------- AMD display has this feature called underlay (which you can read more about at -'Documentation/GPU/amdgpu/display/mpo-overview.rst') which is intended to +'Documentation/gpu/amdgpu/display/mpo-overview.rst') which is intended to save power when playing a video. The basic idea is to put a video in the underlay plane at the bottom and the desktop in the plane above it with a hole in the video area. This feature is enabled in ChromeOS, and from our data diff --git a/Documentation/gpu/amdgpu/index.rst b/Documentation/gpu/amdgpu/index.rst index 912e699fd3..847e049240 100644 --- a/Documentation/gpu/amdgpu/index.rst +++ b/Documentation/gpu/amdgpu/index.rst @@ -15,4 +15,5 @@ Next (GCN), Radeon DNA (RDNA), and Compute DNA (CDNA) architectures. ras thermal driver-misc + debugging amdgpu-glossary diff --git a/Documentation/gpu/driver-uapi.rst b/Documentation/gpu/driver-uapi.rst index e5070a0e95..971cdb4816 100644 --- a/Documentation/gpu/driver-uapi.rst +++ b/Documentation/gpu/driver-uapi.rst @@ -18,6 +18,11 @@ VM_BIND / EXEC uAPI .. kernel-doc:: include/uapi/drm/nouveau_drm.h +drm/panthor uAPI +================ + +.. kernel-doc:: include/uapi/drm/panthor_drm.h + drm/xe uAPI =========== diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst index 13d3627d8b..abfe220764 100644 --- a/Documentation/gpu/drm-kms.rst +++ b/Documentation/gpu/drm-kms.rst @@ -398,6 +398,21 @@ Plane Damage Tracking Functions Reference .. kernel-doc:: include/drm/drm_damage_helper.h :internal: +Plane Panic Feature +------------------- + +.. kernel-doc:: drivers/gpu/drm/drm_panic.c + :doc: overview + +Plane Panic Functions Reference +------------------------------- + +.. kernel-doc:: include/drm/drm_panic.h + :internal: + +.. kernel-doc:: drivers/gpu/drm/drm_panic.c + :export: + Display Modes Function Reference ================================ @@ -496,6 +511,13 @@ addition to the one mentioned above: * An IGT test must be submitted where reasonable. +For historical reasons, non-standard, driver-specific properties exist. If a KMS +driver wants to add support for one of those properties, the requirements for +new properties apply where possible. Additionally, the documented behavior must +match the de facto semantics of the existing property to ensure compatibility. +Developers of the driver that first added the property should help with those +tasks and must ACK the documented behavior if possible. + Property Types and Blob Property Support ---------------------------------------- diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst index 0ca1550fd9..17261ba183 100644 --- a/Documentation/gpu/i915.rst +++ b/Documentation/gpu/i915.rst @@ -204,6 +204,15 @@ DMC Firmware Support .. kernel-doc:: drivers/gpu/drm/i915/display/intel_dmc.c :internal: +DMC wakelock support +-------------------- + +.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dmc_wl.c + :doc: DMC wakelock support + +.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dmc_wl.c + :internal: + Video BIOS Table (VBT) ---------------------- diff --git a/Documentation/gpu/panfrost.rst b/Documentation/gpu/panfrost.rst index b80e41f4b2..51ba375fd8 100644 --- a/Documentation/gpu/panfrost.rst +++ b/Documentation/gpu/panfrost.rst @@ -38,3 +38,12 @@ the currently possible format options: Possible `drm-engine-` key names are: `fragment`, and `vertex-tiler`. `drm-curfreq-` values convey the current operating frequency for that engine. + +Users must bear in mind that engine and cycle sampling are disabled by default, +because of power saving concerns. `fdinfo` users and benchmark applications which +query the fdinfo file must make sure to toggle the job profiling status of the +driver by writing into the appropriate sysfs node:: + + echo > /sys/bus/platform/drivers/panfrost/[a-f0-9]*.gpu/profiling + +Where `N` is either `0` or `1`, depending on the desired enablement status. diff --git a/Documentation/gpu/rfc/i915_vm_bind.h b/Documentation/gpu/rfc/i915_vm_bind.h index 8a8fcd4fce..bc26dc1261 100644 --- a/Documentation/gpu/rfc/i915_vm_bind.h +++ b/Documentation/gpu/rfc/i915_vm_bind.h @@ -93,12 +93,11 @@ struct drm_i915_gem_timeline_fence { * Multiple VA mappings can be created to the same section of the object * (aliasing). * - * The @start, @offset and @length must be 4K page aligned. However the DG2 - * and XEHPSDV has 64K page size for device local memory and has compact page - * table. On those platforms, for binding device local-memory objects, the - * @start, @offset and @length must be 64K aligned. Also, UMDs should not mix - * the local memory 64K page and the system memory 4K page bindings in the same - * 2M range. + * The @start, @offset and @length must be 4K page aligned. However the DG2 has + * 64K page size for device local memory and has compact page table. On that + * platform, for binding device local-memory objects, the @start, @offset and + * @length must be 64K aligned. Also, UMDs should not mix the local memory 64K + * page and the system memory 4K page bindings in the same 2M range. * * Error code -EINVAL will be returned if @start, @offset and @length are not * properly aligned. In version 1 (See I915_PARAM_VM_BIND_VERSION), error code -- cgit v1.2.3