summaryrefslogtreecommitdiffstats
path: root/Documentation/gpu/drm-mm.rst
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--Documentation/gpu/drm-mm.rst506
1 files changed, 506 insertions, 0 deletions
diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
new file mode 100644
index 000000000..9abee1589
--- /dev/null
+++ b/Documentation/gpu/drm-mm.rst
@@ -0,0 +1,506 @@
+=====================
+DRM Memory Management
+=====================
+
+Modern Linux systems require large amount of graphics memory to store
+frame buffers, textures, vertices and other graphics-related data. Given
+the very dynamic nature of many of that data, managing graphics memory
+efficiently is thus crucial for the graphics stack and plays a central
+role in the DRM infrastructure.
+
+The DRM core includes two memory managers, namely Translation Table Maps
+(TTM) and Graphics Execution Manager (GEM). TTM was the first DRM memory
+manager to be developed and tried to be a one-size-fits-them all
+solution. It provides a single userspace API to accommodate the need of
+all hardware, supporting both Unified Memory Architecture (UMA) devices
+and devices with dedicated video RAM (i.e. most discrete video cards).
+This resulted in a large, complex piece of code that turned out to be
+hard to use for driver development.
+
+GEM started as an Intel-sponsored project in reaction to TTM's
+complexity. Its design philosophy is completely different: instead of
+providing a solution to every graphics memory-related problems, GEM
+identified common code between drivers and created a support library to
+share it. GEM has simpler initialization and execution requirements than
+TTM, but has no video RAM management capabilities and is thus limited to
+UMA devices.
+
+The Translation Table Manager (TTM)
+===================================
+
+TTM design background and information belongs here.
+
+TTM initialization
+------------------
+
+ **Warning**
+ This section is outdated.
+
+Drivers wishing to support TTM must pass a filled :c:type:`ttm_bo_driver
+<ttm_bo_driver>` structure to ttm_bo_device_init, together with an
+initialized global reference to the memory manager. The ttm_bo_driver
+structure contains several fields with function pointers for
+initializing the TTM, allocating and freeing memory, waiting for command
+completion and fence synchronization, and memory migration.
+
+The :c:type:`struct drm_global_reference <drm_global_reference>` is made
+up of several fields:
+
+.. code-block:: c
+
+ struct drm_global_reference {
+ enum ttm_global_types global_type;
+ size_t size;
+ void *object;
+ int (*init) (struct drm_global_reference *);
+ void (*release) (struct drm_global_reference *);
+ };
+
+
+There should be one global reference structure for your memory manager
+as a whole, and there will be others for each object created by the
+memory manager at runtime. Your global TTM should have a type of
+TTM_GLOBAL_TTM_MEM. The size field for the global object should be
+sizeof(struct ttm_mem_global), and the init and release hooks should
+point at your driver-specific init and release routines, which probably
+eventually call ttm_mem_global_init and ttm_mem_global_release,
+respectively.
+
+Once your global TTM accounting structure is set up and initialized by
+calling ttm_global_item_ref() on it, you need to create a buffer
+object TTM to provide a pool for buffer object allocation by clients and
+the kernel itself. The type of this object should be
+TTM_GLOBAL_TTM_BO, and its size should be sizeof(struct
+ttm_bo_global). Again, driver-specific init and release functions may
+be provided, likely eventually calling ttm_bo_global_ref_init() and
+ttm_bo_global_ref_release(), respectively. Also, like the previous
+object, ttm_global_item_ref() is used to create an initial reference
+count for the TTM, which will call your initialization function.
+
+See the radeon_ttm.c file for an example of usage.
+
+The Graphics Execution Manager (GEM)
+====================================
+
+The GEM design approach has resulted in a memory manager that doesn't
+provide full coverage of all (or even all common) use cases in its
+userspace or kernel API. GEM exposes a set of standard memory-related
+operations to userspace and a set of helper functions to drivers, and
+let drivers implement hardware-specific operations with their own
+private API.
+
+The GEM userspace API is described in the `GEM - the Graphics Execution
+Manager <http://lwn.net/Articles/283798/>`__ article on LWN. While
+slightly outdated, the document provides a good overview of the GEM API
+principles. Buffer allocation and read and write operations, described
+as part of the common GEM API, are currently implemented using
+driver-specific ioctls.
+
+GEM is data-agnostic. It manages abstract buffer objects without knowing
+what individual buffers contain. APIs that require knowledge of buffer
+contents or purpose, such as buffer allocation or synchronization
+primitives, are thus outside of the scope of GEM and must be implemented
+using driver-specific ioctls.
+
+On a fundamental level, GEM involves several operations:
+
+- Memory allocation and freeing
+- Command execution
+- Aperture management at command execution time
+
+Buffer object allocation is relatively straightforward and largely
+provided by Linux's shmem layer, which provides memory to back each
+object.
+
+Device-specific operations, such as command execution, pinning, buffer
+read & write, mapping, and domain ownership transfers are left to
+driver-specific ioctls.
+
+GEM Initialization
+------------------
+
+Drivers that use GEM must set the DRIVER_GEM bit in the struct
+:c:type:`struct drm_driver <drm_driver>` driver_features
+field. The DRM core will then automatically initialize the GEM core
+before calling the load operation. Behind the scene, this will create a
+DRM Memory Manager object which provides an address space pool for
+object allocation.
+
+In a KMS configuration, drivers need to allocate and initialize a
+command ring buffer following core GEM initialization if required by the
+hardware. UMA devices usually have what is called a "stolen" memory
+region, which provides space for the initial framebuffer and large,
+contiguous memory regions required by the device. This space is
+typically not managed by GEM, and must be initialized separately into
+its own DRM MM object.
+
+GEM Objects Creation
+--------------------
+
+GEM splits creation of GEM objects and allocation of the memory that
+backs them in two distinct operations.
+
+GEM objects are represented by an instance of struct :c:type:`struct
+drm_gem_object <drm_gem_object>`. Drivers usually need to
+extend GEM objects with private information and thus create a
+driver-specific GEM object structure type that embeds an instance of
+struct :c:type:`struct drm_gem_object <drm_gem_object>`.
+
+To create a GEM object, a driver allocates memory for an instance of its
+specific GEM object type and initializes the embedded struct
+:c:type:`struct drm_gem_object <drm_gem_object>` with a call
+to drm_gem_object_init(). The function takes a pointer
+to the DRM device, a pointer to the GEM object and the buffer object
+size in bytes.
+
+GEM uses shmem to allocate anonymous pageable memory.
+drm_gem_object_init() will create an shmfs file of the
+requested size and store it into the struct :c:type:`struct
+drm_gem_object <drm_gem_object>` filp field. The memory is
+used as either main storage for the object when the graphics hardware
+uses system memory directly or as a backing store otherwise.
+
+Drivers are responsible for the actual physical pages allocation by
+calling shmem_read_mapping_page_gfp() for each page.
+Note that they can decide to allocate pages when initializing the GEM
+object, or to delay allocation until the memory is needed (for instance
+when a page fault occurs as a result of a userspace memory access or
+when the driver needs to start a DMA transfer involving the memory).
+
+Anonymous pageable memory allocation is not always desired, for instance
+when the hardware requires physically contiguous system memory as is
+often the case in embedded devices. Drivers can create GEM objects with
+no shmfs backing (called private GEM objects) by initializing them with a call
+to drm_gem_private_object_init() instead of drm_gem_object_init(). Storage for
+private GEM objects must be managed by drivers.
+
+GEM Objects Lifetime
+--------------------
+
+All GEM objects are reference-counted by the GEM core. References can be
+acquired and release by calling drm_gem_object_get() and drm_gem_object_put()
+respectively.
+
+When the last reference to a GEM object is released the GEM core calls
+the :c:type:`struct drm_driver <drm_driver>` gem_free_object_unlocked
+operation. That operation is mandatory for GEM-enabled drivers and must
+free the GEM object and all associated resources.
+
+void (\*gem_free_object) (struct drm_gem_object \*obj); Drivers are
+responsible for freeing all GEM object resources. This includes the
+resources created by the GEM core, which need to be released with
+drm_gem_object_release().
+
+GEM Objects Naming
+------------------
+
+Communication between userspace and the kernel refers to GEM objects
+using local handles, global names or, more recently, file descriptors.
+All of those are 32-bit integer values; the usual Linux kernel limits
+apply to the file descriptors.
+
+GEM handles are local to a DRM file. Applications get a handle to a GEM
+object through a driver-specific ioctl, and can use that handle to refer
+to the GEM object in other standard or driver-specific ioctls. Closing a
+DRM file handle frees all its GEM handles and dereferences the
+associated GEM objects.
+
+To create a handle for a GEM object drivers call drm_gem_handle_create(). The
+function takes a pointer to the DRM file and the GEM object and returns a
+locally unique handle. When the handle is no longer needed drivers delete it
+with a call to drm_gem_handle_delete(). Finally the GEM object associated with a
+handle can be retrieved by a call to drm_gem_object_lookup().
+
+Handles don't take ownership of GEM objects, they only take a reference
+to the object that will be dropped when the handle is destroyed. To
+avoid leaking GEM objects, drivers must make sure they drop the
+reference(s) they own (such as the initial reference taken at object
+creation time) as appropriate, without any special consideration for the
+handle. For example, in the particular case of combined GEM object and
+handle creation in the implementation of the dumb_create operation,
+drivers must drop the initial reference to the GEM object before
+returning the handle.
+
+GEM names are similar in purpose to handles but are not local to DRM
+files. They can be passed between processes to reference a GEM object
+globally. Names can't be used directly to refer to objects in the DRM
+API, applications must convert handles to names and names to handles
+using the DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls
+respectively. The conversion is handled by the DRM core without any
+driver-specific support.
+
+GEM also supports buffer sharing with dma-buf file descriptors through
+PRIME. GEM-based drivers must use the provided helpers functions to
+implement the exporting and importing correctly. See ?. Since sharing
+file descriptors is inherently more secure than the easily guessable and
+global GEM names it is the preferred buffer sharing mechanism. Sharing
+buffers through GEM names is only supported for legacy userspace.
+Furthermore PRIME also allows cross-device buffer sharing since it is
+based on dma-bufs.
+
+GEM Objects Mapping
+-------------------
+
+Because mapping operations are fairly heavyweight GEM favours
+read/write-like access to buffers, implemented through driver-specific
+ioctls, over mapping buffers to userspace. However, when random access
+to the buffer is needed (to perform software rendering for instance),
+direct access to the object can be more efficient.
+
+The mmap system call can't be used directly to map GEM objects, as they
+don't have their own file handle. Two alternative methods currently
+co-exist to map GEM objects to userspace. The first method uses a
+driver-specific ioctl to perform the mapping operation, calling
+do_mmap() under the hood. This is often considered
+dubious, seems to be discouraged for new GEM-enabled drivers, and will
+thus not be described here.
+
+The second method uses the mmap system call on the DRM file handle. void
+\*mmap(void \*addr, size_t length, int prot, int flags, int fd, off_t
+offset); DRM identifies the GEM object to be mapped by a fake offset
+passed through the mmap offset argument. Prior to being mapped, a GEM
+object must thus be associated with a fake offset. To do so, drivers
+must call drm_gem_create_mmap_offset() on the object.
+
+Once allocated, the fake offset value must be passed to the application
+in a driver-specific way and can then be used as the mmap offset
+argument.
+
+The GEM core provides a helper method drm_gem_mmap() to
+handle object mapping. The method can be set directly as the mmap file
+operation handler. It will look up the GEM object based on the offset
+value and set the VMA operations to the :c:type:`struct drm_driver
+<drm_driver>` gem_vm_ops field. Note that drm_gem_mmap() doesn't map memory to
+userspace, but relies on the driver-provided fault handler to map pages
+individually.
+
+To use drm_gem_mmap(), drivers must fill the struct :c:type:`struct drm_driver
+<drm_driver>` gem_vm_ops field with a pointer to VM operations.
+
+The VM operations is a :c:type:`struct vm_operations_struct <vm_operations_struct>`
+made up of several fields, the more interesting ones being:
+
+.. code-block:: c
+
+ struct vm_operations_struct {
+ void (*open)(struct vm_area_struct * area);
+ void (*close)(struct vm_area_struct * area);
+ vm_fault_t (*fault)(struct vm_fault *vmf);
+ };
+
+
+The open and close operations must update the GEM object reference
+count. Drivers can use the drm_gem_vm_open() and drm_gem_vm_close() helper
+functions directly as open and close handlers.
+
+The fault operation handler is responsible for mapping individual pages
+to userspace when a page fault occurs. Depending on the memory
+allocation scheme, drivers can allocate pages at fault time, or can
+decide to allocate memory for the GEM object at the time the object is
+created.
+
+Drivers that want to map the GEM object upfront instead of handling page
+faults can implement their own mmap file operation handler.
+
+For platforms without MMU the GEM core provides a helper method
+drm_gem_cma_get_unmapped_area(). The mmap() routines will call this to get a
+proposed address for the mapping.
+
+To use drm_gem_cma_get_unmapped_area(), drivers must fill the struct
+:c:type:`struct file_operations <file_operations>` get_unmapped_area field with
+a pointer on drm_gem_cma_get_unmapped_area().
+
+More detailed information about get_unmapped_area can be found in
+Documentation/admin-guide/mm/nommu-mmap.rst
+
+Memory Coherency
+----------------
+
+When mapped to the device or used in a command buffer, backing pages for
+an object are flushed to memory and marked write combined so as to be
+coherent with the GPU. Likewise, if the CPU accesses an object after the
+GPU has finished rendering to the object, then the object must be made
+coherent with the CPU's view of memory, usually involving GPU cache
+flushing of various kinds. This core CPU<->GPU coherency management is
+provided by a device-specific ioctl, which evaluates an object's current
+domain and performs any necessary flushing or synchronization to put the
+object into the desired coherency domain (note that the object may be
+busy, i.e. an active render target; in that case, setting the domain
+blocks the client and waits for rendering to complete before performing
+any necessary flushing operations).
+
+Command Execution
+-----------------
+
+Perhaps the most important GEM function for GPU devices is providing a
+command execution interface to clients. Client programs construct
+command buffers containing references to previously allocated memory
+objects, and then submit them to GEM. At that point, GEM takes care to
+bind all the objects into the GTT, execute the buffer, and provide
+necessary synchronization between clients accessing the same buffers.
+This often involves evicting some objects from the GTT and re-binding
+others (a fairly expensive operation), and providing relocation support
+which hides fixed GTT offsets from clients. Clients must take care not
+to submit command buffers that reference more objects than can fit in
+the GTT; otherwise, GEM will reject them and no rendering will occur.
+Similarly, if several objects in the buffer require fence registers to
+be allocated for correct rendering (e.g. 2D blits on pre-965 chips),
+care must be taken not to require more fence registers than are
+available to the client. Such resource management should be abstracted
+from the client in libdrm.
+
+GEM Function Reference
+----------------------
+
+.. kernel-doc:: include/drm/drm_gem.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem.c
+ :export:
+
+GEM CMA Helper Functions Reference
+----------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c
+ :doc: cma helpers
+
+.. kernel-doc:: include/drm/drm_gem_cma_helper.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_cma_helper.c
+ :export:
+
+GEM SHMEM Helper Function Reference
+-----------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_shmem_helper.c
+ :doc: overview
+
+.. kernel-doc:: include/drm/drm_gem_shmem_helper.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_shmem_helper.c
+ :export:
+
+GEM VRAM Helper Functions Reference
+-----------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_vram_helper.c
+ :doc: overview
+
+.. kernel-doc:: include/drm/drm_gem_vram_helper.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_vram_helper.c
+ :export:
+
+GEM TTM Helper Functions Reference
+-----------------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_ttm_helper.c
+ :doc: overview
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_ttm_helper.c
+ :export:
+
+VMA Offset Manager
+==================
+
+.. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c
+ :doc: vma offset manager
+
+.. kernel-doc:: include/drm/drm_vma_manager.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_vma_manager.c
+ :export:
+
+.. _prime_buffer_sharing:
+
+PRIME Buffer Sharing
+====================
+
+PRIME is the cross device buffer sharing framework in drm, originally
+created for the OPTIMUS range of multi-gpu platforms. To userspace PRIME
+buffers are dma-buf based file descriptors.
+
+Overview and Lifetime Rules
+---------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_prime.c
+ :doc: overview and lifetime rules
+
+PRIME Helper Functions
+----------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_prime.c
+ :doc: PRIME Helpers
+
+PRIME Function References
+-------------------------
+
+.. kernel-doc:: include/drm/drm_prime.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_prime.c
+ :export:
+
+DRM MM Range Allocator
+======================
+
+Overview
+--------
+
+.. kernel-doc:: drivers/gpu/drm/drm_mm.c
+ :doc: Overview
+
+LRU Scan/Eviction Support
+-------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_mm.c
+ :doc: lru scan roster
+
+DRM MM Range Allocator Function References
+------------------------------------------
+
+.. kernel-doc:: include/drm/drm_mm.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_mm.c
+ :export:
+
+DRM Cache Handling
+==================
+
+.. kernel-doc:: drivers/gpu/drm/drm_cache.c
+ :export:
+
+DRM Sync Objects
+===========================
+
+.. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
+ :doc: Overview
+
+.. kernel-doc:: include/drm/drm_syncobj.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
+ :export:
+
+GPU Scheduler
+=============
+
+Overview
+--------
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+ :doc: Overview
+
+Scheduler Function References
+-----------------------------
+
+.. kernel-doc:: include/drm/gpu_scheduler.h
+ :internal:
+
+.. kernel-doc:: drivers/gpu/drm/scheduler/sched_main.c
+ :export: