diff options
Diffstat (limited to 'docs/renderer.md')
-rw-r--r-- | docs/renderer.md | 302 |
1 files changed, 302 insertions, 0 deletions
diff --git a/docs/renderer.md b/docs/renderer.md new file mode 100644 index 0000000..3104b0d --- /dev/null +++ b/docs/renderer.md @@ -0,0 +1,302 @@ +# Rendering content: pl_frame, pl_renderer, and pl_queue + +This example roughly builds off the [previous entry](./basic-rendering.md), +and as such will not cover the basics of how to create a window, initialize a +`pl_gpu` and get pixels onto the screen. + +## Renderer + +The `pl_renderer` set of APIs represents the highest-level interface into +libplacebo, and is what most users who simply want to display e.g. a video +feed on-screen will want to be using. + +The basic initialization is straightforward, requiring no extra parameters: + +``` c linenums="1" +pl_renderer renderer; + +init() +{ + renderer = pl_renderer_create(pllog, gpu); + if (!renderer) + goto error; + + // ... +} + +uninit() +{ + pl_renderer_destroy(&renderer); +} +``` + +What makes the renderer powerful is the large number of `pl_render_params` it +exposes. By default, libplacebo provides several presets to use: + +* **pl_render_fast_params**: Disables everything except for defaults. This is + the fastest possible configuration. +* **pl_render_default_params**: Contains the recommended default parameters, + including some slightly higher quality scaling, as well as dithering. +* **pl_render_high_quality_params**: A preset of reasonable defaults for a + higher-end machine (i.e. anything with a discrete GPU). This enables most + of the basic functionality, including upscaling, downscaling, debanding + and better HDR tone mapping. + +Covering all of the possible options exposed by `pl_render_params` is +out-of-scope of this example and would be better served by looking at [the API +documentation](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/renderer.h#L94). + +### Frames + +[`pl_frame`](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/renderer.h#L503) +is the struct libplacebo uses to group textures and their metadata together +into a coherent unit that can be rendered using the renderer. This is not +currently a dynamically allocated or refcounted heap object, it is merely a +struct that can live on the stack (or anywhere else). The actual data lives in +corresponding `pl_tex` objects referenced in each of the frame's planes. + +``` c linenums="1" +bool render_frame(const struct pl_frame *image, + const struct pl_swapchain_frame *swframe) +{ + struct pl_frame target; + pl_frame_from_swapchain(&target, swframe); + + return pl_render_image(renderer, image, target, + &pl_render_default_params); +} +``` + +!!! note "Renderer state" + The `pl_renderer` is conceptually (almost) stateless. The only thing that + is needed to get a different result is to change the render params, which + can be varied freely on every call, if the user desires. + + The one case where this is not entirely true is when using frame mixing + (see below), or when using HDR peak detection. In this case, the renderer + can be explicitly reset using `pl_renderer_flush_cache`. + +To upload frames, the easiest methods are made available as dedicated helpers +in +[`<libplacebo/utils/upload.h>`](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/utils/upload.h), +and +[`<libplacebo/utils/libav.h>`](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/utils/libav.h) +(for AVFrames). In general, I recommend checking out the [demo +programs](https://code.videolan.org/videolan/libplacebo/-/tree/master/demos) +for a clearer illustration of how to use them in practice. + +### Shader cache + +The renderer internally generates, compiles and caches a potentially large +number of shader programs, some of which can be complex. On some platforms +(notably D3D11), these can be quite costly to recompile on every program +launch. + +As such, the renderer offers a way to save/restore its internal shader cache +from some external location (managed by the API user). The use of this API is +highly recommended: + +``` c linenums="1" hl_lines="1-2 10-14 21-27" +static uint8_t *load_saved_cache(); +static void store_saved_cache(uint8_t *cache, size_t bytes); + +void init() +{ + renderer = pl_renderer_create(pllog, gpu); + if (!renderer) + goto error; + + uint8_t *cache = load_saved_cache(); + if (cache) { + pl_renderer_load(renderer, cache); + free(cache); + } + + // ... +} + +void uninit() +{ + size_t cache_bytes = pl_renderer_save(renderer, NULL); + uint8_t *cache = malloc(cache_bytes); + if (cache) { + pl_renderer_save(renderer, cache); + store_saved_cache(cache, cache_bytes); + free(cache); + } + + pl_renderer_destroy(&renderer); +} +``` + +!!! warning "Cache safety" + libplacebo performs only minimal validity checking on the shader cache, + and in general, cannot possibly guard against malicious alteration of such + files. Loading a cache from an untrusted source represents a remote code + execution vector. + +## Frame mixing + +One of the renderer's most powerful features is its ability to compensate +for differences in framerates between the source and display by using [frame +mixing](https://github.com/mpv-player/mpv/wiki/Interpolation) to blend +adjacent frames together. + +Using this API requires presenting the renderer, at each vsync, with a +`pl_frame_mix` struct, describing the current state of the vsync. In +principle, such structs can be constructed by hand. To do this, all of the +relevant frames (nearby the vsync timestamp) must be collected, and their +relative distances to the vsync determined, by normalizing all PTS values such +that the vsync represents time `0.0` (and a distance of `1.0` represents the +nominal duration between adjacent frames). Note that timing vsyncs, and +determining the correct vsync duration, are both left as problems for the user +to solve.[^timing]. Here could be an example of a valid struct: + +[^timing]: However, this may change in the future, as the recent introduction of + the Vulkan display timing extension may result in display timing feedback + being added to the `pl_swapchain` API. That said, as of writing, this has + not yet happened. + +``` c +(struct pl_frame_mix) { + .num_frames = 6 + .frames = (const struct pl_frame *[]) { + /* frame 0 */ + /* frame 1 */ + /* ... */ + /* frame 5 */ + }, + .signatures = (uint64_t[]) { + 0x0, 0x1, 0x2, 0x3, 0x4, 0x5 // (1) + }, + .timestamps = (float[]) { + -2.4, -1.4, -0.4, 0.6, 1.6, 2.6, // (2) + }, + .vsync_duration = 0.4, // 24 fps video on 60 fps display +} +``` + +1. These must be unique per frame, but always refer to the same frame. For + example, this could be based on the frame's PTS, the frame's numerical ID + (in order of decoding), or some sort of hash. The details don't matter, + only that this uniquely identifies specific frames. + +2. Typically, for CFR sources, frame timestamps will always be separated in + this list by a distance of 1.0. In this example, the vsync falls roughly + halfway (but not quite) in between two adjacent frames (with IDs 0x2 and + 0x3). + +!!! note "Frame mixing radius" + In this example, the frame mixing radius (as determined by + `pl_frame_mix_radius` is `3.0`, so we include all frames that fall within + the timestamp interval of `[-3, 3)`. In general, you should consult this + function to determine what frames need to be included in the + `pl_frame_mix` - though including more frames than needed is not an error. + +### Frame queue + +Because this API is rather unwieldy and clumsy to use directly, libplacebo +provides a helper abstraction known as `pl_queue` to assist in transforming +some arbitrary source of frames (such as a video decoder) into nicely packed +`pl_frame_mix` structs ready for consumption by the `pl_renderer`: + +``` c linenums="1" +#include <libplacebo/utils/frame_queue.h> + +pl_queue queue; + +void init() +{ + queue = pl_queue_create(gpu); +} + +void uninit() +{ + pl_queue_destroy(&queue); + // ... +} +``` + +This queue can be interacted with through a number of mechanisms: either +pushing frames (blocking or non-blocking), or by having the queue poll frames +(via blocking or non-blocking callback) as-needed. For a full overview of the +various methods of pushing and polling frames, check the [API +documentation](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/utils/frame_queue.h#L115). + +In this example, I will assume that we have a separate decoder thread pushing +frames into the `pl_queue` in a blocking manner: + +``` c linenums="1" +static void decoder_thread(void) +{ + void *frame; + + while ((frame = /* decode new frame */)) { + pl_queue_push_block(queue, UINT64_MAX, &(struct pl_source_frame) { + .pts = /* frame pts */, + .duration = /* frame duration */, + .map = /* map callback */, + .unmap = /* unmap callback */, + .frame_data = frame, + }); + } + + pl_queue_push(queue, NULL); // signal EOF +} +``` + +Now, in our render loop, we want to call `pl_queue_update` with appropriate +values to retrieve the correct frame mix for each vsync: + +``` c linenums="1" hl_lines="3-10 12-21 27" +bool render_frame(const struct pl_swapchain_frame *swframe) +{ + struct pl_frame_mix mix; + enum pl_queue_status res; + res = pl_queue_update(queue, &mix, pl_queue_params( + .pts = /* time of next vsync */, + .radius = pl_frame_mix_radius(&render_params), + .vsync_duration = /* if known */, + .timeout = UINT64_MAX, // (2) + )); + + switch (res) { + case PL_QUEUE_OK: + break; + case PL_QUEUE_EOF: + /* no more frames */ + return false; + case PL_QUEUE_ERR: + goto error; + // (1) + } + + + struct pl_frame target; + pl_frame_from_swapchain(&target, swframe); + + return pl_render_image_mix(renderer, &mix, target, + &pl_render_default_params); +} +``` + +1. There is a fourth status, `PL_QUEUE_MORE`, which is returned only if the + resulting frame mix is incomplete (and the timeout was reached) - + basically this can only happen if the queue runs dry due to frames not + being supplied fast enough. + + In this example, since we are setting `timeout` to `UINT64_MAX`, we will + never get this return value. + +2. Setting this makes `pl_queue_update` block indefinitely until sufficiently + many frames have been pushed into the `pl_queue` from our separate + decoding thread. + +### Deinterlacing + +The frame queue also vastly simplifies the process of performing +motion-adaptive temporal deinterlacing, by automatically linking together +adjacent fields/frames. To take advantage of this, all you need to do is set +the appropriate field (`pl_source_frame.first_frame`), as well as enabling +[deinterlacing +parameters](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/renderer.h#L186). |