summaryrefslogtreecommitdiffstats
path: root/docs/renderer.md
blob: 3104b0db4311a195d84120b80a29ec4fc8e26d02 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
# Rendering content: pl_frame, pl_renderer, and pl_queue

This example roughly builds off the [previous entry](./basic-rendering.md),
and as such will not cover the basics of how to create a window, initialize a
`pl_gpu` and get pixels onto the screen.

## Renderer

The `pl_renderer` set of APIs represents the highest-level interface into
libplacebo, and is what most users who simply want to display e.g. a video
feed on-screen will want to be using.

The basic initialization is straightforward, requiring no extra parameters:

``` c linenums="1"
pl_renderer renderer;

init()
{
    renderer = pl_renderer_create(pllog, gpu);
    if (!renderer)
        goto error;

    // ...
}

uninit()
{
    pl_renderer_destroy(&renderer);
}
```

What makes the renderer powerful is the large number of `pl_render_params` it
exposes. By default, libplacebo provides several presets to use:

* **pl_render_fast_params**: Disables everything except for defaults. This is
  the fastest possible configuration.
* **pl_render_default_params**: Contains the recommended default parameters,
  including some slightly higher quality scaling, as well as dithering.
* **pl_render_high_quality_params**: A preset of reasonable defaults for a
  higher-end machine (i.e. anything with a discrete GPU). This enables most
  of the basic functionality, including upscaling, downscaling, debanding
  and better HDR tone mapping.

Covering all of the possible options exposed by `pl_render_params` is
out-of-scope of this example and would be better served by looking at [the API
documentation](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/renderer.h#L94).

### Frames

[`pl_frame`](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/renderer.h#L503)
is the struct libplacebo uses to group textures and their metadata together
into a coherent unit that can be rendered using the renderer. This is not
currently a dynamically allocated or refcounted heap object, it is merely a
struct that can live on the stack (or anywhere else). The actual data lives in
corresponding `pl_tex` objects referenced in each of the frame's planes.

``` c linenums="1"
bool render_frame(const struct pl_frame *image,
                  const struct pl_swapchain_frame *swframe)
{
    struct pl_frame target;
    pl_frame_from_swapchain(&target, swframe);

    return pl_render_image(renderer, image, target,
                           &pl_render_default_params);
}
```

!!! note "Renderer state"
    The `pl_renderer` is conceptually (almost) stateless. The only thing that
    is needed to get a different result is to change the render params, which
    can be varied freely on every call, if the user desires.

    The one case where this is not entirely true is when using frame mixing
    (see below), or when using HDR peak detection. In this case, the renderer
    can be explicitly reset using `pl_renderer_flush_cache`.

To upload frames, the easiest methods are made available as dedicated helpers
in
[`<libplacebo/utils/upload.h>`](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/utils/upload.h),
and
[`<libplacebo/utils/libav.h>`](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/utils/libav.h)
(for AVFrames). In general, I recommend checking out the [demo
programs](https://code.videolan.org/videolan/libplacebo/-/tree/master/demos)
for a clearer illustration of how to use them in practice.

### Shader cache

The renderer internally generates, compiles and caches a potentially large
number of shader programs, some of which can be complex. On some platforms
(notably D3D11), these can be quite costly to recompile on every program
launch.

As such, the renderer offers a way to save/restore its internal shader cache
from some external location (managed by the API user). The use of this API is
highly recommended:

``` c linenums="1" hl_lines="1-2 10-14 21-27"
static uint8_t *load_saved_cache();
static void store_saved_cache(uint8_t *cache, size_t bytes);

void init()
{
    renderer = pl_renderer_create(pllog, gpu);
    if (!renderer)
        goto error;

    uint8_t *cache = load_saved_cache();
    if (cache) {
        pl_renderer_load(renderer, cache);
        free(cache);
    }

    // ...
}

void uninit()
{
    size_t cache_bytes = pl_renderer_save(renderer, NULL);
    uint8_t *cache = malloc(cache_bytes);
    if (cache) {
        pl_renderer_save(renderer, cache);
        store_saved_cache(cache, cache_bytes);
        free(cache);
    }

    pl_renderer_destroy(&renderer);
}
```

!!! warning "Cache safety"
    libplacebo performs only minimal validity checking on the shader cache,
    and in general, cannot possibly guard against malicious alteration of such
    files. Loading a cache from an untrusted source represents a remote code
    execution vector.

## Frame mixing

One of the renderer's most powerful features is its ability to compensate
for differences in framerates between the source and display by using [frame
mixing](https://github.com/mpv-player/mpv/wiki/Interpolation) to blend
adjacent frames together.

Using this API requires presenting the renderer, at each vsync, with a
`pl_frame_mix` struct, describing the current state of the vsync. In
principle, such structs can be constructed by hand. To do this, all of the
relevant frames (nearby the vsync timestamp) must be collected, and their
relative distances to the vsync determined, by normalizing all PTS values such
that the vsync represents time `0.0` (and a distance of `1.0` represents the
nominal duration between adjacent frames). Note that timing vsyncs, and
determining the correct vsync duration, are both left as problems for the user
to solve.[^timing]. Here could be an example of a valid struct:

[^timing]: However, this may change in the future, as the recent introduction of
  the Vulkan display timing extension may result in display timing feedback
  being added to the `pl_swapchain` API. That said, as of writing, this has
  not yet happened.

``` c
(struct pl_frame_mix) {
    .num_frames = 6
    .frames = (const struct pl_frame *[]) {
        /* frame 0 */
        /* frame 1 */
        /* ... */
        /* frame 5 */
    },
    .signatures = (uint64_t[]) {
        0x0, 0x1, 0x2, 0x3, 0x4, 0x5 // (1)
    },
    .timestamps = (float[]) {
        -2.4, -1.4, -0.4, 0.6, 1.6, 2.6, // (2)
    },
    .vsync_duration = 0.4, // 24 fps video on 60 fps display
}
```

1.  These must be unique per frame, but always refer to the same frame. For
    example, this could be based on the frame's PTS, the frame's numerical ID
    (in order of decoding), or some sort of hash. The details don't matter,
    only that this uniquely identifies specific frames.

2.  Typically, for CFR sources, frame timestamps will always be separated in
    this list by a distance of 1.0. In this example, the vsync falls roughly
    halfway (but not quite) in between two adjacent frames (with IDs 0x2 and
    0x3).

!!! note "Frame mixing radius"
    In this example, the frame mixing radius (as determined by
    `pl_frame_mix_radius` is `3.0`, so we include all frames that fall within
    the timestamp interval of `[-3, 3)`. In general, you should consult this
    function to determine what frames need to be included in the
    `pl_frame_mix` - though including more frames than needed is not an error.

### Frame queue

Because this API is rather unwieldy and clumsy to use directly, libplacebo
provides a helper abstraction known as `pl_queue` to assist in transforming
some arbitrary source of frames (such as a video decoder) into nicely packed
`pl_frame_mix` structs ready for consumption by the `pl_renderer`:

``` c linenums="1"
#include <libplacebo/utils/frame_queue.h>

pl_queue queue;

void init()
{
    queue = pl_queue_create(gpu);
}

void uninit()
{
    pl_queue_destroy(&queue);
    // ...
}
```

This queue can be interacted with through a number of mechanisms: either
pushing frames (blocking or non-blocking), or by having the queue poll frames
(via blocking or non-blocking callback) as-needed. For a full overview of the
various methods of pushing and polling frames, check the [API
documentation](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/utils/frame_queue.h#L115).

In this example, I will assume that we have a separate decoder thread pushing
frames into the `pl_queue` in a blocking manner:

``` c linenums="1"
static void decoder_thread(void)
{
    void *frame;

    while ((frame = /* decode new frame */)) {
        pl_queue_push_block(queue, UINT64_MAX, &(struct pl_source_frame) {
            .pts        = /* frame pts */,
            .duration   = /* frame duration */,
            .map        = /* map callback */,
            .unmap      = /* unmap callback */,
            .frame_data = frame,
        });
    }

    pl_queue_push(queue, NULL); // signal EOF
}
```

Now, in our render loop, we want to call `pl_queue_update` with appropriate
values to retrieve the correct frame mix for each vsync:

``` c linenums="1" hl_lines="3-10 12-21 27"
bool render_frame(const struct pl_swapchain_frame *swframe)
{
    struct pl_frame_mix mix;
    enum pl_queue_status res;
    res = pl_queue_update(queue, &mix, pl_queue_params(
        .pts            = /* time of next vsync */,
        .radius         = pl_frame_mix_radius(&render_params),
        .vsync_duration = /* if known */,
        .timeout        = UINT64_MAX, // (2)
    ));

    switch (res) {
    case PL_QUEUE_OK:
        break;
    case PL_QUEUE_EOF:
        /* no more frames */
        return false;
    case PL_QUEUE_ERR:
        goto error;
    // (1)
    }


    struct pl_frame target;
    pl_frame_from_swapchain(&target, swframe);

    return pl_render_image_mix(renderer, &mix, target,
                               &pl_render_default_params);
}
```

1.  There is a fourth status, `PL_QUEUE_MORE`, which is returned only if the
    resulting frame mix is incomplete (and the timeout was reached) -
    basically this can only happen if the queue runs dry due to frames not
    being supplied fast enough.

    In this example, since we are setting `timeout` to `UINT64_MAX`, we will
    never get this return value.

2.  Setting this makes `pl_queue_update` block indefinitely until sufficiently
    many frames have been pushed into the `pl_queue` from our separate
    decoding thread.

### Deinterlacing

The frame queue also vastly simplifies the process of performing
motion-adaptive temporal deinterlacing, by automatically linking together
adjacent fields/frames. To take advantage of this, all you need to do is set
the appropriate field (`pl_source_frame.first_frame`), as well as enabling
[deinterlacing
parameters](https://code.videolan.org/videolan/libplacebo/-/blob/master/src/include/libplacebo/renderer.h#L186).