1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
|
.. _apz:
Asynchronous Panning and Zooming
================================
**This document is a work in progress. Some information may be missing
or incomplete.**
.. image:: AsyncPanZoomArchitecture.png
Goals
-----
We need to be able to provide a visual response to user input with
minimal latency. In particular, on devices with touch input, content
must track the finger exactly while panning, or the user experience is
very poor. According to the UX team, 120ms is an acceptable latency
between user input and response.
Context and surrounding architecture
------------------------------------
The fundamental problem we are trying to solve with the Asynchronous
Panning and Zooming (APZ) code is that of responsiveness. By default,
web browsers operate in a “game loop” that looks like this:
::
while true:
process input
do computations
repaint content
display repainted content
In browsers the “do computation” step can be arbitrarily expensive
because it can involve running event handlers in web content. Therefore,
there can be an arbitrary delay between the input being received and the
on-screen display getting updated.
Responsiveness is always good, and with touch-based interaction it is
even more important than with mouse or keyboard input. In order to
ensure responsiveness, we split the “game loop” model of the browser
into a multithreaded variant which looks something like this:
::
Thread 1 (compositor thread)
while true:
receive input
send a copy of input to thread 2
adjust painted content based on input
display adjusted painted content
Thread 2 (main thread)
while true:
receive input from thread 1
do computations
repaint content
update the copy of painted content in thread 1
This multithreaded model is called off-main-thread compositing (OMTC),
because the compositing (where the content is displayed on-screen)
happens on a separate thread from the main thread. Note that this is a
very very simplified model, but in this model the “adjust painted
content based on input” is the primary function of the APZ code.
The “painted content” is stored on a set of “layers”, that are
conceptually double-buffered. That is, when the main thread does its
repaint, it paints into one set of layers (the “client” layers). The
update that is sent to the compositor thread copies all the changes from
the client layers into another set of layers that the compositor holds.
These layers are called the “shadow” layers or the “compositor” layers.
The compositor in theory can continuously composite these shadow layers
to the screen while the main thread is busy doing other things and
painting a new set of client layers.
The APZ code takes the input events that are coming in from the hardware
and uses them to figure out what the user is trying to do (e.g. pan the
page, zoom in). It then expresses this user intention in the form of
translation and/or scale transformation matrices. These transformation
matrices are applied to the shadow layers at composite time, so that
what the user sees on-screen reflects what they are trying to do as
closely as possible.
Technical overview
------------------
As per the heavily simplified model described above, the fundamental
purpose of the APZ code is to take input events and produce
transformation matrices. This section attempts to break that down and
identify the different problems that make this task non-trivial.
Checkerboarding
~~~~~~~~~~~~~~~
The content area that is painted and stored in a shadow layer is called
the “displayport”. The APZ code is responsible for determining how large
the displayport should be. On the one hand, we want the displayport to
be as large as possible. At the very least it needs to be larger than
what is visible on-screen, because otherwise, as soon as the user pans,
there will be some unpainted area of the page exposed. However, we
cannot always set the displayport to be the entire page, because the
page can be arbitrarily long and this would require an unbounded amount
of memory to store. Therefore, a good displayport size is one that is
larger than the visible area but not so large that it is a huge drain on
memory. Because the displayport is usually smaller than the whole page,
it is always possible for the user to scroll so fast that they end up in
an area of the page outside the displayport. When this happens, they see
unpainted content; this is referred to as “checkerboarding”, and we try
to avoid it where possible.
There are many possible ways to determine what the displayport should be
in order to balance the tradeoffs involved (i.e. having one that is too
big is bad for memory usage, and having one that is too small results in
excessive checkerboarding). Ideally, the displayport should cover
exactly the area that we know the user will make visible. Although we
cannot know this for sure, we can use heuristics based on current
panning velocity and direction to ensure a reasonably-chosen displayport
area. This calculation is done in the APZ code, and a new desired
displayport is frequently sent to the main thread as the user is panning
around.
Multiple layers
~~~~~~~~~~~~~~~
Consider, for example, a scrollable page that contains an iframe which
itself is scrollable. The iframe can be scrolled independently of the
top-level page, and we would like both the page and the iframe to scroll
responsively. This means that we want independent asynchronous panning
for both the top-level page and the iframe. In addition to iframes,
elements that have the overflow:scroll CSS property set are also
scrollable, and also end up on separate scrollable layers. In the
general case, the layers are arranged in a tree structure, and so within
the APZ code we have a matching tree of AsyncPanZoomController (APZC)
objects, one for each scrollable layer. To manage this tree of APZC
instances, we have a single APZCTreeManager object. Each APZC is
relatively independent and handles the scrolling for its associated
layer, but there are some cases in which they need to interact; these
cases are described in the sections below.
Hit detection
~~~~~~~~~~~~~
Consider again the case where we have a scrollable page that contains an
iframe which itself is scrollable. As described above, we will have two
APZC instances - one for the page and one for the iframe. When the user
puts their finger down on the screen and moves it, we need to do some
sort of hit detection in order to determine whether their finger is on
the iframe or on the top-level page. Based on where their finger lands,
the appropriate APZC instance needs to handle the input. This hit
detection is also done in the APZCTreeManager, as it has the necessary
information about the sizes and positions of the layers. Currently this
hit detection is not perfect, as it uses rects and does not account for
things like rounded corners and opacity.
Also note that for some types of input (e.g. when the user puts two
fingers down to do a pinch) we do not want the input to be “split”
across two different APZC instances. In the case of a pinch, for
example, we find a “common ancestor” APZC instance - one that is
zoomable and contains all of the touch input points, and direct the
input to that APZC instance.
Scroll Handoff
~~~~~~~~~~~~~~
Consider yet again the case where we have a scrollable page that
contains an iframe which itself is scrollable. Say the user scrolls the
iframe so that it reaches the bottom. If the user continues panning on
the iframe, the expectation is that the top-level page will start
scrolling. However, as discussed in the section on hit detection, the
APZC instance for the iframe is separate from the APZC instance for the
top-level page. Thus, we need the two APZC instances to communicate in
some way such that input events on the iframe result in scrolling on the
top-level page. This behaviour is referred to as “scroll handoff” (or
“fling handoff” in the case where analogous behaviour results from the
scrolling momentum of the page after the user has lifted their finger).
Input event untransformation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The APZC architecture by definition results in two copies of a “scroll
position” for each scrollable layer. There is the original copy on the
main thread that is accessible to web content and the layout and
painting code. And there is a second copy on the compositor side, which
is updated asynchronously based on user input, and corresponds to what
the user visually sees on the screen. Although these two copies may
diverge temporarily, they are reconciled periodically. In particular,
they diverge while the APZ code is performing an async pan or zoom
action on behalf of the user, and are reconciled when the APZ code
requests a repaint from the main thread.
Because of the way input events are stored, this has some unfortunate
consequences. Input events are stored relative to the device screen - so
if the user touches at the same physical spot on the device, the same
input events will be delivered regardless of the content scroll
position. When the main thread receives a touch event, it combines that
with the content scroll position in order to figure out what DOM element
the user touched. However, because we now have two different scroll
positions, this process may not work perfectly. A concrete example
follows:
Consider a device with screen size 600 pixels tall. On this device, a
user is viewing a document that is 1000 pixels tall, and that is
scrolled down by 200 pixels. That is, the vertical section of the
document from 200px to 800px is visible. Now, if the user touches a
point 100px from the top of the physical display, the hardware will
generate a touch event with y=100. This will get sent to the main
thread, which will add the scroll position (200) and get a
document-relative touch event with y=300. This new y-value will be used
in hit detection to figure out what the user touched. If the document
had a absolute-positioned div at y=300, then that would receive the
touch event.
Now let us add some async scrolling to this example. Say that the user
additionally scrolls the document by another 10 pixels asynchronously
(i.e. only on the compositor thread), and then does the same touch
event. The same input event is generated by the hardware, and as before,
the document will deliver the touch event to the div at y=300. However,
visually, the document is scrolled by an additional 10 pixels so this
outcome is wrong. What needs to happen is that the APZ code needs to
intercept the touch event and account for the 10 pixels of asynchronous
scroll. Therefore, the input event with y=100 gets converted to y=110 in
the APZ code before being passed on to the main thread. The main thread
then adds the scroll position it knows about and determines that the
user touched at a document-relative position of y=310.
Analogous input event transformations need to be done for horizontal
scrolling and zooming.
Content independently adjusting scrolling
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As described above, there are two copies of the scroll position in the
APZ architecture - one on the main thread and one on the compositor
thread. Usually for architectures like this, there is a single “source
of truth” value and the other value is simply a copy. However, in this
case that is not easily possible to do. The reason is that both of these
values can be legitimately modified. On the compositor side, the input
events the user is triggering modify the scroll position, which is then
propagated to the main thread. However, on the main thread, web content
might be running Javascript code that programmatically sets the scroll
position (via window.scrollTo, for example). Scroll changes driven from
the main thread are just as legitimate and need to be propagated to the
compositor thread, so that the visual display updates in response.
Because the cross-thread messaging is asynchronous, reconciling the two
types of scroll changes is a tricky problem. Our design solves this
using various flags and generation counters. The general heuristic we
have is that content-driven scroll position changes (e.g. scrollTo from
JS) are never lost. For instance, if the user is doing an async scroll
with their finger and content does a scrollTo in the middle, then some
of the async scroll would occur before the “jump” and the rest after the
“jump”.
Content preventing default behaviour of input events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Another problem that we need to deal with is that web content is allowed
to intercept touch events and prevent the “default behaviour” of
scrolling. This ability is defined in web standards and is
non-negotiable. Touch event listeners in web content are allowed call
preventDefault() on the touchstart or first touchmove event for a touch
point; doing this is supposed to “consume” the event and prevent
touch-based panning. As we saw in a previous section, the input event
needs to be untransformed by the APZ code before it can be delivered to
content. But, because of the preventDefault problem, we cannot fully
process the touch event in the APZ code until content has had a chance
to handle it. Web browsers in general solve this problem by inserting a
delay of up to 300ms before processing the input - that is, web content
is allowed up to 300ms to process the event and call preventDefault on
it. If web content takes longer than 300ms, or if it completes handling
of the event without calling preventDefault, then the browser
immediately starts processing the events.
The way the APZ implementation deals with this is that upon receiving a
touch event, it immediately returns an untransformed version that can be
dispatched to content. It also schedules a 400ms timeout (600ms on
Android) during which content is allowed to prevent scrolling. There is
an API that allows the main-thread event dispatching code to notify the
APZ as to whether or not the default action should be prevented. If the
APZ content response timeout expires, or if the main-thread event
dispatching code notifies the APZ of the preventDefault status, then the
APZ continues with the processing of the events (which may involve
discarding the events).
The touch-action CSS property from the pointer-events spec is intended
to allow eliminating this 400ms delay in many cases (although for
backwards compatibility it will still be needed for a while). Note that
even with touch-action implemented, there may be cases where the APZ
code does not know the touch-action behaviour of the point the user
touched. In such cases, the APZ code will still wait up to 400ms for the
main thread to provide it with the touch-action behaviour information.
Technical details
-----------------
This section describes various pieces of the APZ code, and goes into
more specific detail on APIs and code than the previous sections. The
primary purpose of this section is to help people who plan on making
changes to the code, while also not going into so much detail that it
needs to be updated with every patch.
Overall flow of input events
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This section describes how input events flow through the APZ code.
1. Input events arrive from the hardware/widget code into the APZ via
APZCTreeManager::ReceiveInputEvent. The thread that invokes this is
called the input thread, and may or may not be the same as the Gecko
main thread.
2. Conceptually the first thing that the APZCTreeManager does is to
associate these events with “input blocks”. An input block is a set
of events that share certain properties, and generally are intended
to represent a single gesture. For example with touch events, all
events following a touchstart up to but not including the next
touchstart are in the same block. All of the events in a given block
will go to the same APZC instance and will either all be processed
or all be dropped.
3. Using the first event in the input block, the APZCTreeManager does a
hit-test to see which APZC it hits. This hit-test uses the event
regions populated on the layers, which may be larger than the true
hit area of the layer. If no APZC is hit, the events are discarded
and we jump to step 6. Otherwise, the input block is tagged with the
hit APZC as a tentative target and put into a global APZ input
queue.
4.
i. If the input events landed outside the dispatch-to-content event
region for the layer, any available events in the input block
are processed. These may trigger behaviours like scrolling or
tap gestures.
ii. If the input events landed inside the dispatch-to-content event
region for the layer, the events are left in the queue and a
400ms timeout is initiated. If the timeout expires before step 9
is completed, the APZ assumes the input block was not cancelled
and the tentative target is correct, and processes them as part
of step 10.
5. The call stack unwinds back to APZCTreeManager::ReceiveInputEvent,
which does an in-place modification of the input event so that any
async transforms are removed.
6. The call stack unwinds back to the widget code that called
ReceiveInputEvent. This code now has the event in the coordinate
space Gecko is expecting, and so can dispatch it to the Gecko main
thread.
7. Gecko performs its own usual hit-testing and event dispatching for
the event. As part of this, it records whether any touch listeners
cancelled the input block by calling preventDefault(). It also
activates inactive scrollframes that were hit by the input events.
8. The call stack unwinds back to the widget code, which sends two
notifications to the APZ code on the input thread. The first
notification is via APZCTreeManager::ContentReceivedInputBlock, and
informs the APZ whether the input block was cancelled. The second
notification is via APZCTreeManager::SetTargetAPZC, and informs the
APZ of the results of the Gecko hit-test during event dispatch. Note
that Gecko may report that the input event did not hit any
scrollable frame at all. The SetTargetAPZC notification happens only
once per input block, while the ContentReceivedInputBlock
notification may happen once per block, or multiple times per block,
depending on the input type.
9.
i. If the events were processed as part of step 4(i), the
notifications from step 8 are ignored and step 10 is skipped.
ii. If events were queued as part of step 4(ii), and steps 5-8 take
less than 400ms, the arrival of both notifications from step 8
will mark the input block ready for processing.
iii. If events were queued as part of step 4(ii), but steps 5-8 take
longer than 400ms, the notifications from step 8 will be
ignored and step 10 will already have happened.
10. If events were queued as part of step 4(ii) they are now either
processed (if the input block was not cancelled and Gecko detected a
scrollframe under the input event, or if the timeout expired) or
dropped (all other cases). Note that the APZC that processes the
events may be different at this step than the tentative target from
step 3, depending on the SetTargetAPZC notification. Processing the
events may trigger behaviours like scrolling or tap gestures.
If the CSS touch-action property is enabled, the above steps are
modified as follows: \* In step 4, the APZC also requires the allowed
touch-action behaviours for the input event. This might have been
determined as part of the hit-test in APZCTreeManager; if not, the
events are queued. \* In step 6, the widget code determines the content
element at the point under the input element, and notifies the APZ code
of the allowed touch-action behaviours. This notification is sent via a
call to APZCTreeManager::SetAllowedTouchBehavior on the input thread. \*
In step 9(ii), the input block will only be marked ready for processing
once all three notifications arrive.
Threading considerations
^^^^^^^^^^^^^^^^^^^^^^^^
The bulk of the input processing in the APZ code happens on what we call
“the input thread”. In practice the input thread could be the Gecko main
thread, the compositor thread, or some other thread. There are obvious
downsides to using the Gecko main thread - that is, “asynchronous”
panning and zooming is not really asynchronous as input events can only
be processed while Gecko is idle. In an e10s environment, using the
Gecko main thread of the chrome process is acceptable, because the code
running in that process is more controllable and well-behaved than
arbitrary web content. Using the compositor thread as the input thread
could work on some platforms, but may be inefficient on others. For
example, on Android (Fennec) we receive input events from the system on
a dedicated UI thread. We would have to redispatch the input events to
the compositor thread if we wanted to the input thread to be the same as
the compositor thread. This introduces a potential for higher latency,
particularly if the compositor does any blocking operations - blocking
SwapBuffers operations, for example. As a result, the APZ code itself
does not assume that the input thread will be the same as the Gecko main
thread or the compositor thread.
Active vs. inactive scrollframes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The number of scrollframes on a page is potentially unbounded. However,
we do not want to create a separate layer for each scrollframe right
away, as this would require large amounts of memory. Therefore,
scrollframes as designated as either “active” or “inactive”. Active
scrollframes are the ones that do have their contents put on a separate
layer (or set of layers), and inactive ones do not.
Consider a page with a scrollframe that is initially inactive. When
layout generates the layers for this page, the content of the
scrollframe will be flattened into some other PaintedLayer (call it P).
The layout code also adds the area (or bounding region in case of weird
shapes) of the scrollframe to the dispatch-to-content region of P.
When the user starts interacting with that content, the hit-test in the
APZ code finds the dispatch-to-content region of P. The input block
therefore has a tentative target of P when it goes into step 4(ii) in
the flow above. When gecko processes the input event, it must detect the
inactive scrollframe and activate it, as part of step 7. Finally, the
widget code sends the SetTargetAPZC notification in step 8 to notify the
APZ that the input block should really apply to this new layer. The
issue here is that the layer transaction containing the new layer must
reach the compositor and APZ before the SetTargetAPZC notification. If
this does not occur within the 400ms timeout, the APZ code will be
unable to update the tentative target, and will continue to use P for
that input block. Input blocks that start after the layer transaction
will get correctly routed to the new layer as there will now be a layer
and APZC instance for the active scrollframe.
This model implies that when the user initially attempts to scroll an
inactive scrollframe, it may end up scrolling an ancestor scrollframe.
(This is because in the absence of the SetTargetAPZC notification, the
input events will get applied to the closest ancestor scrollframe’s
APZC.) Only after the round-trip to the gecko thread is complete is
there a layer for async scrolling to actually occur on the scrollframe
itself. At that point the scrollframe will start receiving new input
blocks and will scroll normally.
WebRender Integration
~~~~~~~~~~~~~~~~~~~~~
The APZ code was originally written to work with the "layers" graphics
backend. Many of the concepts (and therefore variable/function names)
stem from the integration with the layers backend. After the WebRender
backend was added, the existing code evolved over time to integrate
with that backend as well, resulting in a bit of a hodge-podge effect.
With that cautionary note out of the way, there are three main pieces
that need to be understood to grasp the integration between the APZ
code and WebRender. These are detailed below.
HitTestingTree
^^^^^^^^^^^^^^
The APZCTreeManager keeps as part of its internal state a tree of
HitTestingTreeNode instances. This is referred to as the HitTestingTree.
As the name implies, this was used for hit-testing purposes, so that
APZ could determine which scrollframe a particular incoming input event
would be targeting. Doing the hit-test requires access to a bunch of state,
such as CSS transforms and clip rects, as well as ancillary data like
event regions, which affect how APZ reacts to input events.
With the layers backend, all this information was provided by a layer tree
update, and so the HitTestingTree was created to mirror the layer tree,
allowing APZ access to that information from other threads. The structure
of the tree was identical to the layer tree. But with WebRender, there
is no "layer tree" per se, and instead we "fake it" by creating a
HitTestingTree structure that is similar to what it would be like on the
equivalent layer tree. But the bigger difference is that with WebRender,
the HitTestingTree is not actually used for hit-testing at all; instead
we get WebRender to do the hit-test for us, as it can do so using its
own internal state and produce a more precise result.
Information stored in the HitTestingTree (e.g. CSS transforms) is still
used by other pieces of APZ (e.g. some of the scrollbar manipulation code)
so it is still needed, even with the WebRender backend. For this reason,
and for consistency between the two backends, we try to populate as much
information in the HitTestingTree that we can, even with the WebRender
backend.
With the layers backend, the way the HitTestingTree is created is by
walking the layer tree with a LayerMetricsWrapper class. This wraps
a layer tree but also expands layers with multiple ScrollMetadata into
multiple nodes. The equivalent in the WebRender world is the
WebRenderScrollDataWrapper, which wraps a WebRenderScrollData object. The
WebRenderScrollData object is roughly analogous to a layer tree, but
is something that is constructed deliberately rather than being a natural
output from the WebRender paint transaction (i.e. we create it explicitly
for APZ's consumption, rather than something that we would create anyway
for WebRender's consumption).
The WebRenderScrollData structure contains within it a tree of
WebRenderLayerScrollData instances, which are analogous to individual
layers in a layer tree. These instances contain various fields like
CSS transforms, fixed/sticky position info, etc. that would normally be
found on individual layers in the layer tree. This allows the code
that builds the HitTestingTree to consume either a WebRenderScrollData
or a layer tree in a more-or-less unified fashion.
Working backwards a bit more, the WebRenderLayerScrollData instances
are created as we traverse the Gecko display list and build the
WebRender display list. In the layers world, the code in FrameLayerBuilder
was responsible for building the layer tree from the Gecko display list,
but in the WebRender world, this happens primarily in WebRenderCommandBuilder.
As of this writing, the architecture for this is that, as we walk
the Gecko display list, we query it to see if it contains any information
that APZ might need to know (e.g. CSS transforms) via a call to
`nsDisplayItem::UpdateScrollData(nullptr, nullptr)`. If this call
returns true, we create a WebRenderLayerScrollData instance for the item,
and populate it with the necessary information in
`WebRenderLayerScrollData::Initialize`. We also create
WebRenderLayerScrollData instances if we detect (via ASR changes) that
we are now processing a Gecko display item that is in a different scrollframe
than the previous item. This is equivalent to how FrameLayerBuilder will
flatten items with different ASRs into different layers, so that it
is cheap to scroll scrollframes in the compositor.
The main sources of complexity in this code come from:
1. Ensuring the ScrollMetadata instances end on the proper
WebRenderLayerScrollData instances (such that every path from a leaf
WebRenderLayerScrollData node to the root has a consistent ordering of
scrollframes without duplications).
2. The deferred-transform optimization that is described in more detail
at the declaration of StackingContextHelper::mDeferredTransformItem.
Hit-testing
^^^^^^^^^^^
Since the HitTestingTree is not used for actual hit-testing purposes
with the WebRender backend (see previous section), this section describes
how hit-testing actually works with WebRender.
With both layers and WebRender, the Gecko display list contains display items
(`nsDisplayCompositorHitTestInfo`) that store hit-testing state. These
items implement the `CreateWebRenderCommands` method and generate a "hit-test
item" into the WebRender display list. This is basically just a rectangle
item in the WebRender display list that is no-op for painting purposes,
but contains information that should be returned by the hit-test (specifically
the hit info flags and the scrollId of the enclosing scrollframe). The
hit-test item gets clipped and transformed in the same way that all the other
items in the WebRender display list do, via clip chains and enclosing
reference frame/stacking context items.
When WebRender needs to do a hit-test, it goes through its display list,
taking into account the current clips and transforms, adjusted for the
most recent async scroll/zoom, and determines which hit-test item(s) are under
the target point, and returns those items. APZ can then take the frontmost
item from that list (or skip over it if it happens to be inside a OOP
subdocument that's pointer-events:none) and use that as the hit target.
It's important to note that when APZ does hit-testing for the layers backend,
it uses the most recent available async transforms, even if those transforms
have not yet been composited. With WebRender, the hit-test uses the last
transform provided by the `SampleForWebRender` API (see next section) which
generally reflects the last composite, and doesn't take into account further
changes to the transforms that have occurred since then.
When debugging hit-test issues, it is often useful to apply the patches
on bug 1656260, which introduce a guid on Gecko display items and propagate
it all the way through to where APZ gets the hit-test result. This allows
answering the question "which nsDisplayCompositorHitTestInfo was responsible
for this hit-test result?" which is often a very good first step in
solving the bug. From there, one can determine if there was some other
display item in front that should have generated a
nsDisplayCompositorHitTestInfo but didn't, or if display item itself had
incorrect information. The second patch on that bug further allows exposing
hand-written debug info to the APZ code, so that the WR hit-testing
mechanism itself can be more effectively debugged, in case there is a problem
with the WR display items getting improperly transformed or clipped.
Sampling
^^^^^^^^
With both the layers and WebRender backend, the compositing step needs to
read the latest async transforms from APZ in order to ensure scrollframes
are rendered at the right position. In both cases, the API for this is
exposed via the `APZSampler` class. The difference is that with the layers
backend, the `AsyncCompositionManager` walks the layer tree and queries
the transform components for each layer individually via the various getters
on `APZSampler`. In contrast, with the WebRender backend, there is a single
`APZSampler::SampleForWebRender` API that returns all the information needed
for all the scrollframes, scrollthumbs, etc. Conceptually though, the
functionality is pretty similar, because the compositor needs the same
information from APZ regardless of which backend is in use.
Along with sampling the APZ transforms, the compositor also triggers APZ
animations to advance to the next timestep (usually the next vsync). Again,
with both the WebRender and layers backend, this happens just before reading
the APZ transforms. The only difference is that with the layers backend,
the `AsyncCompositionManager` invokes the `APZSampler::AdvanceAnimations` API
directly, whereas with the WebRender backend this happens as part of the
`APZSampler::SampleForWebRender` implementation.
Threading / Locking Overview
----------------------------
Threads
~~~~~~~
There are three threads relevant to APZ: the **controller thread**,
the **updater thread**, and the **sampler thread**. This table lists
which threads play these roles on each platform / configuration:
===================== ========== =========== ============= ============== ========== =============
APZ Thread Name Desktop Desktop+GPU Desktop+WR Desktop+WR+GPU Android Android+WR
===================== ========== =========== ============= ============== ========== =============
**controller thread** UI main GPU main UI main GPU main Java UI Java UI
**updater thread** Compositor Compositor SceneBuilder SceneBuilder Compositor SceneBuilder
**sampler thread** Compositor Compositor RenderBackend RenderBackend Compositor RenderBackend
===================== ========== =========== ============= ============== ========== =============
Locks
~~~~~
There are also a number of locks used in APZ code:
======================= ==============================
Lock type How many instances
======================= ==============================
APZ tree lock one per APZCTreeManager
APZC map lock one per APZCTreeManager
APZC instance lock one per AsyncPanZoomController
APZ test lock one per APZCTreeManager
Checkerboard event lock one per AsyncPanZoomController
======================= ==============================
Thread / Lock Ordering
~~~~~~~~~~~~~~~~~~~~~~
To avoid deadlocks, the threads and locks have a global **ordering**
which must be respected.
Respecting the ordering means the following:
- Let "A < B" denote that A occurs earlier than B in the ordering
- Thread T may only acquire lock L, if T < L
- A thread may only acquire lock L2 while holding lock L1, if L1 < L2
- A thread may only block on a response from another thread T while holding a lock L, if L < T
**The lock ordering is as follows**:
1. UI main
2. GPU main (only if GPU enabled)
3. Compositor thread
4. SceneBuilder thread (only if WR enabled)
5. **APZ tree lock**
6. RenderBackend thread (only if WR enabled)
7. **APZC map lock**
8. **APZC instance lock**
9. **APZ test lock**
10. **Checkerboard event lock**
Example workflows
^^^^^^^^^^^^^^^^^
Here are some example APZ workflows. Observe how they all obey
the global thread/lock ordering. Feel free to add others:
- **Input handling** (in WR+GPU) case: UI main -> GPU main -> APZ tree lock -> RenderBackend thread
- **Sync messages** in ``PCompositorBridge.ipdl``: UI main thread -> Compositor thread
- **GetAPZTestData**: Compositor thread -> SceneBuilder thread -> test lock
- **Scene swap**: SceneBuilder thread -> APZ tree lock -> RenderBackend thread
- **Updating hit-testing tree**: SceneBuilder thread -> APZ tree lock -> APZC instance lock
- **Updating APZC map**: SceneBuilder thread -> APZ tree lock -> APZC map lock
- **Sampling and animation deferred tasks** [1]_: RenderBackend thread -> APZC map lock -> APZC instance lock
- **Advancing animations**: RenderBackend thread -> APZC instance lock
.. [1] It looks like there are two deferred tasks that actually need the tree lock,
``AsyncPanZoomController::HandleSmoothScrollOverscroll`` and
``AsyncPanZoomController::HandleFlingOverscroll``. We should be able to rewrite
these to use the map lock instead of the tree lock.
This will allow us to continue running the deferred tasks on the sampler
thread rather than having to bounce them to another thread.
|