diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 09:22:09 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 09:22:09 +0000 |
commit | 43a97878ce14b72f0981164f87f2e35e14151312 (patch) | |
tree | 620249daf56c0258faa40cbdcf9cfba06de2a846 /gfx/docs/AsyncPanZoom.rst | |
parent | Initial commit. (diff) | |
download | firefox-upstream.tar.xz firefox-upstream.zip |
Adding upstream version 110.0.1.upstream/110.0.1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'gfx/docs/AsyncPanZoom.rst')
-rw-r--r-- | gfx/docs/AsyncPanZoom.rst | 929 |
1 files changed, 929 insertions, 0 deletions
diff --git a/gfx/docs/AsyncPanZoom.rst b/gfx/docs/AsyncPanZoom.rst new file mode 100644 index 0000000000..39bdc48c7d --- /dev/null +++ b/gfx/docs/AsyncPanZoom.rst @@ -0,0 +1,929 @@ +.. _apz: + +Asynchronous Panning and Zooming +================================ + +**This document is a work in progress. Some information may be missing +or incomplete.** + +.. image:: AsyncPanZoomArchitecture.png + +Goals +----- + +We need to be able to provide a visual response to user input with +minimal latency. In particular, on devices with touch input, content +must track the finger exactly while panning, or the user experience is +very poor. According to the UX team, 120ms is an acceptable latency +between user input and response. + +Context and surrounding architecture +------------------------------------ + +The fundamental problem we are trying to solve with the Asynchronous +Panning and Zooming (APZ) code is that of responsiveness. By default, +web browsers operate in a “game loop” that looks like this: + +:: + + while true: + process input + do computations + repaint content + display repainted content + +In browsers the “do computation” step can be arbitrarily expensive +because it can involve running event handlers in web content. Therefore, +there can be an arbitrary delay between the input being received and the +on-screen display getting updated. + +Responsiveness is always good, and with touch-based interaction it is +even more important than with mouse or keyboard input. In order to +ensure responsiveness, we split the “game loop” model of the browser +into a multithreaded variant which looks something like this: + +:: + + Thread 1 (compositor thread) + while true: + receive input + send a copy of input to thread 2 + adjust rendered content based on input + display adjusted rendered content + + Thread 2 (main thread) + while true: + receive input from thread 1 + do computations + rerender content + update the copy of rendered content in thread 1 + +This multithreaded model is called off-main-thread compositing (OMTC), +because the compositing (where the content is displayed on-screen) +happens on a separate thread from the main thread. Note that this is a +very very simplified model, but in this model the “adjust rendered +content based on input” is the primary function of the APZ code. + +A couple of notes on APZ's relationship to other browser architecture +improvements: + +1. Due to Electrolysis (e10s), Site Isolation (Fission), and GPU Process + isolation, the above two threads often actually run in different + processes. APZ is largely agnostic to this, as all communication + between the two threads for APZ purposes happens using an IPC layer + that abstracts over communication between threads vs. processes. +2. With the WebRender graphics backend, part of the rendering pipeline is + also offloaded from the main thread. In this architecture, the + information sent from the main thread consists of a display list, and + scrolling-related metadata referencing content in that display list. + The metadata is kept in a queue until the display list undergoes an + additional rendering step in the compositor (scene building). At this + point, we are ready to tell APZ about the new content and have it + start applying adjustments to it, as further rendering steps beyond + scene building are done synchronously on each composite. + +The compositor in theory can continuously composite previously rendered +content (adjusted on each composite by APZ) to the screen while the +main thread is busy doing other things and rendering new content. + +The APZ code takes the input events that are coming in from the hardware +and uses them to figure out what the user is trying to do (e.g. pan the +page, zoom in). It then expresses this user intention in the form of +translation and/or scale transformation matrices. These transformation +matrices are applied to the rendered content at composite time, so that +what the user sees on-screen reflects what they are trying to do as +closely as possible. + +Technical overview +------------------ + +As per the heavily simplified model described above, the fundamental +purpose of the APZ code is to take input events and produce +transformation matrices. This section attempts to break that down and +identify the different problems that make this task non-trivial. + +Checkerboarding +~~~~~~~~~~~~~~~ + +The area of page content for which a display list is built and sent to +the compositor is called the “displayport”. The APZ code is responsible +for determining how large the displayport should be. On the one hand, we +want the displayport to be as large as possible. At the very least it +needs to be larger than what is visible on-screen, because otherwise, as +soon as the user pans, there will be some unpainted area of the page +exposed. However, we cannot always set the displayport to be the entire +page, because the page can be arbitrarily long and this would require an +unbounded amount of memory to store. Therefore, a good displayport size +is one that is larger than the visible area but not so large that it is a +huge drain on memory. Because the displayport is usually smaller than the +whole page, it is always possible for the user to scroll so fast that +they end up in an area of the page outside the displayport. When this +happens, they see unpainted content; this is referred to as +“checkerboarding”, and we try to avoid it where possible. + +There are many possible ways to determine what the displayport should be +in order to balance the tradeoffs involved (i.e. having one that is too +big is bad for memory usage, and having one that is too small results in +excessive checkerboarding). Ideally, the displayport should cover +exactly the area that we know the user will make visible. Although we +cannot know this for sure, we can use heuristics based on current +panning velocity and direction to ensure a reasonably-chosen displayport +area. This calculation is done in the APZ code, and a new desired +displayport is frequently sent to the main thread as the user is panning +around. + +Multiple scrollable elements +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Consider, for example, a scrollable page that contains an iframe which +itself is scrollable. The iframe can be scrolled independently of the +top-level page, and we would like both the page and the iframe to scroll +responsively. This means that we want independent asynchronous panning +for both the top-level page and the iframe. In addition to iframes, +elements that have the overflow:scroll CSS property set are also +scrollable. In the display list, scrollable elements are arranged in a +tree structure, and in the APZ code we have a matching tree of +AsyncPanZoomController (APZC) objects, one for each scrollable element. +To manage this tree of APZC instances, we have a single APZCTreeManager +object. Each APZC is relatively independent and handles the scrolling for +its associated scrollable element, but there are some cases in which they +need to interact; these cases are described in the sections below. + +Hit detection +~~~~~~~~~~~~~ + +Consider again the case where we have a scrollable page that contains an +iframe which itself is scrollable. As described above, we will have two +APZC instances - one for the page and one for the iframe. When the user +puts their finger down on the screen and moves it, we need to do some +sort of hit detection in order to determine whether their finger is on +the iframe or on the top-level page. Based on where their finger lands, +the appropriate APZC instance needs to handle the input. + +This hit detection is done by APZCTreeManager in collaboration with +WebRender, which has more detailed information about the structure of +the page content than is stored in APZ directly. See +:ref:`this section <wr-hit-test-details>` for more details. + +Also note that for some types of input (e.g. when the user puts two +fingers down to do a pinch) we do not want the input to be “split” +across two different APZC instances. In the case of a pinch, for +example, we find a “common ancestor” APZC instance - one that is +zoomable and contains all of the touch input points, and direct the +input to that APZC instance. + +Scroll Handoff +~~~~~~~~~~~~~~ + +Consider yet again the case where we have a scrollable page that +contains an iframe which itself is scrollable. Say the user scrolls the +iframe so that it reaches the bottom. If the user continues panning on +the iframe, the expectation is that the top-level page will start +scrolling. However, as discussed in the section on hit detection, the +APZC instance for the iframe is separate from the APZC instance for the +top-level page. Thus, we need the two APZC instances to communicate in +some way such that input events on the iframe result in scrolling on the +top-level page. This behaviour is referred to as “scroll handoff” (or +“fling handoff” in the case where analogous behaviour results from the +scrolling momentum of the page after the user has lifted their finger). + +.. _input-event-untransformation: + +Input event untransformation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The APZC architecture by definition results in two copies of a “scroll +position” for each scrollable element. There is the original copy on the +main thread that is accessible to web content and the layout and +painting code. And there is a second copy on the compositor side, which +is updated asynchronously based on user input, and corresponds to what +the user visually sees on the screen. Although these two copies may +diverge temporarily, they are reconciled periodically. In particular, +they diverge while the APZ code is performing an async pan or zoom +action on behalf of the user, and are reconciled when the APZ code +requests a repaint from the main thread. + +Because of the way input events are represented, this has some +unfortunate consequences. Input event coordinates are represented +relative to the device screen - so if the user touches at the same +physical spot on the device, the same input events will be delivered +regardless of the content scroll position. When the main thread receives +a touch event, it combines that with the content scroll position in order +to figure out what DOM element the user touched. However, because we now +have two different scroll positions, this process may not work perfectly. +A concrete example follows: + +Consider a device with screen size 600 pixels tall. On this device, a +user is viewing a document that is 1000 pixels tall, and that is +scrolled down by 200 pixels. That is, the vertical section of the +document from 200px to 800px is visible. Now, if the user touches a +point 100px from the top of the physical display, the hardware will +generate a touch event with y=100. This will get sent to the main +thread, which will add the scroll position (200) and get a +document-relative touch event with y=300. This new y-value will be used +in hit detection to figure out what the user touched. If the document +had a absolute-positioned div at y=300, then that would receive the +touch event. + +Now let us add some async scrolling to this example. Say that the user +additionally scrolls the document by another 10 pixels asynchronously +(i.e. only on the compositor thread), and then does the same touch +event. The same input event is generated by the hardware, and as before, +the document will deliver the touch event to the div at y=300. However, +visually, the document is scrolled by an additional 10 pixels so this +outcome is wrong. What needs to happen is that the APZ code needs to +intercept the touch event and account for the 10 pixels of asynchronous +scroll. Therefore, the input event with y=100 gets converted to y=110 in +the APZ code before being passed on to the main thread. The main thread +then adds the scroll position it knows about and determines that the +user touched at a document-relative position of y=310. + +Analogous input event transformations need to be done for horizontal +scrolling and zooming. + +Content independently adjusting scrolling +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As described above, there are two copies of the scroll position in the +APZ architecture - one on the main thread and one on the compositor +thread. Usually for architectures like this, there is a single “source +of truth” value and the other value is simply a copy. However, in this +case that is not easily possible to do. The reason is that both of these +values can be legitimately modified. On the compositor side, the input +events the user is triggering modify the scroll position, which is then +propagated to the main thread. However, on the main thread, web content +might be running Javascript code that programmatically sets the scroll +position (via window.scrollTo, for example). Scroll changes driven from +the main thread are just as legitimate and need to be propagated to the +compositor thread, so that the visual display updates in response. + +Because the cross-thread messaging is asynchronous, reconciling the two +types of scroll changes is a tricky problem. Our design solves this +using various flags and generation counters. The general heuristic we +have is that content-driven scroll position changes (e.g. scrollTo from +JS) are never lost. For instance, if the user is doing an async scroll +with their finger and content does a scrollTo in the middle, then some +of the async scroll would occur before the “jump” and the rest after the +“jump”. + +Content preventing default behaviour of input events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Another problem that we need to deal with is that web content is allowed +to intercept touch events and prevent the “default behaviour” of +scrolling. This ability is defined in web standards and is +non-negotiable. Touch event listeners in web content are allowed call +preventDefault() on the touchstart or first touchmove event for a touch +point; doing this is supposed to “consume” the event and prevent +touch-based panning. As we saw in a previous section, the input event +needs to be untransformed by the APZ code before it can be delivered to +content. But, because of the preventDefault problem, we cannot fully +process the touch event in the APZ code until content has had a chance +to handle it. + +To balance the needs of correctness (which calls for allowing web content +to successfully prevent default handling of events if it wishes to) and +responsiveness (which calls for avoiding blocking on web content +Javascript for a potentially-unbounded amount of time before reacting to +an event), APZ gives web content a "deadline" to process the event and +tell APZ whether preventDefault() was called on the event. The deadline +is 400ms from the time APZ receives the event on desktop, and 600ms on +mobile. If web content is able to process the event before this deadline, +the decision to preventDefault() the event or not will be respected. If +web content fails to process the event before the deadline, APZ assumes +preventDefault() will not be called and goes ahead and processes the +event. + +To implement this, upon receiving a touch event, APZ immediately returns +an untransformed version that can be dispatched to content. It also +schedules the 400ms or 600ms timeout. There is an API that allows the +main-thread event dispatching code to notify APZ as to whether or not the +default action should be prevented. If the APZ content response timeout +expires, or if the main-thread event dispatching code notifies the APZ of +the preventDefault status, then the APZ continues with the processing of +the events (which may involve discarding the events). + +To limit the responsiveness impact of this round-trip to content, APZ +tries to identify cases where it can rule out preventDefault() as a +possible outcome. To this end, the hit-testing information sent to the +compositor includes information about which regions of the page are +occupied by elements that have a touch event listener. If an event +targets an area outside of these regions, preventDefault() can be ruled +out, and the round-trip skipped. + +Additionally, recent enhancements to web standards have given page +authors new tools that can further limit the responsiveness impact of +preventDefault(): + +1. Event listeners can be registered as "passive", which means they + are not allowed to call preventDefault(). Authors can use this flag + when writing listeners that only need to observe the events, not alter + their behaviour via preventDefault(). The presence of passive event + listeners does not cause APZ to perform the content round-trip. +2. If page authors wish to disable certain types of touch interactions + completely, they can use the ``touch-action`` CSS property from the + pointer-events spec to do so declaratively, instead of registering + event listeners that call preventDefault(). Touch-action flags are + also included in the hit-test information sent to the compositor, and + APZ uses this information to respect ``touch-action``. (Note that the + touch-action information sent to the compositor is not always 100% + accurate, and sometimes APZ needs to fall back on asking the main + thread for touch-action information, which again involves a + round-trip.) + +Other event types +~~~~~~~~~~~~~~~~~ + +The above sections talk mostly about touch events, but over time APZ has +been extended to handle a variety of other event types, such as trackpad +and mousewheel scrolling, scrollbar thumb dragging, and keyboard +scrolling in some cases. Much of the above applies to these other event +types too (for example, wheel events can be prevent-defaulted as well). + +Importantly, the "untransformation" described above needs to happen even +for event types which are not handled in APZ, such as mouse click events, +since async scrolling can still affect the correct targeting of such +events. + + +Technical details +----------------- + +This section describes various pieces of the APZ code, and goes into +more specific detail on APIs and code than the previous sections. The +primary purpose of this section is to help people who plan on making +changes to the code, while also not going into so much detail that it +needs to be updated with every patch. + +Overall flow of input events +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This section describes how input events flow through the APZ code. + +Disclaimer: some details in this section are out of date (for example, +it assumes the case where the main thread and compositor thread are +in the same process, which is rarely the case these days, so in practice +e.g. steps 6 and 8 involve IPC, not just "stack unwinding"). + +1. Input events arrive from the hardware/widget code into the APZ via + APZCTreeManager::ReceiveInputEvent. The thread that invokes this is + called the "controller thread", and may or may not be the same as the + Gecko main thread. +2. Conceptually the first thing that the APZCTreeManager does is to + associate these events with “input blocks”. An input block is a set + of events that share certain properties, and generally are intended + to represent a single gesture. For example with touch events, all + events following a touchstart up to but not including the next + touchstart are in the same block. All of the events in a given block + will go to the same APZC instance and will either all be processed + or all be dropped. +3. Using the first event in the input block, the APZCTreeManager does a + hit-test to see which APZC it hits. If no APZC is hit, the events are + discarded and we jump to step 6. Otherwise, the input block is tagged + with the hit APZC as a tentative target and put into a global APZ + input queue. In addition the target APZC, the result of the hit test + also includes whether the input event landed on a "dispatch-to-content" + region. These are regions of the page where there is something going + on that requires dispatching the event to content and waiting for + a response _before_ processing the event in APZ; an example of this + is a region containing an element with a non-passive event listener, + as described above. (TODO: Add a section that talks about the other + uses of the dispatch-to-content mechanism.) +4. + + i. If the input events landed outside a dispatch-to-content region, + any available events in the input block are processed. These may + trigger behaviours like scrolling or tap gestures. + ii. If the input events landed inside a dispatch-to-content region, + the events are left in the queue and a timeout is initiated. If + the timeout expires before step 9 is completed, the APZ assumes + the input block was not cancelled and the tentative target is + correct, and processes them as part of step 10. + +5. The call stack unwinds back to APZCTreeManager::ReceiveInputEvent, + which does an in-place modification of the input event so that any + async transforms are removed. +6. The call stack unwinds back to the widget code that called + ReceiveInputEvent. This code now has the event in the coordinate + space Gecko is expecting, and so can dispatch it to the Gecko main + thread. +7. Gecko performs its own usual hit-testing and event dispatching for + the event. As part of this, it records whether any touch listeners + cancelled the input block by calling preventDefault(). It also + activates inactive scrollframes that were hit by the input events. +8. The call stack unwinds back to the widget code, which sends two + notifications to the APZ code on the controller thread. The first + notification is via APZCTreeManager::ContentReceivedInputBlock, and + informs the APZ whether the input block was cancelled. The second + notification is via APZCTreeManager::SetTargetAPZC, and informs the + APZ of the results of the Gecko hit-test during event dispatch. Note + that Gecko may report that the input event did not hit any + scrollable frame at all. The SetTargetAPZC notification happens only + once per input block, while the ContentReceivedInputBlock + notification may happen once per block, or multiple times per block, + depending on the input type. +9. + + i. If the events were processed as part of step 4(i), the + notifications from step 8 are ignored and step 10 is skipped. + ii. If events were queued as part of step 4(ii), and steps 5-8 + complete before the timeout, the arrival of both notifications + from step 8 will mark the input block ready for processing. + iii. If events were queued as part of step 4(ii), but steps 5-8 take + longer than the timeout, the notifications from step 8 will be + ignored and step 10 will already have happened. + +10. If events were queued as part of step 4(ii) they are now either + processed (if the input block was not cancelled and Gecko detected a + scrollframe under the input event, or if the timeout expired) or + dropped (all other cases). Note that the APZC that processes the + events may be different at this step than the tentative target from + step 3, depending on the SetTargetAPZC notification. Processing the + events may trigger behaviours like scrolling or tap gestures. + +If the CSS touch-action property is enabled, the above steps are +modified as follows: + +* In step 4, the APZC also requires the allowed touch-action behaviours + for the input event. This might have been determined as part of the + hit-test in APZCTreeManager; if not, the events are queued. +* In step 6, the widget code determines the content element at the point + under the input element, and notifies the APZ code of the allowed + touch-action behaviours. This notification is sent via a call to + APZCTreeManager::SetAllowedTouchBehavior on the input thread. +* In step 9(ii), the input block will only be marked ready for processing + once all three notifications arrive. + +Threading considerations +^^^^^^^^^^^^^^^^^^^^^^^^ + +The bulk of the input processing in the APZ code happens on what we call +“the controller thread”. In practice the controller thread could be the +Gecko main thread, the compositor thread, or some other thread. There are +obvious downsides to using the Gecko main thread - that is,“asynchronous” +panning and zooming is not really asynchronous as input events can only +be processed while Gecko is idle. In an e10s environment, using the Gecko +main thread of the chrome process is acceptable, because the code running +in that process is more controllable and well-behaved than arbitrary web +content. Using the compositor thread as the controller thread could work +on some platforms, but may be inefficient on others. For example, on +Android (Fennec) we receive input events from the system on a dedicated +UI thread. We would have to redispatch the input events to the compositor +thread if we wanted to the input thread to be the same as the compositor +thread. This introduces a potential for higher latency, particularly if +the compositor does any blocking operations - blocking SwapBuffers +operations, for example. As a result, the APZ code itself does not assume +that the controller thread will be the same as the Gecko main thread or +the compositor thread. + +Active vs. inactive scrollframes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The number of scrollframes on a page is potentially unbounded. However, +we do not want to create a separate displayport for each scrollframe +right away, as this would require large amounts of memory. Therefore, +scrollframes as designated as either “active” or “inactive”. Active +scrollframes get a displayport, and an APZC on the compositor side. +Inactive scrollframes do not get a displayport (a display list is only +built for their viewport, i.e. what is currently visible) and do not get +an APZC. + +Consider a page with a scrollframe that is initially inactive. This +scroll frame does not get an APZC, and therefore events targeting it will +target the APZC for the nearest active scrollable ancestor (let's call it +P; note, the rootmost scroll frame in a given process is always active). +However, the presence of the inactive scroll frame is reflected by a +dispatch-to-content region that prevents events over the frame from +erroneously scrolling P. + +When the user starts interacting with that content, the hit-test in the +APZ code hits the dispatch-to-content region of P. The input block +therefore has a tentative target of P when it goes into step 4(ii) in the +flow above. When gecko processes the input event, it must detect the +inactive scrollframe and activate it, as part of step 7. Finally, the +widget code sends the SetTargetAPZC notification in step 8 to notify the +APZ that the input block should really apply to this new APZC. An issue +here is that the transaction containing metadata for the newly active +scroll frame must reach the compositor and APZ before the SetTargetAPZC +notification. If this does not occur within the 400ms timeout, the APZ +code will be unable to update the tentative target, and will continue to +use P for that input block. Input blocks that start after the transaction +will get correctly routed to the new scroll frame as there will now be an +APZC instance for the active scrollframe. + +This model implies that when the user initially attempts to scroll an +inactive scrollframe, it may end up scrolling an ancestor scrollframe. +Only after the round-trip to the gecko thread is complete is there an +APZC for async scrolling to actually occur on the scrollframe itself. At +that point the scrollframe will start receiving new input blocks and will +scroll normally. + +Note: with Fission (where inactive scroll frames would make it impossible +to target the correct process in all situations; see +:ref:`this section <fission-hit-testing>` for more details) and WebRender +(which makes displayports more lightweight as the actual rendering is +offloaded to the compositor and can be done on demand), inactive scroll +frames are being phased out, and we are moving towards a model where all +scroll frames with nonempty scroll ranges are active and get a +displayport and an APZC. To conserve memory, displayports for scroll +frames which have not been recently scrolled are kept to a "minimal" size +equal to the viewport size. + +WebRender Integration +~~~~~~~~~~~~~~~~~~~~~ + +This section describes how APZ interacts with the WebRender graphics +backend. + +Note that APZ predates WebRender, having initially been written to work +with the earlier Layers graphics backend. The design of Layers has +influenced APZ significantly, and this still shows in some places in the +code. Now that the Layers backend has been removed, there may be +opportunities to streamline the interaction between APZ and WebRender. + + +HitTestingTree +^^^^^^^^^^^^^^ + +The APZCTreeManager keeps as part of its internal state a tree of +HitTestingTreeNode instances. This is referred to as the HitTestingTree. + +The main purpose of the HitTestingTree is to model the spatial +relationships between content that's affected by async scrolling. Tree +nodes fall roughly into the following categories: + +* Nodes representing scrollable content in an active scroll frame. These + nodes are associated with the scroll frame's APZC. +* Nodes representing other content that may move in special ways in + response to async scrolling, such as fixed content, sticky content, and + scrollbars. +* (Non-leaf) nodes which do not represent any content, just metadata + (e.g. a transform) that applies to its descendant nodes. + +An APZC may be associated with multiple nodes, if e.g. a scroll frame +scrolls two pieces of content that are interleaved with non-scrolling +content. + +Arranging these nodes in a tree allows modelling relationships such as +what content is scrolled by a given scroll frame, what the scroll handoff +relationships are between APZCs, and what content is subject to what +transforms. + +An additional use of the HitTestingTree is to allow APZ to keep content +processes up to date about enclosing transforms that they are subject to. +See :ref:`this section <sending-transforms-to-content-processes>` for +more details. + +(In the past, with the Layers backend, the HitTestingTree was also used +for compositor hit testing, hence the name. This is no longer the case, +and there may be opportunities to simplify the tree as a result.) + +The HitTestingTree is created from another tree data structure called +WebRenderScrollData. The relevant types here are: + +* WebRenderScrollData which stores the entire tree. +* WebRenderLayerScrollData, which represents a single "layer" of content, + i.e. a group of display items that move together when scrolling (or + metadata applying to a subtree of such layers). In the Layers backend, + such content would be rendered into a single texture which could then + be moved asynchronously at composite time. Since a layer of content can + be scrolled by multiple (nested) scroll frames, a + WebRenderLayerScrollData may contain scroll metadata for more than one + scroll frame. +* WebRenderScrollDataWrapper, which wraps WebRenderLayerScrollData + but "expanded" in a way that each node only stores metadata for + a single scroll frame. WebRenderScrollDataWrapper nodes have a + 1:1 correspondence with HitTestingTreeNodes. + +It's not clear whether the distinction between WebRenderLayerScrollData +and WebRenderScrollDataWrapper is still useful in a WebRender-only world. +The code could potentially be revised such that we directly build and +store nodes of a single type with the behaviour of +WebRenderScrollDataWrapper. + +The WebRenderScrollData structure is built on the main thread, and +then shipped over IPC to the compositor where it's used to construct +the HitTestingTree. + +WebRenderScrollData is built in WebRenderCommandBuilder, during the +same traversal of the Gecko display list that is used to build the +WebRender display list. As of this writing, the architecture for this is +that, as we walk the Gecko display list, we query it to see if it +contains any information that APZ might need to know (e.g. CSS +transforms) via a call to ``nsDisplayItem::UpdateScrollData(nullptr, +nullptr)``. If this call returns true, we create a +WebRenderLayerScrollData instance for the item, and populate it with the +necessary information in ``WebRenderLayerScrollData::Initialize``. We also +create WebRenderLayerScrollData instances if we detect (via ASR changes) +that we are now processing a Gecko display item that is in a different +scrollframe than the previous item. + +The main sources of complexity in this code come from: + +1. Ensuring the ScrollMetadata instances end on the proper + WebRenderLayerScrollData instances (such that every path from a leaf + WebRenderLayerScrollData node to the root has a consistent ordering of + scrollframes without duplications). +2. The deferred-transform optimization that is described in more detail + at the declaration of ``StackingContextHelper::mDeferredTransformItem``. + +.. _wr-hit-test-details: + +Hit-testing +^^^^^^^^^^^ + +Since the HitTestingTree is not used for actual hit-testing purposes +with the WebRender backend (see previous section), this section describes +how hit-testing actually works with WebRender. + +The Gecko display list contains display items +(``nsDisplayCompositorHitTestInfo``) that store hit-testing state. These +items implement the ``CreateWebRenderCommands`` method and generate a "hit-test +item" into the WebRender display list. This is basically just a rectangle +item in the WebRender display list that is no-op for painting purposes, +but contains information that should be returned by the hit-test (specifically +the hit info flags and the scrollId of the enclosing scrollframe). The +hit-test item gets clipped and transformed in the same way that all the other +items in the WebRender display list do, via clip chains and enclosing +reference frame/stacking context items. + +When WebRender needs to do a hit-test, it goes through its display list, +taking into account the current clips and transforms, adjusted for the +most recent async scroll/zoom, and determines which hit-test item(s) are under +the target point, and returns those items. APZ can then take the frontmost +item from that list (or skip over it if it happens to be inside a OOP +subdocument that's ``pointer-events:none``) and use that as the hit target. +Note that the hit-test uses the last transform provided by the +``SampleForWebRender`` API (see next section) which generally reflects the +last composite, and doesn't take into account further changes to the +transforms that have occurred since then. In practice, we should be +compositing frequently enough that this doesn't matter much. + +When debugging hit-test issues, it is often useful to apply the patches +on bug 1656260, which introduce a guid on Gecko display items and propagate +it all the way through to where APZ gets the hit-test result. This allows +answering the question "which nsDisplayCompositorHitTestInfo was responsible +for this hit-test result?" which is often a very good first step in +solving the bug. From there, one can determine if there was some other +display item in front that should have generated a +nsDisplayCompositorHitTestInfo but didn't, or if display item itself had +incorrect information. The second patch on that bug further allows exposing +hand-written debug info to the APZ code, so that the WR hit-testing +mechanism itself can be more effectively debugged, in case there is a problem +with the WR display items getting improperly transformed or clipped. + +The information returned by WebRender to APZ in response to the hit test +is enough for APZ to identify a HitTestingTreeNode as the target of the +event. APZ can then take actions such as scrolling the target node's +associated APZC, or other appropriate actions (e.g. initiating a scrollbar +drag if a scrollbar thumb node was targeted by a mouse-down event). + +Sampling +^^^^^^^^ + +The compositing step needs to read the latest async transforms from APZ +in order to ensure scrollframes are rendered at the right position. The API for this is +exposed via the ``APZSampler`` class. When WebRender is ready to do a composite, +it invokes ``APZSampler::SampleForWebRender``. In here, APZ gathers all async +transforms that WebRender needs to know about, including transforms to apply +to scrolled content, fixed and sticky content, and scrollbar thumbs. + +Along with sampling the APZ transforms, the compositor also triggers APZ +animations to advance to the next timestep (usually the next vsync). This +happens just before reading the APZ transforms. + +Fission Integration +~~~~~~~~~~~~~~~~~~~ + +This section describes how APZ interacts with the Fission (Site Isolation) +project. + +Introduction +^^^^^^^^^^^^ + +Fission is an architectural change motivated by security considerations, +where web content from each origin is isolated in its own process. Since +a page can contain a mixture of content from different origins (for +example, the top level page can be content from origin A, and it can +contain an iframe with content from origin B), that means that rendering +and interacting with a page can now involve coordination between APZ and +multiple content processes. + +.. _fission-hit-testing: + +Content Process Selection for Input Events +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Input events are initially received in the browser's parent process. +With Fission, the browser needs to decide which of possibly several +content processes an event is targeting. + +Since process boundaries correspond to iframe (subdocument) boundaries, +and every (html) document has a root scroll frame, process boundaries are +therefore also scroll frame boundaries. Since APZ already needs a hit +test mechanism to be able to determine which scroll frame an event +targets, this hit test mechanism was a good fit to also use to determine +which content process an event targets. + +APZ's hit test was therefore expanded to serve this purpose as well. This +mostly required only minor modifications, such as making sure that APZ +knows about the root scroll frames of iframes even if they're not +scrollable. Since APZ already needs to process all input events to +potentially apply :ref:`untransformations <input-event-untransformation>` +related to async scrolling, as part of this process it now also labels +input events with information identifying which content process they +target. + +Hit Testing Accuracy +^^^^^^^^^^^^^^^^^^^^ + +Prior to Fission, APZ's hit test could afford to be somewhat inaccurate, +as it could fall back on the dispatch-to-content mechanism to wait for +a more accurate answer from the main thread if necessary, suffering a +performance cost only (not a correctness cost). + +With Fission, an inaccurate compositor hit test now implies a correctness +cost, as there is no cross-process main-thread fallback mechanism. +(Such a mechanism was considered, but judged to require too much +complexity and IPC traffic to be worth it.) + +Luckily, with WebRender the compositor has much more detailed information +available to use for hit testing than it did with Layers. For example, +the compositor can perform accurate hit testing even in the presence of +irregular shapes such as rounded corners. + +APZ leverages WebRender's more accurate hit testing ability to aim to +accurately select the target process (and target scroll frame) for an +event in general. + +One consequence of this is that the dispatch-to-content mechanism is now +used less often than before (its primary remaining use is handling +`preventDefault()`). + +.. _sending-transforms-to-content-processes: + +Sending Transforms To Content Processes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Content processes sometimes need to be able to convert between screen +coordinates and their local coordinates. To do this, they need to know +about any transforms that their containing iframe and its ancestors are +subject to, including async transforms (particularly in cases where the +async transforms persist for more than just a few frames). + +APZ has information about these transforms in its HitTestingTree. With +Fission, APZ periodically sends content processes information about these +transforms so that they are kept relatively up to date. + +Testing +------- + +APZ makes use of several test frameworks to verify the expected behavior +is seen. + +Mochitest +~~~~~~~~~ + +The APZ specific mochitests are useful when specific gestures or events need to be tested +with specific content. The APZ mochitests are located in `gfx/layers/apz/test/mochitest`_. +To run all of the APZ mochitests, run something like the following: + +:: + + ./mach mochitest ./gfx/layers/apz/test/mochitest + +The APZ mochitests are often organized as subtests that run in a group. For example, +the `test_group_hittest-2.html`_ contains >20 subtests like +`helper_hittest_overscroll.html`_. When working on a specific subtest, it is often +helpful to use the `apz.subtest` preference to filter the subtests run to just the +tests you are working on. For example, the following would only run the +`helper_hittest_overscroll.html`_ subtest of the `test_group_hittest-2.html`_ group. + +:: + + ./mach mochitest --setpref apz.subtest=helper_hittest_overscroll.html \ + ./gfx/layers/apz/test/mochitest/test_group_hittest-2.html + +For more information on mochitest, see the `Mochitest Documentation`_. + +.. _gfx/layers/apz/test/mochitest: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest +.. _test_group_hittest-2.html: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest/test_group_hittest-2.html +.. _helper_hittest_overscroll.html: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest/helper_hittest_overscroll.html +.. _Mochitest Documentation: /testing/mochitest-plain/index.html + +GTest +~~~~~ + +The APZ specific GTests can be found in `gfx/layers/apz/test/gtest/`_. To run +these tests, run something like the following: + +:: + + ./mach gtest "APZ*" + +For more information, see the `GTest Documentation`_. + +.. _GTest Documentation: /gtest/index.html +.. _gfx/layers/apz/test/gtest/: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/gtest/ + +Reftests +~~~~~~~~ + +The APZ reftests can be found in `layout/reftests/async-scrolling/`_ and +`gfx/layers/apz/test/reftest`_. To run the relevant reftests for APZ, run +a large portion of the APZ reftests, run something like the following: + +:: + + ./mach reftest ./layout/reftests/async-scrolling/ + +Useful information about the reftests can be found in the `Reftest Documentation`_. + +There is no defined process for choosing which directory the APZ reftests +should be placed in, but in general reftests should exist where other +similar tests do. + +.. _layout/reftests/async-scrolling/: https://searchfox.org/mozilla-central/source/layout/reftests/async-scrolling/ +.. _gfx/layers/apz/test/reftest: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/reftest/ +.. _Reftest Documentation: /layout/Reftest.html + +Threading / Locking Overview +---------------------------- + +Threads +~~~~~~~ + +There are three threads relevant to APZ: the **controller thread**, +the **updater thread**, and the **sampler thread**. This table lists +which threads play these roles on each platform / configuration: + +===================== ============= ============== ============= +APZ Thread Name Desktop Desktop+GPU Android +===================== ============= ============== ============= +**controller thread** UI main GPU main Java UI +**updater thread** SceneBuilder SceneBuilder SceneBuilder +**sampler thread** RenderBackend RenderBackend RenderBackend +===================== ============= ============== ============= + +Locks +~~~~~ + +There are also a number of locks used in APZ code: + +======================= ============================== +Lock type How many instances +======================= ============================== +APZ tree lock one per APZCTreeManager +APZC map lock one per APZCTreeManager +APZC instance lock one per AsyncPanZoomController +APZ test lock one per APZCTreeManager +Checkerboard event lock one per AsyncPanZoomController +======================= ============================== + +Thread / Lock Ordering +~~~~~~~~~~~~~~~~~~~~~~ + +To avoid deadlocks, the threads and locks have a global **ordering** +which must be respected. + +Respecting the ordering means the following: + +- Let "A < B" denote that A occurs earlier than B in the ordering +- Thread T may only acquire lock L, if T < L +- A thread may only acquire lock L2 while holding lock L1, if L1 < L2 +- A thread may only block on a response from another thread T while holding a lock L, if L < T + +**The lock ordering is as follows**: + +1. UI main +2. GPU main (only if GPU process enabled) +3. Compositor thread +4. SceneBuilder thread +5. **APZ tree lock** +6. RenderBackend thread +7. **APZC map lock** +8. **APZC instance lock** +9. **APZ test lock** +10. **Checkerboard event lock** + +Example workflows +^^^^^^^^^^^^^^^^^ + +Here are some example APZ workflows. Observe how they all obey +the global thread/lock ordering. Feel free to add others: + +- **Input handling** (with GPU process): UI main -> GPU main -> APZ tree lock -> RenderBackend thread +- **Sync messages** in ``PCompositorBridge.ipdl``: UI main thread -> Compositor thread +- **GetAPZTestData**: Compositor thread -> SceneBuilder thread -> test lock +- **Scene swap**: SceneBuilder thread -> APZ tree lock -> RenderBackend thread +- **Updating hit-testing tree**: SceneBuilder thread -> APZ tree lock -> APZC instance lock +- **Updating APZC map**: SceneBuilder thread -> APZ tree lock -> APZC map lock +- **Sampling and animation deferred tasks** [1]_: RenderBackend thread -> APZC map lock -> APZC instance lock +- **Advancing animations**: RenderBackend thread -> APZC instance lock + +.. [1] It looks like there are two deferred tasks that actually need the tree lock, + ``AsyncPanZoomController::HandleSmoothScrollOverscroll`` and + ``AsyncPanZoomController::HandleFlingOverscroll``. We should be able to rewrite + these to use the map lock instead of the tree lock. + This will allow us to continue running the deferred tasks on the sampler + thread rather than having to bounce them to another thread. |