Adding upstream version 124.0.1.upstream/124.0.1

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-19 00:47:55 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-19 00:47:55 +0000
commit: 26a029d407be480d791972afb5975cf62c9360a6 (patch)
tree: f435a8308119effd964b339f76abb83a57c29483 /gfx/docs/AsyncPanZoom.rst
parent: Initial commit. (diff)
download: firefox-upstream/124.0.1.tar.xz
firefox-upstream/124.0.1.zip
1 files changed, 929 insertions, 0 deletions
diff --git a/gfx/docs/AsyncPanZoom.rst b/gfx/docs/AsyncPanZoom.rst
new file mode 100644
index 0000000000..761c8fbb4f
--- /dev/null
+++ b/gfx/docs/AsyncPanZoom.rst
@@ -0,0 +1,929 @@
+.. _apz:
+
+Asynchronous Panning and Zooming
+================================
+
+**This document is a work in progress. Some information may be missing
+or incomplete.**
+
+.. image:: AsyncPanZoomArchitecture.png
+
+Goals
+-----
+
+We need to be able to provide a visual response to user input with
+minimal latency. In particular, on devices with touch input, content
+must track the finger exactly while panning, or the user experience is
+very poor. According to the UX team, 120ms is an acceptable latency
+between user input and response.
+
+Context and surrounding architecture
+------------------------------------
+
+The fundamental problem we are trying to solve with the Asynchronous
+Panning and Zooming (APZ) code is that of responsiveness. By default,
+web browsers operate in a “game loop” that looks like this:
+
+::
+
+       while true:
+           process input
+           do computations
+           repaint content
+           display repainted content
+
+In browsers the “do computation” step can be arbitrarily expensive
+because it can involve running event handlers in web content. Therefore,
+there can be an arbitrary delay between the input being received and the
+on-screen display getting updated.
+
+Responsiveness is always good, and with touch-based interaction it is
+even more important than with mouse or keyboard input. In order to
+ensure responsiveness, we split the “game loop” model of the browser
+into a multithreaded variant which looks something like this:
+
+::
+
+       Thread 1 (compositor thread)
+       while true:
+           receive input
+           send a copy of input to thread 2
+           adjust rendered content based on input
+           display adjusted rendered content
+
+       Thread 2 (main thread)
+       while true:
+           receive input from thread 1
+           do computations
+           rerender content
+           update the copy of rendered content in thread 1
+
+This multithreaded model is called off-main-thread compositing (OMTC),
+because the compositing (where the content is displayed on-screen)
+happens on a separate thread from the main thread. Note that this is a
+very very simplified model, but in this model the “adjust rendered
+content based on input” is the primary function of the APZ code.
+
+A couple of notes on APZ's relationship to other browser architecture
+improvements:
+
+1. Due to Electrolysis (e10s), Site Isolation (Fission), and GPU Process
+   isolation, the above two threads often actually run in different
+   processes. APZ is largely agnostic to this, as all communication
+   between the two threads for APZ purposes happens using an IPC layer
+   that abstracts over communication between threads vs. processes.
+2. With the WebRender graphics backend, part of the rendering pipeline is
+   also offloaded from the main thread. In this architecture, the
+   information sent from the main thread consists of a display list, and
+   scrolling-related metadata referencing content in that display list.
+   The metadata is kept in a queue until the display list undergoes an
+   additional rendering step in the compositor (scene building). At this
+   point, we are ready to tell APZ about the new content and have it
+   start applying adjustments to it, as further rendering steps beyond
+   scene building are done synchronously on each composite.
+
+The compositor in theory can continuously composite previously rendered
+content (adjusted on each composite by APZ) to the screen while the
+main thread is busy doing other things and rendering new content.
+
+The APZ code takes the input events that are coming in from the hardware
+and uses them to figure out what the user is trying to do (e.g. pan the
+page, zoom in). It then expresses this user intention in the form of
+translation and/or scale transformation matrices. These transformation
+matrices are applied to the rendered content at composite time, so that
+what the user sees on-screen reflects what they are trying to do as
+closely as possible.
+
+Technical overview
+------------------
+
+As per the heavily simplified model described above, the fundamental
+purpose of the APZ code is to take input events and produce
+transformation matrices. This section attempts to break that down and
+identify the different problems that make this task non-trivial.
+
+Checkerboarding
+~~~~~~~~~~~~~~~
+
+The area of page content for which a display list is built and sent to
+the compositor is called the “displayport”. The APZ code is responsible
+for determining how large the displayport should be. On the one hand, we
+want the displayport to be as large as possible. At the very least it
+needs to be larger than what is visible on-screen, because otherwise, as
+soon as the user pans, there will be some unpainted area of the page
+exposed. However, we cannot always set the displayport to be the entire
+page, because the page can be arbitrarily long and this would require an
+unbounded amount of memory to store. Therefore, a good displayport size
+is one that is larger than the visible area but not so large that it is a
+huge drain on memory. Because the displayport is usually smaller than the
+whole page, it is always possible for the user to scroll so fast that
+they end up in an area of the page outside the displayport. When this
+happens, they see unpainted content; this is referred to as
+“checkerboarding”, and we try to avoid it where possible.
+
+There are many possible ways to determine what the displayport should be
+in order to balance the tradeoffs involved (i.e. having one that is too
+big is bad for memory usage, and having one that is too small results in
+excessive checkerboarding). Ideally, the displayport should cover
+exactly the area that we know the user will make visible. Although we
+cannot know this for sure, we can use heuristics based on current
+panning velocity and direction to ensure a reasonably-chosen displayport
+area. This calculation is done in the APZ code, and a new desired
+displayport is frequently sent to the main thread as the user is panning
+around.
+
+Multiple scrollable elements
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Consider, for example, a scrollable page that contains an iframe which
+itself is scrollable. The iframe can be scrolled independently of the
+top-level page, and we would like both the page and the iframe to scroll
+responsively. This means that we want independent asynchronous panning
+for both the top-level page and the iframe. In addition to iframes,
+elements that have the overflow:scroll CSS property set are also
+scrollable. In the display list, scrollable elements are arranged in a
+tree structure, and in the APZ code we have a matching tree of
+AsyncPanZoomController (APZC) objects, one for each scrollable element.
+To manage this tree of APZC instances, we have a single APZCTreeManager
+object. Each APZC is relatively independent and handles the scrolling for
+its associated scrollable element, but there are some cases in which they
+need to interact; these cases are described in the sections below.
+
+Hit detection
+~~~~~~~~~~~~~
+
+Consider again the case where we have a scrollable page that contains an
+iframe which itself is scrollable. As described above, we will have two
+APZC instances - one for the page and one for the iframe. When the user
+puts their finger down on the screen and moves it, we need to do some
+sort of hit detection in order to determine whether their finger is on
+the iframe or on the top-level page. Based on where their finger lands,
+the appropriate APZC instance needs to handle the input.
+
+This hit detection is done by APZCTreeManager in collaboration with
+WebRender, which has more detailed information about the structure of
+the page content than is stored in APZ directly. See
+:ref:`this section <wr-hit-test-details>` for more details.
+
+Also note that for some types of input (e.g. when the user puts two
+fingers down to do a pinch) we do not want the input to be “split”
+across two different APZC instances. In the case of a pinch, for
+example, we find a “common ancestor” APZC instance - one that is
+zoomable and contains all of the touch input points, and direct the
+input to that APZC instance.
+
+Scroll Handoff
+~~~~~~~~~~~~~~
+
+Consider yet again the case where we have a scrollable page that
+contains an iframe which itself is scrollable. Say the user scrolls the
+iframe so that it reaches the bottom. If the user continues panning on
+the iframe, the expectation is that the top-level page will start
+scrolling. However, as discussed in the section on hit detection, the
+APZC instance for the iframe is separate from the APZC instance for the
+top-level page. Thus, we need the two APZC instances to communicate in
+some way such that input events on the iframe result in scrolling on the
+top-level page. This behaviour is referred to as “scroll handoff” (or
+“fling handoff” in the case where analogous behaviour results from the
+scrolling momentum of the page after the user has lifted their finger).
+
+.. _input-event-untransformation:
+
+Input event untransformation
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The APZC architecture by definition results in two copies of a “scroll
+position” for each scrollable element. There is the original copy on the
+main thread that is accessible to web content and the layout and
+painting code. And there is a second copy on the compositor side, which
+is updated asynchronously based on user input, and corresponds to what
+the user visually sees on the screen. Although these two copies may
+diverge temporarily, they are reconciled periodically. In particular,
+they diverge while the APZ code is performing an async pan or zoom
+action on behalf of the user, and are reconciled when the APZ code
+requests a repaint from the main thread.
+
+Because of the way input events are represented, this has some
+unfortunate consequences. Input event coordinates are represented
+relative to the device screen - so if the user touches at the same
+physical spot on the device, the same input events will be delivered
+regardless of the content scroll position. When the main thread receives
+a touch event, it combines that with the content scroll position in order
+to figure out what DOM element the user touched. However, because we now
+have two different scroll positions, this process may not work perfectly.
+A concrete example follows:
+
+Consider a device with screen size 600 pixels tall. On this device, a
+user is viewing a document that is 1000 pixels tall, and that is
+scrolled down by 200 pixels. That is, the vertical section of the
+document from 200px to 800px is visible. Now, if the user touches a
+point 100px from the top of the physical display, the hardware will
+generate a touch event with y=100. This will get sent to the main
+thread, which will add the scroll position (200) and get a
+document-relative touch event with y=300. This new y-value will be used
+in hit detection to figure out what the user touched. If the document
+had a absolute-positioned div at y=300, then that would receive the
+touch event.
+
+Now let us add some async scrolling to this example. Say that the user
+additionally scrolls the document by another 10 pixels asynchronously
+(i.e. only on the compositor thread), and then does the same touch
+event. The same input event is generated by the hardware, and as before,
+the document will deliver the touch event to the div at y=300. However,
+visually, the document is scrolled by an additional 10 pixels so this
+outcome is wrong. What needs to happen is that the APZ code needs to
+intercept the touch event and account for the 10 pixels of asynchronous
+scroll. Therefore, the input event with y=100 gets converted to y=110 in
+the APZ code before being passed on to the main thread. The main thread
+then adds the scroll position it knows about and determines that the
+user touched at a document-relative position of y=310.
+
+Analogous input event transformations need to be done for horizontal
+scrolling and zooming.
+
+Content independently adjusting scrolling
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+As described above, there are two copies of the scroll position in the
+APZ architecture - one on the main thread and one on the compositor
+thread. Usually for architectures like this, there is a single “source
+of truth” value and the other value is simply a copy. However, in this
+case that is not easily possible to do. The reason is that both of these
+values can be legitimately modified. On the compositor side, the input
+events the user is triggering modify the scroll position, which is then
+propagated to the main thread. However, on the main thread, web content
+might be running Javascript code that programmatically sets the scroll
+position (via window.scrollTo, for example). Scroll changes driven from
+the main thread are just as legitimate and need to be propagated to the
+compositor thread, so that the visual display updates in response.
+
+Because the cross-thread messaging is asynchronous, reconciling the two
+types of scroll changes is a tricky problem. Our design solves this
+using various flags and generation counters. The general heuristic we
+have is that content-driven scroll position changes (e.g. scrollTo from
+JS) are never lost. For instance, if the user is doing an async scroll
+with their finger and content does a scrollTo in the middle, then some
+of the async scroll would occur before the “jump” and the rest after the
+“jump”.
+
+Content preventing default behaviour of input events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Another problem that we need to deal with is that web content is allowed
+to intercept touch events and prevent the “default behaviour” of
+scrolling. This ability is defined in web standards and is
+non-negotiable. Touch event listeners in web content are allowed call
+preventDefault() on the touchstart or first touchmove event for a touch
+point; doing this is supposed to “consume” the event and prevent
+touch-based panning. As we saw in a previous section, the input event
+needs to be untransformed by the APZ code before it can be delivered to
+content. But, because of the preventDefault problem, we cannot fully
+process the touch event in the APZ code until content has had a chance
+to handle it.
+
+To balance the needs of correctness (which calls for allowing web content
+to successfully prevent default handling of events if it wishes to) and
+responsiveness (which calls for avoiding blocking on web content
+Javascript for a potentially-unbounded amount of time before reacting to
+an event), APZ gives web content a "deadline" to process the event and
+tell APZ whether preventDefault() was called on the event. The deadline
+is 400ms from the time APZ receives the event on desktop, and 600ms on
+mobile. If web content is able to process the event before this deadline,
+the decision to preventDefault() the event or not will be respected. If
+web content fails to process the event before the deadline, APZ assumes
+preventDefault() will not be called and goes ahead and processes the
+event.
+
+To implement this, upon receiving a touch event, APZ immediately returns
+an untransformed version that can be dispatched to content. It also
+schedules the 400ms or 600ms timeout. There is an API that allows the
+main-thread event dispatching code to notify APZ as to whether or not the
+default action should be prevented. If the APZ content response timeout
+expires, or if the main-thread event dispatching code notifies the APZ of
+the preventDefault status, then the APZ continues with the processing of
+the events (which may involve discarding the events).
+
+To limit the responsiveness impact of this round-trip to content, APZ
+tries to identify cases where it can rule out preventDefault() as a
+possible outcome. To this end, the hit-testing information sent to the
+compositor includes information about which regions of the page are
+occupied by elements that have a touch event listener. If an event
+targets an area outside of these regions, preventDefault() can be ruled
+out, and the round-trip skipped.
+
+Additionally, recent enhancements to web standards have given page
+authors new tools that can further limit the responsiveness impact of
+preventDefault():
+
+1. Event listeners can be registered as "passive", which means they
+   are not allowed to call preventDefault(). Authors can use this flag
+   when writing listeners that only need to observe the events, not alter
+   their behaviour via preventDefault(). The presence of passive event
+   listeners does not cause APZ to perform the content round-trip.
+2. If page authors wish to disable certain types of touch interactions
+   completely, they can use the ``touch-action`` CSS property from the
+   pointer-events spec to do so declaratively, instead of registering
+   event listeners that call preventDefault(). Touch-action flags are
+   also included in the hit-test information sent to the compositor, and
+   APZ uses this information to respect ``touch-action``. (Note that the
+   touch-action information sent to the compositor is not always 100%
+   accurate, and sometimes APZ needs to fall back on asking the main
+   thread for touch-action information, which again involves a
+   round-trip.)
+
+Other event types
+~~~~~~~~~~~~~~~~~
+
+The above sections talk mostly about touch events, but over time APZ has
+been extended to handle a variety of other event types, such as trackpad
+and mousewheel scrolling, scrollbar thumb dragging, and keyboard
+scrolling in some cases. Much of the above applies to these other event
+types too (for example, wheel events can be prevent-defaulted as well).
+
+Importantly, the "untransformation" described above needs to happen even
+for event types which are not handled in APZ, such as mouse click events,
+since async scrolling can still affect the correct targeting of such
+events.
+
+
+Technical details
+-----------------
+
+This section describes various pieces of the APZ code, and goes into
+more specific detail on APIs and code than the previous sections. The
+primary purpose of this section is to help people who plan on making
+changes to the code, while also not going into so much detail that it
+needs to be updated with every patch.
+
+Overall flow of input events
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This section describes how input events flow through the APZ code.
+
+Disclaimer: some details in this section are out of date (for example,
+it assumes the case where the main thread and compositor thread are
+in the same process, which is rarely the case these days, so in practice
+e.g. steps 6 and 8 involve IPC, not just "stack unwinding").
+
+1.  Input events arrive from the hardware/widget code into the APZ via
+    APZCTreeManager::ReceiveInputEvent. The thread that invokes this is
+    called the "controller thread", and may or may not be the same as the
+    Gecko main thread.
+2.  Conceptually the first thing that the APZCTreeManager does is to
+    associate these events with “input blocks”. An input block is a set
+    of events that share certain properties, and generally are intended
+    to represent a single gesture. For example with touch events, all
+    events following a touchstart up to but not including the next
+    touchstart are in the same block. All of the events in a given block
+    will go to the same APZC instance and will either all be processed
+    or all be dropped.
+3.  Using the first event in the input block, the APZCTreeManager does a
+    hit-test to see which APZC it hits. If no APZC is hit, the events are
+    discarded and we jump to step 6. Otherwise, the input block is tagged
+    with the hit APZC as a tentative target and put into a global APZ
+    input queue. In addition the target APZC, the result of the hit test
+    also includes whether the input event landed on a "dispatch-to-content"
+    region. These are regions of the page where there is something going
+    on that requires dispatching the event to content and waiting for
+    a response _before_ processing the event in APZ; an example of this
+    is a region containing an element with a non-passive event listener,
+    as described above. (TODO: Add a section that talks about the other
+    uses of the dispatch-to-content mechanism.)
+4.
+
+    i.  If the input events landed outside a dispatch-to-content region,
+        any available events in the input block are processed. These may
+        trigger behaviours like scrolling or tap gestures.
+    ii. If the input events landed inside a dispatch-to-content region,
+        the events are left in the queue and a timeout is initiated. If
+        the timeout expires before step 9 is completed, the APZ assumes
+        the input block was not cancelled and the tentative target is
+        correct, and processes them as part of step 10.
+
+5.  The call stack unwinds back to APZCTreeManager::ReceiveInputEvent,
+    which does an in-place modification of the input event so that any
+    async transforms are removed.
+6.  The call stack unwinds back to the widget code that called
+    ReceiveInputEvent. This code now has the event in the coordinate
+    space Gecko is expecting, and so can dispatch it to the Gecko main
+    thread.
+7.  Gecko performs its own usual hit-testing and event dispatching for
+    the event. As part of this, it records whether any touch listeners
+    cancelled the input block by calling preventDefault(). It also
+    activates inactive scrollframes that were hit by the input events.
+8.  The call stack unwinds back to the widget code, which sends two
+    notifications to the APZ code on the controller thread. The first
+    notification is via APZCTreeManager::ContentReceivedInputBlock, and
+    informs the APZ whether the input block was cancelled. The second
+    notification is via APZCTreeManager::SetTargetAPZC, and informs the
+    APZ of the results of the Gecko hit-test during event dispatch. Note
+    that Gecko may report that the input event did not hit any
+    scrollable frame at all. The SetTargetAPZC notification happens only
+    once per input block, while the ContentReceivedInputBlock
+    notification may happen once per block, or multiple times per block,
+    depending on the input type.
+9.
+
+    i.   If the events were processed as part of step 4(i), the
+         notifications from step 8 are ignored and step 10 is skipped.
+    ii.  If events were queued as part of step 4(ii), and steps 5-8
+         complete before the timeout, the arrival of both notifications
+         from step 8 will mark the input block ready for processing.
+    iii. If events were queued as part of step 4(ii), but steps 5-8 take
+         longer than the timeout, the notifications from step 8 will be
+         ignored and step 10 will already have happened.
+
+10. If events were queued as part of step 4(ii) they are now either
+    processed (if the input block was not cancelled and Gecko detected a
+    scrollframe under the input event, or if the timeout expired) or
+    dropped (all other cases). Note that the APZC that processes the
+    events may be different at this step than the tentative target from
+    step 3, depending on the SetTargetAPZC notification. Processing the
+    events may trigger behaviours like scrolling or tap gestures.
+
+If the CSS touch-action property is enabled, the above steps are
+modified as follows:
+
+* In step 4, the APZC also requires the allowed touch-action behaviours
+  for the input event. This might have been determined as part of the
+  hit-test in APZCTreeManager; if not, the events are queued.
+* In step 6, the widget code determines the content element at the point
+  under the input element, and notifies the APZ code of the allowed
+  touch-action behaviours. This notification is sent via a call to
+  APZCTreeManager::SetAllowedTouchBehavior on the input thread.
+* In step 9(ii), the input block will only be marked ready for processing
+  once all three notifications arrive.
+
+Threading considerations
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The bulk of the input processing in the APZ code happens on what we call
+“the controller thread”. In practice the controller thread could be the
+Gecko main thread, the compositor thread, or some other thread. There are
+obvious downsides to using the Gecko main thread - that is,“asynchronous”
+panning and zooming is not really asynchronous as input events can only
+be processed while Gecko is idle. In an e10s environment, using the Gecko
+main thread of the chrome process is acceptable, because the code running
+in that process is more controllable and well-behaved than arbitrary web
+content. Using the compositor thread as the controller thread could work
+on some platforms, but may be inefficient on others. For example, on
+Android (Fennec) we receive input events from the system on a dedicated
+UI thread. We would have to redispatch the input events to the compositor
+thread if we wanted to the input thread to be the same as the compositor
+thread. This introduces a potential for higher latency, particularly if
+the compositor does any blocking operations - blocking SwapBuffers
+operations, for example. As a result, the APZ code itself does not assume
+that the controller thread will be the same as the Gecko main thread or
+the compositor thread.
+
+Active vs. inactive scrollframes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The number of scrollframes on a page is potentially unbounded. However,
+we do not want to create a separate displayport for each scrollframe
+right away, as this would require large amounts of memory. Therefore,
+scrollframes as designated as either “active” or “inactive”. Active
+scrollframes get a displayport, and an APZC on the compositor side.
+Inactive scrollframes do not get a displayport (a display list is only
+built for their viewport, i.e. what is currently visible) and do not get
+an APZC.
+
+Consider a page with a scrollframe that is initially inactive. This
+scroll frame does not get an APZC, and therefore events targeting it will
+target the APZC for the nearest active scrollable ancestor (let's call it
+P; note, the rootmost scroll frame in a given process is always active).
+However, the presence of the inactive scroll frame is reflected by a
+dispatch-to-content region that prevents events over the frame from
+erroneously scrolling P.
+
+When the user starts interacting with that content, the hit-test in the
+APZ code hits the dispatch-to-content region of P. The input block
+therefore has a tentative target of P when it goes into step 4(ii) in the
+flow above. When gecko processes the input event, it must detect the
+inactive scrollframe and activate it, as part of step 7. Finally, the
+widget code sends the SetTargetAPZC notification in step 8 to notify the
+APZ that the input block should really apply to this new APZC. An issue
+here is that the transaction containing metadata for the newly active
+scroll frame must reach the compositor and APZ before the SetTargetAPZC
+notification. If this does not occur within the 400ms timeout, the APZ
+code will be unable to update the tentative target, and will continue to
+use P for that input block. Input blocks that start after the transaction
+will get correctly routed to the new scroll frame as there will now be an
+APZC instance for the active scrollframe.
+
+This model implies that when the user initially attempts to scroll an
+inactive scrollframe, it may end up scrolling an ancestor scrollframe.
+Only after the round-trip to the gecko thread is complete is there an
+APZC for async scrolling to actually occur on the scrollframe itself. At
+that point the scrollframe will start receiving new input blocks and will
+scroll normally.
+
+Note: with Fission (where inactive scroll frames would make it impossible
+to target the correct process in all situations; see
+:ref:`this section <fission-hit-testing>` for more details) and WebRender
+(which makes displayports more lightweight as the actual rendering is
+offloaded to the compositor and can be done on demand), inactive scroll
+frames are being phased out, and we are moving towards a model where all
+scroll frames with nonempty scroll ranges are active and get a
+displayport and an APZC. To conserve memory, displayports for scroll
+frames which have not been recently scrolled are kept to a "minimal" size
+equal to the viewport size.
+
+WebRender Integration
+~~~~~~~~~~~~~~~~~~~~~
+
+This section describes how APZ interacts with the WebRender graphics
+backend.
+
+Note that APZ predates WebRender, having initially been written to work
+with the earlier Layers graphics backend. The design of Layers has
+influenced APZ significantly, and this still shows in some places in the
+code. Now that the Layers backend has been removed, there may be
+opportunities to streamline the interaction between APZ and WebRender.
+
+
+HitTestingTree
+^^^^^^^^^^^^^^
+
+The APZCTreeManager keeps as part of its internal state a tree of
+HitTestingTreeNode instances. This is referred to as the HitTestingTree.
+
+The main purpose of the HitTestingTree is to model the spatial
+relationships between content that's affected by async scrolling. Tree
+nodes fall roughly into the following categories:
+
+* Nodes representing scrollable content in an active scroll frame. These
+  nodes are associated with the scroll frame's APZC.
+* Nodes representing other content that may move in special ways in
+  response to async scrolling, such as fixed content, sticky content, and
+  scrollbars.
+* (Non-leaf) nodes which do not represent any content, just metadata
+  (e.g. a transform) that applies to its descendant nodes.
+
+An APZC may be associated with multiple nodes, if e.g. a scroll frame
+scrolls two pieces of content that are interleaved with non-scrolling
+content.
+
+Arranging these nodes in a tree allows modelling relationships such as
+what content is scrolled by a given scroll frame, what the scroll handoff
+relationships are between APZCs, and what content is subject to what
+transforms.
+
+An additional use of the HitTestingTree is to allow APZ to keep content
+processes up to date about enclosing transforms that they are subject to.
+See :ref:`this section <sending-transforms-to-content-processes>` for
+more details.
+
+(In the past, with the Layers backend, the HitTestingTree was also used
+for compositor hit testing, hence the name. This is no longer the case,
+and there may be opportunities to simplify the tree as a result.)
+
+The HitTestingTree is created from another tree data structure called
+WebRenderScrollData. The relevant types here are:
+
+* WebRenderScrollData which stores the entire tree.
+* WebRenderLayerScrollData, which represents a single "layer" of content,
+  i.e. a group of display items that move together when scrolling (or
+  metadata applying to a subtree of such layers). In the Layers backend,
+  such content would be rendered into a single texture which could then
+  be moved asynchronously at composite time. Since a layer of content can
+  be scrolled by multiple (nested) scroll frames, a
+  WebRenderLayerScrollData may contain scroll metadata for more than one
+  scroll frame.
+* WebRenderScrollDataWrapper, which wraps WebRenderLayerScrollData
+  but "expanded" in a way that each node only stores metadata for
+  a single scroll frame. WebRenderScrollDataWrapper nodes have a
+  1:1 correspondence with HitTestingTreeNodes.
+
+It's not clear whether the distinction between WebRenderLayerScrollData
+and WebRenderScrollDataWrapper is still useful in a WebRender-only world.
+The code could potentially be revised such that we directly build and
+store nodes of a single type with the behaviour of
+WebRenderScrollDataWrapper.
+
+The WebRenderScrollData structure is built on the main thread, and
+then shipped over IPC to the compositor where it's used to construct
+the HitTestingTree.
+
+WebRenderScrollData is built in WebRenderCommandBuilder, during the
+same traversal of the Gecko display list that is used to build the
+WebRender display list. As of this writing, the architecture for this is
+that, as we walk the Gecko display list, we query it to see if it
+contains any information that APZ might need to know (e.g. CSS
+transforms) via a call to ``nsDisplayItem::UpdateScrollData(nullptr,
+nullptr)``. If this call returns true, we create a
+WebRenderLayerScrollData instance for the item, and populate it with the
+necessary information in ``WebRenderLayerScrollData::Initialize``. We also
+create WebRenderLayerScrollData instances if we detect (via ASR changes)
+that we are now processing a Gecko display item that is in a different
+scrollframe than the previous item.
+
+The main sources of complexity in this code come from:
+
+1. Ensuring the ScrollMetadata instances end on the proper
+   WebRenderLayerScrollData instances (such that every path from a leaf
+   WebRenderLayerScrollData node to the root has a consistent ordering of
+   scrollframes without duplications).
+2. The deferred-transform optimization that is described in more detail
+   at the declaration of ``StackingContextHelper::mDeferredTransformItem``.
+
+.. _wr-hit-test-details:
+
+Hit-testing
+^^^^^^^^^^^
+
+Since the HitTestingTree is not used for actual hit-testing purposes
+with the WebRender backend (see previous section), this section describes
+how hit-testing actually works with WebRender.
+
+The Gecko display list contains display items
+(``nsDisplayCompositorHitTestInfo``) that store hit-testing state. These
+items implement the ``CreateWebRenderCommands`` method and generate a "hit-test
+item" into the WebRender display list. This is basically just a rectangle
+item in the WebRender display list that is no-op for painting purposes,
+but contains information that should be returned by the hit-test (specifically
+the hit info flags and the scrollId of the enclosing scrollframe). The
+hit-test item gets clipped and transformed in the same way that all the other
+items in the WebRender display list do, via clip chains and enclosing
+reference frame/stacking context items.
+
+When WebRender needs to do a hit-test, it goes through its display list,
+taking into account the current clips and transforms, adjusted for the
+most recent async scroll/zoom, and determines which hit-test item(s) are under
+the target point, and returns those items. APZ can then take the frontmost
+item from that list (or skip over it if it happens to be inside a OOP
+subdocument that's ``pointer-events:none``) and use that as the hit target.
+Note that the hit-test uses the last transform provided by the
+``SampleForWebRender`` API (see next section) which generally reflects the
+last composite, and doesn't take into account further changes to the
+transforms that have occurred since then. In practice, we should be
+compositing frequently enough that this doesn't matter much.
+
+When debugging hit-test issues, it is often useful to apply the patches
+on bug 1656260, which introduce a guid on Gecko display items and propagate
+it all the way through to where APZ gets the hit-test result. This allows
+answering the question "which nsDisplayCompositorHitTestInfo was responsible
+for this hit-test result?" which is often a very good first step in
+solving the bug. From there, one can determine if there was some other
+display item in front that should have generated a
+nsDisplayCompositorHitTestInfo but didn't, or if display item itself had
+incorrect information. The second patch on that bug further allows exposing
+hand-written debug info to the APZ code, so that the WR hit-testing
+mechanism itself can be more effectively debugged, in case there is a problem
+with the WR display items getting improperly transformed or clipped.
+
+The information returned by WebRender to APZ in response to the hit test
+is enough for APZ to identify a HitTestingTreeNode as the target of the
+event. APZ can then take actions such as scrolling the target node's
+associated APZC, or other appropriate actions (e.g. initiating a scrollbar
+drag if a scrollbar thumb node was targeted by a mouse-down event).
+
+Sampling
+^^^^^^^^
+
+The compositing step needs to read the latest async transforms from APZ
+in order to ensure scrollframes are rendered at the right position. The API for this is
+exposed via the ``APZSampler`` class. When WebRender is ready to do a composite,
+it invokes ``APZSampler::SampleForWebRender``. In here, APZ gathers all async
+transforms that WebRender needs to know about, including transforms to apply
+to scrolled content, fixed and sticky content, and scrollbar thumbs.
+
+Along with sampling the APZ transforms, the compositor also triggers APZ
+animations to advance to the next timestep (usually the next vsync). This
+happens just before reading the APZ transforms.
+
+Fission Integration
+~~~~~~~~~~~~~~~~~~~
+
+This section describes how APZ interacts with the Fission (Site Isolation)
+project.
+
+Introduction
+^^^^^^^^^^^^
+
+Fission is an architectural change motivated by security considerations,
+where web content from each origin is isolated in its own process. Since
+a page can contain a mixture of content from different origins (for
+example, the top level page can be content from origin A, and it can
+contain an iframe with content from origin B), that means that rendering
+and interacting with a page can now involve coordination between APZ and
+multiple content processes.
+
+.. _fission-hit-testing:
+
+Content Process Selection for Input Events
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Input events are initially received in the browser's parent process.
+With Fission, the browser needs to decide which of possibly several
+content processes an event is targeting.
+
+Since process boundaries correspond to iframe (subdocument) boundaries,
+and every (html) document has a root scroll frame, process boundaries are
+therefore also scroll frame boundaries. Since APZ already needs a hit
+test mechanism to be able to determine which scroll frame an event
+targets, this hit test mechanism was a good fit to also use to determine
+which content process an event targets.
+
+APZ's hit test was therefore expanded to serve this purpose as well. This
+mostly required only minor modifications, such as making sure that APZ
+knows about the root scroll frames of iframes even if they're not
+scrollable. Since APZ already needs to process all input events to
+potentially apply :ref:`untransformations <input-event-untransformation>`
+related to async scrolling, as part of this process it now also labels
+input events with information identifying which content process they
+target.
+
+Hit Testing Accuracy
+^^^^^^^^^^^^^^^^^^^^
+
+Prior to Fission, APZ's hit test could afford to be somewhat inaccurate,
+as it could fall back on the dispatch-to-content mechanism to wait for
+a more accurate answer from the main thread if necessary, suffering a
+performance cost only (not a correctness cost).
+
+With Fission, an inaccurate compositor hit test now implies a correctness
+cost, as there is no cross-process main-thread fallback mechanism.
+(Such a mechanism was considered, but judged to require too much
+complexity and IPC traffic to be worth it.)
+
+Luckily, with WebRender the compositor has much more detailed information
+available to use for hit testing than it did with Layers. For example,
+the compositor can perform accurate hit testing even in the presence of
+irregular shapes such as rounded corners.
+
+APZ leverages WebRender's more accurate hit testing ability to aim to
+accurately select the target process (and target scroll frame) for an
+event in general.
+
+One consequence of this is that the dispatch-to-content mechanism is now
+used less often than before (its primary remaining use is handling
+`preventDefault()`).
+
+.. _sending-transforms-to-content-processes:
+
+Sending Transforms To Content Processes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Content processes sometimes need to be able to convert between screen
+coordinates and their local coordinates. To do this, they need to know
+about any transforms that their containing iframe and its ancestors are
+subject to, including async transforms (particularly in cases where the
+async transforms persist for more than just a few frames).
+
+APZ has information about these transforms in its HitTestingTree. With
+Fission, APZ periodically sends content processes information about these
+transforms so that they are kept relatively up to date.
+
+Testing
+-------
+
+APZ makes use of several test frameworks to verify the expected behavior
+is seen.
+
+Mochitest
+~~~~~~~~~
+
+The APZ specific mochitests are useful when specific gestures or events need to be tested
+with specific content. The APZ mochitests are located in `gfx/layers/apz/test/mochitest`_.
+To run all of the APZ mochitests, run something like the following:
+
+::
+
+    ./mach mochitest ./gfx/layers/apz/test/mochitest
+
+The APZ mochitests are often organized as subtests that run in a group. For example,
+the `test_group_hittest-2.html`_ contains >20 subtests like
+`helper_hittest_overscroll.html`_. When working on a specific subtest, it is often
+helpful to use the `apz.subtest` preference to filter the subtests run to just the
+tests you are working on. For example, the following would only run the
+`helper_hittest_overscroll.html`_ subtest of the `test_group_hittest-2.html`_ group.
+
+::
+
+    ./mach mochitest --setpref apz.subtest=helper_hittest_overscroll.html \
+        ./gfx/layers/apz/test/mochitest/test_group_hittest-2.html
+
+For more information on mochitest, see the `Mochitest Documentation`_.
+
+.. _gfx/layers/apz/test/mochitest: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest
+.. _test_group_hittest-2.html: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest/test_group_hittest-2.html
+.. _helper_hittest_overscroll.html: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/mochitest/helper_hittest_overscroll.html
+.. _Mochitest Documentation: /testing/mochitest-plain/index.html
+
+GTest
+~~~~~
+
+The APZ specific GTests can be found in `gfx/layers/apz/test/gtest/`_. To run
+these tests, run something like the following:
+
+::
+
+    ./mach gtest "APZ*"
+
+For more information, see the `GTest Documentation`_.
+
+.. _GTest Documentation: /gtest/index.html
+.. _gfx/layers/apz/test/gtest/: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/gtest/
+
+Reftests
+~~~~~~~~
+
+The APZ reftests can be found in `layout/reftests/async-scrolling/`_ and
+`gfx/layers/apz/test/reftest`_. To run the relevant reftests for APZ, run
+a large portion of the APZ reftests, run something like the following:
+
+::
+
+    ./mach reftest ./layout/reftests/async-scrolling/
+
+Useful information about the reftests can be found in the `Reftest Documentation`_.
+
+There is no defined process for choosing which directory the APZ reftests
+should be placed in, but in general reftests should exist where other
+similar tests do.
+
+.. _layout/reftests/async-scrolling/: https://searchfox.org/mozilla-central/source/layout/reftests/async-scrolling/
+.. _gfx/layers/apz/test/reftest: https://searchfox.org/mozilla-central/source/gfx/layers/apz/test/reftest/
+.. _Reftest Documentation: /layout/Reftest.html
+
+Threading / Locking Overview
+----------------------------
+
+Threads
+~~~~~~~
+
+There are three threads relevant to APZ: the **controller thread**,
+the **updater thread**, and the **sampler thread**. This table lists
+which threads play these roles on each platform / configuration:
+
+===================== ============= ============== =============
+APZ Thread Name       Desktop       Desktop+GPU    Android
+===================== ============= ============== =============
+**controller thread** UI main       GPU main       Java UI
+**updater thread**    SceneBuilder  SceneBuilder   SceneBuilder
+**sampler thread**    RenderBackend RenderBackend  RenderBackend
+===================== ============= ============== =============
+
+Locks
+~~~~~
+
+There are also a number of locks used in APZ code:
+
+======================= ==============================
+Lock type               How many instances
+======================= ==============================
+APZ tree lock           one per APZCTreeManager
+APZC map lock           one per APZCTreeManager
+APZC instance lock      one per AsyncPanZoomController
+APZ test lock           one per APZCTreeManager
+Checkerboard event lock one per AsyncPanZoomController
+======================= ==============================
+
+Thread / Lock Ordering
+~~~~~~~~~~~~~~~~~~~~~~
+
+To avoid deadlocks, the threads and locks have a global **ordering**
+which must be respected.
+
+Respecting the ordering means the following:
+
+- Let "A < B" denote that A occurs earlier than B in the ordering
+- Thread T may only acquire lock L, if T < L
+- A thread may only acquire lock L2 while holding lock L1, if L1 < L2
+- A thread may only block on a response from another thread T while holding a lock L, if L < T
+
+**The lock ordering is as follows**:
+
+1. UI main
+2. GPU main (only if GPU process enabled)
+3. Compositor thread
+4. SceneBuilder thread
+5. **APZ tree lock**
+6. RenderBackend thread
+7. **APZC map lock**
+8. **APZC instance lock**
+9. **APZ test lock**
+10. **Checkerboard event lock**
+
+Example workflows
+^^^^^^^^^^^^^^^^^
+
+Here are some example APZ workflows. Observe how they all obey
+the global thread/lock ordering. Feel free to add others:
+
+- **Input handling** (with GPU process): UI main -> GPU main -> APZ tree lock -> RenderBackend thread
+- **Sync messages** in ``PCompositorBridge.ipdl``: UI main thread -> Compositor thread
+- **GetAPZTestData**: Compositor thread -> SceneBuilder thread -> test lock
+- **Scene swap**: SceneBuilder thread -> APZ tree lock -> RenderBackend thread
+- **Updating hit-testing tree**: SceneBuilder thread -> APZ tree lock -> APZC instance lock
+- **Updating APZC map**: SceneBuilder thread -> APZ tree lock -> APZC map lock
+- **Sampling and animation deferred tasks** [1]_: RenderBackend thread -> APZC map lock -> APZC instance lock
+- **Advancing animations**: RenderBackend thread -> APZC instance lock
+
+.. [1] It looks like there are two deferred tasks that actually need the tree lock,
+   ``AsyncPanZoomController::HandleSmoothScrollOverscroll`` and
+   ``AsyncPanZoomController::HandleFlingOverscroll``. We should be able to rewrite
+   these to use the map lock instead of the tree lock.
+   This will allow us to continue running the deferred tasks on the sampler
+   thread rather than having to bounce them to another thread.
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-19 00:47:55 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-19 00:47:55 +0000
commit	26a029d407be480d791972afb5975cf62c9360a6 (patch)
tree	f435a8308119effd964b339f76abb83a57c29483 /gfx/docs/AsyncPanZoom.rst
parent	Initial commit. (diff)
download	firefox-upstream/124.0.1.tar.xz firefox-upstream/124.0.1.zip