diff options
Diffstat (limited to 'tools/profiler/docs')
-rw-r--r-- | tools/profiler/docs/buffer.rst | 70 | ||||
-rw-r--r-- | tools/profiler/docs/code-overview.rst | 1494 | ||||
-rw-r--r-- | tools/profiler/docs/fissionprofiler-20200424.png | bin | 0 -> 131301 bytes | |||
-rw-r--r-- | tools/profiler/docs/fissionprofiler.umlet.uxf | 546 | ||||
-rw-r--r-- | tools/profiler/docs/index.rst | 37 | ||||
-rw-r--r-- | tools/profiler/docs/instrumenting-javascript.rst | 60 | ||||
-rw-r--r-- | tools/profiler/docs/instrumenting-rust.rst | 433 | ||||
-rw-r--r-- | tools/profiler/docs/markers-guide.rst | 551 | ||||
-rw-r--r-- | tools/profiler/docs/memory.rst | 46 | ||||
-rw-r--r-- | tools/profiler/docs/profilerclasses-20220913.png | bin | 0 -> 727313 bytes | |||
-rw-r--r-- | tools/profiler/docs/profilerclasses.umlet.uxf | 811 | ||||
-rw-r--r-- | tools/profiler/docs/profilerthreadregistration-20220913.png | bin | 0 -> 383738 bytes | |||
-rw-r--r-- | tools/profiler/docs/profilerthreadregistration.umlet.uxf | 710 |
13 files changed, 4758 insertions, 0 deletions
diff --git a/tools/profiler/docs/buffer.rst b/tools/profiler/docs/buffer.rst new file mode 100644 index 0000000000..a97e52136e --- /dev/null +++ b/tools/profiler/docs/buffer.rst @@ -0,0 +1,70 @@ +Buffers and Memory Management +============================= + +In a post-Fission world, precise memory management across many threads and processes is +especially important. In order for the profiler to achieve this, it uses a chunked buffer +strategy. + +The `ProfileBuffer`_ is the overall buffer class that controls the memory and storage +for the profile, it allows allocating objects into it. This can be used freely +by things like markers and samples to store data as entries, without needing to know +about the general strategy for how the memory is managed. + +The `ProfileBuffer`_ is then backed by the `ProfileChunkedBuffer`_. This specialized +buffer grows incrementally, by allocating additional `ProfileBufferChunk`_ objects. +More and more chunks will be allocated until a memory limit is reached, where they will +be released. After releasing, the chunk will either be recycled or freed. + +The limiting of memory usage is coordinated by the `ProfilerParent`_ in the parent +process. The `ProfilerParent`_ and `ProfilerChild`_ exchange IPC messages with information +about how much memory is being used. When the maximum byte threshold is passed, +the ProfileChunkManager in the parent process removes the oldest chunk, and then the +`ProfilerParent`_ sends a `DestroyReleasedChunksAtOrBefore`_ message to all of child +processes so that the oldest chunks in the profile are released. This helps long profiles +to keep having data in a similar time frame. + +Profile Buffer Terminology +########################## + +ProfilerParent + The main profiler machinery is installed in the parent process. It uses IPC to + communicate to the child processes. The PProfiler is the actor which is used + to communicate across processes to coordinate things. See `ProfilerParent.h`_. The + ProfilerParent uses the DestroyReleasedChunksAtOrBefore message to control the + overall chunk limit. + +ProfilerChild + ProfilerChild is installed in every child process, it will receive requests from + DestroyReleasedChunksAtOrBefore. + +Entry + This is an individual entry in the `ProfileBuffer.h`_,. These entry sizes are not + related to the chunks sizes. An individual entry can straddle two different chunks. + An entry can contain various pieces of data, like markers, samples, and stacks. + +Chunk + An arbitrary sized chunk of memory, managed by the `ProfileChunkedBuffer`_, and + IPC calls from the ProfilerParent. + +Unreleased Chunk + This chunk is currently being used to write entries into. + +Released chunk + This chunk is full of data. When memory limits happen, it can either be recycled + or freed. + +Recycled chunk + This is a chunk that was previously written into, and full. When memory limits occur, + rather than freeing the memory, it is re-used as the next chunk. + +.. _ProfileChunkedBuffer: https://searchfox.org/mozilla-central/search?q=ProfileChunkedBuffer&path=&case=true®exp=false +.. _ProfileChunkManager: https://searchfox.org/mozilla-central/search?q=ProfileBufferChunkManager.h&path=&case=true®exp=false +.. _ProfileBufferChunk: https://searchfox.org/mozilla-central/search?q=ProfileBufferChunk&path=&case=true®exp=false +.. _ProfileBufferChunkManagerWithLocalLimit: https://searchfox.org/mozilla-central/search?q=ProfileBufferChunkManagerWithLocalLimit&case=true&path= +.. _ProfilerParent.h: https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerParent.h +.. _ProfilerChild.h: https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerChild.h +.. _ProfileBuffer.h: https://searchfox.org/mozilla-central/source/tools/profiler/core/ProfileBuffer.h +.. _ProfileBuffer: https://searchfox.org/mozilla-central/search?q=ProfileBuffer&path=&case=true®exp=false +.. _ProfilerParent: https://searchfox.org/mozilla-central/search?q=ProfilerParent&path=&case=true®exp=false +.. _ProfilerChild: https://searchfox.org/mozilla-central/search?q=ProfilerChild&path=&case=true®exp=false +.. _DestroyReleasedChunksAtOrBefore: https://searchfox.org/mozilla-central/search?q=DestroyReleasedChunksAtOrBefore&path=&case=true®exp=false diff --git a/tools/profiler/docs/code-overview.rst b/tools/profiler/docs/code-overview.rst new file mode 100644 index 0000000000..3ca662e141 --- /dev/null +++ b/tools/profiler/docs/code-overview.rst @@ -0,0 +1,1494 @@ +Profiler Code Overview +###################### + +This is an overview of the code that implements the Profiler inside Firefox +with dome details around tricky subjects, or pointers to more detailed +documentation and/or source code. + +It assumes familiarity with Firefox development, including Mercurial (hg), mach, +moz.build files, Try, Phabricator, etc. + +It also assumes knowledge of the user-visible part of the Firefox Profiler, that +is: How to use the Firefox Profiler, and what profiles contain that is shown +when capturing a profile. See the main website https://profiler.firefox.com, and +its `documentation <https://profiler.firefox.com/docs/>`_. + +For just an "overview", it may look like a huge amount of information, but the +Profiler code is indeed quite expansive, so it takes a lot of words to explain +even just a high-level view of it! For on-the-spot needs, it should be possible +to search for some terms here and follow the clues. But for long-term +maintainers, it would be worth skimming this whole document to get a grasp of +the domain, and return to get some more detailed information before diving into +the code. + +WIP note: This document should be correct at the time it is written, but the +profiler code constantly evolves to respond to bugs or to provide new exciting +features, so this document could become obsolete in parts! It should still be +useful as an overview, but its correctness should be verified by looking at the +actual code. If you notice any significant discrepancy or broken links, please +help by +`filing a bug <https://bugzilla.mozilla.org/enter_bug.cgi?product=Core&component=Gecko+Profiler>`_. + +***** +Terms +***** + +This is the common usage for some frequently-used terms, as understood by the +Dev Tools team. But incorrect usage can sometimes happen, context is key! + +* **profiler** (a): Generic name for software that enables the profiling of + code. (`"Profiling" on Wikipedia <https://en.wikipedia.org/wiki/Profiling_(computer_programming)>`_) +* **Profiler** (the): All parts of the profiler code inside Firefox. +* **Base Profiler** (the): Parts of the Profiler that live in + mozglue/baseprofiler, and can be used from anywhere, but has limited + functionality. +* **Gecko Profiler** (the): Parts of the Profiler that live in tools/profiler, + and can only be used from other code in the XUL library. +* **Profilers** (the): Both the Base Profiler and the Gecko Profiler. +* **profiling session**: This is the time during which the profiler is running + and collecting data. +* **profile** (a): The output from a profiling session, either as a file, or a + shared viewable profile on https://profiler.firefox.com +* **Profiler back-end** (the): Other name for the Profiler code inside Firefox, + to distinguish it from... +* **Profiler front-end** (the): The website https://profiler.firefox.com that + displays profiles captured by the back-end. +* **Firefox Profiler** (the): The whole suite comprised of the back-end and front-end. + +****************** +Guiding Principles +****************** + +When working on the profiler, here are some guiding principles to keep in mind: + +* Low profiling overhead in cpu and memory. For the Profiler to provide the best + value, it should stay out of the way and consume as few resources (in time and + memory) as possible, so as not to skew the actual Firefox code too much. + +* Common data structures and code should be in the Base Profiler when possible. + + WIP note: Deduplication is slowly happening, see + `meta bug 1557566 <https://bugzilla.mozilla.org/show_bug.cgi?id=1557566>`_. + This document focuses on the Profiler back-end, and mainly the Gecko Profiler + (because this is where most of the code lives, the Base Profiler is mostly a + subset, originally just a cut-down version of the Gecko Profiler); so unless + specified, descriptions below are about the Gecko Profiler, but know that + there may be some equivalent code in the Base Profiler as well. + +* Use appropriate programming-language features where possible to reduce coding + errors in both our code, and our users' usage of it. In C++, this can be done + by using a specific class/struct types for a given usage, to avoid misuse + (e.g., an generic integer representing a **process** could be incorrectly + given to a function expecting a **thread**; we have specific types for these + instead, more below.) + +* Follow the + `Coding Style <https://firefox-source-docs.mozilla.org/code-quality/coding-style/index.html>`_. + +* Whenever possible, write tests (if not present already) for code you add or + modify -- but this may be too difficult in some case, use good judgement and + at least test manually instead. + +****************** +Profiler Lifecycle +****************** + +Here is a high-level view of the Base **or** Gecko Profiler lifecycle, as part +of a Firefox run. The following sections will go into much more details. + +* Profiler initialization, preparing some common data. +* Threads de/register themselves as they start and stop. +* During each User/test-controlled profiling session: + + * Profiler start, preparing data structures that will store the profiling data. + * Periodic sampling from a separate thread, happening at a user-selected + frequency (usually once every 1-2 ms), and recording snapshots of what + Firefox is doing: + + * CPU sampling, measuring how much time each thread has spent actually + running on the CPU. + * Stack sampling, capturing a stack of functions calls from whichever leaf + function the program is in at this point in time, up to the top-most + caller (i.e., at least the ``main()`` function, or its callers if any). + Note that unlike most external profilers, the Firefox Profiler back-end + is capable or getting more useful information than just native functions + calls (compiled from C++ or Rust): + + * Labels added by Firefox developers along the stack, usually to identify + regions of code that perform "interesting" operations (like layout, file + I/Os, etc.). + * JavaScript function calls, including the level of optimization applied. + * Java function calls. + * At any time, Markers may record more specific details of what is happening, + e.g.: User operations, page rendering steps, garbage collection, etc. + * Optional profiler pause, which stops most recording, usually near the end of + a session so that no data gets recorded past this point. + * Profile JSON output, generated from all the recorded profiling data. + * Profiler stop, tearing down profiling session objects. +* Profiler shutdown. + +Note that the Base Profiler can start earlier, and then the data collected so +far, as well as the responsibility for periodic sampling, is handed over to the +Gecko Profiler: + +#. (Firefox starts) +#. Base Profiler init +#. Base Profiler start +#. (Firefox loads the libxul library and initializes XPCOM) +#. Gecko Profiler init +#. Gecko Profiler start +#. Handover from Base to Gecko +#. Base Profiler stop +#. (Bulk of the profiling session) +#. JSON generation +#. Gecko Profiler stop +#. Gecko Profiler shutdown +#. (Firefox ends XPCOM) +#. Base Profiler shutdown +#. (Firefox exits) + +Base Profiler functions that add data (mostly markers and labels) may be called +from anywhere, and will be recorded by either Profiler. The corresponding +functions in Gecko Profiler can only be called from other libxul code, and can +only be recorded by the Gecko Profiler. + +Whenever possible, Gecko Profiler functions should be preferred if accessible, +as they may provide extended functionality (e.g., better stacks with JS in +markers). Otherwise fallback on Base Profiler functions. + +*********** +Directories +*********** + +* Non-Profiler supporting code + + * `mfbt <https://searchfox.org/mozilla-central/source/mfbt>`_ - Mostly + replacements for C++ std library facilities. + + * `mozglue/misc <https://searchfox.org/mozilla-central/source/mozglue/misc>`_ + + * `PlatformMutex.h <https://searchfox.org/mozilla-central/source/mozglue/misc/PlatformMutex.h>`_ - + Mutex base classes. + * `StackWalk.h <https://searchfox.org/mozilla-central/source/mozglue/misc/StackWalk.h>`_ - + Stack-walking functions. + * `TimeStamp.h <https://searchfox.org/mozilla-central/source/mozglue/misc/TimeStamp.h>`_ - + Timestamps and time durations. + + * `xpcom <https://searchfox.org/mozilla-central/source/xpcom>`_ + + * `ds <https://searchfox.org/mozilla-central/source/xpcom/ds>`_ - + Data structures like arrays, strings. + + * `threads <https://searchfox.org/mozilla-central/source/xpcom/threads>`_ - + Threading functions. + +* Profiler back-end + + * `mozglue/baseprofiler <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler>`_ - + Base Profiler code, usable from anywhere in Firefox. Because it lives in + mozglue, it's loaded right at the beginning, so it's possible to start the + profiler very early, even before Firefox loads its big&heavy "xul" library. + + * `baseprofiler's public <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public>`_ - + Public headers, may be #included from anywhere. + * `baseprofiler's core <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/core>`_ - + Main implementation code. + * `baseprofiler's lul <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/lul>`_ - + Special stack-walking code for Linux. + * `../tests/TestBaseProfiler.cpp <https://searchfox.org/mozilla-central/source/mozglue/tests/TestBaseProfiler.cpp>`_ - + Unit tests. + + * `tools/profiler <https://searchfox.org/mozilla-central/source/tools/profiler>`_ - + Gecko Profiler code, only usable from the xul library. That library is + loaded a short time after Firefox starts, so the Gecko Profiler is not able + to profile the early phase of the application, Base Profiler handles that, + and can pass its collected data to the Gecko Profiler when the latter + starts. + + * `public <https://searchfox.org/mozilla-central/source/tools/profiler/public>`_ - + Public headers, may be #included from most libxul code. + * `core <https://searchfox.org/mozilla-central/source/tools/profiler/core>`_ - + Main implementation code. + * `gecko <https://searchfox.org/mozilla-central/source/tools/profiler/gecko>`_ - + Control from JS, and multi-process/IPC code. + * `lul <https://searchfox.org/mozilla-central/source/tools/profiler/lul>`_ - + Special stack-walking code for Linux. + * `rust-api <https://searchfox.org/mozilla-central/source/tools/profiler/rust-api>`_, + `rust-helper <https://searchfox.org/mozilla-central/source/tools/profiler/rust-helper>`_ + * `tests <https://searchfox.org/mozilla-central/source/tools/profiler/tests>`_ + + * `devtools/client/performance-new <https://searchfox.org/mozilla-central/source/devtools/client/performance-new>`_, + `devtools/shared/performance-new <https://searchfox.org/mozilla-central/source/devtools/shared/performance-new>`_ - + Middleware code for about:profiling and devtools panel functionality. + + * js, starting with + `js/src/vm/GeckoProfiler.h <https://searchfox.org/mozilla-central/source/js/src/vm/GeckoProfiler.h>`_ - + JavaScript engine support, mostly to capture JS stacks. + + * `toolkit/components/extensions/schemas/geckoProfiler.json <https://searchfox.org/mozilla-central/source/toolkit/components/extensions/schemas/geckoProfiler.json>`_ - + File that needs to be updated when Profiler features change. + +* Profiler front-end + + * Out of scope for this document, but its code and bug repository can be found at: + https://github.com/firefox-devtools/profiler . Sometimes work needs to be + done on both the back-end of the front-end, especially when modifying the + back-end's JSON output format. + +******* +Headers +******* + +The most central public header is +`GeckoProfiler.h <https://searchfox.org/mozilla-central/source/tools/profiler/public/GeckoProfiler.h>`_, +from which almost everything else can be found, it can be a good starting point +for exploration. +It includes other headers, which together contain important top-level macros and +functions. + +WIP note: GeckoProfiler.h used to be the header that contained everything! +To better separate areas of functionality, and to hopefully reduce compilation +times, parts of it have been split into smaller headers, and this work will +continue, see `bug 1681416 <https://bugzilla.mozilla.org/show_bug.cgi?id=1681416>`_. + +MOZ_GECKO_PROFILER and Macros +============================= + +Mozilla officially supports the Profiler on `tier-1 platforms +<https://firefox-source-docs.mozilla.org/contributing/build/supported.html>`_: +Windows, macos, Linux and Android. +There is also some code running on tier 2-3 platforms (e.g., for FreeBSD), but +the team at Mozilla is not obligated to maintain it; we do try to keep it +running, and some external contributors are keeping an eye on it and provide +patches when things do break. + +To reduce the burden on unsupported platforms, a lot of the Profilers code is +only compiled when ``MOZ_GECKO_PROFILER`` is #defined. This means that some +public functions may not always be declared or implemented, and should be +surrounded by guards like ``#ifdef MOZ_GECKO_PROFILER``. + +Some commonly-used functions offer an empty definition in the +non-``MOZ_GECKO_PROFILER`` case, so these functions may be called from anywhere +without guard. + +Other functions have associated macros that can always be used, and resolve to +nothing on unsupported platforms. E.g., +``PROFILER_REGISTER_THREAD`` calls ``profiler_register_thread`` where supported, +otherwise does nothing. + +WIP note: There is an effort to eventually get rid of ``MOZ_GECKO_PROFILER`` and +its associated macros, see +`bug 1635350 <https://bugzilla.mozilla.org/show_bug.cgi?id=1635350>`_. + +RAII "Auto" macros and classes +============================== +A number of functions are intended to be called in pairs, usually to start and +then end some operation. To ease their use, and ensure that both functions are +always called together, they usually have an associated class and/or macro that +may be called only once. This pattern of using an object's destructor to ensure +that some action always eventually happens, is called +`RAII <https://en.cppreference.com/w/cpp/language/raii>`_ in C++, with the +common prefix "auto". + +E.g.: In ``MOZ_GECKO_PROFILER`` builds, +`AUTO_PROFILER_INIT <https://searchfox.org/mozilla-central/search?q=AUTO_PROFILER_INIT>`_ +instantiates an +`AutoProfilerInit <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AAutoProfilerInit>`_ +object, which calls ``profiler_init`` when constructed, and +``profiler_shutdown`` when destroyed. + +********************* +Platform Abstractions +********************* + +This section describes some platform abstractions that are used throughout the +Profilers. (Other platform abstractions will be described where they are used.) + +Process and Thread IDs +====================== + +The Profiler back-end often uses process and thread IDs (aka "pid" and "tid"), +which are commonly just a number. +For better code correctness, and to hide specific platform details, they are +encapsulated in opaque types +`BaseProfilerProcessId <https://searchfox.org/mozilla-central/search?q=BaseProfilerProcessId>`_ +and +`BaseProfilerThreadId <https://searchfox.org/mozilla-central/search?q=BaseProfilerThreadId>`_. +These types should be used wherever possible. +When interfacing with other code, they may be converted using the member +functions ``FromNumber`` and ``ToNumber``. + +To find the current process or thread ID, use +`profiler_current_process_id <https://searchfox.org/mozilla-central/search?q=profiler_current_process_id>`_ +or +`profiler_current_thread_id <https://searchfox.org/mozilla-central/search?q=profiler_current_thread_id>`_. + +The main thread ID is available through +`profiler_main_thread_id <https://searchfox.org/mozilla-central/search?q=profiler_main_thread_id>`_ +(assuming +`profiler_init_main_thread_id <https://searchfox.org/mozilla-central/search?q=profiler_init_main_thread_id>`_ +was called when the application started -- especially important in stand-alone +test programs.) +And +`profiler_is_main_thread <https://searchfox.org/mozilla-central/search?q=profiler_is_main_thread>`_ +is a quick way to find out if the current thread is the main thread. + +Locking +======= +The locking primitives in PlatformMutex.h are not supposed to be used as-is, but +through a user-accessible implementation. For the Profilers, this is in +`BaseProfilerDetail.h <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public/BaseProfilerDetail.h>`_. + +In addition to the usual ``Lock``, ``TryLock``, and ``Unlock`` functions, +`BaseProfilerMutex <https://searchfox.org/mozilla-central/search?q=BaseProfilerMutex>`_ +objects have a name (which may be helpful when debugging), +they record the thread on which they are locked (making it possible to know if +the mutex is locked on the current thread), and in ``DEBUG`` builds there are +assertions verifying that the mutex is not incorrectly used recursively, to +verify the correct ordering of different Profiler mutexes, and that it is +unlocked before destruction. + +Mutexes should preferably be locked within C++ block scopes, or as class +members, by using +`BaseProfilerAutoLock <https://searchfox.org/mozilla-central/search?q=BaseProfilerAutoLock>`_. + +Some classes give the option to use a mutex or not (so that single-threaded code +can more efficiently bypass locking operations), for these we have +`BaseProfilerMaybeMutex <https://searchfox.org/mozilla-central/search?q=BaseProfilerMaybeMutex>`_ +and +`BaseProfilerMaybeAutoLock <https://searchfox.org/mozilla-central/search?q=BaseProfilerMaybeAutoLock>`_. + +There is also a special type of shared lock (aka RWLock, see +`RWLock on wikipedia <https://en.wikipedia.org/wiki/Readers%E2%80%93writer_lock>`_), +which may be locked in multiple threads (through ``LockShared`` or preferably +`BaseProfilerAutoLockShared <https://searchfox.org/mozilla-central/search?q=BaseProfilerAutoLockShared>`_), +or locked exclusively, preventing any other locking (through ``LockExclusive`` or preferably +`BaseProfilerAutoLockExclusive <https://searchfox.org/mozilla-central/search?q=BaseProfilerAutoLockExclusive>`_). + +********************* +Main Profiler Classes +********************* + +Diagram showing the most important Profiler classes, see details in the +following sections: + +(As noted, the "RegisteredThread" classes are now obsolete in the Gecko +Profiler, see the "Thread Registration" section below for an updated diagram and +description.) + +.. image:: profilerclasses-20220913.png + +*********************** +Profiler Initialization +*********************** + +`profiler_init <https://searchfox.org/mozilla-central/search?q=symbol:_Z13profiler_initPv>`_ +and +`baseprofiler::profiler_init <https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla12baseprofiler13profiler_initEPv>`_ +must be called from the main thread, and are used to prepare important aspects +of the profiler, including: + +* Making sure the main thread ID is recorded. +* Handling ``MOZ_PROFILER_HELP=1 ./mach run`` to display the command-line help. +* Creating the ``CorePS`` instance -- more details below. +* Registering the main thread. +* Initializing some platform-specific code. +* Handling other environment variables that are used to immediately start the + profiler, with optional settings provided in other env-vars. + +CorePS +====== + +The `CorePS class <https://searchfox.org/mozilla-central/search?q=symbol:T_CorePS>`_ +has a single instance that should live for the duration of the Firefox +application, and contains important information that could be needed even when +the Profiler is not running. + +It includes: + +* A static pointer to its single instance. +* The process start time. +* JavaScript-specific data structures. +* A list of registered + `PageInformations <https://searchfox.org/mozilla-central/search?q=symbol:T_PageInformation>`_, + used to keep track of the tabs that this process handles. +* A list of + `BaseProfilerCounts <https://searchfox.org/mozilla-central/search?q=symbol:T_BaseProfilerCount>`_, + used to record things like the process memory usage. +* The process name, and optionally the "eTLD+1" (roughly sub-domain) that this + process handles. +* In the Base Profiler only, a list of + `RegisteredThreads <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%253A%253Abaseprofiler%253A%253ARegisteredThread>`_. + WIP note: This storage has been reworked in the Gecko Profiler (more below), + and in practice the Base Profiler only registers the main thread. This should + eventually disappear as part of the de-duplication work + (`bug 1557566 <https://bugzilla.mozilla.org/show_bug.cgi?id=1557566>`_). + +******************* +Thread Registration +******************* + +Threads need to register themselves in order to get fully profiled. +This section describes the main data structures that record the list of +registered threads and their data. + +WIP note: There is some work happening to add limited profiling of unregistered +threads, with the hope that more and more functionality could be added to +eventually use the same registration data structures. + +Diagram showing the relevant classes, see details in the following sub-sections: + +.. image:: profilerthreadregistration-20220913.png + +ProfilerThreadRegistry +====================== + +The +`static ProfilerThreadRegistry object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3Aprofiler%3A%3AThreadRegistry>`_ +contains a list of ``OffThreadRef`` objects. + +Each ``OffThreadRef`` points to a ``ProfilerThreadRegistration``, and restricts +access to a safe subset of the thread data, and forces a mutex lock if necessary +(more information under ProfilerThreadRegistrationData below). + +ProfilerThreadRegistration +========================== + +A +`ProfilerThreadRegistration object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3Aprofiler%3A%3AThreadRegistration>`_ +contains a lot of information relevant to its thread, to help with profiling it. + +This data is accessible from the thread itself through an ``OnThreadRef`` +object, which points to the ``ThreadRegistration``, and restricts access to a +safe subset of thread data, and forces a mutex lock if necessary (more +information under ProfilerThreadRegistrationData below). + +ThreadRegistrationData and accessors +==================================== + +`The ProfilerThreadRegistrationData.h header <https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerThreadRegistrationData.h>`_ +contains a hierarchy of classes that encapsulate all the thread-related data. + +``ThreadRegistrationData`` contains all the actual data members, including: + +* Some long-lived + `ThreadRegistrationInfo <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%253A%253Aprofiler%253A%253AThreadRegistrationInfo>`_, + containing the thread name, its registration time, the thread ID, and whether + it's the main thread. +* A ``ProfilingStack`` that gathers developer-provided pseudo-frames, and JS + frames. +* Some platform-specific ``PlatformData`` (usually required to actually record + profiling measurements for that thread). +* A pointer to the top of the stack. +* A shared pointer to the thread's ``nsIThread``. +* A pointer to the ``JSContext``. +* An optional pre-allocated ``JsFrame`` buffer used during stack-sampling. +* Some JS flags. +* Sleep-related data (to avoid costly sampling while the thread is known to not + be doing anything). +* The current ``ThreadProfilingFeatures``, to know what kind of data to record. +* When profiling, a pointer to a ``ProfiledThreadData``, which contains some + more data needed during and just after profiling. + +As described in their respective code comments, each data member is supposed to +be accessed in certain ways, e.g., the ``JSContext`` should only be "written +from thread, read from thread and suspended thread". To enforce these rules, +data members can only be accessed through certain classes, which themselves can +only be instantiated in the correct conditions. + +The accessor classes are, from base to most-derived: + +* ``ThreadRegistrationData``, not an accessor itself, but it's the base class + with all the ``protected`` data. +* ``ThreadRegistrationUnlockedConstReader``, giving unlocked ``const`` access to + the ``ThreadRegistrationInfo``, ``PlatformData``, and stack top. +* ``ThreadRegistrationUnlockedConstReaderAndAtomicRW``, giving unlocked + access to the atomic data members: ``ProfilingStack``, sleep-related data, + ``ThreadProfilingFeatures``. +* ``ThreadRegistrationUnlockedRWForLockedProfiler``, giving access that's + protected by the Profiler's main lock, but doesn't require a + ``ThreadRegistration`` lock, to the ``ProfiledThreadData`` +* ``ThreadRegistrationUnlockedReaderAndAtomicRWOnThread``, giving unlocked + mutable access, but only on the thread itself, to the ``JSContext``. +* ``ThreadRegistrationLockedRWFromAnyThread``, giving locked access from any + thread to mutex-protected data: ``ThreadProfilingFeatures``, ``JsFrame``, + ``nsIThread``, and the JS flags. +* ``ThreadRegistrationLockedRWOnThread``, giving locked access, but only from + the thread itself, to the ``JSContext`` and a JS flag-related operation. +* ``ThreadRegistration::EmbeddedData``, containing all of the above, and stored + as a data member in each ``ThreadRegistration``. + +To recapitulate, if some code needs some data on the thread, it can use +``ThreadRegistration`` functions to request access (with the required rights, +like a mutex lock). +To access data about another thread, use similar functions from +``ThreadRegistry`` instead. +You may find some examples in the implementations of the functions in +ProfilerThreadState.h (see the following section). + +ProfilerThreadState.h functions +=============================== + +The +`ProfilerThreadState.h <https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerThreadState.h>`_ +header provides a few helpful functions related to threads, including: + +* ``profiler_is_active_and_thread_is_registered`` +* ``profiler_thread_is_being_profiled`` (for the current thread or another + thread, and for a given set of features) +* ``profiler_thread_is_sleeping`` + +************** +Profiler Start +************** + +There are multiple ways to start the profiler, through command line env-vars, +and programmatically in C++ and JS. + +The main public C++ function is +`profiler_start <https://searchfox.org/mozilla-central/search?q=symbol:_Z14profiler_startN7mozilla10PowerOfTwoIjEEdjPPKcjyRKNS_5MaybeIdEE%2C_Z14profiler_startN7mozilla10PowerOfTwoIjEEdjPPKcjmRKNS_5MaybeIdEE>`_. +It takes all the features specifications, and returns a promise that gets +resolved when the Profiler has fully started in all processes (multi-process +profiling is described later in this document, for now the focus will be on each +process running its instance of the Profiler). It first calls ``profiler_init`` +if needed, and also ``profiler_stop`` if the profiler was already running. + +The main implementation, which can be called from multiple sources, is +`locked_profiler_start <https://searchfox.org/mozilla-central/search?q=locked_profiler_start>`_. +It performs a number of operations to start the profiling session, including: + +* Record the session start time. +* Pre-allocate some work buffer to capture stacks for markers on the main thread. +* In the Gecko Profiler only: If the Base Profiler was running, take ownership + of the data collected so far, and stop the Base Profiler (we don't want both + trying to collect the same data at the same time!) +* Create the ActivePS, which keeps track of most of the profiling session + information, more about it below. +* For each registered thread found in the ``ThreadRegistry``, check if it's one + of the threads to profile, and if yes set the appropriate data into the + corresponding ``ThreadRegistrationData`` (including informing the JS engine to + start recording profiling data). +* On Android, start the Java sampler. +* If native allocations are to be profiled, setup the appropriate hooks. +* Start the audio callback tracing if requested. +* Set the public shared "active" state, used by many functions to quickly assess + whether to actually record profiling data. + +ActivePS +======== + +The `ActivePS class <https://searchfox.org/mozilla-central/search?q=symbol:T_ActivePS>`_ +has a single instance at a time, that should live for the length of the +profiling session. + +It includes: + +* The session start time. +* A way to track "generations" (in case an old ActivePS still lives when the + next one starts, so that in-flight data goes to the correct place.) +* Requested features: Buffer capacity, periodic sampling interval, feature set, + list of threads to profile, optional: specific tab to profile. +* The profile data storage buffer and its chunk manager (see "Storage" section + below for details.) +* More data about live and dead profiled threads. +* Optional counters for per-process CPU usage, and power usage. +* A pointer to the ``SamplerThread`` object (see "Periodic Sampling" section + below for details.) + +******* +Storage +******* + +During a session, the profiling data is serialized into a buffer, which is made +of "chunks", each of which contains "blocks", which have a size and the "entry" +data. + +During a profiling session, there is one main profile buffer, which may be +started by the Base Profiler, and then handed over to the Gecko Profiler when +the latter starts. + +The buffer is divided in chunks of equal size, which are allocated before they +are needed. When the data reaches a user-set limit, the oldest chunk is +recycled. This means that for long-enough profiling sessions, only the most +recent data (that could fit under the limit) is kept. + +Each chunk stores a sequence of blocks of variable length. The chunk itself +only knows where the first full block starts, and where the last block ends, +which is where the next block will be reserved. + +To add an entry to the buffer, a block is reserved, the size is written first +(so that readers can find the start of the next block), and then the entry bytes +are written. + +The following sessions give more technical details. + +leb128iterator.h +================ + +`This utility header <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public/leb128iterator.h>`_ +contains some functions to read and write unsigned "LEB128" numbers +(`LEB128 on wikipedia <https://en.wikipedia.org/wiki/LEB128>`_). + +They are an efficient way to serialize numbers that are usually small, e.g., +numbers up to 127 only take one byte, two bytes up to 16,383, etc. + +ProfileBufferBlockIndex +======================= + +`A ProfileBufferBlockIndex object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferBlockIndex>`_ +encapsulates a block index that is known to be the valid start of a block. It is +created when a block is reserved, or when trusted code computes the start of a +block in a chunk. + +The more generic +`ProfileBufferIndex <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferIndex>`_ +type is used when working inside blocks. + +ProfileBufferChunk +================== + +`A ProfileBufferChunk <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferChunk>`_ +is a variable-sized object. It contains: + +* A public copyable header, itself containing: + + * The local offset to the first full block (a chunk may start with the end of + a block that was started at the end of the previous chunk). That offset in + the very first chunk is the natural start to read all the data in the + buffer. + * The local offset past the last reserved block. This is where the next block + should be reserved, unless it points past the end of this chunk size. + * The timestamp when the chunk was first used. + * The timestamp when the chunk became full. + * The number of bytes that may be stored in this chunk. + * The number of reserved blocks. + * The global index where this chunk starts. + * The process ID writing into this chunk. + +* An owning unique pointer to the next chunk. It may be null for the last chunk + in a chain. + +* In ``DEBUG`` builds, a state variable, which is used to ensure that the chunk + goes through a known sequence of states (e.g., Created, then InUse, then + Done, etc.) See the sequence diagram + `where the member variable is defined <https://searchfox.org/mozilla-central/search?q=symbol:F_%3CT_mozilla%3A%3AProfileBufferChunk%3A%3AInternalHeader%3E_mState>`_. + +* The actual buffer data. + +Because a ProfileBufferChunk is variable-size, it must be created through its +static ``Create`` function, which takes care of allocating the correct amount +of bytes, at the correct alignment. + +Chunk Managers +============== + +ProfilerBufferChunkManager +-------------------------- + +`The ProfileBufferChunkManager abstract class <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferChunkManager>`_ +defines the interface of classes that manage chunks. + +Concrete implementations are responsible for: +* Creating chunks for their user, with a mechanism to pre-allocate chunks before they are actually needed. +* Taking back and owning chunks when they are "released" (usually when full). +* Automatically destroying or recycling the oldest released chunks. +* Giving temporary access to extant released chunks. + +ProfileBufferChunkManagerSingle +------------------------------- + +`A ProfileBufferChunkManagerSingle object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferChunkManagerSingle>`_ +manages a single chunk. + +That chunk is always the same, it is never destroyed. The user may use it and +optionally release it. The manager can then be reset, and that one chunk will +be available again for use. + +A request for a second chunk would always fail. + +This manager is short-lived and not thread-safe. It is useful when there is some +limited data that needs to be captured without blocking the global profiling +buffer, usually one stack sample. This data may then be extracted and quickly +added to the global buffer. + +ProfileBufferChunkManagerWithLocalLimit +--------------------------------------- + +`A ProfileBufferChunkManagerWithLocalLimit object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferChunkManagerSingle>`_ +implements the ``ProfileBufferChunkManager`` interface fully, managing a number +of chunks, and making sure their total combined size stays under a given limit. +This is the main chunk manager user during a profiling session. + +Note: It also implements the ``ProfileBufferControlledChunkManager`` interface, +this is explained in the later section "Multi-Process Profiling". + +It is thread-safe, and one instance is shared by both Profilers. + +ProfileChunkedBuffer +==================== + +`A ProfileChunkedBuffer object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileChunkedBuffer>`_ +uses a ``ProfilerBufferChunkManager`` to store data, and handles the different +C++ types of data that the Profilers want to read/write as entries in buffer +chunks. + +Its main function is ``ReserveAndPut``: + +* It takes an invocable object (like a lambda) that should return the size of + the entry to store, this is to potentially avoid costly operations just to + compute a size, when the profiler may not be running. +* It attempts to reserve the space in its chunks, requesting a new chunk if + necessary. +* It then calls a provided invocable object with a + `ProfileBufferEntryWriter <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferEntryWriter>`_, + which offers a range of functions to help serialize C++ objects. The + de/serialization functions are found in specializations of + `ProfileBufferEntryWriter::Serializer <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferEntryWriter%3A%3ASerializer>`_ + and + `ProfileBufferEntryReader::Deserializer <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferEntryReader%3A%3ADeserializer>`_. + +More "put" functions use ``ReserveAndPut`` to more easily serialize blocks of +memory, or C++ objects. + +``ProfileChunkedBuffer`` is optionally thread-safe, using a +``BaseProfilerMaybeMutex``. + +WIP note: Using a mutex makes this storage too noisy for profiling some +real-time (like audio processing). +`Bug 1697953 <https://bugzilla.mozilla.org/show_bug.cgi?id=1697953>`_ will look +at switching to using atomic variables instead. +An alternative would be to use a totally separate non-thread-safe buffers for +each real-time thread that requires it (see +`bug 1754889 <https://bugzilla.mozilla.org/show_bug.cgi?id=1754889>`_). + +ProfileBuffer +============= + +`A ProfileBuffer object <https://searchfox.org/mozilla-central/search?q=symbol:T_ProfileBuffer>`_ +uses a ``ProfileChunkedBuffer`` to store data, and handles the different kinds +of entries that the Profilers want to read/write. + +Each entry starts with a tag identifying a kind. These kinds can be found in +`ProfileBufferEntryKinds.h <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public/ProfileBufferEntryKinds.h>`_. + +There are "legacy" kinds, which are small fixed-length entries, such as: +Categories, labels, frame information, counters, etc. These can be stored in +`ProfileBufferEntry objects <https://searchfox.org/mozilla-central/search?q=symbol:T_ProfileBufferEntry>`_ + +And there are "modern" kinds, which have variable sizes, such as: Markers, CPU +running times, full stacks, etc. These are more directly handled by code that +can access the underlying ``ProfileChunkedBuffer``. + +The other major responsibility of a ``ProfileChunkedBuffer`` is to read back all +this data, sometimes during profiling (e.g., to duplicate a stack), but mainly +at the end of a session when generating the output JSON profile. + +***************** +Periodic Sampling +***************** + +Probably the most important job of the Profiler is to sample stacks of a number +of running threads, to help developers know which functions get used a lot when +performing some operation on Firefox. + +This is accomplished from a special thread, which regularly springs into action +and captures all this data. + +SamplerThread +============= + +`The SamplerThread object <https://searchfox.org/mozilla-central/search?q=symbol:T_SamplerThread>`_ +manages the information needed during sampling. It is created when the profiler +starts, and is stored inside the ``ActivePS``, see above for details. + +It includes: + +* A ``Sampler`` object that contains platform-specific details, which are + implemented in separate files like platform-win32.cpp, etc. +* The same generation index as its owning ``ActivePS``. +* The requested interval between samples. +* A handle to the thread where the sampling happens, its main function is + `Run() function <https://searchfox.org/mozilla-central/search?q=symbol:_ZN13SamplerThread3RunEv>`_. +* A list of callbacks to invoke after the next sampling. These may be used by + tests to wait for sampling to actually happen. +* The unregistered-thread-spy data, and an optional handle on another thread + that takes care of "spying" on unregistered thread (on platforms where that + operation is too expensive to run directly on the sampling thread). + +The ``Run()`` function takes care of performing the periodic sampling work: +(more details in the following sections) + +* Retrieve the sampling parameters. +* Instantiate a ``ProfileBuffer`` on the stack, to capture samples from other threads. +* Loop until a ``break``: + + * Lock the main profiler mutex, and do: + + * Check if sampling should stop, and break from the loop. + * Clean-up exit profiles (these are profiles sent from dying sub-processes, + and are kept for as long as they overlap with this process' own buffer range). + * Record the CPU utilization of the whole process. + * Record the power consumption. + * Sample each registered counter, including the memory counter. + * For each registered thread to be profiled: + + * Record the CPU utilization. + * If the thread is marked as "still sleeping", record a "same as before" + sample, otherwise suspend the thread and take a full stack sample. + * On some threads, record the event delay to compute the + (un)responsiveness. WIP note: This implementation may change. + + * Record profiling overhead durations. + + * Unlock the main profiler mutex. + * Invoke registered post-sampling callbacks. + * Spy on unregistered threads. + * Based on the requested sampling interval, and how much time this loop took, + compute when the next sampling loop should start, and make the thread sleep + for the appropriate amount of time. The goal is to be as regular as + possible, but if some/all loops take too much time, don't try too hard to + catch up, because the system is probably under stress already. + * Go back to the top of the loop. + +* If we're here, we hit a loop ``break`` above. +* Invoke registered post-sampling callbacks, to let them know that sampling + stopped. + +CPU Utilization +=============== + +CPU Utilization is stored as a number of milliseconds that a thread or process +has spent running on the CPU since the previous sampling. + +Implementations are platform-dependent, and can be found in +`the GetThreadRunningTimesDiff function <https://searchfox.org/mozilla-central/search?q=symbol:_ZL25GetThreadRunningTimesDiffRK10PSAutoLockRN7mozilla8profiler45ThreadRegistrationUnlockedRWForLockedProfilerE>`_ +and +`the GetProcessRunningTimesDiff function <https://searchfox.org/mozilla-central/search?q=symbol:_ZL26GetProcessRunningTimesDiffRK10PSAutoLockR12RunningTimes>`_. + +Power Consumption +================= + +Energy probes added in 2022. + +Stacks +====== + +Stacks are the sequence of calls going from the entry point in the program +(generally ``main()`` and some OS-specific functions above), down to the +function where code is currently being executed. + +Native Frames +------------- + +Compiled code, from C++ and Rust source. + +Label Frames +------------ + +Pseudo-frames with arbitrary text, added from any language, mostly C++. + +JS, Wasm Frames +--------------- + +Frames corresponding to JavaScript functions. + +Java Frames +----------- + +Recorded by the JavaSampler. + +Stack Merging +------------- + +The above types of frames are all captured in different ways, and when finally +taking an actual stack sample (apart from Java), they get merged into one stack. + +All frames have an associated address in the call stack, and can therefore be +merged mostly by ordering them by this stack address. See +`MergeStacks <https://searchfox.org/mozilla-central/search?q=symbol:_ZL11MergeStacksjbRKN7mozilla8profiler51ThreadRegistrationUnlockedReaderAndAtomicRWOnThreadERK9RegistersRK11NativeStackR22ProfilerStackCollectorPN2JS22ProfilingFrameIterator5FrameEj>`_ +for the implementation details. + +Counters +======== + +Counters are a special kind of probe, which can be continuously updated during +profiling, and the ``SamplerThread`` will sample their value at every loop. + +Memory Counter +-------------- + +This is the main counter. During a profiling session, hooks into the memory +manager keep track of each de/allocation, so at each sampling we know how many +operations were performed, and what is the current memory usage compared to the +previous sampling. + +Profiling Overhead +================== + +The ``SamplerThread`` records timestamps between parts of its sampling loop, and +records this as the sampling overhead. This may be useful to determine if the +profiler itself may have used too much of the computer resources, which could +skew the profile and give wrong impressions. + +Unregistered Thread Profiling +============================= + +At some intervals (not necessarily every sampling loop, depending on the OS), +the profiler may attempt to find unregistered threads, and record some +information about them. + +WIP note: This feature is experimental, and data is captured in markers on the +main thread. More work is needed to put this data in tracks like regular +registered threads, and capture more data like stack samples and markers. + +******* +Markers +******* + +Markers are events with a precise timestamp or time range, they have a name, a +category, options (out of a few choices), and optional marker-type-specific +payload data. + +Before describing the implementation, it is useful to be familiar with how +markers are natively added from C++, because this drives how the implementation +takes all this information and eventually outputs it in the final JSON profile. + +Adding Markers from C++ +======================= + +See https://firefox-source-docs.mozilla.org/tools/profiler/markers-guide.html + +Implementation +============== + +The main function that records markers is +`profiler_add_marker <https://searchfox.org/mozilla-central/search?q=symbol:_Z19profiler_add_markerRKN7mozilla18ProfilerStringViewIcEERKNS_14MarkerCategoryEONS_13MarkerOptionsET_DpRKT0_>`_. +It's a variadic templated function that takes the different the expected +arguments, first checks if the marker should actually be recorded (the profiler +should be running, and the target thread should be profiled), and then calls +into the deeper implementation function ``AddMarkerToBuffer`` with a reference +to the main profiler buffer. + +`AddMarkerToBuffer <https://searchfox.org/mozilla-central/search?q=symbol:_Z17AddMarkerToBufferRN7mozilla20ProfileChunkedBufferERKNS_18ProfilerStringViewIcEERKNS_14MarkerCategoryEONS_13MarkerOptionsET_DpRKT0_>`_ +takes the marker type as an object, removes it from the function parameter list, +and calls the next function with the marker type as an explicit template +parameter, and also a pointer to the function that can capture the stack +(because it is different between Base and Gecko Profilers, in particular the +latter one knows about JS). + +From here, we enter the land of +`BaseProfilerMarkersDetail.h <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public/BaseProfilerMarkersDetail.h>`_, +which employs some heavy template techniques, in order to most efficiently +serialize the given marker payload arguments, in order to make them +deserializable when outputting the final JSON. In previous implementations, for +each new marker type, a new C++ class derived from a payload abstract class was +required, that had to implement all the constructors and virtual functions to: + +* Create the payload object. +* Serialize the payload into the profile buffer. +* Deserialize from the profile buffer to a new payload object. +* Convert the payload into the final output JSON. + +Now, the templated functions automatically take care of serializing all given +function call arguments directly (instead of storing them somewhere first), and +preparing a deserialization function that will recreate them on the stack and +directly call the user-provided JSONification function with these arguments. + +Continuing from the public ``AddMarkerToBuffer``, +`mozilla::base_profiler_markers_detail::AddMarkerToBuffer <https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla28base_profiler_markers_detail17AddMarkerToBufferERNS_20ProfileChunkedBufferERKNS_18ProfilerStringViewIcEERKNS_14MarkerCategoryEONS_13MarkerOptionsEPFbS2_NS_19StackCaptureOptionsEEDpRKT0_>`_ +sets some defaults if not specified by the caller: Target the current thread, +use the current time. + +Then if a stack capture was requested, attempt to do it in +the most efficient way, using a pre-allocated buffer if possible. + +WIP note: This potential allocation should be avoided in time-critical thread. +There is already a buffer for the main thread (because it's the busiest thread), +but there could be more pre-allocated threads, for specific real-time thread +that need it, or picked from a pool of pre-allocated buffers. See +`bug 1578792 <https://bugzilla.mozilla.org/show_bug.cgi?id=1578792>`_. + +From there, `AddMarkerWithOptionalStackToBuffer <https://searchfox.org/mozilla-central/search?q=AddMarkerWithOptionalStackToBuffer>`_ +handles ``NoPayload`` markers (usually added with ``PROFILER_MARKER_UNTYPED``) +in a special way, mostly to avoid the extra work associated with handling +payloads. Otherwise it continues with the following function. + +`MarkerTypeSerialization<MarkerType>::Serialize <symbol:_ZN7mozilla28base_profiler_markers_detail23MarkerTypeSerialization9SerializeERNS_20ProfileChunkedBufferERKNS_18ProfilerStringViewIcEERKNS_14MarkerCategoryEONS_13MarkerOptionsEDpRKTL0__>`_ +retrieves the deserialization tag associated with the marker type. If it's the +first time this marker type is used, +`Streaming::TagForMarkerTypeFunctions <symbol:_ZN7mozilla28base_profiler_markers_detail9Streaming25TagForMarkerTypeFunctionsEPFvRNS_24ProfileBufferEntryReaderERNS_12baseprofiler20SpliceableJSONWriterEEPFNS_4SpanIKcLy18446744073709551615EEEvEPFNS_12MarkerSchemaEvE,_ZN7mozilla28base_profiler_markers_detail9Streaming25TagForMarkerTypeFunctionsEPFvRNS_24ProfileBufferEntryReaderERNS_12baseprofiler20SpliceableJSONWriterEEPFNS_4SpanIKcLm18446744073709551615EEEvEPFNS_12MarkerSchemaEvE,_ZN7mozilla28base_profiler_markers_detail9Streaming25TagForMarkerTypeFunctionsEPFvRNS_24ProfileBufferEntryReaderERNS_12baseprofiler20SpliceableJSONWriterEEPFNS_4SpanIKcLj4294967295EEEvEPFNS_12MarkerSchemaEvE>`_ +adds it to the global list (which stores some function pointers used during +deserialization). + +Then the main serialization happens in +`StreamFunctionTypeHelper<decltype(MarkerType::StreamJSONMarkerData)>::Serialize <symbol:_ZN7mozilla28base_profiler_markers_detail24StreamFunctionTypeHelperIFT_RNS_12baseprofiler20SpliceableJSONWriterEDpT0_EE9SerializeERNS_20ProfileChunkedBufferERKNS_18ProfilerStringViewIcEERKNS_14MarkerCategoryEONS_13MarkerOptionsEhDpRKS6_>`_. +Deconstructing this mouthful of an template: + +* ``MarkerType::StreamJSONMarkerData`` is the user-provided function that will + eventually produce the final JSON, but here it's only used to know the + parameter types that it expects. +* ``StreamFunctionTypeHelper`` takes that function prototype, and can extract + its argument by specializing on ```R(SpliceableJSONWriter&, As...)``, now + ``As...`` is a parameter pack matching the function parameters. +* Note that ``Serialize`` also takes a parameter pack, which contains all the + referenced arguments given to the top ``AddBufferToMarker`` call. These two + packs are supposed to match, at least the given arguments should be + convertible to the target pack parameter types. +* That specialization's ``Serialize`` function calls the buffer's ``PutObjects`` + variadic function to write all the marker data, that is: + + * The entry kind that must be at the beginning of every buffer entry, in this + case `ProfileBufferEntryKind::Marker <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public/ProfileBufferEntryKinds.h#78>`_. + * The common marker data (options first, name, category, deserialization tag). + * Then all the marker-type-specific arguments. Note that the C++ types + are those extracted from the deserialization function, so we know that + whatever is serialized here can be later deserialized using those same + types. + +The deserialization side is described in the later section "JSON output of +Markers". + +Adding Markers from Rust +======================== + +See https://firefox-source-docs.mozilla.org/tools/profiler/instrumenting-rust.html#adding-markers + +Adding Markers from JS +====================== + +See https://firefox-source-docs.mozilla.org/tools/profiler/instrumenting-javascript.html + +Adding Markers from Java +======================== + +See https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/geckoview/ProfilerController.java + +************* +Profiling Log +************* + +During a profiling session, some profiler-related events may be recorded using +`ProfilingLog::Access <https://searchfox.org/mozilla-central/search?q=symbol:_ZN12ProfilingLog6AccessEOT_>`_. + +The resulting JSON object is added near the end of the process' JSON generation, +in a top-level property named "profilingLog". This object is free-form, and is +not intended to be displayed, or even read by most people. But it may include +interesting information for advanced users, or could be an early temporary +prototyping ground for new features. + +See "profileGatheringLog" for another log related to late events. + +WIP note: This was introduced shortly before this documentation, so at this time +it doesn't do much at all. + +*************** +Profile Capture +*************** + +Usually at the end of a profiling session, a profile is "captured", and either +saved to disk, or sent to the front-end https://profiler.firefox.com for +analysis. This section describes how the captured data is converted to the +Gecko Profiler JSON format. + +FailureLatch +============ + +`The FailureLatch interface <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AFailureLatch>`_ +is used during the JSON generation, in order to catch any unrecoverable error +(such as running Out Of Memory), to exit the process early, and to forward the +error to callers. + +There are two main implementations, suffixed "source" as they are the one source +of failure-handling, which is passed as ``FailureLatch&`` throughout the code: + +* `FailureLatchInfallibleSource <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AFailureLatchInfallibleSource>`_ + is an "infallible" latch, meaning that it doesn't expect any failure. So if + a failure actually happened, the program would immediately terminate! (This + was the default behavior prior to introducing these latches.) +* `FailureLatchSource <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AFailureLatchSource>`_ + is a "fallible" latch, it will record the first failure that happens, and + "latch" into the failure state. The code should regularly examine this state, + and return early when possible. Eventually this failure state may be exposed + to end users. + +ProgressLogger, ProportionValue +=============================== + +`A ProgressLogger object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProgressLogger>`_ +is used to track the progress of a long operation, in this case the JSON +generation process. + +To match how the JSON generation code works (as a tree of C++ functions calls), +each ``ProgressLogger`` in a function usually records progress from 0 to 100% +locally inside that function. If that function calls a sub-function, it gives it +a sub-logger, which in the caller function is set to represent a local sub-range +(like 20% to 40%), but to the called function it will look like its own local +``ProgressLogger`` that goes from 0 to 100%. The very top ``ProgressLogger`` +converts the deepest local progress value to the corresponding global progress. + +Progress values are recorded in +`ProportionValue objects <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProportionValue>`_, +which effectively record fractional value with no loss of precision. + +This progress is most useful when the parent process is waiting for child +processes to do their work, to make sure progress does happen, otherwise to stop +waiting for frozen processes. More about that in the "Multi-Process Profiling" +section below. + +JSONWriter +========== + +`A JSONWriter object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AJSONWriter>`_ +offers a simple way to create a JSON stream (start/end collections, add +elements, etc.), and calls back into a provided +`JSONWriteFunc interface <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AJSONWriteFunc>`_ +to output characters. + +While these classes live outside of the Profiler directories, it may sometimes be +worth maintaining and/or modifying them to better serve the Profiler's needs. +But there are other users, so be careful not to break other things! + +SpliceableJSONWriter and SpliceableChunkedJSONWriter +==================================================== + +Because the Profiler deals with large amounts of data (big profiles can take +tens to hundreds of megabytes!), some specialized wrappers add better handling +of these large JSON streams. + +`SpliceableJSONWriter <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3Abaseprofiler%3A%3ASpliceableJSONWriter>`_ +is a subclass of ``JSONWriter``, and allows the "splicing" of JSON strings, +i.e., being able to take a whole well-formed JSON string, and directly inserting +it as a JSON object in the target JSON being streamed. + +It also offers some functions that are often useful for the Profiler, such as: +* Converting a timestamp into a JSON object in the stream, taking care of keeping a nanosecond precision, without unwanted zeroes or nines at the end. +* Adding a number of null elements. +* Adding a unique string index, and add that string to a provided unique-string list if necessary. (More about UniqueStrings below.) + +`SpliceableChunkedJSONWriter <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3Abaseprofiler%3A%3ASpliceableChunkedJSONWriter>`_ +is a subclass of ``SpliceableJSONWriter``. Its main attribute is that it provides its own writer +(`ChunkedJSONWriteFunc <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3Abaseprofiler%3A%3AChunkedJSONWriteFunc>`_), +which stores the stream as a sequence of "chunks" (heap-allocated buffers). +It starts with a chunk of a default size, and writes incoming data into it, +later allocating more chunks as needed. This avoids having massive buffers being +resized all the time. + +It also offers the same splicing abilities as its parent class, but in case an +incoming JSON string comes from another ``SpliceableChunkedJSONWriter``, it's +able to just steal the chunks and add them to its list, thereby avoiding +expensive allocations and copies and destructions. + +UniqueStrings +============= + +Because a lot of strings would be repeated in profiles (e.g., frequent marker +names), such strings are stored in a separate JSON array of strings, and an +index into this list is used instead of that full string object. + +Note that these unique-string indices are currently only located in specific +spots in the JSON tree, they cannot be used just anywhere strings are accepted. + +`The UniqueJSONStrings class <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3Abaseprofiler%3A%3AUniqueJSONStrings>`_ +stores this list of unique strings in a ``SpliceableChunkedJSONWriter``. +Given a string, it takes care of storing it if encountered for the first time, +and inserts the index into a target ``SpliceableJSONWriter``. + +JSON Generation +=============== + +The "Gecko Profile Format" can be found at +https://github.com/firefox-devtools/profiler/blob/main/docs-developer/gecko-profile-format.md . + +The implementation in the back-end is +`locked_profiler_stream_json_for_this_process <https://searchfox.org/mozilla-central/search?q=locked_profiler_stream_json_for_this_process>`_. +It outputs each JSON top-level JSON object, mostly in sequence. See the code for +how each object is output. Note that there is special handling for samples and +markers, as explained in the following section. + +ProcessStreamingContext and ThreadStreamingContext +-------------------------------------------------- + +In JSON profiles, samples and markers are separated by thread and by +samples/markers. Because there are potentially tens to a hundred threads, it +would be very costly to read the full profile buffer once for each of these +groups. So instead the buffer is read once, and all samples and markers are +handled as they are read, and their JSON output is sent to separate JSON +writers. + +`A ProcessStreamingContext object <https://searchfox.org/mozilla-central/search?q=symbol:T_ProcessStreamingContext>`_ +contains all the information to facilitate this output, including a list of +`ThreadStreamingContext's <https://searchfox.org/mozilla-central/search?q=symbol:T_ThreadStreamingContext>`_, +which each contain one ``SpliceableChunkedJSONWriter`` for the samples, and one +for the markers in this thread. + +When reading entries from the profile buffer, samples and markers are found by +their ``ProfileBufferEntryKind``, and as part of deserializing either kind (more +about each below), the thread ID is read, and determines which +``ThreadStreamingContext`` will receive the JSON output. + +At the end of this process, all ``SpliceableChunkedJSONWriters`` are efficiently +spliced (mainly a pointer move) into the final JSON output. + +JSON output of Samples +---------------------- + +This work is done in +`ProfileBuffer::DoStreamSamplesAndMarkersToJSON <https://searchfox.org/mozilla-central/search?q=DoStreamSamplesAndMarkersToJSON>`_. + +From the main ``ProfileChunkedBuffer``, each entry is visited, its +``ProfileBufferEntryKind`` is read first, and for samples all frames from +captured stack are converted to the appropriate JSON. + +`A UniqueStacks object <https://searchfox.org/mozilla-central/search?q=symbol:T_UniqueStacks>`_ +is used to de-duplicate frames and even sub-stacks: + +* Each unique frame string is written into a JSON array inside a + ``SpliceableChunkedJSONWriter``, and its index is the frame identifier. +* Each stack level is also de-duplicated, and identifies the associated frame + string, and points at the calling stack level (i.e., closer to the root). +* Finally, the identifier for the top of the stack is stored, along with a + timestamp (and potentially some more information) as the sample. + +For example, if we have collected the following samples: + +#. A -> B -> C +#. A -> B +#. A -> B -> D + +The frame table would contain each frame name, something like: +``["A", "B", "C", "D"]``. So the frame containing "A" has index 0, "B" is at 1, +etc. + +The stack table would contain each stack level, something like: +``[[0, null], [1, 0], [2, 1], [3, 1]]``. ``[0, null]`` means the frame is 0 +("A"), and it has no caller, it's the root frame. ``[1, 0]`` means the frame is +1 ("B"), and its caller is stack 0, which is just the previous one in this +example. + +And the three samples stored in the thread data would be therefore be: 2, 1, 3 +(E.g.: "2" points in the stack table at the frame [2,1] with "C", and from them +down to "B", then "A"). + +All this contains all the information needed to reconstruct all full stack +samples. + +JSON output of Markers +---------------------- + +This also happens +`inside ProfileBuffer::DoStreamSamplesAndMarkersToJSON <https://searchfox.org/mozilla-central/search?q=DoStreamSamplesAndMarkersToJSON>`_. + +When a ``ProfileBufferEntryKind::Marker`` is encountered, +`the DeserializeAfterKindAndStream function <https://searchfox.org/mozilla-central/search?q=DeserializeAfterKindAndStream>`_ +reads the ``MarkerOptions`` (stored as explained above), which include the +thread ID, identifying which ``ThreadStreamingContext``'s +``SpliceableChunkedJSONWriter`` to use. + +After that, the common marker data (timing, category, etc.) is output. + +Then the ``Streaming::DeserializerTag`` identifies which type of marker this is. +The special case of ``0`` (no payload) means nothing more is output. + +Otherwise some more common data is output as part of the payload if present, in +particular the "inner window id" (used to match markers with specific html +frames), and stack. + +WIP note: Some of these may move around in the future, see +`bug 1774326 <https://bugzilla.mozilla.org/show_bug.cgi?id=1774326>`_, +`bug 1774328 <https://bugzilla.mozilla.org/show_bug.cgi?id=1774328>`_, and +others. + +In case of a C++-written payload, the ``DeserializerTag`` identifies the +``MarkerDataDeserializer`` function to use. This is part of the heavy templated +code in BaseProfilerMarkersDetail.h, the function is defined as +`MarkerTypeSerialization<MarkerType>::Deserialize <https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla28base_profiler_markers_detail23MarkerTypeSerialization11DeserializeERNS_24ProfileBufferEntryReaderERNS_12baseprofiler20SpliceableJSONWriterE>`_, +which outputs the marker type name, and then each marker payload argument. The +latter is done by using the user-defined ``MarkerType::StreamJSONMarkerData`` +parameter list, and recursively deserializing each parameter from the profile +buffer into an on-stack variable of a corresponding type, at the end of which +``MarkerType::StreamJSONMarkerData`` can be called with all of these arguments +at it expects, and that function does the actual JSON streaming as the user +programmed. + +************* +Profiler Stop +************* + +See "Profiler Start" and do the reverse! + +There is some special handling of the ``SampleThread`` object, just to ensure +that it gets deleted outside of the main profiler mutex being locked, otherwise +this could result in a deadlock (because it needs to take the lock before being +able to check the state variable indicating that the sampling loop and thread +should end). + +***************** +Profiler Shutdown +***************** + +See "Profiler Initialization" and do the reverse! + +One additional action is handling the optional ``MOZ_PROFILER_SHUTDOWN`` +environment variable, to output a profile if the profiler was running. + +*********************** +Multi-Process Profiling +*********************** + +All of the above explanations focused on what the profiler is doing is each +process: Starting, running and collecting samples, markers, and more data, +outputting JSON profiles, and stopping. + +But Firefox is a multi-process program, since +`Electrolysis aka e10s <https://wiki.mozilla.org/Electrolysis>`_ introduce child +processes to handle web content and extensions, and especially since +`Fission <https://wiki.mozilla.org/Project_Fission>`_ forced even parts of the +same webpage to run in separate processes, mainly for added security. Since then +Firefox can spawn many processes, sometimes 10 to 20 when visiting busy sites. + +The following sections explains how profiling Firefox as a whole works. + +IPC (Inter-Process Communication) +================================= + +See https://firefox-source-docs.mozilla.org/ipc/. + +As a quick summary, some message-passing function-like declarations live in +`PProfiler.ipdl <https://searchfox.org/mozilla-central/source/tools/profiler/gecko/PProfiler.ipdl>`_, +and corresponding ``SendX`` and ``RecvX`` C++ functions are respectively +generated in +`PProfilerParent.h <https://searchfox.org/mozilla-central/source/__GENERATED__/ipc/ipdl/_ipdlheaders/mozilla/PProfilerParent.h>`_, +and virtually declared (for user implementation) in +`PProfilerChild.h <https://searchfox.org/mozilla-central/source/__GENERATED__/ipc/ipdl/_ipdlheaders/mozilla/PProfilerChild.h>`_. + +During Profiling +================ + +Exit profiles +------------- + +One IPC message that is not in PProfiler.ipdl, is +`ShutdownProfile <https://searchfox.org/mozilla-central/search?q=ShutdownProfile%28&path=&case=false®exp=false>`_ +in +`PContent.ipdl <https://searchfox.org/mozilla-central/source/dom/ipc/PContent.ipdl>`_. + +It's called from +`ContentChild::ShutdownInternal <https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla3dom12ContentChild16ShutdownInternalEv>`_, +just before a child process ends, and if the profiler was running, to ensure +that the profile data is collected and sent to the parent, for storage in its +``ActivePS``. + +See +`ActivePS::AddExitProfile <https://searchfox.org/mozilla-central/search?q=symbol:_ZN8ActivePS14AddExitProfileERK10PSAutoLockRK12nsTSubstringIcE>`_ +for details. Note that the current "buffer position at gathering time" (which is +effectively the largest ``ProfileBufferBlockIndex`` that is present in the +global profile buffer) is recorded. Later, +`ClearExpiredExitProfiles <https://searchfox.org/mozilla-central/search?q=ClearExpiredExitProfiles>`_ +looks at the **smallest** ``ProfileBufferBlockIndex`` still present in the +buffer (because early chunks may have been discarded to limit memory usage), and +discards exit profiles that were recorded before, because their data is now +older than anything stored in the parent. + +Profile Buffer Global Memory Control +------------------------------------ + +Each process runs its own profiler, with each its own profile chunked buffer. To +keep the overall memory usage of all these buffers under the user-picked limit, +processes work together, with the parent process overseeing things. + +Diagram showing the relevant classes, see details in the following sub-sections: + +.. image:: fissionprofiler-20200424.png + +ProfileBufferControlledChunkManager +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +`The ProfileBufferControlledChunkManager interface <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferControlledChunkManager>`_ +allows a controller to get notified about all chunk updates, and to force the +destruction/recycling of old chunks. +`The ProfileBufferChunkManagerWithLocalLimit class <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferChunkManagerWithLocalLimit>`_ +implements it. + +`An Update object <https://searchfox.org/mozilla-central/search?q=symbol:T_mozilla%3A%3AProfileBufferControlledChunkManager%3A%3AUpdate>`_ +contains all information related to chunk changes: How much memory is currently +used by the local chunk manager, how much has been "released" (and therefore +could be destroyed/recycled), and a list of all chunks that were released since +the previous update; it also has a special state meaning that the child is +shutting down so there won't be updates anymore. An ``Update`` may be "folded" +into a previous one, to create a combined update equivalent to the two separate +ones one after the other. + +Update Handling in the ProfilerChild +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When the profiler starts in a child process, the ``ProfilerChild`` +`starts to listen for updates <https://searchfox.org/mozilla-central/search?q=symbol:_ZN7mozilla13ProfilerChild17SetupChunkManagerEv>`_. + +These updates are stored and folded into previous ones (if any). At some point, +`an AwaitNextChunkManagerUpdate message <https://searchfox.org/mozilla-central/search?q=RecvAwaitNextChunkManagerUpdate>`_ +will be received, and any update can be forwarded to the parent. The local +update is cleared, ready to store future updates. + +Update Handling in the ProfilerParent +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When the profiler starts AND when there are child processes, the +`ProfilerParent's ProfilerParentTracker <https://searchfox.org/mozilla-central/search?q=ProfilerParentTracker>`_ +creates +`a ProfileBufferGlobalController <https://searchfox.org/mozilla-central/search?q=ProfileBufferGlobalController>`_, +which starts to listen for updates from the local chunk manager. + +The ``ProfilerParentTracker`` is also responsible for keeping track of child +processes, and to regularly +`send them AwaitNextChunkManagerUpdate messages <https://searchfox.org/mozilla-central/search?q=SendAwaitNextChunkManagerUpdate>`_, +that the child's ``ProfilerChild`` answers to with updates. The update may +indicate that the child is shutting down, in which case the tracker will stop +tracking it. + +All these updates (from the local chunk manager, and from child processes' own +chunk managers) are processed in +`ProfileBufferGlobalController::HandleChunkManagerNonFinalUpdate <https://searchfox.org/mozilla-central/search?q=HandleChunkManagerNonFinalUpdate>`_. +Based on this stream of updates, it is possible to calculate the total memory +used by all profile buffers in all processes, and to keep track of all chunks +that have been "released" (i.e., are full, and can be destroyed). When the total +memory usage reaches the user-selected limit, the controller can lookup the +oldest chunk, and get it destroyed (either a local call for parent chunks, or by +sending +`a DestroyReleasedChunksAtOrBefore message <https://searchfox.org/mozilla-central/search?q=DestroyReleasedChunksAtOrBefore>`_ +to the owning child). + +Historical note: Prior to Fission, the Profiler used to keep one fixed-size +circular buffer in each process, but as Fission made the possible number of +processes unlimited, the memory consumption grew too fast, and required the +implementation of the above system. But there may still be mentions of +"circular buffers" in the code or documents; these have effectively been +replaced by chunked buffers, with centralized chunk control. + +Gathering Child Profiles +======================== + +When it's time to capture a full profile, the parent process performs its own +JSON generation (as described above), and sends +`a GatherProfile message <https://searchfox.org/mozilla-central/search?q=GatherProfile%28>`_ +to all child processes, which will make them generate their JSON profile and +send it back to the parent. + +All child profiles, including the exit profiles collected during profiling, are +stored as elements of a top-level array with property name "processes". + +During the gathering phase, while the parent is waiting for child responses, it +regularly sends +`GetGatherProfileProgress messages <https://searchfox.org/mozilla-central/search?q=GetGatherProfileProgress>`_ +to all child processes that have not sent their profile yet, and the parent +expects responses within a short timeframe. The response carries a progress +value. If at some point two messages went with no progress was made anywhere +(either there was no response, or the progress value didn't change), the parent +assumes that remaining child processes may be frozen indefinitely, stops the +gathering and considers the JSON generation complete. + +During all of the above work, events are logged (especially issues with child +processes), and are added at the end of the JSON profile, in a top-level object +with property name "profileGatheringLog". This object is free-form, and is not +intended to be displayed, or even read by most people. But it may include +interesting information for advanced users regarding the profile-gathering +phase. diff --git a/tools/profiler/docs/fissionprofiler-20200424.png b/tools/profiler/docs/fissionprofiler-20200424.png Binary files differnew file mode 100644 index 0000000000..1602877a5b --- /dev/null +++ b/tools/profiler/docs/fissionprofiler-20200424.png diff --git a/tools/profiler/docs/fissionprofiler.umlet.uxf b/tools/profiler/docs/fissionprofiler.umlet.uxf new file mode 100644 index 0000000000..3325294e25 --- /dev/null +++ b/tools/profiler/docs/fissionprofiler.umlet.uxf @@ -0,0 +1,546 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<diagram program="umlet" version="14.3.0">
+ <zoom_level>10</zoom_level>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>70</x>
+ <y>110</y>
+ <w>300</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>/PProfilerParent/
+bg=light_gray
+--
+*+SendAwaitNextChunkManagerUpdate()*
+*+SendDestroyReleasedChunksAtOrBefore()*</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>470</x>
+ <y>20</y>
+ <w>210</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>*ProfileBufferChunkMetadata*
+bg=light_gray
+--
++doneTimeStamp
++bufferBytes
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>780</x>
+ <y>110</y>
+ <w>330</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>/PProfilerChild/
+bg=light_gray
+--
+*/+RecvAwaitNextChunkManagerUpdate() = 0/*
+*/+RecvDestroyReleasedChunksAtOrBefore() = 0/*
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>110</x>
+ <y>260</y>
+ <w>220</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>ProfilerParent
+--
+*-processId*
+--
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>210</x>
+ <y>170</y>
+ <w>30</w>
+ <h>110</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;90.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>740</x>
+ <y>250</y>
+ <w>410</w>
+ <h>90</h>
+ </coordinates>
+ <panel_attributes>ProfilerChild
+--
+-UpdateStorage: unreleased bytes, released: {pid, rangeStart[ ]}
+--
+*+RecvAwaitNextChunkUpdate()*
+*+RecvDestroyReleasedChunksAtOrBefore()*
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>930</x>
+ <y>170</y>
+ <w>30</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;80.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>110</x>
+ <y>400</y>
+ <w>220</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>ProfilerParentTracker
+--
+_+Enumerate()_
+_*+ForChild()*_</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>210</x>
+ <y>320</y>
+ <w>190</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>lt=<-
+m1=0..n
+nsTArray<ProfilerParent*></panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;80.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>80</x>
+ <y>1070</y>
+ <w>150</w>
+ <h>30</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferChunk</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>380</x>
+ <y>1070</y>
+ <w>210</w>
+ <h>30</h>
+ </coordinates>
+ <panel_attributes>/ProfileBufferChunkManager/</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>180</x>
+ <y>900</y>
+ <w>700</w>
+ <h>50</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferChunkManagerWithLocalLimit
+--
+-mUpdateCallback</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>480</x>
+ <y>940</y>
+ <w>30</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;130.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>380</x>
+ <y>1200</y>
+ <w>210</w>
+ <h>30</h>
+ </coordinates>
+ <panel_attributes>ProfileChunkedBuffer</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>410</x>
+ <y>1090</y>
+ <w>140</w>
+ <h>130</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mChunkManager</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;110.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>960</x>
+ <y>1200</y>
+ <w>100</w>
+ <h>30</h>
+ </coordinates>
+ <panel_attributes>CorePS</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>960</x>
+ <y>1040</y>
+ <w>100</w>
+ <h>30</h>
+ </coordinates>
+ <panel_attributes>ActivePS</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>580</x>
+ <y>1200</y>
+ <w>400</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mCoreBuffer</panel_attributes>
+ <additional_attributes>10.0;20.0;380.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>870</x>
+ <y>940</y>
+ <w>250</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mProfileBufferChunkManager</panel_attributes>
+ <additional_attributes>10.0;10.0;90.0;100.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>830</x>
+ <y>1140</y>
+ <w>100</w>
+ <h>30</h>
+ </coordinates>
+ <panel_attributes>ProfileBuffer</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>920</x>
+ <y>1060</y>
+ <w>130</w>
+ <h>110</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mProfileBuffer</panel_attributes>
+ <additional_attributes>10.0;90.0;40.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>580</x>
+ <y>1160</y>
+ <w>270</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mEntries</panel_attributes>
+ <additional_attributes>10.0;50.0;250.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>90</x>
+ <y>1090</y>
+ <w>310</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+m1=0..1
+mCurrentChunk: UniquePtr<></panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;130.0;290.0;130.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>210</x>
+ <y>1080</y>
+ <w>200</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+m1=0..N
+mNextChunks: UniquePtr<></panel_attributes>
+ <additional_attributes>20.0;10.0;170.0;130.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>200</x>
+ <y>940</y>
+ <w>230</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+m1=0..N
+mReleasedChunks: UniquePtr<></panel_attributes>
+ <additional_attributes>10.0;130.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>530</x>
+ <y>1090</y>
+ <w>270</w>
+ <h>130</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mOwnedChunkManager: UniquePtr<></panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;110.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>480</x>
+ <y>390</y>
+ <w>550</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>*ProfileBufferGlobalController*
+--
+-mMaximumBytes
+-mCurrentUnreleasedBytesTotal
+-mCurrentUnreleasedBytes: {pid, unreleased bytes}[ ] sorted by pid
+-mCurrentReleasedBytes
+-mReleasedChunks: {doneTimeStamp, bytes, pid}[ ] sorted by timestamp
+-mDestructionCallback: function<void(pid, rangeStart)>
+--
++Update(pid, unreleased bytes, released: ProfileBufferChunkMetadata[ ])</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>320</x>
+ <y>420</y>
+ <w>180</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mController</panel_attributes>
+ <additional_attributes>160.0;20.0;10.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>20</x>
+ <y>400</y>
+ <w>110</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+_sInstance_</panel_attributes>
+ <additional_attributes>90.0;60.0;10.0;60.0;10.0;10.0;90.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLNote</id>
+ <coordinates>
+ <x>480</x>
+ <y>250</y>
+ <w>220</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>The controller is only needed
+if there *are* child processes,
+so we can create it with the first
+child (at which point the tracker
+can register itself with the local
+profiler), and destroyed with the
+last child.
+bg=blue</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>690</x>
+ <y>330</y>
+ <w>100</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes/>
+ <additional_attributes>10.0;10.0;80.0;60.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>130</x>
+ <y>460</y>
+ <w>200</w>
+ <h>380</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mParentChunkManager</panel_attributes>
+ <additional_attributes>180.0;360.0;10.0;360.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>740</x>
+ <y>330</y>
+ <w>350</w>
+ <h>510</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mLocalBufferChunkManager</panel_attributes>
+ <additional_attributes>10.0;490.0;330.0;490.0;330.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>470</x>
+ <y>650</y>
+ <w>400</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>*ProfileBufferControlledChunkManager::Update*
+--
+-mUnreleasedBytes
+-mReleasedBytes
+-mOldestDoneTimeStamp
+-mNewReleasedChunks: ChunkMetadata[ ]</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>470</x>
+ <y>560</y>
+ <w>400</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>*ProfileBufferControlledChunkManager::ChunkMetadata*
+--
+-mDoneTimeStamp
+-mBufferBytes</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>670</x>
+ <y>610</y>
+ <w>30</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>lt=<.</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;40.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>670</x>
+ <y>740</y>
+ <w>30</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>lt=<.</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;40.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>670</x>
+ <y>50</y>
+ <w>130</w>
+ <h>110</h>
+ </coordinates>
+ <panel_attributes>lt=<.</panel_attributes>
+ <additional_attributes>10.0;10.0;110.0;90.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>360</x>
+ <y>50</y>
+ <w>130</w>
+ <h>110</h>
+ </coordinates>
+ <panel_attributes>lt=<.</panel_attributes>
+ <additional_attributes>110.0;10.0;10.0;90.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>400</x>
+ <y>130</y>
+ <w>350</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>*ProfileBufferChunkManagerUpdate*
+bg=light_gray
+--
+-unreleasedBytes
+-releasedBytes
+-oldestDoneTimeStamp
+-newlyReleasedChunks: ProfileBufferChunkMetadata[ ]</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>310</x>
+ <y>780</y>
+ <w>440</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>*ProfileBufferControlledChunkManager*
+--
+*/+SetUpdateCallback(function<void(update: Update&&)>)/*
+*/+DestroyChunksAtOrBefore(timeStamp)/*</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>480</x>
+ <y>840</y>
+ <w>30</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;60.0</additional_attributes>
+ </element>
+</diagram>
diff --git a/tools/profiler/docs/index.rst b/tools/profiler/docs/index.rst new file mode 100644 index 0000000000..53920e7d2f --- /dev/null +++ b/tools/profiler/docs/index.rst @@ -0,0 +1,37 @@ +Gecko Profiler +============== + +The Firefox Profiler is the collection of tools used to profile Firefox. This is backed +by the Gecko Profiler, which is the primarily C++ component that instruments Gecko. It +is configurable, and supports a variety of data sources and recording modes. Primarily, +it is used as a statistical profiler, where the execution of threads that have been +registered with the profile is paused, and a sample is taken. Generally, this includes +a stackwalk with combined native stack frame, JavaScript stack frames, and custom stack +frame labels. + +In addition to the sampling, the profiler can collect markers, which are collected +deterministically (as opposed to statistically, like samples). These include some +kind of text description, and optionally a payload with more information. + +This documentation serves to document the Gecko Profiler and Base Profiler components, +while the profiler.firefox.com interface is documented at `profiler.firefox.com/docs/ <https://profiler.firefox.com/docs/>`_ + +.. toctree:: + :maxdepth: 1 + + code-overview + buffer + instrumenting-javascript + instrumenting-rust + markers-guide + memory + +The following areas still need documentation: + + * LUL + * Instrumenting Java + * Registering Threads + * Samples and Stack Walking + * Triggering Gecko Profiles in Automation + * JS Tracer + * Serialization diff --git a/tools/profiler/docs/instrumenting-javascript.rst b/tools/profiler/docs/instrumenting-javascript.rst new file mode 100644 index 0000000000..928d94781e --- /dev/null +++ b/tools/profiler/docs/instrumenting-javascript.rst @@ -0,0 +1,60 @@ +Instrumenting JavaScript +======================== + +There are multiple ways to use the profiler with JavaScript. There is the "JavaScript" +profiler feature (via about:profiling), which enables stack walking for JavaScript code. +This is most likely turned on already for every profiler preset. + +In addition, markers can be created to specifically marker an instant in time, or a +duration. This can be helpful to make sense of a particular piece of the front-end, +or record events that normally wouldn't show up in samples. + +.. note:: + This guide explains JavaScript markers in depth. To learn more about how to add a + marker in C++ or Rust, please take a look at their documentation + in :doc:`markers-guide` or :doc:`instrumenting-rust` respectively. + +Markers in Browser Chrome +************************* + +If you have access to ChromeUtils, then adding a marker is relatively easily. + +.. code-block:: javascript + + // Add an instant marker, representing a single point in time + ChromeUtils.addProfilerMarker("MarkerName"); + + // Add a duration marker, representing a span of time. + const startTime = Cu.now(); + doWork(); + ChromeUtils.addProfilerMarker("MarkerName", startTime); + + // Add a duration marker, representing a span of time, with some additional tex + const startTime = Cu.now(); + doWork(); + ChromeUtils.addProfilerMarker("MarkerName", startTime, "Details about this event"); + + // Add an instant marker, with some additional tex + const startTime = Cu.now(); + doWork(); + ChromeUtils.addProfilerMarker("MarkerName", undefined, "Details about this event"); + +Markers in Content Code +*********************** + +If instrumenting content code, then the `UserTiming`_ API is the best bet. +:code:`performance.mark` will create an instant marker, and a :code:`performance.measure` +will create a duration marker. These markers will show up under UserTiming in +the profiler UI. + +.. code-block:: javascript + + // Create an instant marker. + performance.mark("InstantMarkerName"); + + doWork(); + + // Measuring with the performance API will also create duration markers. + performance.measure("DurationMarkerName", "InstantMarkerName"); + +.. _UserTiming: https://developer.mozilla.org/en-US/docs/Web/API/User_Timing_API diff --git a/tools/profiler/docs/instrumenting-rust.rst b/tools/profiler/docs/instrumenting-rust.rst new file mode 100644 index 0000000000..c3e12f42dc --- /dev/null +++ b/tools/profiler/docs/instrumenting-rust.rst @@ -0,0 +1,433 @@ +Instrumenting Rust +================== + +There are multiple ways to use the profiler with Rust. Native stack sampling already +includes the Rust frames without special handling. There is the "Native Stacks" +profiler feature (via about:profiling), which enables stack walking for native code. +This is most likely turned on already for every profiler presets. + +In addition to that, there is a profiler Rust API to instrument the Rust code +and add more information to the profile data. There are three main functionalities +to use: + +1. Register Rust threads with the profiler, so the profiler can record these threads. +2. Add stack frame labels to annotate and categorize a part of the stack. +3. Add markers to specifically mark instants in time, or durations. This can be + helpful to make sense of a particular piece of the code, or record events that + normally wouldn't show up in samples. + +Crate to Include as a Dependency +-------------------------------- + +Profiler Rust API is located inside the ``gecko-profiler`` crate. This needs to +be included in the project dependencies before the following functionalities can +be used. + +To be able to include it, a new dependency entry needs to be added to the project's +``Cargo.toml`` file like this: + +.. code-block:: toml + + [dependencies] + gecko-profiler = { path = "../../tools/profiler/rust-api" } + +Note that the relative path needs to be updated depending on the project's location +in mozilla-central. + +Registering Threads +------------------- + +To be able to see the threads in the profile data, they need to be registered +with the profiler. Also, they need to be unregistered when they are exiting. +It's important to give a unique name to the thread, so they can be filtered easily. + +Registering and unregistering a thread is straightforward: + +.. code-block:: rust + + // Register it with a given name. + gecko_profiler::register_thread("Thread Name"); + // After doing some work, and right before exiting the thread, unregister it. + gecko_profiler::unregister_thread(); + +For example, here's how to register and unregister a simple thread: + +.. code-block:: rust + + let thread_name = "New Thread"; + std::thread::Builder::new() + .name(thread_name.into()) + .spawn(move || { + gecko_profiler::register_thread(thread_name); + // DO SOME WORK + gecko_profiler::unregister_thread(); + }) + .unwrap(); + +Or with a thread pool: + +.. code-block:: rust + + let worker = rayon::ThreadPoolBuilder::new() + .thread_name(move |idx| format!("Worker#{}", idx)) + .start_handler(move |idx| { + gecko_profiler::register_thread(&format!("Worker#{}", idx)); + }) + .exit_handler(|_idx| { + gecko_profiler::unregister_thread(); + }) + .build(); + +.. note:: + Registering a thread only will not make it appear in the profile data. In + addition, it needs to be added to the "Threads" filter in about:profiling. + This filter input is a comma-separated list. It matches partial names and + supports the wildcard ``*``. + +Adding Stack Frame Labels +------------------------- + +Stack frame labels are useful for annotating a part of the call stack with a +category. The category will appear in the various places on the Firefox Profiler +analysis page like timeline, call tree tab, flame graph tab, etc. + +``gecko_profiler_label!`` macro is used to add a new label frame. The added label +frame will exist between the call of this macro and the end of the current scope. + +Adding a stack frame label: + +.. code-block:: rust + + // Marking the stack as "Layout" category, no subcategory provided. + gecko_profiler_label!(Layout); + // Marking the stack as "JavaScript" category and "Parsing" subcategory. + gecko_profiler_label!(JavaScript, Parsing); + + // Or the entire function scope can be marked with a procedural macro. This is + // essentially a syntactical sugar and it expands into a function with a + // gecko_profiler_label! call at the very start: + #[gecko_profiler_fn_label(DOM)] + fn foo(bar: u32) -> u32 { + bar + } + +See the list of all profiling categories in the `profiling_categories.yaml`_ file. + +Adding Markers +-------------- + +Markers are packets of arbitrary data that are added to a profile by the Firefox code, +usually to indicate something important happening at a point in time, or during an interval of time. + +Each marker has a name, a category, some common optional information (timing, backtrace, etc.), +and an optional payload of a specific type (containing arbitrary data relevant to that type). + +.. note:: + This guide explains Rust markers in depth. To learn more about how to add a + marker in C++ or JavaScript, please take a look at their documentation + in :doc:`markers-guide` or :doc:`instrumenting-javascript` respectively. + +Examples +^^^^^^^^ + +Short examples, details are below. + +.. code-block:: rust + + // Record a simple marker with the category of Graphics, DisplayListBuilding. + gecko_profiler::add_untyped_marker( + // Name of the marker as a string. + "Marker Name", + // Category with an optional sub-category. + gecko_profiler_category!(Graphics, DisplayListBuilding), + // MarkerOptions that keeps options like marker timing and marker stack. + // It will be a point in type by default. + Default::default(), + ); + +.. code-block:: rust + + // Create a marker with some additional text information. + let info = "info about this marker"; + gecko_profiler::add_text_marker( + // Name of the marker as a string. + "Marker Name", + // Category with an optional sub-category. + gecko_profiler_category!(DOM), + // MarkerOptions that keeps options like marker timing and marker stack. + MarkerOptions { + timing: MarkerTiming::instant_now(), + ..Default::default() + }, + // Additional information as a string. + info, + ); + +.. code-block:: rust + + // Record a custom marker of type `ExampleNumberMarker` (see definition below). + gecko_profiler::add_marker( + // Name of the marker as a string. + "Marker Name", + // Category with an optional sub-category. + gecko_profiler_category!(Graphics, DisplayListBuilding), + // MarkerOptions that keeps options like marker timing and marker stack. + Default::default(), + // Marker payload. + ExampleNumberMarker { number: 5 }, + ); + + .... + + // Marker type definition. It needs to derive Serialize, Deserialize. + #[derive(Serialize, Deserialize, Debug)] + pub struct ExampleNumberMarker { + number: i32, + } + + // Marker payload needs to implement the ProfilerMarker trait. + impl gecko_profiler::ProfilerMarker for ExampleNumberMarker { + // Unique marker type name. + fn marker_type_name() -> &'static str { + "example number" + } + // Data specific to this marker type, serialized to JSON for profiler.firefox.com. + fn stream_json_marker_data(&self, json_writer: &mut gecko_profiler::JSONWriter) { + json_writer.int_property("number", self.number.into()); + } + // Where and how to display the marker and its data. + fn marker_type_display() -> gecko_profiler::MarkerSchema { + use gecko_profiler::marker::schema::*; + let mut schema = MarkerSchema::new(&[Location::MarkerChart]); + schema.set_chart_label("Name: {marker.name}"); + schema.add_key_label_format("number", "Number", Format::Integer); + schema + } + } + +Untyped Markers +^^^^^^^^^^^^^^^ + +Untyped markers don't carry any information apart from common marker data: +Name, category, options. + +.. code-block:: rust + + gecko_profiler::add_untyped_marker( + // Name of the marker as a string. + "Marker Name", + // Category with an optional sub-category. + gecko_profiler_category!(Graphics, DisplayListBuilding), + // MarkerOptions that keeps options like marker timing and marker stack. + MarkerOptions { + timing: MarkerTiming::instant_now(), + ..Default::default() + }, + ); + +1. Marker name + The first argument is the name of this marker. This will be displayed in most places + the marker is shown. It can be a literal string, or any dynamic string. +2. `Profiling category pair`_ + A category + subcategory pair from the `the list of categories`_. + ``gecko_profiler_category!`` macro should be used to create a profiling category + pair since it's easier to use, e.g. ``gecko_profiler_category!(JavaScript, Parsing)``. + Second parameter can be omitted to use the default subcategory directly. + ``gecko_profiler_category!`` macro is encouraged to use, but ``ProfilingCategoryPair`` + enum can also be used if needed. +3. `MarkerOptions`_ + See the options below. It can be omitted if there are no arguments with ``Default::default()``. + Some options can also be omitted, ``MarkerOptions {<options>, ..Default::default()}``, + with one or more of the following options types: + + * `MarkerTiming`_ + This specifies an instant or interval of time. It defaults to the current instant if + left unspecified. Otherwise use ``MarkerTiming::instant_at(ProfilerTime)`` or + ``MarkerTiming::interval(pt1, pt2)``; timestamps are usually captured with + ``ProfilerTime::Now()``. It is also possible to record only the start or the end of an + interval, pairs of start/end markers will be matched by their name. + * `MarkerStack`_ + By default, markers do not record a "stack" (or "backtrace"). To record a stack at + this point, in the most efficient manner, specify ``MarkerStack::Full``. To + capture a stack without native frames for reduced overhead, specify + ``MarkerStack::NonNative``. + + *Note: Currently, all C++ marker options are not present in the Rust side. They will + be added in the future.* + +Text Markers +^^^^^^^^^^^^ + +Text markers are very common, they carry an extra text as a fourth argument, in addition to +the marker name. Use the following macro: + +.. code-block:: rust + + let info = "info about this marker"; + gecko_profiler::add_text_marker( + // Name of the marker as a string. + "Marker Name", + // Category with an optional sub-category. + gecko_profiler_category!(DOM), + // MarkerOptions that keeps options like marker timing and marker stack. + MarkerOptions { + stack: MarkerStack::Full, + ..Default::default() + }, + // Additional information as a string. + info, + ); + +As useful as it is, using an expensive ``format!`` operation to generate a complex text +comes with a variety of issues. It can leak potentially sensitive information +such as URLs during the profile sharing step. profiler.firefox.com cannot +access the information programmatically. It won't get the formatting benefits of the +built-in marker schema. Please consider using a custom marker type to separate and +better present the data. + +Other Typed Markers +^^^^^^^^^^^^^^^^^^^ + +From Rust code, a marker of some type ``YourMarker`` (details about type definition follow) can be +recorded like this: + +.. code-block:: rust + + gecko_profiler::add_marker( + // Name of the marker as a string. + "Marker Name", + // Category with an optional sub-category. + gecko_profiler_category!(JavaScript), + // MarkerOptions that keeps options like marker timing and marker stack. + Default::default(), + // Marker payload. + YourMarker { number: 5, text: "some string".to_string() }, + ); + +After the first three common arguments (like in ``gecko_profiler::add_untyped_marker``), +there is a marker payload struct and it needs to be defined. Let's take a look at +how to define it. + +How to Define New Marker Types +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Each marker type must be defined once and only once. +The definition is a Rust ``struct``, it's constructed when recording markers of +that type in Rust. Each marker struct holds the data that is required for them +to show in the profiler.firefox.com. +By convention, the suffix "Marker" is recommended to better distinguish them +from non-profiler entities in the source. + +Each marker payload must derive ``serde::Serialize`` and ``serde::Deserialize``. +They are also exported from ``gecko-profiler`` crate if a project doesn't have it. +Each marker payload should include its data as its fields like this: + +.. code-block:: rust + + #[derive(Serialize, Deserialize, Debug)] + pub struct YourMarker { + number: i32, + text: String, + } + +Each marker struct must also implement the `ProfilerMarker`_ trait. + +``ProfilerMarker`` trait +************************ + +`ProfilerMarker`_ trait must be implemented for all marker types. Its methods are +similar to C++ counterparts, please refer to :ref:`the C++ markers guide to learn +more about them <how-to-define-new-marker-types>`. It includes three methods that +needs to be implemented: + +1. ``marker_type_name() -> &'static str``: + A marker type must have a unique name, it is used to keep track of the type of + markers in the profiler storage, and to identify them uniquely on profiler.firefox.com. + (It does not need to be the same as the struct's name.) + + E.g.: + + .. code-block:: rust + + fn marker_type_name() -> &'static str { + "your marker type" + } + +2. ``stream_json_marker_data(&self, json_writer: &mut JSONWriter)`` + All markers of any type have some common data: A name, a category, options like + timing, etc. as previously explained. + + In addition, a certain marker type may carry zero of more arbitrary pieces of + information, and they are always the same for all markers of that type. + + These are defined in a special static member function ``stream_json_marker_data``. + + It's a member method and takes a ``&mut JSONWriter`` as a parameter, + it will be used to stream the data as JSON, to later be read by + profiler.firefox.com. See `JSONWriter object and its methods`_. + + E.g.: + + .. code-block:: rust + + fn stream_json_marker_data(&self, json_writer: &mut JSONWriter) { + json_writer.int_property("number", self.number.into()); + json_writer.string_property("text", &self.text); + } + +3. ``marker_type_display() -> schema::MarkerSchema`` + Now that how to stream type-specific data (from Firefox to + profiler.firefox.com) is defined, it needs to be described where and how this + data will be displayed on profiler.firefox.com. + + The static member function ``marker_type_display`` returns an opaque ``MarkerSchema`` + object, which will be forwarded to profiler.firefox.com. + + See the `MarkerSchema::Location enumeration for the full list`_. Also see the + `MarkerSchema struct for its possible methods`_. + + E.g.: + + .. code-block:: rust + + fn marker_type_display() -> schema::MarkerSchema { + // Import MarkerSchema related types for easier use. + use crate::marker::schema::*; + // Create a MarkerSchema struct with a list of locations provided. + // One or more constructor arguments determine where this marker will be displayed in + // the profiler.firefox.com UI. + let mut schema = MarkerSchema::new(&[Location::MarkerChart]); + + // Some labels can optionally be specified, to display certain information in different + // locations: set_chart_label, set_tooltip_label, and set_table_label``; or + // set_all_labels to define all of them the same way. + schema.set_all_labels("{marker.name} - {marker.data.number}); + + // Next, define the main display of marker data, which will appear in the Marker Chart + // tooltips and the Marker Table sidebar. + schema.add_key_label_format("number", "Number", Format::Number); + schema.add_key_label_format("text", "Text", Format::String); + schema.add_static_label_value("Help", "This is my own marker type"); + + // Lastly, return the created schema. + schema + } + + Note that the strings in ``set_all_labels`` may refer to marker data within braces: + + * ``{marker.name}``: Marker name. + * ``{marker.data.X}``: Type-specific data, as streamed with property name "X" + from ``stream_json_marker_data``. + + :ref:`See the C++ markers guide for more details about it <marker-type-display-schema>`. + +.. _profiling_categories.yaml: https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/build/profiling_categories.yaml +.. _Profiling category pair: https://searchfox.org/mozilla-central/source/__GENERATED__/tools/profiler/rust-api/src/gecko_bindings/profiling_categories.rs +.. _the list of categories: https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/build/profiling_categories.yaml +.. _MarkerOptions: https://searchfox.org/mozilla-central/define?q=rust_analyzer::cargo::gecko_profiler::0_1_0::options::marker::MarkerOptions +.. _MarkerTiming: https://searchfox.org/mozilla-central/define?q=rust_analyzer::cargo::gecko_profiler::0_1_0::options::marker::MarkerTiming +.. _MarkerStack: https://searchfox.org/mozilla-central/define?q=rust_analyzer::cargo::gecko_profiler::0_1_0::options::marker::[MarkerStack] +.. _ProfilerMarker: https://searchfox.org/mozilla-central/define?q=rust_analyzer::cargo::gecko_profiler::0_1_0::marker::ProfilerMarker +.. _MarkerSchema::Location enumeration for the full list: https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerSchema%3A%3ALocation +.. _JSONWriter object and its methods: https://searchfox.org/mozilla-central/define?q=rust_analyzer::cargo::gecko_profiler::0_1_0::json_writer::JSONWriter +.. _MarkerSchema struct for its possible methods: https://searchfox.org/mozilla-central/define?q=rust_analyzer::cargo::gecko_profiler::0_1_0::schema::marker::MarkerSchema diff --git a/tools/profiler/docs/markers-guide.rst b/tools/profiler/docs/markers-guide.rst new file mode 100644 index 0000000000..86744c523c --- /dev/null +++ b/tools/profiler/docs/markers-guide.rst @@ -0,0 +1,551 @@ +Markers +======= + +Markers are packets of arbitrary data that are added to a profile by the Firefox code, usually to +indicate something important happening at a point in time, or during an interval of time. + +Each marker has a name, a category, some common optional information (timing, backtrace, etc.), +and an optional payload of a specific type (containing arbitrary data relevant to that type). + +.. note:: + This guide explains C++ markers in depth. To learn more about how to add a + marker in JavaScript or Rust, please take a look at their documentation + in :doc:`instrumenting-javascript` or :doc:`instrumenting-rust` respectively. + +Example +------- + +Short example, details below. + +Note: Most marker-related identifiers are in the ``mozilla`` namespace, to be added where necessary. + +.. code-block:: cpp + + // Record a simple marker with the category of DOM. + PROFILER_MARKER_UNTYPED("Marker Name", DOM); + + // Create a marker with some additional text information. (Be wary of printf!) + PROFILER_MARKER_TEXT("Marker Name", JS, MarkerOptions{}, "Additional text information."); + + // Record a custom marker of type `ExampleNumberMarker` (see definition below). + PROFILER_MARKER("Number", OTHER, MarkerOptions{}, ExampleNumberMarker, 42); + +.. code-block:: cpp + + // Marker type definition. + struct ExampleNumberMarker : public BaseMarkerType<ExampleNumberMarker> { + // Unique marker type name. + static constexpr const char* Name = "number"; + // Marker description. + static constexpr const char* Description = "This is a number marker."; + + // For convenience. + using MS = MarkerSchema; + // Fields of payload for the marker. + static constexpr MS::PayloadField PayloadFields[] = { + {"number", MS::InputType::Uint32t, "Number", MS::Format::Integer}}; + + // Locations this marker should be displayed. + static constexpr MS::Location Locations[] = {MS::Location::MarkerChart, + MS::Location::MarkerTable}; + // Location specific label for this marker. + static constexpr const char* ChartLabel = "Number: {marker.data.number}"; + + // Data specific to this marker type, as passed to PROFILER_MARKER/profiler_add_marker. + static void StreamJSONMarkerData(SpliceableJSONWriter& aWriter, uint32_t a number) { + // Custom writer for marker fields, or using the default parent + // implementation if the function arguments match the schema. + StreamJSONMarkerDataImpl(aWriter, a number); + } + }; + +When adding a marker whose arguments differ from the schema, a translator +function and a custom implementation of StreamJSONMarkerData can be used. + +.. code-block:: c++ + + // Marker type definition. + struct ExampleBooleanMarker : public BaseMarkerType<ExampleBooleanMarker> { + // Unique marker type name. + static constexpr const char* Name = "boolean"; + // Marker description. + static constexpr const char* Description = "This is a boolean marker."; + + // For convenience. + using MS = MarkerSchema; + // Fields of payload for the marker. + static constexpr MS::PayloadField PayloadFields[] = { + {"boolean", MS::InputType::CString, "Boolean"}}; + + // Locations this marker should be displayed. + static constexpr MS::Location Locations[] = {MS::Location::MarkerChart, + MS::Location::MarkerTable}; + // Location specific label for this marker. + static constexpr const char* ChartLabel = "Boolean: {marker.data.boolean}"; + + // Data specific to this marker type, as passed to PROFILER_MARKER/profiler_add_marker. + static void StreamJSONMarkerData(SpliceableJSONWriter& aWriter, bool aBoolean) { + // Note the schema expects a string, we cannot use the default implementation. + if (aBoolean) { + aWriter.StringProperty("boolean", "true"); + } else { + aWriter.StringProperty("boolean", "false"); + } + } + + // The translation to the schema must also be defined in a translator function. + // The argument list should match that to PROFILER_MARKER/profiler_add_marker. + static void TranslateMarkerInputToSchema(void* aContext, bool aBoolean) { + // This should call ETW::OutputMarkerSchema with an argument list matching the schema. + if (aIsStart) { + ETW::OutputMarkerSchema(aContext, ExampleBooleanMarker{}, ProfilerStringView("true")); + } else { + ETW::OutputMarkerSchema(aContext, ExampleBooleanMarker{}, ProfilerStringView("false")); + } + } + }; + +A more detailed description is offered below. + + +How to Record Markers +--------------------- + +Header to Include +^^^^^^^^^^^^^^^^^ + +If the compilation unit only defines and records untyped, text, and/or its own markers, include +`the main profiler markers header <https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerMarkers.h>`_: + +.. code-block:: cpp + + #include "mozilla/ProfilerMarkers.h" + +If it also records one of the other common markers defined in +`ProfilerMarkerTypes.h <https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerMarkerTypes.h>`_, +include that one instead: + +.. code-block:: cpp + + #include "mozilla/ProfilerMarkerTypes.h" + +And if it uses any other profiler functions (e.g., labels), use +`the main Gecko Profiler header <https://searchfox.org/mozilla-central/source/tools/profiler/public/GeckoProfiler.h>`_ +instead: + +.. code-block:: cpp + + #include "GeckoProfiler.h" + +The above works from source files that end up in libxul, which is true for the majority +of Firefox source code. But some files live outside of libxul, such as mfbt, in which +case the advice is the same but the equivalent headers are from the Base Profiler instead: + +.. code-block:: cpp + + #include "mozilla/BaseProfilerMarkers.h" // Only own/untyped/text markers + #include "mozilla/BaseProfilerMarkerTypes.h" // Only common markers + #include "BaseProfiler.h" // Markers and other profiler functions + +Untyped Markers +^^^^^^^^^^^^^^^ + +Untyped markers don't carry any information apart from common marker data: +Name, category, options. + +.. code-block:: cpp + + PROFILER_MARKER_UNTYPED( + // Name, and category pair. + "Marker Name", OTHER, + // Marker options, may be omitted if all defaults are acceptable. + MarkerOptions(MarkerStack::Capture(), ...)); + +``PROFILER_MARKER_UNTYPED`` is a macro that simplifies the use of the main +``profiler_add_marker`` function, by adding the appropriate namespaces, and a surrounding +``#ifdef MOZ_GECKO_PROFILER`` guard. + +1. Marker name + The first argument is the name of this marker. This will be displayed in most places + the marker is shown. It can be a literal C string, or any dynamic string object. +2. `Category pair name <https://searchfox.org/mozilla-central/source/__GENERATED__/mozglue/baseprofiler/public/ProfilingCategoryList.h>`_ + Choose a category + subcategory from the `the list of categories <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/build/profiling_categories.yaml>`_. + This is the second parameter of each ``SUBCATEGORY`` line, for instance ``LAYOUT_Reflow``. + (Internally, this is really a `MarkerCategory <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerCategory>`_ + object, in case you need to construct it elsewhere.) +3. `MarkerOptions <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerOptions>`_ + See the options below. It can be omitted if there are no other arguments, ``{}``, or + ``MarkerOptions()`` (no specified options); only one of the following option types + alone; or ``MarkerOptions(...)`` with one or more of the following options types: + + * `MarkerThreadId <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerThreadId>`_ + Rarely used, as it defaults to the current thread. Otherwise it specifies the target + "thread id" (aka "track") where the marker should appear; This may be useful when + referring to something that happened on another thread (use ``profiler_current_thread_id()`` + from the original thread to get its id); or for some important markers, they may be + sent to the "main thread", which can be specified with ``MarkerThreadId::MainThread()``. + * `MarkerTiming <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerTiming>`_ + This specifies an instant or interval of time. It defaults to the current instant if + left unspecified. Otherwise use ``MarkerTiming::InstantAt(timestamp)`` or + ``MarkerTiming::Interval(ts1, ts2)``; timestamps are usually captured with + ``TimeStamp::Now()``. It is also possible to record only the start or the end of an + interval, pairs of start/end markers will be matched by their name. *Note: The + upcoming "marker sets" feature will make this pairing more reliable, and also + allow more than two markers to be connected*. + * `MarkerStack <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerStack>`_ + By default, markers do not record a "stack" (or "backtrace"). To record a stack at + this point, in the most efficient manner, specify ``MarkerStack::Capture()``. To + record a previously captured stack, first store a stack into a + ``UniquePtr<ProfileChunkedBuffer>`` with ``profiler_capture_backtrace()``, then pass + it to the marker with ``MarkerStack::TakeBacktrace(std::move(stack))``. + * `MarkerInnerWindowId <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerInnerWindowId>`_ + If you have access to an "inner window id", consider specifying it as an option, to + help profiler.firefox.com to classify them by tab. + +"Auto" Scoped Interval Markers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To capture time intervals around some important operations, it is common to store a timestamp, do the work, +and then record a marker, e.g.: + +.. code-block:: cpp + + void DoTimedWork() { + TimeStamp start = TimeStamp::Now(); + DoWork(); + PROFILER_MARKER_TEXT("Timed work", OTHER, MarkerTiming::IntervalUntilNowFrom(start), "Details"); + } + +`RAII <https://en.cppreference.com/w/cpp/language/raii>`_ objects automate this, by recording the time +when the object is constructed, and later recording the marker when the object is destroyed at the end +of its C++ scope. +This is especially useful if there are multiple scope exit points. + +``AUTO_PROFILER_MARKER_TEXT`` is `the only one implemented <https://searchfox.org/mozilla-central/search?q=id%3AAUTO_PROFILER_MARKER_TEXT`_ at this time. + +.. code-block:: cpp + + void MaybeDoTimedWork(bool aDoIt) { + AUTO_PROFILER_MARKER_TEXT("Timed work", OTHER, "Details"); + if (!aDoIt) { /* Marker recorded here... */ return; } + DoWork(); + /* ... or here. */ + } + +Note that these RAII objects only record one marker. In some situation, a very long +operation could be missed if it hasn't completed by the end of the profiling session. +In this case, consider recording two distinct markers, using +``MarkerTiming::IntervalStart()`` and ``MarkerTiming::IntervalEnd()``. + +Text Markers +^^^^^^^^^^^^ + +Text markers are very common, they carry an extra text as a fourth argument, in addition to +the marker name. Use the following macro: + +.. code-block:: cpp + + PROFILER_MARKER_TEXT( + // Name, category pair, options. + "Marker Name", OTHER, {}, + // Text string. + "Here are some more details." + ); + +As useful as it is, using an expensive ``printf`` operation to generate a complex text +comes with a variety of issues string. It can leak potentially sensitive information +such as URLs can be leaked during the profile sharing step. profiler.firefox.com cannot +access the information programmatically. It won't get the formatting benefits of the +built-in marker schema. Please consider using a custom marker type to separate and +better present the data. + +Other Typed Markers +^^^^^^^^^^^^^^^^^^^ + +From C++ code, a marker of some type ``YourMarker`` (details about type definition follow) can be +recorded like this: + +.. code-block:: cpp + + PROFILER_MARKER( + "YourMarker name", OTHER, + MarkerOptions(MarkerTiming::IntervalUntilNowFrom(someStartTimestamp), + MarkerInnerWindowId(innerWindowId))), + YourMarker, "some string", 12345, "http://example.com", someTimeStamp); + +After the first three common arguments (like in ``PROFILER_MARKER_UNTYPED``), there are: + +4. The marker type, which is the name of the C++ ``struct`` that defines that type. +5. A variadic list of type-specific argument. They must match the number of, and must + be convertible to the types defined in the schema. If they are not, they must match + the number of and be convertible to the types in ``StreamJSONMarkerData`` and + ``TranslateMarkerInputToSchema``. + +Where to Define New Marker Types +-------------------------------- + +The first step is to determine the location of the marker type definition: + +* If this type is only used in one function, or a component, it can be defined in a + local common place relative to its use. +* For a more common type that could be used from multiple locations: + + * If there is no dependency on XUL, it can be defined in the Base Profiler, which can + be used in most locations in the codebase: + `mozglue/baseprofiler/public/BaseProfilerMarkerTypes.h <https://searchfox.org/mozilla-central/source/mozglue/baseprofiler/public/BaseProfilerMarkerTypes.h>`__ + + * However, if there is a XUL dependency, then it needs to be defined in the Gecko Profiler: + `tools/profiler/public/ProfilerMarkerTypes.h <https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerMarkerTypes.h>`__ + +.. _how-to-define-new-marker-types: + +How to Define New Marker Types +------------------------------ + +Each marker type must be defined once and only once. +The definition is a C++ ``struct``, that inherits from ``BaseMarkerType``, its identifier is used when recording +markers of that type in C++. +By convention, the suffix "Marker" is recommended to better distinguish them +from non-profiler entities in the source. + +.. code-block:: cpp + + struct YourMarker : BaseMarkerType<YourMarker> { + +Marker Type Name & Description +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A marker type must have a unique name, it is used to keep track of the type of +markers in the profiler storage, and to identify them uniquely on profiler.firefox.com. +(It does not need to be the same as the ``struct``'s name.) + +This name is defined in a special static data member ``Name``: + +.. code-block:: cpp + + // … + static constexpr const char* Name = "YourMarker"; + +In addition you must add a description of your marker in a special static data member ``Description``: + +.. code-block:: cpp + + // … + static constexpr const char* Description = "This is my marker!"; + +Marker Type Data +^^^^^^^^^^^^^^^^ + +All markers of any type have some common data: A name, a category, options like +timing, etc. as previously explained. + +In addition, a certain marker type may carry zero of more arbitrary pieces of +information, and they are always the same for all markers of that type. + +These are defined in a special static member data array of ``PayloadField`` s. +Each payload field specifies a key, a C++ type description, a label, a format, +and optionally some additional options (see the ``PayloadField`` type). The +most important fields are: + +* Key: Element property name as streamed in ``StreamJSONMarkerData``. +* Type: An enum value describing the C++ type specified to PROFILER_MARKER/profiler_add_marker. +* Label: Prefix to display to label the field. +* Format: How to format the data element value, see `MarkerSchema::Format for details <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerSchema%3A%3AFormat>`_. + +.. code-block:: cpp + + // … + // This will be used repeatedly and is done for convenience. + using MS = MarkerSchema; + static constexpr MS::PayloadField PayloadFields[] = { + {"number", MS::InputType::Uint32t, "Number", MS::Format::Integer}}; + +In addition, a ``StreamJSONMarkerData`` function must be defined that matches +the C++ argument types to PROFILER_MARKER. + +The first function parameters is always ``SpliceableJSONWriter& aWriter``, +it will be used to stream the data as JSON, to later be read by +profiler.firefox.com. + +.. code-block:: cpp + + // … + static void StreamJSONMarkerData(SpliceableJSONWriter& aWriter, + +The following function parameters is how the data is received as C++ objects +from the call sites. + +* Most C/C++ `POD (Plain Old Data) <https://en.cppreference.com/w/cpp/named_req/PODType>`_ + and `trivially-copyable <https://en.cppreference.com/w/cpp/named_req/TriviallyCopyable>`_ + types should work as-is, including ``TimeStamp``. +* Character strings should be passed using ``const ProfilerString8View&`` (this handles + literal strings, and various ``std::string`` and ``nsCString`` types, and spans with or + without null terminator). Use ``const ProfilerString16View&`` for 16-bit strings such as + ``nsString``. +* Other types can be used if they define specializations for ``ProfileBufferEntryWriter::Serializer`` + and ``ProfileBufferEntryReader::Deserializer``. You should rarely need to define new + ones, but if needed see how existing specializations are written, or contact the + `perf-tools team for help <https://chat.mozilla.org/#/room/#profiler:mozilla.org>`_. + +Passing by value or by reference-to-const is recommended, because arguments are serialized +in binary form (i.e., there are no optimizable ``move`` operations). + +For example, here's how to handle a string, a 64-bit number, another string, and +a timestamp: + +.. code-block:: cpp + + // … + const ProfilerString8View& aString, + const int64_t aBytes, + const ProfilerString8View& aURL, + const TimeStamp& aTime) { + +Then the body of the function turns these parameters into a JSON stream. + +If these parameter types match the types specified in the schema, both in order +and number. It can simply call the default implementation. + +.. code-block:: cpp + + // … + static void StreamJSONMarkerData(SpliceableJSONWriter& aWriter, + const ProfilerString8View& aString, + const int64_t aBytes, + const ProfilerString8View& aURL, + const TimeStamp& aTime) { + StreamJSONMarkerDataImpl(aWrite, aString, aBytes, aURL, aTime); + } + + +If the parameters passed to PROFILER_MARKER do not match the schema, some +additional work is required. + +When this function is called, the writer has just started a JSON object, so +everything that is written should be a named object property. Use +``SpliceableJSONWriter`` functions, in most cases ``...Property`` functions +from its parent class ``JSONWriter``: ``NullProperty``, ``BoolProperty``, +``IntProperty``, ``DoubleProperty``, ``StringProperty``. (Other nested JSON +types like arrays or objects are not supported by the profiler.) + +As a special case, ``TimeStamps`` must be streamed using ``aWriter.TimeProperty(timestamp)``. + +The property names will be used to identify where each piece of data is stored and +how it should be displayed on profiler.firefox.com (see next section). + +Suppose our marker schema defines a string for a boolean, here is how that could be streamed. + +.. code-block:: cpp + + // … + + static void StreamJSONMarkerData(SpliceableJSONWriter& aWriter, + bool aBoolean) { + aWriter.StringProperty("myBoolean", aBoolean ? "true" : "false"); + } + +In addition, a ``TranslateMarkerInputToSchema`` function must be added to +ensure correct output to ETW. + +.. code-block:: c++ + + // The translation to the schema must also be defined in a translator function. + // The argument list should match that to PROFILER_MARKER/profiler_add_marker. + static void TranslateMarkerInputToSchema(void* aContext, bool aBoolean) { + // This should call ETW::OutputMarkerSchema with an argument list matching the schema. + if (aIsStart) { + ETW::OutputMarkerSchema(aContext, YourMarker{}, ProfilerStringView("true")); + } else { + ETW::OutputMarkerSchema(aContext, YourMarker{}, ProfilerStringView("false")); + } + } + +.. _marker-type-display-schema: + +Marker Type Display Schema +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Now that we have defined how to stream type-specific data (from Firefox to +profiler.firefox.com), we need to describe where and how this data will be +displayed on profiler.firefox.com. + +The location data member determines where this marker will be displayed in +the profiler.firefox.com UI. See the `MarkerSchema::Location enumeration for the +full list <https://searchfox.org/mozilla-central/define?q=T_mozilla%3A%3AMarkerSchema%3A%3ALocation>`_. + +Here is the most common set of locations, showing markers of that type in both the +Marker Chart and the Marker Table panels: + +.. code-block:: cpp + + // … + static constexpr MS::Location Locations[] = {MS::Location::MarkerChart, + MS::Location::MarkerTable}; + +Some labels can optionally be specified, to display certain information in different +locations: ``ChartLabel``, ``TooltipLabel``, and ``TableLabel``; or ``AllLabels`` to +define all of them the same way. + +The arguments is a string that may refer to marker data within braces: + +* ``{marker.name}``: Marker name. +* ``{marker.data.X}``: Type-specific data, as streamed with property name "X" from ``StreamJSONMarkerData`` (e.g., ``aWriter.IntProperty("X", a number);`` + +For example, here's how to set the Marker Chart label to show the marker name and the +``myBytes`` number of bytes: + +.. code-block:: cpp + + // … + static constexpr const char* ChartLabel = "{marker.name} – {marker.data.myBytes}"; + +profiler.firefox.com will apply the label with the data in a consistent manner. For +example, with this label definition, it could display marker information like the +following in the Firefox Profiler's Marker Chart: + + * "Marker Name – 10B" + * "Marker Name – 25.204KB" + * "Marker Name – 512.54MB" + +For implementation details on this processing, see `src/profiler-logic/marker-schema.js <https://github.com/firefox-devtools/profiler/blob/main/src/profile-logic/marker-schema.js>`_ +in the profiler's front-end. + +Any other ``struct`` member function is ignored. There could be utility functions used by the above +compulsory functions, to make the code clearer. + +And that is the end of the marker definition ``struct``. + +.. code-block:: cpp + + // … + }; + +Performance Considerations +-------------------------- + +During profiling, it is best to reduce the amount of work spent doing profiler +operations, as they can influence the performance of the code that you want to profile. + +Whenever possible, consider passing simple types to marker functions, such that +``StreamJSONMarkerData`` will do the minimum amount of work necessary to serialize +the marker type-specific arguments to its internal buffer representation. POD types +(numbers) and strings are the easiest and cheapest to serialize. Look at the +corresponding ``ProfileBufferEntryWriter::Serializer`` specializations if you +want to better understand the work done. + +Avoid doing expensive operations when recording markers. E.g.: ``printf`` of +different things into a string, or complex computations; instead pass the +``printf``/computation arguments straight through to the marker function, so that +``StreamJSONMarkerData`` can do the expensive work at the end of the profiling session. + +Marker Architecture Description +------------------------------- + +The above sections should give all the information needed for adding your own marker +types. However, if you are wanting to work on the marker architecture itself, this +section will describe how the system works. + +TODO: + * Briefly describe the buffer and serialization. + * Describe the template strategy for generating marker types + * Describe the serialization and link to profiler front-end docs on marker processing (if they exist) diff --git a/tools/profiler/docs/memory.rst b/tools/profiler/docs/memory.rst new file mode 100644 index 0000000000..347a91f9e7 --- /dev/null +++ b/tools/profiler/docs/memory.rst @@ -0,0 +1,46 @@ +Profiling Memory +================ + +Sampling stacks from native allocations +--------------------------------------- + +The profiler can sample allocations and de-allocations from malloc using the +"Native Allocations" feature. This can be enabled by going to `about:profiling` and +enabling the "Native Allocations" checkbox. It is only available in Nightly, as it +uses a technique of hooking into malloc that could be a little more risky to apply to +the broader population of Firefox users. + +This implementation is located in: `tools/profiler/core/memory_hooks.cpp +<https://searchfox.org/mozilla-central/source/tools/profiler/core/memory_hooks.cpp>`_ + +It works by hooking into all of the malloc calls. When the profiler is running, it +performs a `Bernoulli trial`_, that will pass for a given probability of per-byte +allocated. What this means is that larger allocations have a higher chance of being +recorded compared to smaller allocations. Currently, there is no way to configure +the per-byte probability. This means that sampled allocation sizes will be closer +to the actual allocated bytes. + +This infrastructure is quite similar to DMD, but with the additional motiviations of +making it easy to turn on and use with the profiler. The overhead is quite high, +especially on systems with more expensive stack walking, like Linux. Turning off +thee "Native Stacks" feature can help lower overhead, but will give less information. + +For more information on analyzing these profiles, see the `Firefox Profiler docs`_. + +Memory counters +--------------- + +Similar to the Native Allocations feature, memory counters use the malloc memory hook +that is only available in Nightly. When it's available, the memory counters are always +turned on. This is a lightweight way to count in a very granular fashion how much +memory is being allocated and deallocated during the profiling session. + +This information is then visualized in the `Firefox Profiler memory track`_. + +This feature uses the `Profiler Counters`_, which can be used to create other types +of cheap counting instrumentation. + +.. _Bernoulli trial: https://en.wikipedia.org/wiki/Bernoulli_trial +.. _Firefox Profiler docs: https://profiler.firefox.com/docs/#/./memory-allocations +.. _Firefox Profiler memory track: https://profiler.firefox.com/docs/#/./memory-allocations?id=memory-track +.. _Profiler Counters: https://searchfox.org/mozilla-central/source/tools/profiler/public/ProfilerCounts.h diff --git a/tools/profiler/docs/profilerclasses-20220913.png b/tools/profiler/docs/profilerclasses-20220913.png Binary files differnew file mode 100644 index 0000000000..a5ba265407 --- /dev/null +++ b/tools/profiler/docs/profilerclasses-20220913.png diff --git a/tools/profiler/docs/profilerclasses.umlet.uxf b/tools/profiler/docs/profilerclasses.umlet.uxf new file mode 100644 index 0000000000..c807853401 --- /dev/null +++ b/tools/profiler/docs/profilerclasses.umlet.uxf @@ -0,0 +1,811 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<diagram program="umlet" version="15.0.0">
+ <zoom_level>10</zoom_level>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>80</x>
+ <y>370</y>
+ <w>340</w>
+ <h>190</h>
+ </coordinates>
+ <panel_attributes>ThreadInfo
+--
+-mName: nsCString
+-mRegisterTime: TimeStamp
+-mThreadId: int
+-mIsMainThread: bool
+--
+NS_INLINE_DECL_THREADSAFE_REFCOUNTING
++Name()
++RegisterTime()
++ThreadId()
++IsMainThread()
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>470</x>
+ <y>300</y>
+ <w>600</w>
+ <h>260</h>
+ </coordinates>
+ <panel_attributes>RacyRegisteredThread
+--
+-mProfilingStackOwner: NotNull<RefPtr<ProfilingStackOwner>>
+-mThreadId
+-mSleep: Atomic<int> /* AWAKE, SLEEPING_NOT_OBSERVED, SLEEPING_OBSERVED */
+-mIsBeingProfiled: Atomic<bool, Relaxed>
+--
++SetIsBeingProfiled()
++IsBeingProfiled()
++ReinitializeOnResume()
++CanDuplicateLastSampleDueToSleep()
++SetSleeping()
++SetAwake()
++IsSleeping()
++ThreadId()
++ProfilingStack()
++ProfilingStackOwner()</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>470</x>
+ <y>650</y>
+ <w>350</w>
+ <h>360</h>
+ </coordinates>
+ <panel_attributes>RegisteredThread
+--
+-mPlatformData: UniquePlatformData
+-mStackTop: const void*
+-mThread: nsCOMPtr<nsIThread>
+-mContext: JSContext*
+-mJSSampling: enum {INACTIVE, ACTIVE_REQUESTED, ACTIVE, INACTIVE_REQUESTED}
+-mmJSFlags: uint32_t
+--
++RacyRegisteredThread()
++GetPlatformData()
++StackTop()
++GetRunningEventDelay()
++SizeOfIncludingThis()
++SetJSContext()
++ClearJSContext()
++GetJSContext()
++Info(): RefPtr<ThreadInfo>
++GetEventTarget(): nsCOMPtr<nsIEventTarget>
++ResetMainThread(nsIThread*)
++StartJSSampling()
++StopJSSampling()
++PollJSSampling()
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>750</x>
+ <y>550</y>
+ <w>180</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<<-
+mRacyRegisteredThread</panel_attributes>
+ <additional_attributes>10.0;100.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>290</x>
+ <y>550</y>
+ <w>230</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mThreadInfo: RefPtr<></panel_attributes>
+ <additional_attributes>210.0;100.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>70</x>
+ <y>660</y>
+ <w>340</w>
+ <h>190</h>
+ </coordinates>
+ <panel_attributes>PageInformation
+--
+-mBrowsingContextID: uint64_t
+-mInnerWindowID: uint64_t
+-mUrl: nsCString
+-mEmbedderInnerWindowID: uint64_t
+--
+NS_INLINE_DECL_THREADSAFE_REFCOUNTING
++SizeOfIncludingThis(MallocSizeOf)
++Equals(PageInformation*)
++StreamJSON(SpliceableJSONWriter&)
++InnerWindowID()
++BrowsingContextID()
++Url()
++EmbedderInnerWindowID()
++BufferPositionWhenUnregistered(): Maybe<uint64_t>
++NotifyUnregistered(aBufferPosition: uint64_t)</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>760</x>
+ <y>1890</y>
+ <w>570</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>ProfilerBacktrace
+--
+-mName: UniqueFreePtr<char>
+-mThreadId: int
+-mProfileChunkedBuffer: UniquePtr<ProfileChunkedBuffer>
+-mProfileBuffer: UniquePtr<ProfileBuffer>
+--
++StreamJSON(SpliceableJSONWriter&, aProcessStartTime: TimeStamp, UniqueStacks&)
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>20</x>
+ <y>2140</y>
+ <w>620</w>
+ <h>580</h>
+ </coordinates>
+ <panel_attributes>ProfileChunkedBuffer
+--
+-mMutex: BaseProfilerMaybeMutex
+-mChunkManager: ProfileBufferChunkManager*
+-mOwnedChunkManager: UniquePtr<ProfileBufferChunkManager>
+-mCurrentChunk: UniquePtr<ProfileBufferChunk>
+-mNextChunks: UniquePtr<ProfileBufferChunk>
+-mRequestedChunkHolder: RefPtr<RequestedChunkRefCountedHolder>
+-mNextChunkRangeStart: ProfileBufferIndex
+-mRangeStart: Atomic<ProfileBufferIndex, ReleaseAcquire>
+-mRangeEnd: ProfileBufferIndex
+-mPushedBlockCount: uint64_t
+-mClearedBlockCount: Atomic<uint64_t, ReleaseAcquire>
+--
++Byte = ProfileBufferChunk::Byte
++Length = ProfileBufferChunk::Length
++IsThreadSafe()
++IsInSession()
++ResetChunkManager()
++SetChunkManager()
++Clear()
++BufferLength(): Maybe<size_t>
++SizeOfExcludingThis(MallocSizeOf)
++SizeOfIncludingThis(MallocSizeOf)
++GetState()
++IsThreadSafeAndLockedOnCurrentThread(): bool
++LockAndRun(Callback&&)
++ReserveAndPut(CallbackEntryBytes&&, Callback<auto(Maybe<ProfileBufferEntryWriter>&)>&&)
++Put(aEntryBytes: Length, Callback<auto(Maybe<ProfileBufferEntryWriter>&)>&&)
++PutFrom(const void*, Length)
++PutObjects(const Ts&...)
++PutObject(const T&)
++GetAllChunks()
++Read(Callback<void(Reader&)>&&): bool
++ReadEach(Callback<void(ProfileBufferEntryReader& [, ProfileBufferBlockIndex])>&&)
++ReadAt(ProfileBufferBlockIndex, Callback<void(Maybe<ProfileBufferEntryReader>&&)>&&)
++AppendContents</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>810</x>
+ <y>2100</y>
+ <w>500</w>
+ <h>620</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferChunk
+--
++Header: {
+ mOffsetFirstBlock; mOffsetPastLastBlock; mDoneTimeStamp;
+ mBufferBytes; mBlockCount; mRangeStart; mProcessId;
+ }
+-InternalHeader: { mHeader: Header; mNext: UniquePtr<ProfileBufferChunk>; }
+--
+-mInternalHeader: InternalHeader
+-mBuffer: Byte /* First byte */
+--
++Byte = uint8_t
++Length = uint32_t
++SpanOfBytes = Span<Byte>
+/+Create(aMinBufferBytes: Length): UniquePtr<ProfileBufferChunk>/
++ReserveInitialBlockAsTail(Length): SpanOfBytes
++ReserveBlock(Length): { SpanOfBytes, ProfileBufferBlockIndex }
++MarkDone()
++MarkRecycled()
++ChunkHeader()
++BufferBytes()
++ChunkBytes()
++SizeOfExcludingThis(MallocSizeOf)
++SizeOfIncludingThis(MallocSizeOf)
++RemainingBytes(): Length
++OffsetFirstBlock(): Length
++OffsetPastLastBlock(): Length
++BlockCount(): Length
++ProcessId(): int
++SetProcessId(int)
++RangeStart(): ProfileBufferIndex
++SetRangeStart(ProfileBufferIndex)
++BufferSpan(): Span<const Byte>
++ByteAt(aOffset: Length)
++GetNext(): maybe-const ProfileBufferChunk*
++ReleaseNext(): UniquePtr<ProfileBufferChunk>
++InsertNext(UniquePtr<ProfileBufferChunk>&&)
++Last(): const ProfileBufferChunk*
++SetLast(UniquePtr<ProfileBufferChunk>&&)
+/+Join(UniquePtr<ProfileBufferChunk>&&, UniquePtr<ProfileBufferChunk>&&)/
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>120</x>
+ <y>2850</y>
+ <w>570</w>
+ <h>350</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferEntryReader
+--
+-mCurrentSpan: SpanOfConstBytes
+-mNextSpanOrEmpty: SpanOfConstBytes
+-mCurrentBlockIndex: ProfileBufferBlockIndex
+-mNextBlockIndex: ProfileBufferBlockIndex
+--
++RemainingBytes(): Length
++SetRemainingBytes(Length)
++CurrentBlockIndex(): ProfileBufferBlockIndex
++NextBlockIndex(): ProfileBufferBlockIndex
++EmptyIteratorAtOffset(Length): ProfileBufferEntryReader
++operator*(): const Byte&
++operator++(): ProfileBufferEntryReader&
++operator+=(Length): ProfileBufferEntryReader&
++operator==(const ProfileBufferEntryReader&)
++operator!=(const ProfileBufferEntryReader&)
++ReadULEB128<T>(): T
++ReadBytes(void*, Length)
++ReadIntoObject(T&)
++ReadIntoObjects(Ts&...)
++ReadObject<T>(): T</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>740</x>
+ <y>2850</y>
+ <w>570</w>
+ <h>300</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferEntryWriter
+--
+-mCurrentSpan: SpanOfBytes
+-mNextSpanOrEmpty: SpanOfBytes
+-mCurrentBlockIndex: ProfileBufferBlockIndex
+-mNextBlockIndex: ProfileBufferBlockIndex
+--
++RemainingBytes(): Length
++CurrentBlockIndex(): ProfileBufferBlockIndex
++NextBlockIndex(): ProfileBufferBlockIndex
++operator*(): Byte&
++operator++(): ProfileBufferEntryReader&
++operator+=(Length): ProfileBufferEntryReader&
+/+ULEB128Size(T): unsigned/
++WriteULEB128(T)
+/+SumBytes(const Ts&...): Length/
++WriteFromReader(ProfileBufferEntryReader&, Length)
++WriteObject(const T&)
++WriteObjects(const T&)</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>120</x>
+ <y>3270</y>
+ <w>570</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferEntryReader::Deserializer<T>
+/to be specialized for all types read from ProfileBufferEntryReader/
+--
+/+ReadInto(ProfileBufferEntryReader&, T&)/
+/+Read<T>(ProfileBufferEntryReader&): T/</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>740</x>
+ <y>3270</y>
+ <w>570</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferEntryWriter::Serializer<T>
+/to be specialized for all types written into ProfileBufferEntryWriter/
+--
+/+Bytes(const T&): Length/
+/+Write(ProfileBufferEntryWriter&, const T&)/</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>330</x>
+ <y>2710</y>
+ <w>110</w>
+ <h>160</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<creates>></panel_attributes>
+ <additional_attributes>10.0;10.0;60.0;140.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>430</x>
+ <y>2710</y>
+ <w>360</w>
+ <h>160</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<creates>></panel_attributes>
+ <additional_attributes>10.0;10.0;340.0;140.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>660</x>
+ <y>2710</y>
+ <w>260</w>
+ <h>160</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<points into>></panel_attributes>
+ <additional_attributes>10.0;140.0;240.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>870</x>
+ <y>2710</y>
+ <w>140</w>
+ <h>160</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<points into>></panel_attributes>
+ <additional_attributes>10.0;140.0;80.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>630</x>
+ <y>2170</y>
+ <w>200</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mCurrentChunk</panel_attributes>
+ <additional_attributes>10.0;20.0;180.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>630</x>
+ <y>2230</y>
+ <w>200</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mNextChunks</panel_attributes>
+ <additional_attributes>10.0;20.0;180.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>1100</x>
+ <y>2030</y>
+ <w>170</w>
+ <h>90</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mInternalHeader.mNext</panel_attributes>
+ <additional_attributes>10.0;70.0;10.0;20.0;150.0;20.0;150.0;70.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>490</x>
+ <y>3190</y>
+ <w>70</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<uses>></panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;80.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>580</x>
+ <y>3190</y>
+ <w>230</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<uses>></panel_attributes>
+ <additional_attributes>10.0;10.0;210.0;80.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>50</x>
+ <y>1620</y>
+ <w>570</w>
+ <h>410</h>
+ </coordinates>
+ <panel_attributes>ProfileBuffer
+--
+-mFirstSamplingTimeNs: double
+-mLastSamplingTimeNs: double
+-mIntervalNs, etc.: ProfilerStats
+--
++IsThreadSafe(): bool
++AddEntry(const ProfileBufferEntry&): uint64_t
++AddThreadIdEntry(int): uint64_t
++PutObjects(Kind, const Ts&...): ProfileBufferBlockIndex
++CollectCodeLocation(...)
++AddJITInfoForRange(...)
++StreamSamplesToJSON(SpliceableJSONWriter&, aThreadId: int, aSinceTime: double, UniqueStacks&)
++StreamMarkersToJSON(SpliceableJSONWriter&, ...)
++StreamPausedRangesToJSON(SpliceableJSONWriter&, aSinceTime: double)
++StreamProfilerOverheadToJSON(SpliceableJSONWriter&, ...)
++StreamCountersToJSON(SpliceableJSONWriter&, ...)
++DuplicateLsstSample
++DiscardSamplesBeforeTime(aTime: double)
++GetEntry(aPosition: uint64_t): ProfileBufferEntry
++SizeOfExcludingThis(MallocSizeOf)
++SizeOfIncludingThis(MallocSizeOf)
++CollectOverheadStats(...)
++GetProfilerBufferInfo(): ProfilerBufferInfo
++BufferRangeStart(): uint64_t
++BufferRangeEnd(): uint64_t</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>690</x>
+ <y>1620</y>
+ <w>230</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>ProfileBufferEntry
+--
++mKind: Kind
++mStorage: uint8_t[kNumChars=8]</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>930</x>
+ <y>1620</y>
+ <w>440</w>
+ <h>130</h>
+ </coordinates>
+ <panel_attributes>UniqueJSONStrings
+--
+-mStringTableWriter: SpliceableChunkedJSONWriter
+-mStringHashToIndexMap: HashMap<HashNumber, uint32_t>
+--
++SpliceStringTableElements(SpliceableJSONWriter&)
++WriteProperty(JSONWriter&, aName: const char*, aStr: const char*)
++WriteElement(JSONWriter&, aStr: const char*)
++GetOrAddIndex(const char*): uint32_t</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>680</x>
+ <y>1760</y>
+ <w>470</w>
+ <h>110</h>
+ </coordinates>
+ <panel_attributes>UniqueStack
+--
+-mFrameTableWriter: SpliceableChunkedJSONWriter
+-mFrameToIndexMap: HashMap<FrameKey, uint32_t, FrameKeyHasher>
+-mStackTableWriter: SpliceableChunkedJSONWriter
+-mStackToIndexMap: HashMap<StackKey, uint32_t, StackKeyHasher>
+-mJITInfoRanges: Vector<JITFrameInfoForBufferRange></panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>320</x>
+ <y>2020</y>
+ <w>230</w>
+ <h>140</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mEntries: ProfileChunkedBuffer&</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;120.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1640</y>
+ <w>100</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<uses>></panel_attributes>
+ <additional_attributes>10.0;20.0;80.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1710</y>
+ <w>340</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<uses>></panel_attributes>
+ <additional_attributes>10.0;20.0;320.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1800</y>
+ <w>90</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=.>
+<<uses>></panel_attributes>
+ <additional_attributes>10.0;20.0;70.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1900</y>
+ <w>170</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mProfileBuffer</panel_attributes>
+ <additional_attributes>150.0;20.0;10.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>590</x>
+ <y>1940</y>
+ <w>250</w>
+ <h>220</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mProfileChunkedBuffer</panel_attributes>
+ <additional_attributes>170.0;10.0;10.0;200.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>20</x>
+ <y>1030</y>
+ <w>490</w>
+ <h>550</h>
+ </coordinates>
+ <panel_attributes>CorePS
+--
+/-sInstance: CorePS*/
+-mMainThreadId: int
+-mProcessStartTime: TimeStamp
+-mCoreBuffer: ProfileChunkedBuffer
+-mRegisteredThreads: Vector<UniquePtr<RegisteredThread>>
+-mRegisteredPages: Vector<RefPtr<PageInformation>>
+-mCounters: Vector<BaseProfilerCount*>
+-mLul: UniquePtr<lul::LUL> /* linux only */
+-mProcessName: nsAutoCString
+-mJsFrames: JsFrameBuffer
+--
++Create
++Destroy
++Exists(): bool
++AddSizeOf(...)
++MainThreadId()
++ProcessStartTime()
++CoreBuffer()
++RegisteredThreads(PSLockRef)
++JsFrames(PSLockRef)
+/+AppendRegisteredThread(PSLockRef, UniquePtr<RegisteredThread>)/
+/+RemoveRegisteredThread(PSLockRef, RegisteredThread*)/
++RegisteredPages(PSLockRef)
+/+AppendRegisteredPage(PSLockRef, RefPtr<PageInformation>)/
+/+RemoveRegisteredPage(PSLockRef, aRegisteredInnerWindowID: uint64_t)/
+/+ClearRegisteredPages(PSLockRef)/
++Counters(PSLockRef)
++AppendCounter
++RemoveCounter
++Lul(PSLockRef)
++SetLul(PSLockRef, UniquePtr<lul::LUL>)
++ProcessName(PSLockRef)
++SetProcessName(PSLockRef, const nsACString&)
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>20</x>
+ <y>1570</y>
+ <w>110</w>
+ <h>590</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<<-
+mCoreBuffer</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;570.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>160</x>
+ <y>840</y>
+ <w>150</w>
+ <h>210</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mRegisteredPages</panel_attributes>
+ <additional_attributes>10.0;190.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>250</x>
+ <y>840</y>
+ <w>240</w>
+ <h>210</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mRegisteredThreads</panel_attributes>
+ <additional_attributes>10.0;190.0;220.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>920</x>
+ <y>860</y>
+ <w>340</w>
+ <h>190</h>
+ </coordinates>
+ <panel_attributes>SamplerThread
+--
+-mSampler: Sampler
+-mActivityGeneration: uint32_t
+-mIntervalMicroseconds: int
+-mThread /* OS-specific */
+-mPostSamplingCallbackList: UniquePtr<PostSamplingCallbackListItem>
+--
++Run()
++Stop(PSLockRef)
++AppendPostSamplingCallback(PSLockRef, PostSamplingCallback&&)</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>1060</x>
+ <y>600</y>
+ <w>340</w>
+ <h>190</h>
+ </coordinates>
+ <panel_attributes>Sampler
+--
+-mOldSigprofHandler: sigaction
+-mMyPid: int
+-mSamplerTid: int
++sSigHandlerCoordinator
+--
++Disable(PSLockRef)
++SuspendAndSampleAndResumeThread(PSLockRef, const RegisteredThread&, aNow: TimeStamp, const Func&)
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>1190</x>
+ <y>780</y>
+ <w>90</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<<-
+mSampler</panel_attributes>
+ <additional_attributes>10.0;80.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>610</x>
+ <y>1130</y>
+ <w>470</w>
+ <h>400</h>
+ </coordinates>
+ <panel_attributes>ActivePS
+--
+/-sInstance: ActivePS*/
+-mGeneration: const uint32_t
+/-sNextGeneration: uint32_t/
+-mCapacity: const PowerOfTwo
+-mDuration: const Maybe<double>
+-mInterval: const double /* milliseconds */
+-mFeatures: const uint32_t
+-mFilters: Vector<std::string>
+-mActiveBrowsingContextID: uint64_t
+-mProfileBufferChunkManager: ProfileBufferChunkManagerWithLocalLimit
+-mProfileBuffer: ProfileBuffer
+-mLiveProfiledThreads: Vector<LiveProfiledThreadData>
+-mDeadProfiledThreads: Vector<UniquePtr<ProfiledThreadData>>
+-mDeadProfiledPages: Vector<RefPtr<PageInformation>>
+-mSamplerThread: SamplerThread* const
+-mInterposeObserver: RefPtr<ProfilerIOInterposeObserver>
+-mPaused: bool
+-mWasPaused: bool /* linux */
+-mBaseProfileThreads: UniquePtr<char[]>
+-mGeckoIndexWhenBaseProfileAdded: ProfileBufferBlockIndex
+-mExitProfiles: Vector<ExitProfile>
+--
++</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>970</x>
+ <y>1040</y>
+ <w>140</w>
+ <h>110</h>
+ </coordinates>
+ <panel_attributes>lt=<<<<-
+mSamplerThread</panel_attributes>
+ <additional_attributes>10.0;90.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLNote</id>
+ <coordinates>
+ <x>500</x>
+ <y>160</y>
+ <w>510</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>bg=red
+This document pre-dates the generated image profilerclasses-20220913.png!
+Unfortunately, the changes to make the image were lost.
+
+This previous version may still be useful to start reconstructing the image,
+if there is a need to update it.</panel_attributes>
+ <additional_attributes/>
+ </element>
+</diagram>
diff --git a/tools/profiler/docs/profilerthreadregistration-20220913.png b/tools/profiler/docs/profilerthreadregistration-20220913.png Binary files differnew file mode 100644 index 0000000000..8f7049d743 --- /dev/null +++ b/tools/profiler/docs/profilerthreadregistration-20220913.png diff --git a/tools/profiler/docs/profilerthreadregistration.umlet.uxf b/tools/profiler/docs/profilerthreadregistration.umlet.uxf new file mode 100644 index 0000000000..3e07215db4 --- /dev/null +++ b/tools/profiler/docs/profilerthreadregistration.umlet.uxf @@ -0,0 +1,710 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<diagram program="umlet" version="15.0.0">
+ <zoom_level>10</zoom_level>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>200</x>
+ <y>330</y>
+ <w>370</w>
+ <h>250</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistry::OffThreadRef
+--
++UnlockedConstReaderCRef() const
++WithUnlockedConstReader(F&& aF) const
++UnlockedConstReaderAndAtomicRWCRef() const
++WithUnlockedConstReaderAndAtomicRW(F&& aF) const
++UnlockedConstReaderAndAtomicRWRef()
++WithUnlockedConstReaderAndAtomicRW(F&& aF)
++UnlockedRWForLockedProfilerCRef()
++WithUnlockedRWForLockedProfiler(F&& aF)
++UnlockedRWForLockedProfilerRef()
++WithUnlockedRWForLockedProfiler(F&& aF)
++ConstLockedRWFromAnyThread()
++WithConstLockedRWFromAnyThread(F&& aF)
++LockedRWFromAnyThread()
++WithLockedRWFromAnyThread(F&& aF)</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>310</x>
+ <y>80</y>
+ <w>560</w>
+ <h>160</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistry
+--
+-sRegistryMutex: RegistryMutex (aka BaseProfilerSharedMutex)
+/exclusive lock used during un/registration, shared lock for other accesses/
+--
+friend class ThreadRegistration
+-Register(ThreadRegistration::OnThreadRef)
+-Unregister(ThreadRegistration::OnThreadRef)
+--
++WithOffThreadRef(ProfilerThreadId, auto&& aF) static
++WithOffThreadRefOr(ProfilerThreadId, auto&& aF, auto&& aFallbackReturn) static: auto</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>310</x>
+ <y>630</y>
+ <w>530</w>
+ <h>260</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistration
+--
+-mDataMutex: DataMutex (aka BaseProfilerMutex)
+-mIsOnHeap: bool
+-mIsRegistryLockedSharedOnThisThread: bool
+-tlsThreadRegistration: MOZ_THREAD_LOCAL(ThreadRegistration*)
+-GetTLS() static: tlsThreadRegistration*
+-GetFromTLS() static: ThreadRegistration*
+--
++ThreadRegistration(const char* aName, const void* aStackTop)
++~ThreadRegistration()
++RegisterThread(const char* aName, const void* aStackTop) static: ProfilingStack*
++UnregisterThread() static
++IsRegistered() static: bool
++GetOnThreadPtr() static OnThreadPtr
++WithOnThreadRefOr(auto&& aF, auto&& aFallbackReturn) static: auto
++IsDataMutexLockedOnCurrentThread() static: bool</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>880</x>
+ <y>620</y>
+ <w>450</w>
+ <h>290</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistration::OnThreadRef
+--
++UnlockedConstReaderCRef() const
++WithUnlockedConstReader(auto&& aF) const: auto
++UnlockedConstReaderAndAtomicRWCRef() const
++WithUnlockedConstReaderAndAtomicRW(auto&& aF) const: auto
++UnlockedConstReaderAndAtomicRWRef()
++WithUnlockedConstReaderAndAtomicRW(auto&& aF): auto
++UnlockedRWForLockedProfilerCRef() const
++WithUnlockedRWForLockedProfiler(auto&& aF) const: auto
++UnlockedRWForLockedProfilerRef()
++WithUnlockedRWForLockedProfiler(auto&& aF): auto
++UnlockedReaderAndAtomicRWOnThreadCRef() const
++WithUnlockedReaderAndAtomicRWOnThread(auto&& aF) const: auto
++UnlockedReaderAndAtomicRWOnThreadRef()
++WithUnlockedReaderAndAtomicRWOnThread(auto&& aF): auto
++RWOnThreadWithLock LockedRWOnThread()
++WithLockedRWOnThread(auto&& aF): auto</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>1040</x>
+ <y>440</y>
+ <w>230</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistration::OnThreadPtr
+--
++operator*(): OnThreadRef
++operator->(): OnThreadRef</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>450</x>
+ <y>940</y>
+ <w>350</w>
+ <h>240</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationData
+--
+-mProfilingStack: ProfilingStack
+-mStackTop: const void* const
+-mThread: nsCOMPtr<nsIThread>
+-mJSContext: JSContext*
+-mJsFrameBuffer: JsFrame*
+-mJSFlags: uint32_t
+-Sleep: Atomic<int>
+-mThreadCpuTimeInNsAtLastSleep: Atomic<uint64_t>
+-mWakeCount: Atomic<uint64_t, Relaxed>
+-mRecordWakeCountMutex: BaseProfilerMutex
+-mAlreadyRecordedWakeCount: uint64_t
+-mAlreadyRecordedCpuTimeInMs: uin64_t
+-mThreadProfilingFeatures: ThreadProfilingFeatures</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>460</x>
+ <y>1220</y>
+ <w>330</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationUnlockedConstReader
+--
++Info() const: const ThreadRegistrationInfo&
++PlatformDataCRef() const: const PlatformData&
++StackTop() const: const void*</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>440</x>
+ <y>1340</y>
+ <w>370</w>
+ <h>190</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationUnlockedConstReaderAndAtomicRW
+--
++ProfilingStackCRef() const: const ProfilingStack&
++ProfilingStackRef(): ProfilingStack&
++ProfilingFeatures() const: ThreadProfilingFeatures
++SetSleeping()
++SetAwake()
++GetNewCpuTimeInNs(): uint64_t
++RecordWakeCount() const
++ReinitializeOnResume()
++CanDuplicateLastSampleDueToSleep(): bool
++IsSleeping(): bool</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>460</x>
+ <y>1570</y>
+ <w>330</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationUnlockedRWForLockedProfiler
+--
++GetProfiledThreadData(): const ProfiledThreadData*
++GetProfiliedThreadData(): ProfiledThreadData*</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>430</x>
+ <y>1670</y>
+ <w>390</w>
+ <h>50</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationUnlockedReaderAndAtomicRWOnThread
+--
++GetJSContext(): JSContext*</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>380</x>
+ <y>1840</y>
+ <w>490</w>
+ <h>190</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationLockedRWFromAnyThread
+--
++SetProfilingFeaturesAndData(
+ ThreadProfilingFeatures, ProfiledThreadData*, const PSAutoLock&)
++ClearProfilingFeaturesAndData(const PSAutoLock&)
++GetJsFrameBuffer() const JsFrame*
++GetEventTarget() const: const nsCOMPtr<nsIEventTarget>
++ResetMainThread()
++GetRunningEventDelay(const TimeStamp&, TimeDuration&, TimeDuration&)
++StartJSSampling(uint32_t)
++StopJSSampling()</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>490</x>
+ <y>2070</y>
+ <w>260</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationLockedRWOnThread
+--
++SetJSContext(JSContext*)
++ClearJSContext()
++PollJSSampling()</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1170</y>
+ <w>30</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;50.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>500</x>
+ <y>2190</y>
+ <w>240</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistration::EmbeddedData
+--</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1290</y>
+ <w>30</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;50.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1520</y>
+ <w>30</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;50.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>1620</y>
+ <w>30</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;50.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>650</x>
+ <y>1710</y>
+ <w>30</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;130.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>2020</y>
+ <w>30</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;50.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>610</x>
+ <y>2140</y>
+ <w>30</w>
+ <h>70</h>
+ </coordinates>
+ <panel_attributes>lt=<<-</panel_attributes>
+ <additional_attributes>10.0;10.0;10.0;50.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>340</x>
+ <y>880</y>
+ <w>180</w>
+ <h>1370</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mData</panel_attributes>
+ <additional_attributes>160.0;1350.0;10.0;1350.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>990</x>
+ <y>930</y>
+ <w>210</w>
+ <h>100</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistrationInfo
+--
++Name(): const char*
++RegisterTime(): const TimeStamp&
++ThreadId(): ProfilerThreadId
++IsMainThread(): bool</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>790</x>
+ <y>980</y>
+ <w>220</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mInfo</panel_attributes>
+ <additional_attributes>200.0;20.0;10.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>990</x>
+ <y>1040</y>
+ <w>210</w>
+ <h>50</h>
+ </coordinates>
+ <panel_attributes>PlatformData
+--
+</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>790</x>
+ <y>1040</y>
+ <w>220</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>>
+mPlatformData</panel_attributes>
+ <additional_attributes>200.0;20.0;10.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>990</x>
+ <y>1100</y>
+ <w>210</w>
+ <h>60</h>
+ </coordinates>
+ <panel_attributes>ProfiledThreadData
+--</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>790</x>
+ <y>1100</y>
+ <w>220</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mProfiledThreadData: *</panel_attributes>
+ <additional_attributes>200.0;20.0;10.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>710</x>
+ <y>480</y>
+ <w>350</w>
+ <h>170</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+m1=0..1
+mThreadRegistration: *</panel_attributes>
+ <additional_attributes>10.0;150.0;330.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>830</x>
+ <y>580</y>
+ <w>260</w>
+ <h>130</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+m1=1
+mThreadRegistration: *</panel_attributes>
+ <additional_attributes>10.0;110.0;40.0;20.0;220.0;20.0;240.0;40.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>1140</x>
+ <y>500</y>
+ <w>90</w>
+ <h>140</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<creates></panel_attributes>
+ <additional_attributes>10.0;120.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>780</x>
+ <y>900</y>
+ <w>450</w>
+ <h>380</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses></panel_attributes>
+ <additional_attributes>10.0;360.0;430.0;360.0;430.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>800</x>
+ <y>900</y>
+ <w>510</w>
+ <h>560</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses></panel_attributes>
+ <additional_attributes>10.0;540.0;420.0;540.0;420.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>780</x>
+ <y>900</y>
+ <w>540</w>
+ <h>720</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses></panel_attributes>
+ <additional_attributes>10.0;700.0;450.0;700.0;450.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>810</x>
+ <y>900</y>
+ <w>520</w>
+ <h>820</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses></panel_attributes>
+ <additional_attributes>10.0;800.0;430.0;800.0;430.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>900</x>
+ <y>2070</y>
+ <w>410</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistration::OnThreadRef::ConstRWOnThreadWithLock
+--
+-mDataLock: BaseProfilerAutoLock
+--
++DataCRef() const: ThreadRegistrationLockedRWOnThread&
++operator->() const: ThreadRegistrationLockedRWOnThread&</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>740</x>
+ <y>2100</y>
+ <w>180</w>
+ <h>40</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mLockedRWOnThread</panel_attributes>
+ <additional_attributes>10.0;20.0;160.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>1250</x>
+ <y>900</y>
+ <w>90</w>
+ <h>1190</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<creates></panel_attributes>
+ <additional_attributes>10.0;1170.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>660</x>
+ <y>440</y>
+ <w>400</w>
+ <h>210</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<creates></panel_attributes>
+ <additional_attributes>380.0;10.0;10.0;190.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>740</x>
+ <y>880</y>
+ <w>160</w>
+ <h>50</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<creates></panel_attributes>
+ <additional_attributes>140.0;30.0;50.0;30.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>460</x>
+ <y>230</y>
+ <w>150</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+m1=0..N
+sRegistryContainer:
+static Vector<></panel_attributes>
+ <additional_attributes>10.0;100.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>800</x>
+ <y>250</y>
+ <w>470</w>
+ <h>150</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistry::LockedRegistry
+--
+-mRegistryLock: RegistryLockShared (aka BaseProfilerAutoLockShared)
+--
++LockedRegistry()
++~LockedRegistry()
++begin() const: const OffThreadRef*
++end() const: const OffThreadRef*
++begin(): OffThreadRef*
++end(): OffThreadRef*</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>560</x>
+ <y>350</y>
+ <w>260</w>
+ <h>50</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses with
+shared lock></panel_attributes>
+ <additional_attributes>10.0;20.0;240.0;20.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>550</x>
+ <y>390</y>
+ <w>330</w>
+ <h>260</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<updates
+mIsRegistryLockedSharedOnThisThread></panel_attributes>
+ <additional_attributes>10.0;240.0;310.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>330</x>
+ <y>570</y>
+ <w>170</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+m1=1
+mThreadRegistration: *</panel_attributes>
+ <additional_attributes>120.0;60.0;40.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>280</x>
+ <y>570</y>
+ <w>200</w>
+ <h>710</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses></panel_attributes>
+ <additional_attributes>180.0;690.0;10.0;690.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>270</x>
+ <y>570</y>
+ <w>190</w>
+ <h>890</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<accesses></panel_attributes>
+ <additional_attributes>170.0;870.0;10.0;870.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>UMLClass</id>
+ <coordinates>
+ <x>200</x>
+ <y>1740</y>
+ <w>440</w>
+ <h>80</h>
+ </coordinates>
+ <panel_attributes>ThreadRegistry::OffThreadRef::{,Const}RWFromAnyThreadWithLock
+--
+-mDataLock: BaseProfilerAutoLock
+--
++DataCRef() {,const}: ThreadRegistrationLockedRWOnThread&
++operator->() {,const}: ThreadRegistrationLockedRWOnThread&</panel_attributes>
+ <additional_attributes/>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>250</x>
+ <y>570</y>
+ <w>90</w>
+ <h>1190</h>
+ </coordinates>
+ <panel_attributes>lt=<.
+<creates></panel_attributes>
+ <additional_attributes>10.0;1170.0;10.0;10.0</additional_attributes>
+ </element>
+ <element>
+ <id>Relation</id>
+ <coordinates>
+ <x>180</x>
+ <y>1810</y>
+ <w>220</w>
+ <h>120</h>
+ </coordinates>
+ <panel_attributes>lt=->>>>
+mLockedRWFromAnyThread</panel_attributes>
+ <additional_attributes>200.0;100.0;80.0;100.0;80.0;10.0</additional_attributes>
+ </element>
+</diagram>
|