diff options
Diffstat (limited to 'toolkit/components/telemetry/docs/collection')
13 files changed, 2189 insertions, 0 deletions
diff --git a/toolkit/components/telemetry/docs/collection/custom-pings.rst b/toolkit/components/telemetry/docs/collection/custom-pings.rst new file mode 100644 index 0000000000..395a23aace --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/custom-pings.rst @@ -0,0 +1,80 @@ +.. _submitting-customping: + +======================= +Submitting custom pings +======================= + +Custom pings can be submitted from JavaScript using: + +.. code-block:: js + + TelemetryController.submitExternalPing(type, payload, options) + +- ``type`` - a ``string`` that is the type of the ping, limited to ``/^[a-z0-9][a-z0-9-]+[a-z0-9]$/i``. +- ``payload`` - the actual payload data for the ping, has to be a JSON style object. +- ``options`` - optional, an object containing additional options: + - ``addClientId``- whether to add the client id to the ping, defaults to ``false`` + - ``addEnvironment`` - whether to add the environment data to the ping, defaults to ``false`` + - ``overrideEnvironment`` - a JSON style object that overrides the environment data + +``TelemetryController`` will assemble a ping with the passed payload and the specified options. +That ping will be archived locally for use with Shield and inspection in ``about:telemetry``. +If preferences allow the upload of Telemetry pings, the ping will be uploaded at the next opportunity (this is subject to throttling, retry-on-failure, etc.). + +.. important:: + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +Submission constraints +---------------------- + +When submitting pings on shutdown, they should not be submitted after Telemetry shutdown. +Pings should be submitted at the latest within: + +- the `observer notification <https://developer.mozilla.org/docs/Observer_Notifications#Application_shutdown>`_ ``"profile-before-change"`` +- the :ref:`AsyncShutdown phase <AsyncShutdown_phases>` ``sendTelemetry`` + +There are other constraints that can lead to a ping submission getting dropped: + +- invalid ping type strings. +- invalid payload types: E.g. strings instead of objects. +- oversized payloads: We currently only drop pings >1MB, but targeting sizes of <=10KB is recommended. + +Tools +===== + +Helpful tools for designing new pings include: + +- `gzipServer <https://github.com/mozilla/gzipServer>`_ - a Python script that can run locally and receives and saves Telemetry pings. Making Firefox send to it allows inspecting outgoing pings easily. +- ``about:telemetry`` - allows inspecting submitted pings from the local archive, including all custom ones. + +Designing custom pings +====================== + +In general, creating a new custom ping means you don't benefit automatically from the existing tooling. Further work is needed to make data show up in re:dash or other analysis tools. + +In addition to the `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__, questions to guide a new ping design are: + +- Submission interval & triggers: + - What events trigger ping submission? + - What interval is the ping submitted in? + - Is there a throttling mechanism? + - What is the desired latency? (submitting "at least daily" still leads to certain latency tails) + - Are pings submitted on a clock schedule? Or based on "time since session start", "time since last ping" etc.? (I.e. will we get sharp spikes in submission volume?) +- Size and volume: + - What’s the size of the submitted payload? + - What's the full ping size including metadata in the pipeline? + - What’s the target population? + - What's the overall estimated volume? +- Dataset: + - Is it opt-out? + - Does it need to be opt-out? + - Does it need to be in a separate ping? (why can’t the data live in probes?) +- Privacy: + - Is there risk to leak PII? + - How is that risk mitigated? +- Data contents: + - Does the submitted data answer the posed product questions? + - Does the shape of the data allow to answer the questions efficiently? + - Is the data limited to what's needed to answer the questions? + - Does the data use common formats? (i.e. can we re-use tooling or analysis know-how) diff --git a/toolkit/components/telemetry/docs/collection/events.rst b/toolkit/components/telemetry/docs/collection/events.rst new file mode 100644 index 0000000000..831c40a8bc --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/events.rst @@ -0,0 +1,349 @@ +.. _eventtelemetry: + +====== +Events +====== + +Across the different Firefox initiatives, there is a common need for a mechanism for recording, storing, sending & analysing application usage in an event-oriented format. +*Event Telemetry* specifies a common events data format, which allows for broader, shared usage of data processing tools. +Adding events is supported in artifact builds and build faster workflows. + +For events recorded into Firefox Telemetry we also provide an API that opaquely handles storage and submission to our servers. + +.. important:: + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +.. _events.serializationformat: + +Serialization format +==================== + +Events are submitted in an :doc:`../data/event-ping` as an array, e.g.: + +.. code-block:: js + + [ + [2147, "ui", "click", "back_button"], + [2213, "ui", "search", "search_bar", "google"], + [2892, "ui", "completion", "search_bar", "yahoo", + {"querylen": "7", "results": "23"}], + [5434, "dom", "load", "frame", null, + {"prot": "https", "src": "script"}], + // ... + ] + +Each event is of the form: + +.. code-block:: js + + [timestamp, category, method, object, value, extra] + +Where the individual fields are: + +- ``timestamp``: ``Number``, positive integer. This is the time in ms when the event was recorded, relative to the main process start time. +- ``category``: ``String``, identifier. The category is a group name for events and helps to avoid name conflicts. +- ``method``: ``String``, identifier. This describes the type of event that occurred, e.g. ``click``, ``keydown`` or ``focus``. +- ``object``: ``String``, identifier. This is the object the event occurred on, e.g. ``reload_button`` or ``urlbar``. +- ``value``: ``String``, optional, may be ``null``. This is a user defined value, providing context for the event. +- ``extra``: ``Object``, optional, may be ``null``. This is an object of the form ``{"key": "value", ...}``, both keys and values need to be strings, keys are identifiers. This is used for events where additional richer context is needed. + +.. _eventlimits: + +Limits +------ + +Each ``String`` marked as an identifier (the event ``name``, ``category``, ``method``, +``object``, and the keys of ``extra``) is restricted to be composed of alphanumeric ASCII +characters ([a-zA-Z0-9]) plus infix underscores ('_' characters that aren't the first or last). +``category`` is also permitted infix periods ('.' characters, so long as they aren't the +first or last character). + +For the Firefox Telemetry implementation, several fields are subject to length limits: + +- ``category``: Max. byte length is ``30``. +- ``method``: Max. byte length is ``20``. +- ``object``: Max. byte length is ``20``. +- ``value``: Max. byte length is ``80``. +- ``extra``: Max. number of keys is ``10``. + + - Each extra key name: Max. string length is ``15``. + - Each extra value: Max. byte length is ``80``. + +Only ``value`` and the values of ``extra`` will be truncated if over the specified length. +Any other ``String`` going over its limit will be reported as an error and the operation +aborted. + +.. _eventdefinition: + +The YAML definition file +======================== + +Any event recorded into Firefox Telemetry must be registered before it can be recorded. +For any code that ships as part of Firefox that happens in `Events.yaml <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Events.yaml>`_. + +The probes in the definition file are represented in a fixed-depth, three-level structure. The first level contains *category* names (grouping multiple events together), the second level contains *event* names, under which the events properties are listed. E.g.: + +.. code-block:: yaml + + # The following is a category of events named "browser.ui". + browser.ui: + click: # This is the event named "click". + objects: ["reload-btn"] # List the objects for this event. + description: > + Describes this event in detail, potentially over + multiple lines. + # ... and more event properties. + # ... and more events. + # This is the "dom" category. + search: + # And the "completion" event. + completion: + # ... + description: Recorded when a search completion suggestion was clicked. + extra_keys: + distance: The edit distance to the current search query input. + loadtime: How long it took to load this completion entry. + # ... + +Category and event names are subject to the limits :ref:`specified above <eventlimits>`. + +The following event properties are valid: + +- ``methods`` *(optional, list of strings)*: The valid event methods. If not set this defaults to ``[eventName]``. +- ``objects`` *(required, list of strings)*: The valid event objects. +- ``description`` *(required, string)*: Description of the event and its semantics. +- ``release_channel_collection`` *(optional, string)*: This can be set to ``opt-in`` (default) or ``opt-out``. +- ``record_in_processes`` *(required, list of strings)*: A list of processes the event can be recorded in. Currently supported values are: + + - ``main`` + - ``content`` + - ``gpu`` + - ``all_children`` (record in all the child processes) + - ``all`` (record in all the processes). + +- ``bug_numbers`` *(required, list of numbers)*: A list of Bugzilla bug numbers that are relevant to this event. +- ``notification_emails`` *(required, list of strings)*: A list of emails of owners for this event. This is used for contact for data reviews and potentially to email alerts. +- expiry: There are two properties that can specify expiry, at least one needs to be set: + + - ``expiry_version`` *(required, string)*: The version number in which the event expires, e.g. ``"50"``, or ``"never"``. A version number of type "N" is automatically converted to "N.0a1" in order to expire the event also in the development channels. For events that never expire the value ``never`` can be used. + +- ``extra_keys`` *(optional, object)*: An object that specifies valid keys for the ``extra`` argument and a description - see the example above. +- ``products`` *(required, list of strings)*: A list of products the event can be recorded on. Currently supported values are: + + - ``firefox`` - Collected in Firefox Desktop for submission via Firefox Telemetry. + - ``thunderbird`` - Collected in Thunderbird for submission via Thunderbird Telemetry. + +- ``operating_systems`` *(optional, list of strings)*: This field restricts recording to certain operating systems only. It defaults to ``all``. Currently supported values are: + + - ``mac`` + - ``linux`` + - ``windows`` + - ``android`` + - ``unix`` + - ``all`` (record on all operating systems) + +.. note:: + + Combinations of ``category``, ``method``, and ``object`` defined in the file must be unique. + +The API +======= + +Public JS API +------------- + +``recordEvent()`` +~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + Services.telemetry.recordEvent(category, method, object, value, extra); + +Record a registered event. + +* ``value``: Optional, may be ``null``. A string value, limited to 80 bytes. +* ``extra``: Optional. An object with string keys & values. Key strings are limited to what was registered. Value strings are limited to 80 bytes. + +Throws if the combination of ``category``, ``method`` and ``object`` is unknown. +Recording an expired event will not throw, but print a warning into the browser console. + +.. note:: + + Each ``recordEvent`` of a known non-expired combination of ``category``, ``method``, and + ``object``, will be :ref:`summarized <events.event-summary>`. + +.. warning:: + + Event Telemetry recording is designed to be cheap, not free. If you wish to record events in a performance-sensitive piece of code, store the events locally and record them only after the performance-sensitive piece ("hot path") has completed. + +Example: + +.. code-block:: js + + Services.telemetry.recordEvent("ui", "click", "reload-btn"); + // event: [543345, "ui", "click", "reload-btn"] + Services.telemetry.recordEvent("ui", "search", "search-bar", "google"); + // event: [89438, "ui", "search", "search-bar", "google"] + Services.telemetry.recordEvent("ui", "completion", "search-bar", "yahoo", + {"querylen": "7", "results": "23"}); + // event: [982134, "ui", "completion", "search-bar", "yahoo", + // {"qerylen": "7", "results": "23"}] + +``setEventRecordingEnabled()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + Services.telemetry.setEventRecordingEnabled(category, enabled); + +Event recording is currently disabled by default for events registered in Events.yaml. +Dynamically-registered events (those registered using ``registerEvents()``) are enabled by default, and cannot be disabled. +Privileged add-ons and Firefox code can enable & disable recording events for specific categories using this function. + +Example: + +.. code-block:: js + + Services.telemetry.setEventRecordingEnabled("ui", true); + // ... now events in the "ui" category will be recorded. + Services.telemetry.setEventRecordingEnabled("ui", false); + // ... now "ui" events will not be recorded anymore. + +.. note:: + + Even if your event category isn't enabled, counts of events that attempted to be recorded will + be :ref:`summarized <events.event-summary>`. + +.. _registerevents: + +``registerEvents()`` +~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + Services.telemetry.registerEvents(category, eventData); + +Register new events from add-ons. + +* ``category`` - *(required, string)* The category the events are in. +* ``eventData`` - *(required, object)* An object of the form ``{eventName1: event1Data, ...}``, where each events data is an object with the entries: + + * ``methods`` - *(required, list of strings)* The valid event methods. + * ``objects`` - *(required, list of strings)* The valid event objects. + * ``extra_keys`` - *(optional, list of strings)* The valid extra keys for the event. + * ``record_on_release`` - *(optional, bool)* + * ``expired`` - *(optional, bool)* Whether this event entry is expired. This allows recording it without error, but it will be discarded. Defaults to false. + +For events recorded from add-ons, registration happens at runtime. Any new events must first be registered through this function before they can be recorded. +The registered categories will automatically be enabled for recording, and cannot be disabled. +If a dynamic event uses the same category as a static event, the category will also be enabled upon registration. + +After registration, the events can be recorded through the ``recordEvent()`` function. They will be submitted in event pings like static events are, under the ``dynamic`` process. + +New events registered here are subject to the same limitations as the ones registered through ``Events.yaml``, although the naming was in parts updated to recent policy changes. + +When add-ons are updated, they may re-register all of their events. In that case, any changes to events that are already registered are ignored. The only exception is expiry; an event that is re-registered with ``expired: true`` will not be recorded anymore. + +Example: + +.. code-block:: js + + Services.telemetry.registerEvents("myAddon.interaction", { + "click": { + methods: ["click"], + objects: ["red_button", "blue_button"], + } + }); + // Now events can be recorded. + Services.telemetry.recordEvent("myAddon.interaction", "click", "red_button"); + +Internal API +------------ + +.. code-block:: js + + Services.telemetry.snapshotEvents(dataset, clear, eventLimit); + Services.telemetry.clearEvents(); + +These functions are only supposed to be used by Telemetry internally or in tests. + +Also, the ``event-telemetry-storage-limit-reached`` topic is notified when the event ping event +limit is reached (1000 event records). +This is intended only for use internally or in tests. + +.. _events.event-summary: + +Event Summary +============= + +Calling ``recordEvent`` on any non-expired registered event will accumulate to a +:doc:`Scalar <scalars>` for ease of analysing uptake and usage patterns. Even if the event category +isn't enabled. + +The scalar is ``telemetry.event_counts`` for statically-registered events (the ones in +``Events.yaml``) and ``telemetry.dynamic_event_counts`` for dynamically-registered events (the ones +registered via ``registerEvents``). These are :ref:`keyed scalars <scalars.keyed-scalars>` where +the keys are of the form ``category#method#object`` and the values are counts of the number of +times ``recordEvent`` was called with that combination of ``category``, ``method``, and ``object``. + +These two scalars have a default maximum key limit of 500 per process. + +Example: + +.. code-block:: js + + // telemetry.event_counts summarizes in the same process the events were recorded + + // Let us suppose in the parent process this happens: + Services.telemetry.recordEvent("interaction", "click", "document", "xuldoc"); + Services.telemetry.recordEvent("interaction", "click", "document", "xuldoc-neighbour"); + + // And in each of child processes 1 through 4, this happens: + Services.telemetry.recordEvent("interaction", "click", "document", "htmldoc"); + +In the case that ``interaction.click.document`` is statically-registered, this will result in the +parent-process scalar ``telemetry.event_counts`` having a key ``interaction#click#document`` with +value ``2`` and the content-process scalar ``telemetry.event_counts`` having a key +``interaction#click#document`` with the value ``4``. + +All dynamically-registered events end up in the dynamic-process ``telemetry.dynamic_event_counts`` +(notice the different name) regardless of in which process the events were recorded. From the +example above, if ``interaction.click.document`` was registered with ``registerEvents`` then +the dynamic-process scalar ``telemetry.dynamic_event_counts`` would have a key +``interaction#click#document`` with the value ``6``. + +Testing +======= + +Tests involving Event Telemetry often follow this four-step form: + +1. ``Services.telemetry.clearEvents();`` To minimize the effects of prior code and tests. +2. ``Services.telemetry.setEventRecordingEnabled(myCategory, true);`` To enable the collection of + your events. (May or may not be relevant in your case) +3. ``runTheCode();`` This is part of the test where you call the code that's supposed to collect + Event Telemetry. +4. ``TelemetryTestUtils.assertEvents(expected, filter, options);`` This will check the + events recorded by Event Telemetry against your provided list of expected events. + If you only need to check the number of events recorded, you can use + ``TelemetryTestUtils.assertNumberOfEvents(expectedNum, filter, options);``. + Both utilities have :searchfox:`helpful inline documentation <toolkit/components/telemetry/tests/utils/TelemetryTestUtils.sys.mjs>`. + + +Version History +=============== + +- Firefox 79: ``geckoview`` support removed (see `bug 1620395 <https://bugzilla.mozilla.org/show_bug.cgi?id=1620395>`__). +- Firefox 52: Initial event support (`bug 1302663 <https://bugzilla.mozilla.org/show_bug.cgi?id=1302663>`_). +- Firefox 53: Event recording disabled by default (`bug 1329139 <https://bugzilla.mozilla.org/show_bug.cgi?id=1329139>`_). +- Firefox 54: Added child process events (`bug 1313326 <https://bugzilla.mozilla.org/show_bug.cgi?id=1313326>`_). +- Firefox 56: Added support for recording new probes from add-ons (`bug 1302681 <bug https://bugzilla.mozilla.org/show_bug.cgi?id=1302681>`_). +- Firefox 58: + + - Ignore re-registering existing events for a category instead of failing (`bug 1408975 <https://bugzilla.mozilla.org/show_bug.cgi?id=1408975>`_). + - Removed support for the ``expiry_date`` property, as it was unused (`bug 1414638 <https://bugzilla.mozilla.org/show_bug.cgi?id=1414638>`_). +- Firefox 61: + + - Enabled support for adding events in artifact builds and build-faster workflows (`bug 1448945 <https://bugzilla.mozilla.org/show_bug.cgi?id=1448945>`_). + - Added summarization of events (`bug 1440673 <https://bugzilla.mozilla.org/show_bug.cgi?id=1440673>`_). +- Firefox 66: Replace ``cpp_guard`` with ``operating_systems`` (`bug 1482912 <https://bugzilla.mozilla.org/show_bug.cgi?id=1482912>`_)` diff --git a/toolkit/components/telemetry/docs/collection/experiments.rst b/toolkit/components/telemetry/docs/collection/experiments.rst new file mode 100644 index 0000000000..d9c926fcea --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/experiments.rst @@ -0,0 +1,41 @@ +===================== +Experiment Annotation +===================== +This API allows privileged JavaScript to annotate the :doc:`../data/environment` with any experiments a client is participating in. + +The experiment annotations are sent with any ping that includes the :doc:`../data/environment` data. + +The JS API +========== +Privileged JavaScript code can annotate experiments using the functions exposed by ``TelemetryEnvironment.sys.mjs``. + +The following function adds an annotation to the environment for the provided ``id``, ``branch`` and ``options``. Calling this function repeatedly with the same ``id`` will overwrite the state and trigger new subsessions (subject to throttling). +``options`` is an object that may contain ``type`` to tag the experiment with a specific type or ``enrollmentId`` to tag the enrollment in this experiment with an identifier. + +.. code-block:: js + + TelemetryEnvironment.setExperimentActive(id, branch, [options={}}]) + +This removes the annotation for the experiment with the provided ``id``. + +.. code-block:: js + + TelemetryEnvironment.setExperimentInactive(id) + +This synchronously returns a dictionary containing the information for each active experiment. + +.. code-block:: js + + TelemetryEnvironment.getActiveExperiments() + +.. note:: + + Both ``setExperimentActive`` and ``setExperimentInactive`` trigger a new subsession. However + the latter only does so if there was an active experiment with the provided ``id``. + +Limits and restrictions +----------------------- +To prevent abuses, the content of the experiment ``id`` and ``branch`` is limited to +100 characters in length. +``type`` is limited to a length of 20 characters. +``enrollmentId`` is limited to 40 characters (chosen to be just a little longer than the 36-character long GUID text representation). diff --git a/toolkit/components/telemetry/docs/collection/histograms.rst b/toolkit/components/telemetry/docs/collection/histograms.rst new file mode 100644 index 0000000000..1998c6062e --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/histograms.rst @@ -0,0 +1,411 @@ +========== +Histograms +========== + +In Firefox, the Telemetry system collects various measures of Firefox performance, hardware, usage and customizations and submits it to Mozilla. The Telemetry data collected by a single client can be examined from the integrated ``about:telemetry`` browser page, while the aggregated reports across entire user populations are publicly available at `telemetry.mozilla.org <https://telemetry.mozilla.org>`_. + +.. important:: + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +The following sections explain how to add a new measurement to Telemetry. + +Overview +======== + +Telemetry histograms are an efficient way to collect numeric measurements like multiple counts or timings. +They are collected through a common API and automatically submitted with the :doc:`main ping <../data/main-ping>`. + +.. hint:: + + Before adding a new histogram, you should consider using other collection mechanisms. For example, if the need is to track a single scalar value (e.g. number, boolean or string), you should use :doc:`scalars`. + +The histogram below is taken from Firefox's ``about:telemetry`` page. It shows a histogram used for tracking plugin shutdown times and the data collected over a single Firefox session. The timing data is grouped into buckets where the height of the blue bars represents the number of items in each bucket. The tallest bar, for example, indicates that there were 63 plugin shutdowns lasting between 129ms and 204ms. + +.. image:: sampleHistogram.png + +The histograms on the ``about:telemetry`` page only show the non-empty buckets in a histogram, except for the bucket to the left of the first non-empty bucket and the bucket to the right of the last non-empty bucket. + +.. _choosing-histogram-type: + +Choosing a Histogram Type +========================= + +The first step to adding a new histogram is to choose the histogram type that best represents the data being measured. The sample histogram used above is an "exponential" histogram. + +.. note:: + + Only ``flag`` and ``count`` histograms have default values. All other histograms start out empty and are only submitted if a value is recorded. + +``boolean`` +----------- +These histograms only record boolean values. Multiple boolean entries can be recorded in the same histogram during a single browsing session, e.g. if a histogram is measuring user choices in a dialog box with options "Yes" or "No", a new boolean value is added every time the dialog is displayed. + +``linear`` +---------- +Linear histograms are similar to enumerated histograms, except each bucket is associated with a range of values instead of a single enum value. The range of values covered by each bucket increases linearly from the previous bucket, e.g. one bucket might count the number of occurrences of values between 0 to 9, the next bucket would cover values 10-19, the next 20-29, etc. This bucket type is useful if there aren't orders of magnitude differences between the minimum and maximum values stored in the histogram, e.g. if the values you are storing are percentages 0-100%. + +.. note:: + + If you need a linear histogram with buckets < 0, 1, 2 ... N >, then you should declare an enumerated histogram. This restriction was added to prevent developers from making a common off-by-one mistake when specifying the number of buckets in a linear histogram. + +``exponential`` +--------------- +Exponential histograms are similar to linear histograms but the range of values covered by each bucket increases exponentially. As an example of its use, consider the timings of an I/O operation whose duration might normally fall in the range of 0ms-50ms but extreme cases might have durations in seconds or minutes. For such measurements, you would want finer-grained bucketing in the normal range but coarser-grained bucketing for the extremely large values. An exponential histogram fits this requirement since it has "narrow" buckets near the minimum value and significantly "wider" buckets near the maximum value. + +``categorical`` +--------------- +Categorical histograms are similar to enumerated histograms. However, instead of specifying ``n_buckets``, you specify an array of strings in the ``labels`` field. From JavaScript, the label values or their indices can be passed as strings to ``histogram.add()``. From C++ you can use ``AccumulateCategorical`` with passing a value from the corresponding ``Telemetry::LABEL_*`` enum, or, in exceptional cases the string values. + +.. note:: + + You can add new labels to a categorical histogram later on, + up to the configured maximum. + Categorical histograms by default support up to 50 labels, + but you can set it higher using the ``n_values`` property. + If you need to add labels beyond the maximum later, + you need to use a new histogram name. + See `Changing a Histogram`_ for details. + +``enumerated`` +-------------- +This histogram type is intended for storing "enum" values, when you can't specify labels and thus cannot use ``categorical`` histograms. An enumerated histogram consists of a fixed number of *buckets* (specified by ``n_values``), each of which is associated with a consecutive integer value (the bucket's *label*), `0` to `n_values`. Each bucket corresponds to an enum value and counts the number of times its particular enum value was recorded; except for the `n_values` bucket, which counts all values greater than or equal to n_values. + +You might use this type of histogram if, for example, you wanted to track the relative popularity of SSL handshake types. Whenever the browser started an SSL handshake, it would record one of a limited number of enum values which uniquely identifies the handshake type. + +.. note:: + + Set ``n_values`` to a slightly larger value than needed to allow for new enum values in the future. See `Changing a histogram`_ if you need to add more enums later. + +``flag`` +-------- +*Deprecated* (please use boolean :doc:`scalars`). + +This histogram type allows you to record a single value (`0` or `1`, default `0`). This type is useful if you need to track whether a feature was ever used during a Firefox session. You only need to add a single line of code which sets the flag when the feature is used because the histogram is initialized with a default value of `0`/`false` (flag not set). Thus, recording a value of `0` is not allowed and asserts. + +Flag histograms will ignore any changes after the flag is set, so once the flag is set, it cannot be unset. + +``count`` +--------- +*Deprecated* (please use uint :doc:`scalars`). + +This histogram type is used when you want to record a count of something. It only stores a single value and defaults to `0`. + +.. _histogram-type-keyed: + +Keyed Histograms +---------------- + +Keyed histograms are collections of one of the histogram types above, indexed by a string key. This is for example useful when you want to break down certain counts by a name, like how often searches happen with which search engine. +Note that when you need to record for a small set of known keys, using separate plain histograms is more efficient. + +.. warning:: + + Keyed histograms are currently not supported in the `histogram change detector <https://alerts.telemetry.mozilla.org/index.html>`_. + +Declaring a Histogram +===================== + +Histograms should be declared in the `Histograms.json <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Histograms.json>`_ file. These declarations are checked for correctness at `compile time <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/gen_histogram_data.py>`_ and used to generate C++ code. + +The following is a sample histogram declaration from ``Histograms.json`` for a histogram named ``MEMORY_RESIDENT`` which tracks the amount of resident memory used by a process: + + +.. code-block:: json + + { + "MEMORY_RESIDENT": { + "record_in_processes": ["main", "content"], + "alert_emails": ["team@mozilla.xyz"], + "expires_in_version": "never", + "kind": "exponential", + "low": 32768, + "high": 1048576, + "n_buckets": 50, + "bug_numbers": [12345], + "description": "Resident memory size (KB)" + } + } + +Histograms which track timings in milliseconds or microseconds should suffix their names with ``"_MS"`` and ``"_US"`` respectively. Flag-type histograms should have the suffix ``"_FLAG"`` in their name. + +The possible fields in a histogram declaration are listed below. + +``record_in_processes`` +----------------------- +Required. This field is a list of processes this histogram can be recorded in. Currently-supported values are: + +- ``main`` +- ``content`` +- ``gpu`` +- ``all_childs`` (record in all child processes) +- ``all`` (record in all processes) + +``alert_emails`` +---------------- +Required. This field is a list of e-mail addresses that should be notified when the distribution of the histogram changes significantly from one build-id to the other. This can be useful to detect regressions. Note that all alerts will be sent automatically to mozilla.dev.telemetry-alerts. + +``expires_in_version`` +---------------------- +Required. The version number in which the histogram expires; e.g. a value of `"30"` will mean that the histogram stops recording from Firefox 30 on. A version number of type ``"N"`` is automatically converted to ``"N.0a1"`` in order to expire the histogram also in the development channels. For histograms that never expire the value ``"never"`` can be used as in the example above. Accumulating data into an expired histogram is effectively a non-op and will not record anything. + +``kind`` +-------- +Required. One of the histogram types described in the previous section. Different histogram types require different fields to be present in the declaration. + +``keyed`` +--------- +Optional, boolean, defaults to ``false``. Determines whether this is a *keyed histogram*. + +``keys`` +--------- +Optional, list of strings. Only valid for *keyed histograms*. Defines a case sensitive list of allowed keys that can be used for this histogram. The list is limited to 30 keys with a maximum length of 20 characters. When using a key that is not in the list, the accumulation is discarded and a warning is printed to the browser console. + +``low`` +------- +Optional, the default value is ``1``. This field represents the minimum value expected in the histogram. Note that all histograms automatically get a bucket with label ``0`` for counting values below the ``low`` value. If a histogram does not specify a ``low`` value, it will always have a ``"0"`` bucket (for negative or zero values) and a ``"1"`` bucket (for values between ``1`` and the next bucket). + + +``high`` +-------- +Required for linear and exponential histograms. The maximum value to be stored in a linear or exponential histogram. Any recorded values greater than this maximum will be counted in the last bucket. + +``n_buckets`` +------------- +Required for linear and exponential histograms. The number of buckets in a linear or exponential histogram. + +.. note:: + + The maximum value for ``n_buckets`` is 100. The more buckets, the larger the storage and transfer costs borne by our users and our pipeline. + +``n_values`` +------------ +Required for enumerated histograms. Similar to n_buckets, it represent the number of elements in the enum. + +.. note:: + + The maximum value for ``n_values`` is 100. The more values, the larger the storage and transfer costs borne by our users and our pipeline. + +``labels`` +---------- +Required for categorical histograms. This is an array of strings which are the labels for different values in this histograms. The labels are restricted to a C++-friendly subset of characters (``^[a-z][a-z0-9_]+[a-z0-9]$``). This field is limited to 100 strings, each with a maximum length of 20 characters. + +``bug_numbers`` +--------------- +Required for all new histograms. This is an array of integers and should at least contain the bug number that added the probe and additionally other bug numbers that affected its behavior. + +``description`` +--------------- +Required. A description of the data tracked by the histogram, e.g. _"Resident memory size"_ + +``cpp_guard`` (obsolete, use ``operating_systems``) +--------------------------------------------------- +Optional. This field inserts an #ifdef directive around the histogram's C++ declaration. This is typically used for platform-specific histograms, e.g. ``"cpp_guard": "ANDROID"`` + +``operating_systems`` +--------------------- +Optional. This field restricts recording to certain operating systems only. Use that in-place of previous ``cpp_guards`` to avoid inclusion on not-specified operating systems. +Currently supported values are: + +- ``mac`` +- ``linux`` +- ``windows`` +- ``android`` +- ``unix`` +- ``all`` (record on all operating systems) + +If this field is left out it defaults to ``all``. + +``releaseChannelCollection`` +---------------------------- +Optional. This is one of: + +* ``"opt-in"``: (default value) This histogram is submitted by default on pre-release channels, unless the user opts out. +* ``"opt-out"``: This histogram is submitted by default on release and pre-release channels, unless the user opts out. + +.. warning:: + + Because they are collected by default, opt-out probes need to meet a higher "user benefit" threshold than opt-in probes during data collection review. + + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +.. _histogram-products: + +``products`` +------------- +Required. This field is a list of products this histogram can be recorded on. Currently-supported values are: + +- ``firefox`` - Collected in Firefox Desktop for submission via Firefox Telemetry. +- ``geckoview_streaming`` - See :doc:`this guide <../start/report-gecko-telemetry-in-glean>` for how to stream data through geckoview to the Glean SDK. +- ``thunderbird`` - Collected in Thunderbird for submission via Thunderbird Telemetry. + +``record_into_store`` +--------------------- + +Optional. This field is a list of stores this histogram should be recorded into. +If this field is left out it defaults to ``[main]``. + +Changing a histogram +==================== + +Changing a histogram declaration after the histogram has been released is tricky. +Many tools +(like `the aggregator <https://github.com/mozilla/python_mozaggregator>`_) +assume histograms don't change. +The current recommended procedure is to change the name of the histogram. + +* When changing existing histograms, the recommended pattern is to use a versioned name (``PROBE``, ``PROBE_2``, ``PROBE_3``, ...). +* For enum histograms, it's recommended to set "n_buckets" to a slightly larger value than needed since new elements may be added to the enum in the future. + +The one exception is `Categorical`_ histograms. +They can be changed by adding labels until it reaches the configured maximum +(default of 50, or the value of ``n_values``). +If you need to change the configured maximum, +then you must change the histogram name as mentioned above. + +Histogram values +================ + +The values you can accumulate to Histograms are limited by their internal representation. + +Telemetry Histograms do not record negative values, instead clamping them to 0 before recording. + +Telemetry Histograms do not record values greater than 2^31, instead clamping them to INT_MAX before recording. + +Adding a JavaScript Probe +========================= + +A Telemetry probe is the code that measures and stores values in a histogram. Probes in privileged JavaScript code can make use of the `nsITelemetry <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/nsITelemetry.idl>`_ interface to get references to histogram objects. A new value is recorded in the histogram by calling ``add`` on the histogram object: + +.. code-block:: js + + let histogram = Services.telemetry.getHistogramById("PLACES_AUTOCOMPLETE_1ST_RESULT_TIME_MS"); + histogram.add(measuredDuration); + + let keyed = Services.telemetry.getKeyedHistogramById("TAG_SEEN_COUNTS"); + keyed.add("blink"); + +Note that ``nsITelemetry.getHistogramById()`` will throw an ``NS_ERROR_FAILURE`` JavaScript exception if it is called with an invalid histogram ID. The ``add()`` function will not throw if it fails, instead it prints an error in the browser console. + +.. warning:: + + Adding a new Telemetry probe is not possible with Artifact builds. A full build is needed. + +For histograms measuring time, TelemetryStopwatch can be used to avoid working with Dates manually: + +.. code-block:: js + + TelemetryStopwatch.start("SEARCH_SERVICE_INIT_MS"); + TelemetryStopwatch.finish("SEARCH_SERVICE_INIT_MS"); + + TelemetryStopwatch.start("FX_TAB_SWITCH_TOTAL_MS"); + TelemetryStopwatch.cancel("FX_TAB_SWITCH_TOTAL_MS"); + +Adding a C++ Probe +================== + +Probes in native code can also use the `nsITelemetry <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/nsITelemetry.idl>`_ interface, but the helper functions declared in `Telemetry.h <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Telemetry.h>`_ are more convenient: + +.. code-block:: cpp + + #include "mozilla/Telemetry.h" + + /** + * Adds sample to a histogram defined in Histograms.json + * + * @param id - histogram id + * @param sample - value to record. + */ + void Accumulate(HistogramID id, uint32_t sample); + + /** + * Adds samples to a histogram defined in Histograms.json + * + * @param id - histogram id + * @param samples - values to record. + */ + void Accumulate(HistogramID id, const nsTArray<uint32_t>& samples); + + /** + * Adds sample to a keyed histogram defined in Histograms.h + * + * @param id - keyed histogram id + * @param key - the string key + * @param sample - (optional) value to record, defaults to 1. + */ + void Accumulate(HistogramID id, const nsCString& key, uint32_t sample = 1); + + /** + * Adds time delta in milliseconds to a histogram defined in Histograms.json + * + * @param id - histogram id + * @param start - start time + * @param end - (optional) end time, defaults to TimeStamp::Now(). + */ + void AccumulateTimeDelta(HistogramID id, TimeStamp start, TimeStamp end = TimeStamp::Now()); + + /** + * Adds time delta in milliseconds to a keyed histogram defined in Histograms.json + * + * @param id - histogram id + * @param key - the string key + * @param start - start time + * @param end - (optional) end time, defaults to TimeStamp::Now(). + */ + void AccumulateTimeDelta(HistogramID id, const cs TimeStamp start, TimeStamp end = TimeStamp::Now()); + + /** Adds time delta in milliseconds to a histogram defined in TelemetryHistogramEnums.h + * + * @param id - histogram id + * @param key - the string key + * @param start - start time + * @param end - (optional) end time, defaults to TimeStamp::Now(). + */ + void AccumulateTimeDelta(HistogramID id, const nsCString& key, TimeStamp start, TimeStamp end = TimeStamp::Now()); + +The histogram names declared in ``Histograms.json`` are translated into constants in the ``mozilla::Telemetry`` namespace: + +.. code-block:: cpp + + mozilla::Telemetry::Accumulate(mozilla::Telemetry::STARTUP_CRASH_DETECTED, true); + +.. warning:: + + Telemetry accumulations are designed to be cheap, not free. If you wish to accumulate values in a performance-sensitive piece of code, store the accumualtions locally and accumulate after the performance-sensitive piece ("hot path") has completed. + +The ``Telemetry.h`` header also declares the helper classes ``AutoTimer`` and ``AutoCounter``. Objects of these types automatically record a histogram value when they go out of scope: + +.. code-block:: cpp + + nsresult + nsPluginHost::StopPluginInstance(nsNPAPIPluginInstance* aInstance) + { + Telemetry::AutoTimer<Telemetry::PLUGIN_SHUTDOWN_MS> timer; + ... + return NS_OK; + } + +If the HistogramID is not known at compile time, one can use the ``RuntimeAutoTimer`` and ``RuntimeAutoCounter`` classes, which behave like the template parameterized ``AutoTimer`` and ``AutoCounter`` ones. + +.. code-block:: cpp + + void + FunctionWithTiming(Telemetry::HistogramID aTelemetryID) + { + ... + Telemetry::RuntimeAutoTimer timer(aTelemetryID); + ... + } + + int32_t + FunctionWithCounter(Telemetry::HistogramID aTelemetryID) + { + ... + Telemetry::RuntimeAutoCounter myCounter(aTelemetryID); + ++myCounter; + myCounter += 42; + ... + } + +Prefer using the template parameterized ``AutoTimer`` and ``AutoCounter`` on hot paths, if possible. diff --git a/toolkit/components/telemetry/docs/collection/index.rst b/toolkit/components/telemetry/docs/collection/index.rst new file mode 100644 index 0000000000..9b8c938b7a --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/index.rst @@ -0,0 +1,50 @@ +=============== +Data collection +=============== + +There are different APIs and formats to collect data in Firefox, all suiting different use cases. + +In general, we aim to submit data in a common format where possible. This has several advantages; from common code and tooling to sharing analysis know-how. + +In cases where this isn't possible and more flexibility is needed, we can submit custom pings or consider adding different data formats to existing pings. + +*Note:* Every new data collection must go through a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`_. + +The current data collection possibilities include: + +* :doc:`scalars` allow recording of a single value (string, boolean, a number) +* :doc:`histograms` can efficiently record multiple data points +* ``environment`` data records information about the system and settings a session occurs in +* :doc:`events` can record richer data on individual occurrences of specific actions +* :doc:`Measuring elapsed time <measuring-time>` +* :doc:`Custom pings <custom-pings>` +* :doc:`Use counters <use-counters>` measure the usage of web platform features +* :doc:`Experiment annotations <experiments>` +* :doc:`Remote content uptake <uptake>` +* :doc:`WebExtension API <webextension-api>` can be used in privileged webextensions +* :doc:`Origin Telemetry <origin>` Experimental prototype. For use by Content Blocking only for now. +* :doc:`User Interactions <user-interactions>` allow annotating hang report pings with information on what the user was interacting with at the time + +.. toctree:: + :maxdepth: 2 + :titlesonly: + :hidden: + :glob: + + scalars + histograms + events + measuring-time + custom-pings + experiments + uptake + * + +Browser Usage Telemetry +~~~~~~~~~~~~~~~~~~~~~~~ +For more information, see :ref:`browserusagetelemetry`. + +Version History +~~~~~~~~~~~~~~~ + +- Firefox 61: Stopped reporting Telemetry Log items (`bug 1443614 <https://bugzilla.mozilla.org/show_bug.cgi?id=1443614>`_). diff --git a/toolkit/components/telemetry/docs/collection/measuring-time.rst b/toolkit/components/telemetry/docs/collection/measuring-time.rst new file mode 100644 index 0000000000..c2d972b378 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/measuring-time.rst @@ -0,0 +1,116 @@ +====================== +Measuring elapsed time +====================== + +To make it easier to measure how long operations take, we have helpers for both JavaScript and C++. +These helpers record the elapsed time into histograms, so you have to create suitable :doc:`histograms` for them first. + +From JavaScript +=============== +JavaScript can measure elapsed time using TelemetryStopwatch. + +``TelemetryStopwatch`` is a helper that simplifies recording elapsed time (in milliseconds) into histograms (plain or keyed). + +API: + +.. code-block:: js + + TelemetryStopwatch = { + // Start, check if running, cancel & finish recording elapsed time into a + // histogram. + // |aObject| is optional. If specified, the timer is associated with this + // object, so multiple time measurements can be done concurrently. + start(histogramId, aObject); + running(histogramId, aObject); + cancel(histogramId, aObject); + finish(histogramId, aObject); + // Start, check if running, cancel & finish recording elapsed time into a + // keyed histogram. + // |key| specifies the key to record into. + // |aObject| is optional and used as above. + startKeyed(histogramId, key, aObject); + runningKeyed(histogramId, key, aObject); + cancelKeyed(histogramId, key, aObject); + finishKeyed(histogramId, key, aObject); + }; + +Example: + +.. code-block:: js + + TelemetryStopwatch.start("SAMPLE_FILE_LOAD_TIME_MS"); + // ... start loading file. + if (failedToOpenFile) { + // Cancel this if the operation failed early etc. + TelemetryStopwatch.cancel("SAMPLE_FILE_LOAD_TIME_MS"); + return; + } + // ... do more work. + TelemetryStopwatch.finish("SAMPLE_FILE_LOAD_TIME_MS"); + + // Another loading attempt? Start stopwatch again if + // not already running. + if (!TelemetryStopwatch.running("SAMPLE_FILE_LOAD_TIME_MS")) { + TelemetryStopwatch.start("SAMPLE_FILE_LOAD_TIME_MS"); + } + + // Periodically, it's necessary to attempt to finish a + // TelemetryStopwatch that's already been canceled or + // finished. Normally, that throws a warning to the + // console. If the TelemetryStopwatch being possibly + // canceled or finished is expected behaviour, the + // warning can be suppressed by passing the optional + // aCanceledOkay argument. + + // ... suppress warning on a previously finished + // TelemetryStopwatch + TelemetryStopwatch.finish("SAMPLE_FILE_LOAD_TIME_MS", null, + true /* aCanceledOkay */); + +From C++ +======== + +API: + +.. code-block:: cpp + + // This helper class is the preferred way to record elapsed time. + template<HistogramID id> + class AutoTimer { + // Record into a plain histogram. + explicit AutoTimer(TimeStamp aStart = TimeStamp::Now()); + // Record into a keyed histogram, with key |aKey|. + explicit AutoTimer(const nsCString& aKey, + TimeStamp aStart = TimeStamp::Now()); + }; + + // If the Histogram id is not known at compile time: + class RuntimeAutoTimer { + // Record into a plain histogram. + explicit RuntimeAutoTimer(Telemetry::HistogramID aId, + TimeStamp aStart = TimeStamp::Now()); + // Record into a keyed histogram, with key |aKey|. + explicit RuntimeAutoTimer(Telemetry::HistogramID aId, + const nsCString& aKey, + TimeStamp aStart = TimeStamp::Now()); + }; + + void AccumulateTimeDelta(HistogramID id, TimeStamp start, TimeStamp end = TimeStamp::Now()); + void AccumulateTimeDelta(HistogramID id, const nsCString& key, TimeStamp start, TimeStamp end = TimeStamp::Now()); + +Example: + +.. code-block:: cpp + + { + Telemetry::AutoTimer<Telemetry::FIND_PLUGINS> telemetry; + // ... scan disk for plugins. + } + // When leaving the scope, AutoTimers destructor will record the time that passed. + + // If the histogram id is not known at compile time. + { + Telemetry::RuntimeAutoTimer telemetry(Telemetry::FIND_PLUGINS); + // ... scan disk for plugins. + } + // When leaving the scope, AutoTimers destructor will record the time that passed. diff --git a/toolkit/components/telemetry/docs/collection/origin.rst b/toolkit/components/telemetry/docs/collection/origin.rst new file mode 100644 index 0000000000..0d0b211c71 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/origin.rst @@ -0,0 +1,166 @@ +.. _origintelemetry: + +================ +Origin Telemetry +================ + +*Origin Telemetry* is an experimental Firefox Telemetry mechanism that allows us to privately report origin-specific information in aggregate. +In short, it allows us to get exact counts of how *many* Firefox clients do certain things on specific origins without us being able to know *which* clients were doing which things on which origins. + +As an example, Content Blocking would like to know which trackers Firefox blocked most frequently. +Origin Telemetry allows us to count how many times a given tracker is blocked without being able to find out which clients were visiting pages that had those trackers on them. + +.. important:: + + This mechanism is experimental and is a prototype. + Please do not try to use this without explicit permission from the Firefox Telemetry Team, as it's really only been designed to work for Content Blocking right now. + +Adding or removing Origins or Metrics is not supported in artifact builds and build faster workflows. A non-artifact Firefox build is necessary to change these lists. + +This mechanism is enabled on Firefox Nightly only at present. + +.. important:: + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +Privacy +======= + +To achieve the necessary goal of getting accurate counts without being able to learn which clients contributed to the counts we use a mechanism called `Prio (pdf) <https://www.usenix.org/system/files/conference/nsdi17/nsdi17-corrigan-gibbs.pdf>`_. + +Prio uses cryptographic techniques to encrypt information and a proof that the information is correct, only sending the encrypted information on to be aggregated. +Only after aggregation do we learn the information we want (aggregated counts), and at no point do we learn the information we don't want (which clients contributed to the counts). + +.. _origin.usage: + +Using Origin Telemetry +====================== + +To record that something happened on a given origin, three things must happen: + +1. The origin must be one of the fixed, known list of origins. ("Where" something happened) +2. The metric must be one of the fixed, known list of metrics. ("What" happened) +3. A call must be made to the Origin Telemetry API. (To let Origin Telemetry know "that" happened "there") + +At present the lists of origins and metrics are hardcoded in C++. +Please consult the Firefox Telemetry Team before changing these lists. + +Origins can be arbitrary byte sequences of any length. +Do not add duplicate origins to the list. + +If an attempt is made to record to an unknown origin, a meta-origin ``__UNKNOWN__`` captures that it happened. +Unlike other origins where multiple recordings are considered additive ``__UNKNOWN__`` only accumulates a single value. +This is to avoid inflating the ping size in case the caller submits a lot of unknown origins for a given unit (e.g. pageload). + +Metrics should be of the form ``categoryname.metric_name``. +Both ``categoryname`` and ``metric_name`` should not exceed 40 bytes (UTF-8 encoded) in length and should only contain alphanumeric character and infix underscores. + +.. _origin.API: + +API +=== + +Origin Telemetry supplies APIs for recording information into and snapshotting information out of storage. + +Recording +--------- + +``Telemetry::RecordOrigin(aOriginMetricID, aOrigin);`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This C++ API records that a metric was true for a given origin. +For instance, maybe the user visited a page in which content from ``example.net`` was blocked. +That call might look like ``Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Blocked, "example.net"_ns)``. + +Snapshotting +------------ + +``let snapshot = await Telemetry.getEncodedOriginSnapshot(aClear);`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This JS API provides a snapshot of the prio-encoded payload and is intended to only be used to assemble the :doc:`"prio" ping's <../data/prio-ping>` payload. +It returns a Promise which resolves to an object of the form: + +.. code-block:: js + + { + a: <base64-encoded, prio-encoded data>, + b: <base64-encoded, prio-encoded data>, + } + +``let snapshot = Telemetry.getOriginSnapshot(aClear);`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This JS API provides a snapshot of the unencrypted storage of unsent Origin Telemetry, optionally clearing that storage. +It returns a structure of the form: + +.. code-block:: js + + { + "categoryname.metric_name": { + "origin1": count1, + "origin2": count2, + ... + }, + ... + } + +.. important:: + + This API is only intended to be used by ``about:telemetry`` and tests. + +.. _origin.example: + +Example +======= + +Firefox Content Blocking blocks web content from certain origins present on a list. +Users can exempt certain origins from being blocked. +To improve Content Blocking's effectiveness we need to know these two "what's" of information about that list of "wheres". + +This means we need two metrics ``contentblocking.blocked`` and ``contentblocking.exempt`` (the "what's"), and a list of origins (the "wheres"). + +Say "example.net" was blocked and "example.com" was exempted from blocking. +Content Blocking calls ``Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Blocked, "example.net"_ns))`` and ``Telemetry::RecordOrigin(OriginMetricID::ContentBlocking_Exempt, "example.com"_ns)``. + +At this time a call to ``Telemetry.getOriginSnapshot()`` would return: + +.. code-block:: js + + { + "contentblocking.blocked": {"example.net": 1}, + "contentblocking.exempt": {"example.com": 1}, + } + +Later, Origin Telemetry will get the encoded snapshot (clearing the storage) and assemble it with other information into a :doc:`"prio" ping <../data/prio-ping>` which will then be submitted. + +.. _origin.encoding: + +Encoding +======== + +.. note:: + + This section is provided to help you understand the client implementation's architecture. + If how we arranged our code doesn't matter to you, feel free to ignore. + +There are three levels of encoding in Origin Telemetry: App Encoding, Prio Encoding, and Base64 Encoding. + +*App Encoding* is the process by which we turn the Metrics and Origins into data structures that Prio can encrypt for us. +Prio, at time of writing, only supports counting up to 2046 "true/false" values at a time. +Thus, from the example, we need to turn "example.net was blocked" into "the boolean at index 11 of chunk 2 is true". +This encoding can be done any way we like so long as we don't change it without informing the aggregation servers (by sending it a new :ref:`encoding name <prio-ping.encoding>`). +This encoding provides no privacy benefit and is just a matter of transforming the data into a format Prio can process. + +*Prio Encoding* is the process by which those ordered true/false values that result from App Encoding are turned into an encrypted series of bytes. +You can `read the paper (pdf) <https://www.usenix.org/system/files/conference/nsdi17/nsdi17-corrigan-gibbs.pdf>`_ to learn more about that. +This encoding, together with the overall system architecture, is what provides the privacy quality to Origin Telemetry. + +*Base64 Encoding* is how we turn those encrypted bytes into a string of characters we can send over the network. +You can learn more about Base64 encoding `on wikipedia <https://wikipedia.org/wiki/Base64>`_. +This encoding provides no privacy benefit and is just used to make Data Engineers' lives a little easier. + +Version History +=============== + +- Firefox 68: Initial Origin Telemetry support (Nightly Only) (`bug 1536565 <https://bugzilla.mozilla.org/show_bug.cgi?id=1536565>`_). diff --git a/toolkit/components/telemetry/docs/collection/sampleHistogram.png b/toolkit/components/telemetry/docs/collection/sampleHistogram.png Binary files differnew file mode 100644 index 0000000000..8bb185930a --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/sampleHistogram.png diff --git a/toolkit/components/telemetry/docs/collection/scalars.rst b/toolkit/components/telemetry/docs/collection/scalars.rst new file mode 100644 index 0000000000..e1efd734c6 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/scalars.rst @@ -0,0 +1,327 @@ +======= +Scalars +======= + +A *scalar* metric can be used to track a single value. Unlike +histograms, which collect every measurement taken, a scalar only +tracks a single value, with later values completely replacing earlier +ones. + +Historically we started to overload our histogram mechanism to also collect scalar data, +such as flag values, counts, labels and others. +The scalar measurement types are the suggested way to collect that kind of scalar data. +The serialized scalar data is submitted with the :doc:`main pings <../data/main-ping>`. Adding scalars is supported in artifact builds and build faster workflows. + +.. important:: + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +The API +======= +Scalar probes can be managed either through the `nsITelemetry interface <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/core/nsITelemetry.idl>`_ +or the `C++ API <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/core/Telemetry.h>`_. + +JS API +------ +Probes in privileged JavaScript code can use the following functions to manipulate scalars: + +.. code-block:: js + + Services.telemetry.scalarAdd(aName, aValue); + Services.telemetry.scalarSet(aName, aValue); + Services.telemetry.scalarSetMaximum(aName, aValue); + + Services.telemetry.keyedScalarAdd(aName, aKey, aValue); + Services.telemetry.keyedScalarSet(aName, aKey, aValue); + Services.telemetry.keyedScalarSetMaximum(aName, aKey, aValue); + +These functions can throw if, for example, an operation is performed on a scalar type that doesn't support it +(e.g. calling scalarSetMaximum on a scalar of the string kind). Please look at the `code documentation <https://searchfox.org/mozilla-central/search?q=TelemetryScalar%3A%3A%28Set%7CAdd%29&path=TelemetryScalar.cpp&case=false®exp=true>`_ for +additional information. + +.. _registerscalars: + +``registerScalars()`` +~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + Services.telemetry.registerScalars(category, scalarData); + +Register new scalars from add-ons. + +* ``category`` - *(required, string)* The unique category the scalars are registered in (see :ref:`limitations <scalar-limitations>`). +* ``scalarData`` - *(required, object)* An object of the form ``{scalarName1: scalar1Data, ...}`` that contains registration data for multiple scalars; ``scalarName1`` is subject to :ref:`limitations <scalar-limitations>`; each scalar is an object with the following properties: + + * ``kind`` - *(required, uint)* One of the scalar types (nsITelemetry::SCALAR_TYPE_*). + * ``keyed`` - *(optional, bool)* Whether this is a keyed scalar or not. Defaults to false. + * ``record_on_release`` - *(optional, bool)* Whether to record this data on release. Defaults to false. + * ``expired`` - *(optional, bool)* Whether this scalar entry is expired. This allows recording it without error, but it will be discarded. Defaults to false. + +For scalars recorded from add-ons, registration happens at runtime. Any new scalar must first be registered through this function before they can be recorded. + +After registration, the scalars can be recorded through the usual scalar JS API. If the accumulation happens in a content process right after the registration and the definition still has to reach this process, it will be discarded: one way to work around the problem is to send an IPC message to the content process and start accumulating data once this message has been received. The accumulated data will be submitted in the main pings payload under ``processes.dynamic.scalars``. + +.. note:: + + Accumulating in dynamic scalars only works in content child processes and in the parent process. All the accumulations (parent and content children) are aggregated together . + +New scalars registered here are subject to the same :ref:`limitations <scalar-limitations>` as the ones registered through ``Scalars.yaml``, e.g. the length of the category name or the allowed characters. + +When add-ons are updated, they may re-register all of their scalars. In that case, any changes to scalars that are already registered are ignored. The only exception is expiry; a scalar that is re-registered with ``expired: true`` will not be recorded anymore. + +Example: + +.. code-block:: js + + Services.telemetry.registerScalars("myAddon.category", { + "counter_scalar": { + kind: Ci.nsITelemetry.SCALAR_TYPE_COUNT, + keyed: false, + record_on_release: false + }, + }); + // Now scalars can be recorded. + Services.telemetry.scalarSet("myAddon.category.counter_scalar", 37); + + +.. _scalars-c++-API: + +C++ API +------- +Probes in native code can use the more convenient helper functions declared in `Telemetry.h <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/core/Telemetry.h>`_: + +.. code-block:: cpp + + void ScalarAdd(mozilla::Telemetry::ScalarID aId, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, const nsAString& aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, bool aValue); + void ScalarSetMaximum(mozilla::Telemetry::ScalarID aId, uint32_t aValue); + + void ScalarAdd(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, uint32_t aValue); + void ScalarSet(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, bool aValue); + void ScalarSetMaximum(mozilla::Telemetry::ScalarID aId, const nsAString& aKey, uint32_t aValue); + +.. warning:: + + Scalar operations are designed to be cheap, not free. If you wish to manipulate Scalars in a performance-sensitive piece of code, store the operations locally and change the Scalar only after the performance-sensitive piece ("hot path") has completed. + +The YAML definition file +======================== +Scalar probes are required to be registered, both for validation and transparency reasons, +in the `Scalars.yaml <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Scalars.yaml>`_ +definition file. + +The probes in the definition file are represented in a fixed-depth, two-level structure: + +.. code-block:: yaml + + # The following is a category. + a.category.hierarchy: + a_probe_name: + kind: uint + ... + another_probe: + kind: string + ... + ... + category2: + probe: + kind: int + ... + +.. _scalar-limitations: + +Category and probe names need to follow a few rules: + +- they cannot exceed 40 characters each; +- category names must be alphanumeric + ``.``, with no leading/trailing digit or ``.``; +- probe names must be alphanumeric + ``_``, with no leading/trailing digit or ``_``. + +A probe can be defined as follows: + +.. code-block:: yaml + + a.category.hierarchy: + a_scalar: + bug_numbers: + - 1276190 + description: A nice one-line description. + expires: never + kind: uint + notification_emails: + - telemetry-client-dev@mozilla.com + +.. _scalars-required-fields: + +Required Fields +--------------- + +- ``bug_numbers``: A list of unsigned integers representing the number of the bugs the probe was introduced in. +- ``description``: A single or multi-line string describing what data the probe collects and when it gets collected. +- ``expires``: The version number in which the scalar expires, e.g. "30"; a version number of type "N" is automatically converted to "N.0a1" in order to expire the scalar also in the development channels. A telemetry probe acting on an expired scalar will print a warning into the browser console. For scalars that never expire the value ``never`` can be used. +- ``kind``: A string representing the scalar type. Allowed values are ``uint``, ``string`` and ``boolean``. +- ``notification_emails``: A list of email addresses to notify with alerts of expiring probes. More importantly, these are used by the data steward to verify that the probe is still useful. +- ``products``: A list of products the scalar can be recorded on. Currently supported values are: + + - ``firefox`` - Collected in Firefox Desktop for submission via Firefox Telemetry. + - ``geckoview_streaming`` - See :doc:`this guide <../start/report-gecko-telemetry-in-glean>` for how to stream data through geckoview to the Glean SDK. + - ``thunderbird`` - Collected in Thunderbird for submission via Thunderbird Telemetry. + +- ``record_in_processes``: A list of processes the scalar is allowed to record in. Currently supported values are: + + - ``main``; + - ``content``; + - ``gpu``; + - ``all_children`` (record in all the child processes); + - ``all`` (record in all the processes). + +Optional Fields +--------------- + +- ``release_channel_collection``: This can be either ``opt-in`` (default) or ``opt-out``. With the former the scalar is submitted by default on pre-release channels, unless the user has opted out. With the latter the scalar is submitted by default on release and pre-release channels, unless the user has opted out. +- ``keyed``: A boolean that determines whether this is a keyed scalar. It defaults to ``false``. +- ``keys``: A string list. Only valid for *keyed scalars*. Defines a case insensitive list of allowed keys that can be used for this scalar. The list is limited to 100 keys with a maximum length of 72 characters each. When using a key that is not in the list, an error is returned. +- ``record_into_store``: A list of stores this scalar should be recorded into. It defaults to ``[main]``. +- ``operating_systems``: This field restricts recording to certain operating systems only. Use that in-place of previous ``cpp_guards`` to avoid inclusion on not-specified operating systems. It defaults to ``all``. Currently supported values are: + + - ``mac`` + - ``linux`` + - ``windows`` + - ``android`` + - ``unix`` + - ``all`` (record on all operating systems) + +String type restrictions +------------------------ +To prevent abuses, the content of a string scalar is limited to 50 characters in length. Trying +to set a longer string will result in an error and no string being set. + +.. _scalars-keyed-scalars: + +Keyed Scalars +------------- +Keyed scalars are collections of ``uint`` or ``boolean`` scalar types, indexed by a string key that can contain UTF8 characters and cannot be longer than 72 characters. Keyed scalars can contain up to 100 keys. This scalar type is for example useful when you want to break down certain counts by a name, like how often searches happen with which search engine. + +Keyed ``string`` scalars are not supported. + +Keyed scalars should only be used if the set of keys are not known beforehand. If the keys are from a known set of strings, other options are preferred if suitable, like categorical histograms or splitting measurements up into separate scalars. + +Multiple processes caveats +-------------------------- +When recording data in different processes of the same type (e.g. multiple content processes), the user is responsible for preventing races between the operations on the scalars. +Races can happen because scalar changes are sent from each child process to the parent process, and then merged into the final storage location. Since there's no synchronization between the processes, operations like ``setMaximum`` can potentially produce different results if sent from more than one child process. + +The processor scripts +===================== +The scalar definition file is processed and checked for correctness at compile time. If it +conforms to the specification, the processor scripts generate two C++ headers files, included +by the Telemetry C++ core. + +gen_scalar_data.py +------------------ +This script is called by the build system to generate the ``TelemetryScalarData.h`` C++ header +file out of the scalar definitions. +This header file contains an array holding the scalar names and version strings, in addition +to an array of ``ScalarInfo`` structures representing all the scalars. + +gen_scalar_enum.py +------------------ +This script is called by the build system to generate the ``TelemetryScalarEnums.h`` C++ header +file out of the scalar definitions. +This header file contains an enum class with all the scalar identifiers used to access them +from code through the C++ API. + +Adding a new probe +================== +Making a scalar measurement is a two step process: + +1. add the probe definition to the scalar registry; +2. record into the scalar using the API. + +Registering the scalar +---------------------- +Let's start by registering two probes in the `Scalars.yaml <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Scalars.yaml>`_ definition file: a simple boolean scalar and a keyed unsigned scalar. + +.. code-block:: yaml + + # The following section contains the demo scalars. + profile: + was_reset: + bug_numbers: + - 1301364 + description: True if the profile was reset. + expires: "60" + kind: boolean + notification_emails: + - change-me@allizom.com + release_channel_collection: opt-out + record_in_processes: + - 'main' + + ui: + download_button_activated: + bug_numbers: + - 1301364 + description: > + The number of times the download button was activated, per + input type (e.g. 'mouse_click', 'touchscreen', ...). + expires: "60" + kind: uint + keyed: true + notification_emails: + - change-me@allizom.com + release_channel_collection: opt-in + record_in_processes: + - 'main' + +These two scalars have different collection policies and are both constrained to recording only in the main process. +For example, the ``ui.download_button_activated`` can be recorded only by users on running pre-release builds of Firefox. + +Using the JS API +---------------- +Changing the demo scalars from privileged JavaScript code is straightforward: + +.. code-block:: js + + // Set the scalar value: trying to use a non-boolean value doesn't throw + // but rather prints a warning to the browser console + Services.telemetry.scalarSet("profile.was_reset", true); + + // This call increments the value stored in "mouse_click" within the + // "ui.download_button_activated" scalar, by 1. + Services.telemetry.keyedScalarAdd("ui.download_button_activated", "mouse_click", 1); + +More usage examples can be found in the tests covering the `JS Scalars API <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/tests/unit/test_TelemetryScalars.js>`_ and `child processes scalars <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/tests/unit/test_ChildScalars.js>`_. + +Using the C++ API +----------------- +Native code can take advantage of Scalars as well, by including the ``Telemetry.h`` header file. + +.. code-block:: cpp + + Telemetry::ScalarSet(Telemetry::ScalarID::PROFILE_WAS_RESET, false); + + Telemetry::ScalarAdd(Telemetry::ScalarID::UI_DOWNLOAD_BUTTON_ACTIVATED, + u"touchscreen"_ns, 1); + +The ``ScalarID`` enum is automatically generated by the build process, with an example being available `here <https://searchfox.org/mozilla-central/search?q=path%3ATelemetryScalarEnums.h&redirect=false>`_ . + +Other examples can be found in the `test coverage <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/tests/gtest/TestScalars.cpp>`_ for the scalars C++ API. + +Version History +=============== + +- Firefox 79: ``geckoview`` support removed (see `bug 1620395 <https://bugzilla.mozilla.org/show_bug.cgi?id=1620395>`__). +- Firefox 50: Initial scalar support (`bug 1276195 <https://bugzilla.mozilla.org/show_bug.cgi?id=1276195>`_). +- Firefox 51: Added keyed scalars (`bug 1277806 <https://bugzilla.mozilla.org/show_bug.cgi?id=1277806>`_). +- Firefox 53: Added child process scalars (`bug 1278556 <https://bugzilla.mozilla.org/show_bug.cgi?id=1278556>`_). +- Firefox 58 + + - Added support for recording new scalars from add-ons (`bug 1393801 <bug https://bugzilla.mozilla.org/show_bug.cgi?id=1393801>`_). + - Ignore re-registering existing scalars for a category instead of failing (`bug 1409323 <https://bugzilla.mozilla.org/show_bug.cgi?id=1409323>`_). + +- Firefox 60: Enabled support for adding scalars in artifact builds and build-faster workflows (`bug 1425909 <https://bugzilla.mozilla.org/show_bug.cgi?id=1425909>`_). +- Firefox 66: Replace ``cpp_guard`` with ``operating_systems`` (`bug 1482912 <https://bugzilla.mozilla.org/show_bug.cgi?id=1482912>`_)` diff --git a/toolkit/components/telemetry/docs/collection/uptake.rst b/toolkit/components/telemetry/docs/collection/uptake.rst new file mode 100644 index 0000000000..b9ee803c30 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/uptake.rst @@ -0,0 +1,114 @@ +.. _telemetry/collection/uptake: + +================ +Uptake Telemetry +================ + +Firefox continuously pulls data from different remote sources (eg. settings, system add-ons, …). In order to have consistent insights about the *uptake rate* of these *update sources*, our clients can use a unified Telemetry helper to report their *update status*. + +The helper — described below — reports predefined update status, which eventually gives a unified way to obtain: + +* the proportion of success among clients; +* its evolution over time; +* the distribution of error causes. + +.. note:: + + Examples of update sources: *remote settings, add-ons update, add-ons, gfx, and plugins blocklists, certificate revocation, certificate pinning, system add-ons delivery…* + + Examples of update status: *up-to-date, success, network error, server error, signature error, server backoff, unknown error…* + +Every call to the UptakeTelemetry helper may send a :ref:`Telemetry Event <eventtelemetry>`. Because events are expensive, we take some measures to avoid overwhelming Mozilla systems with the flood of data that this produces. We always send events when not on release channel. On release channel, we only send events from 1% of clients. + +Usage +----- + +.. code-block:: js + + const { UptakeTelemetry } = ChromeUtils.import("resource://services-common/uptake-telemetry.js", {}); + + UptakeTelemetry.report(component, status, { source }); + +- ``component``, a ``string`` that identifies the calling component (eg. ``"remotesettings"``, ``"normandy"``). Arbitrary components have to be previously declared in the :ref:`Telemetry Events definition file <eventdefinition>`. +- ``source``, a ``string`` to distinguish what is being pulled or updated in the component (eg. ``"blocklists/addons"``, ``"recipes/33"``) +- ``status``, one of the following status constants: + + - ``UptakeTelemetry.STATUS.UP_TO_DATE``: Local content was already up-to-date with remote content. + - ``UptakeTelemetry.STATUS.SUCCESS``: Local content was updated successfully. + - ``UptakeTelemetry.STATUS.BACKOFF``: Remote server asked clients to backoff. + - ``UptakeTelemetry.STATUS.PREF_DISABLED``: Update is disabled in user preferences. + - ``UptakeTelemetry.STATUS.PARSE_ERROR``: Parsing server response has failed. + - ``UptakeTelemetry.STATUS.CONTENT_ERROR``: Server response has unexpected content. + - ``UptakeTelemetry.STATUS.CORRUPTION_ERROR``: Error related to corrupted local data. + - ``UptakeTelemetry.STATUS.SIGNATURE_ERROR``: Signature verification after diff-based sync has failed. + - ``UptakeTelemetry.STATUS.SIGNATURE_RETRY_ERROR``: Signature verification after full fetch has failed. + - ``UptakeTelemetry.STATUS.CONFLICT_ERROR``: Some remote changes are in conflict with local changes. + - ``UptakeTelemetry.STATUS.SYNC_ERROR``: Synchronization of remote changes has failed. + - ``UptakeTelemetry.STATUS.APPLY_ERROR``: Application of changes locally has failed. + - ``UptakeTelemetry.STATUS.SERVER_ERROR``: Server failed to respond. + - ``UptakeTelemetry.STATUS.CERTIFICATE_ERROR``: Server certificate verification has failed. + - ``UptakeTelemetry.STATUS.DOWNLOAD_ERROR``: Data could not be fully retrieved. + - ``UptakeTelemetry.STATUS.TIMEOUT_ERROR``: Server response has timed out. + - ``UptakeTelemetry.STATUS.NETWORK_ERROR``: Communication with server has failed. + - ``UptakeTelemetry.STATUS.NETWORK_OFFLINE_ERROR``: Network not available. + - ``UptakeTelemetry.STATUS.CLEANUP_ERROR``: Clean-up of temporary files has failed. + - ``UptakeTelemetry.STATUS.SHUTDOWN_ERROR``: Error occurring during shutdown. + - ``UptakeTelemetry.STATUS.UNKNOWN_ERROR``: Uncategorized error. + - ``UptakeTelemetry.STATUS.CUSTOM_1_ERROR``: Error #1 specific to this update source. + - ``UptakeTelemetry.STATUS.CUSTOM_2_ERROR``: Error #2 specific to this update source. + - ``UptakeTelemetry.STATUS.CUSTOM_3_ERROR``: Error #3 specific to this update source. + - ``UptakeTelemetry.STATUS.CUSTOM_4_ERROR``: Error #4 specific to this update source. + - ``UptakeTelemetry.STATUS.CUSTOM_5_ERROR``: Error #5 specific to this update source. + +Example: + +.. code-block:: js + + const COMPONENT = "normandy"; + const UPDATE_SOURCE = "update-monitoring"; + + let status; + try { + const data = await fetch(uri); + status = UptakeTelemetry.STATUS.SUCCESS; + } catch (e) { + status = /NetworkError/.test(e) ? + UptakeTelemetry.STATUS.NETWORK_ERROR : + UptakeTelemetry.STATUS.SERVER_ERROR ; + } + UptakeTelemetry.report(COMPONENT, status, { source: UPDATE_SOURCE }); + + +Additional Event Info +''''''''''''''''''''' + +Events sent using the telemetry events API can contain additional information. Uptake Telemetry allows you to add the following extra fields to events by adding them to the ``options`` argument: + +- ``trigger``: A label to distinguish what triggered the polling/fetching of remote content (eg. ``"broadcast"``, ``"timer"``, ``"forced"``, ``"manual"``) +- ``age``: The age of pulled data in seconds (ie. difference between publication time and fetch time). +- ``duration``: The duration of the synchronization process in milliseconds. + +.. code-block:: js + + UptakeTelemetry.report(component, status, { source, trigger: "timer", age: 138 }); + +Remember that events are sampled on release channel. Those calls to uptake telemetry that do not produce events will ignore these extra fields. + + +Use-cases +--------- + +The following remote data sources are already using this unified histogram. + +* remote settings changes monitoring +* add-ons blocklist +* gfx blocklist +* plugins blocklist +* certificate revocation +* certificate pinning +* :ref:`Normandy Recipe client <components/normandy>` + +Obviously, the goal is to eventually converge and avoid ad-hoc Telemetry probes for measuring uptake of remote content. Some notable potential use-cases are: + +* nsUpdateService +* mozapps extensions update diff --git a/toolkit/components/telemetry/docs/collection/use-counters.rst b/toolkit/components/telemetry/docs/collection/use-counters.rst new file mode 100644 index 0000000000..cfa2c749c3 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/use-counters.rst @@ -0,0 +1,105 @@ +============ +Use Counters +============ + +Use counters are used to report Telemetry statistics on whether individual documents +use a given WebIDL method or attribute (getters and setters are reported separately), CSS +property, or deprecated DOM operation. Custom use counters can also be +defined to test frequency of things that don't fall into one of those +categories. + +As of Firefox 65 the collection of Use Counters is enabled on all channels. + +The API +======= +The process to add a new use counter is different depending on the type feature that needs +to be measured. In general, for each defined use counter, two separate boolean histograms are generated: + +- one describes the use of the tracked feature for individual documents and has the ``_DOCUMENT`` suffix; +- the other describes the use of the same thing for top-level pages (basically what we think of as a *web page*) and has the ``_PAGE`` suffix. + +Using two histograms is particularly suited to measure how many sites would be affected by +removing the tracked feature. + +Example scenarios: + +- Site *X* triggers use counter *Y*. We report "used" (true) in both the ``_DOCUMENT`` and ``_PAGE`` histograms. +- Site *X* does not trigger use counter *Y*. We report "unused" (false) in both the ``_DOCUMENT`` and ``_PAGE`` histograms. +- Site *X* has an iframe for site *W*. Site *W* triggers use counter *Y*, but site *X* does not. We report one "used" and one "unused" in the individual ``_DOCUMENT`` histogram and one "used" in the top-level ``_PAGE`` histogram. + +Deprecated DOM operations +------------------------- +Use counters for deprecated DOM operations are declared in the `nsDeprecatedOperationList.h <https://searchfox.org/mozilla-central/source/dom/base/nsDeprecatedOperationList.h>`_ file. The counters are +registered through the ``DEPRECATED_OPERATION(DeprecationReference)`` macro. The provided +parameter must have the same value of the deprecation note added to the *IDL* file. + +See this `changeset <https://hg.mozilla.org/mozilla-central/rev/e30a357b25f1>`_ for a sample +deprecated operation. + +CSS Properties +~~~~~~~~~~~~~~ + +Use counters for CSS properties are generated for every property Gecko supports automatically, and are counted via StyleUseCounters (`Rust code <https://searchfox.org/mozilla-central/rev/7ed8e2d3d1d7a1464ba42763a33fd2e60efcaedc/servo/components/style/use_counters/mod.rs>`_, `C++ code <https://searchfox.org/mozilla-central/rev/7ed8e2d3d1d7a1464ba42763a33fd2e60efcaedc/dom/base/Document.h#5077>`_). + +The UseCounters registry +------------------------ +Use counters for WebIDL methods/attributes are registered in the `UseCounters.conf <https://searchfox.org/mozilla-central/source/dom/base/UseCounters.conf>`_ file. The format of this file is very strict. Each line can be: + +1. a blank line +2. a comment, which is a line that begins with ``//`` +3. one of four possible use counter declarations: + + * ``method <IDL interface name>.<IDL operation name>`` + * ``attribute <IDL interface name>.<IDL attribute name>`` + * ``custom <any valid identifier> <description>`` + +Custom use counters +~~~~~~~~~~~~~~~~~~~ +The <description> for custom counters will be appended to "When a document " or "When a page ", so phrase it appropriately. For instance, "constructs a Foo object" or "calls Document.bar('some value')". It may contain any character (including whitespace). Custom counters are incremented when SetUseCounter(eUseCounter_custom_MyName) is called on a Document object. + +WebIDL methods and attributes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Additionally to having a new entry added to the `UseCounters.conf <https://searchfox.org/mozilla-central/source/dom/base/UseCounters.conf>`_ file, WebIDL methods and attributes must have a ``[UseCounter]`` extended attribute in the Web IDL file in order for the counters to be incremented. + +Both additions are required because generating things from bindings codegen and ensuring all the dependencies are correct would have been rather difficult. + +The processor script +==================== +The definition files are processed twice: + +- once to generate two C++ headers files, included by the web platform components (e.g. DOM) that own the features to be tracked; +- the other time by the Telemetry component, to generate the histogram definitions that make the collection system work. + +.. note:: + + The histograms that are generated out of use counters are set to *never* expire and are collected from Firefox release. Note that before Firefox 65 they were only collected on pre-release. + +gen-usecounters.py +------------------ +This script is called by the build system to generate: + +- the ``UseCounterList.h`` header for the WebIDL, out of the definition files. + +Interpreting the data +===================== +The histogram as accumulated on the client only puts values into the 1 bucket, meaning that +the use counter directly reports if a feature was used but it does not directly report if +it isn't used. +The values accumulated within a use counter should be considered proportional to +``CONTENT_DOCUMENTS_DESTROYED`` and ``TOP_LEVEL_CONTENT_DOCUMENTS_DESTROYED`` (see +`here <https://searchfox.org/mozilla-central/rev/1a973762afcbc5066f73f1508b0c846872fe3952/dom/base/Document.cpp#15059-15081>`__). The difference between the values of these two histograms +and the related use counters below tell us how many pages did *not* use the feature in question. +For instance, if we see that a given session has destroyed 30 content documents, but a +particular use counter shows only a count of 5, we can infer that the use counter was *not* +used in 25 of those 30 documents. + +Things are done this way, rather than accumulating a boolean flag for each use counter, +to avoid sending histograms for features that don't get widely used. Doing things in this +fashion means smaller telemetry payloads and faster processing on the server side. + +Version History +--------------- + +- Firefox 65: + + - Enable Use Counters on release channel (`bug 1477433 <https://bugzilla.mozilla.org/show_bug.cgi?id=1477433>`_) diff --git a/toolkit/components/telemetry/docs/collection/user-interactions.rst b/toolkit/components/telemetry/docs/collection/user-interactions.rst new file mode 100644 index 0000000000..16d1d53977 --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/user-interactions.rst @@ -0,0 +1,272 @@ +.. _userinteractionstelemetry: + +================= +User Interactions +================= + +The Background Hang Reporter is a tool that collects stacks during hangs on pre-release channels. +User Interactions are a way of annotating Background Hang Reports with additional information about what the user was doing when a hang occurs. +This allows for grouping and prioritization of hangs based on the user interactions that they occur during. + +Since the built-in profiler is often the first tool that developers reach for to debug performance issues, +User Interactions also will add profiler markers for each recording. + +.. important:: + + Every new or changed data collection in Firefox needs a `data collection review <https://wiki.mozilla.org/Firefox/Data_Collection>`__ from a Data Steward. + +.. _userinteractionsserializationformat: + +Serialization format +==================== + +User Interactions are submitted in a :doc:`../data/backgroundhangmonitor-ping` as a property under the `annotations` for a hang, e.g.: + +.. code-block:: js + + ... + { + "duration": 105.547582, + // ... + "annotations": [ + ["UserInteracting", "true"] + ["browser.tabs.opening", "animated"] + ], + "stack": [ + "XREMain::XRE_main", + "-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]", + "nsAppShell::ProcessGeckoEvents", + "nsRefreshDriver::Tick", + "EventDispatcher::Dispatch", + "EventDispatcher::Dispatch", + "", + "browser/content/tabbrowser-tabs.js:1699", + "browser/content/tabbrowser-tabs.js:1725", + "browser/content/tabbrowser-tabs.js:142", + "browser/content/tabbrowser-tabs.js:153", + "(jit frame)", + "(unresolved)", + [ + 1, + "418de17" + ], + [ + 1, + "418de91" + ], + [ + 1, + "4382e56" + ], + [ + 8, + "108e3" + ], + [ + 9, + "2624" + ], + [ + 9, + "129f" + ] + ] + // ... + }, + +Each User Interaction is of the form: + +.. code-block:: js + + ["User Interaction ID", "value"] + +A `User Interaction ID` is its category concatenated with its name. +For example, a User Interaction with category `browser.tabs` and name `opening` has an ID of `browser.tabs.opening`. + +.. _userinteractionslimits: + +Limits +------ + +Each ``String`` marked as an identifier (the User Interaction ``name``, ``category``, ``value``) is restricted to be composed of alphanumeric ASCII characters ([a-zA-Z0-9]) plus infix underscores ('_' characters that aren't the first or last). +``category`` is also permitted infix periods ('.' characters, so long as they aren't the first or last character). + +Several fields are subject to length limits: + +- ``category``: Max. byte length is ``40``. +- ``User Interaction`` name: Max. byte length is ``40``. +- ``value``: A UTF-8 string with max. byte length of ``50``. + +Any ``String`` going over its limit will be reported as an error and the operation aborted. + + +.. _userinteractionsdefinition: + +The YAML definition file +======================== + +Any User Interaction recorded into Firefox Telemetry must be registered before it can be recorded. +For any code that ships as part of Firefox that happens in `UserInteractions.yaml <https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/UserInteractions.yaml>`_. + +The User Interactions in the definition file are represented in a fixed-depth, three-level structure. +The first level contains *category* names (grouping multiple User Interactions together), +the second level contains User Interaction IDs, under which the User Interaction properties are listed. E.g.: + +.. code-block:: yaml + + # The following is a category of User Interactions named "browser.tabs". + browser.tabs: + opening: # This is the name of the User Interaction. The ID for the + # User Interaction is browser.tabs.opening + description: > + Describes this User Interaction in detail, potentially over + multiple lines. + # ... and more User Interaction properties. + # ... and more User Interactions. + # This is the "browser.places" category. + browser.places: + # And the "history" search User Interaction. Its User Interaction ID is + # browser.places.history_async + history_async: + # ... + description: Session History is searched asynchronously. + # ... and more User Interaction properties. + # ... + +Category and User Interaction names are subject to the limits :ref:`specified above <userinteractionslimits>`. + + +Profiler markers +================ + +The profiler markers automatically added for each User Interaction will have a starting point and ending point corresponding with the recording of the User Interaction. +The name of the marker will be the User Interaction category plus the User Interaction ID. +The value of the marker will be the value passed through the `UserInteraction` API, plus any additional text that is optionally added when the recording is finished. + +Further details on what the profiler is and what profiler markers are can be found `here <https://profiler.firefox.com/docs/#/>`_. + + +The API +======= + +Public JS API +------------- + +This API is main-thread only, and all functions will return `false` if accessed off of the main thread. + +``start()`` +~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + UserInteraction.start(id, value, object); + +Starts recording a User Interaction. +Any hangs that occur on the main thread while recording this User Interaction result in an annotation being added to the background hang report. + +If a pre-existing UserInteraction already exists with the same ``id`` and the same ``object``, that pre-existing UserInteraction will be overwritten. +The newly created UserInteraction will include a "(clobbered)" suffix on its BHR annotation name. + +* ``id``: Required. A string value, limited to 80 characters. This is the category name concatenated with the User Interaction name. +* ``value``: Required. A string value, limited to 50 characters. +* ``object``: Optional. If specified, the User Interaction is associated with this object, so multiple recordings can be done concurrently. + +Example: + +.. code-block:: js + + UserInteraction.start("browser.tabs.opening", "animated", window1); + UserInteraction.start("browser.tabs.opening", "animated", window2); + +Returns `false` and logs a message to the browser console if the recording does not start for some reason. + +Example: + +.. code-block:: js + + UserInteraction.start("browser.tabs.opening", "animated", window); + UserInteraction.start("browser.places.history_search", "synchronous"); + +``update()`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + UserInteraction.update(id, value, object); + +Updates a User Interaction that's already being recorded with a new value. +Any hangs that occur on the main thread will be annotated using the new value. +Updating only works for User Interactions that are already being recorded. + +* ``id``: Required. A string value, limited to 80 characters. This is the category name concatenated with the User Interaction name. +* ``value``: Required. The new string value, limited to 50 characters. +* ``object``: Optional. If specified, the User Interaction is associated with this object, so multiple recordings can be done concurrently. + +Returns `false` and logs a message to the browser console if the update cannot be done for some reason. + + +Example: + +.. code-block:: js + + // At this point, we don't know if the tab will open with animation + // or not. + UserInteraction.start("browser.tabs.opening", "initting", window); + // ... + if (animating) { + UserInteraction.update("browser.tabs.opening", "animating", window); + } else { + UserInteraction.update("browser.tabs.opening", "not-animating", window); + } + +``cancel()`` +~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + UserInteraction.cancel(id, object); + +Cancels a recording User Interaction. +No profiler marker will be added in this case, and no further hangs will be annotated. +Hangs that occurred before the User Interaction was cancelled will not, however, be expunged. + +* ``id``: Required. A string value, limited to 80 characters. This is the category name concatenated with the User Interaction name. +* ``object``: Optional. If specified, the User Interaction is associated with this object, so multiple recordings can be done concurrently. + +Returns `false` and logs a message to the browser console if the cancellation cannot be completed for some reason. + +``running()`` +~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + UserInteraction.running(id, object); + +Checks to see if a UserInteraction is already running. + +* ``id``: Required. A string value, limited to 80 characters. This is the category name concatenated with the User Interaction name. +* ``object``: Optional. If specified, the User Interaction is associated with this object, so multiple recordings can be done concurrently. If you're checking for a running timer that was started with an object, you'll need to pass in that same object here to check its running state. + +Returns `true` if a UserInteraction is already running. + +``finish()`` +~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + UserInteraction.finish(id, object, additionalText); + +Finishes recording the User Interaction. +Any hangs that occur on the main thread will no longer be annotated with this User Interaction. +A profiler marker will also be added, starting at the `UserInteraction.start` point and ending at the `UserInteraction.finish` point, along with any additional text that the author wants to include. + +* ``id``: Required. A string value, limited to 80 characters. This is the category name concatenated with the User Interaction name. +* ``object``: Optional. If specified, the User Interaction is associated with this object, so multiple recordings can be done concurrently. +* ``additionalText``: Optional. If specified, the profile marker will have this text appended to the `value`, separated with a comma. + +Returns `false` and logs a message to the browser console if finishing cannot be completed for some reason. + +Version History +=============== + +- Firefox 84: Initial User Interaction support (see `bug 1661304 <https://bugzilla.mozilla.org/show_bug.cgi?id=1661304>`__). diff --git a/toolkit/components/telemetry/docs/collection/webextension-api.rst b/toolkit/components/telemetry/docs/collection/webextension-api.rst new file mode 100644 index 0000000000..a3e73e11fa --- /dev/null +++ b/toolkit/components/telemetry/docs/collection/webextension-api.rst @@ -0,0 +1,158 @@ +.. _webextension-telemetry: + +============================== +WebExtension API for Telemetry +============================== + +Use the ``browser.telemetry`` API to send telemetry data to the Mozilla Telemetry service. Restricted to Mozilla privileged webextensions. + +Types +----- + +``ScalarType`` +~~~~~~~~~~~~~~ + +Type of scalar: 'count' for numeric values, 'string' for string values, 'boolean' for boolean values. Maps to ``nsITelemetry.SCALAR_TYPE_*``. + +``ScalarData`` +~~~~~~~~~~~~~~ + +Represents registration data for a Telemetry scalar. + +Properties: + +* ``kind`` - See ScalarType_. +* ``keyed`` - *(optional, boolean)* True if this is a keyed scalar. Defaults to ``false``. +* ``record_on_release`` - *(optional, boolean)* True if this data should be recorded on release. Defaults to ``false``. +* ``expired`` - *(optional, boolean)* True if this scalar entry is expired. Operations on an expired scalar don't error (operations on an undefined scalar do), but the operations are no-ops. No data will be recorded. Defaults to ``false``. + +``EventData`` +~~~~~~~~~~~~~ + +Represents registration data for a Telemetry event. + +Properties: + +* ``methods`` - *(array)* List of methods for this event entry. +* ``objects`` - *(array)* List of objects for this event entry. +* ``extra_keys`` - *(array)* List of allowed extra keys for this event entry. +* ``record_on_release`` - *(optional, boolean)* True if this data should be recorded on release. Defaults to ``false``. +* ``expired`` - *(optional, boolean)* True if this event entry is expired. Recording an expired event doesn't error (operations on undefined events do). No data will be recorded. Defaults to ``false``. + +Functions +--------- + +``submitPing`` +~~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.submitPing(type, message, options); + +Submits a custom ping to the Telemetry backend. See :ref:`submitting-customping`. + +* ``type`` - *(string)* The type of the ping. +* ``message`` - *(object)* The data payload for the ping. +* ``options`` - *(optional, object)* Options object. + + * ``addClientId`` - *(optional, boolean)* True if the ping should contain the client id. Defaults to ``false``. + * ``addEnvironment`` - *(optional, boolean)* True if the ping should contain the environment data. Defaults to ``false``. + * ``overrideEnvironment`` - *(optional, object)* Set to override the environment data. Default: not set. + * ``usePingSender`` - *(optional, boolean)* If true, send the ping using the PingSender. Defaults to ``false``. + + +``canUpload`` +~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.canUpload(); + +Checks if Telemetry upload is enabled. + +``scalarAdd`` +~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.scalarAdd(name, value); + +Adds the value to the given scalar. + +* ``name`` - *(string)* The scalar name. +* ``value`` - *(integer)* The numeric value to add to the scalar. Only unsigned integers supported. + +``scalarSet`` +~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.scalarSet(name, value); + +Sets the named scalar to the given value. Throws if the value type doesn't match the scalar type. + +* ``name`` - *(string)* The scalar name. +* ``value`` - *(string|boolean|integer|object)* The value to set the scalar to. + +``scalarSetMaximum`` +~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.scalarSetMaximum(name, value); + +Sets the scalar to the maximum of the current and the passed value + +* ``name`` - *(string)* The scalar name. +* ``value`` - *(integer)* The numeric value to set the scalar to. Only unsigned integers supported. + +``recordEvent`` +~~~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.recordEvent(category, method, object, value, extra); + +Record an event in Telemetry. Throws when trying to record an unknown event. + +* ``category`` - *(string)* The category name. +* ``method`` - *(string)* The method name. +* ``object`` - *(string)* The object name. +* ``value`` - *(optional, string)* An optional string value to record. +* ``extra`` - *(optional, object)* An optional object of the form (string -> string). It should only contain registered extra keys. + +``registerScalars`` +~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.registerScalars(category, data); + +Register new scalars to record them from addons. See :ref:`registerscalars` for more details. + +* ``category`` - *(string)* The unique category the scalars are registered in. +* ``data`` - *(object)* An object that contains registration data for multiple scalars. Each property name is the scalar name, and the corresponding property value is an object of ScalarData_ type. + +``registerEvents`` +~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.registerEvents(category, data); + +Register new events to record them from addons. See :ref:`registerevents` for more details. + +* ``category`` - *(string)* The unique category the events are registered in. +* ``data`` - *(object)* An object that contains registration data for 1+ events. Each property name is the category name, and the corresponding property value is an object of EventData_ type. + +``setEventRecordingEnabled`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: js + + browser.telemetry.setEventRecordingEnabled(category, enabled); + +Enable recording of events in a category. Events default to recording disabled. This allows to toggle recording for all events in the specified category. + +* ``category`` - *(string)* The category name. +* ``enabled`` - *(boolean)* Whether recording is enabled for events in that category. |