summaryrefslogtreecommitdiffstats
path: root/toolkit/components/telemetry/docs/concepts
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-28 14:29:10 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-28 14:29:10 +0000
commit2aa4a82499d4becd2284cdb482213d541b8804dd (patch)
treeb80bf8bf13c3766139fbacc530efd0dd9d54394c /toolkit/components/telemetry/docs/concepts
parentInitial commit. (diff)
downloadfirefox-upstream.tar.xz
firefox-upstream.zip
Adding upstream version 86.0.1.upstream/86.0.1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
-rw-r--r--toolkit/components/telemetry/docs/concepts/archiving.rst23
-rw-r--r--toolkit/components/telemetry/docs/concepts/crashes.rst25
-rw-r--r--toolkit/components/telemetry/docs/concepts/index.rst23
-rw-r--r--toolkit/components/telemetry/docs/concepts/pings.rst29
-rw-r--r--toolkit/components/telemetry/docs/concepts/sessions.rst37
-rw-r--r--toolkit/components/telemetry/docs/concepts/submission.rst42
-rw-r--r--toolkit/components/telemetry/docs/concepts/subsession_triggers.pngbin0 -> 857375 bytes
7 files changed, 179 insertions, 0 deletions
diff --git a/toolkit/components/telemetry/docs/concepts/archiving.rst b/toolkit/components/telemetry/docs/concepts/archiving.rst
new file mode 100644
index 0000000000..0466f13769
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/archiving.rst
@@ -0,0 +1,23 @@
+=========
+Archiving
+=========
+
+When archiving is enabled through the relevant pref (``toolkit.telemetry.archive.enabled``), pings submitted to ``TelemetryController`` are also stored locally in the user profile directory, in ``<profile-dir>/datareporting/archived``.
+
+To allow for cheaper lookup of archived pings, storage follows a specific naming scheme for both the directory and the ping file name: `<YYYY-MM>/<timestamp>.<UUID>.<type>.jsonlz4`.
+
+* ``<YYYY-MM>`` - The subdirectory name, generated from the ping creation date.
+* ``<timestamp>`` - Timestamp of the ping creation date.
+* ``<UUID>`` - The ping identifier.
+* ``<type>`` - The ping type.
+
+Archived data can be viewed on ``about:telemetry``.
+
+Cleanup
+-------
+
+Archived pings are not kept around forever.
+After startup of Firefox and initialization of Telemetry, the archive is cleaned up if necessary.
+
+* Old ping data is removed by month if it is older than 60 days.
+* If the total size of the archive exceeds the quota of 120 MB, pings are removed to reduce the size of the archive again.
diff --git a/toolkit/components/telemetry/docs/concepts/crashes.rst b/toolkit/components/telemetry/docs/concepts/crashes.rst
new file mode 100644
index 0000000000..4ea40f89c4
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/crashes.rst
@@ -0,0 +1,25 @@
+=======
+Crashes
+=======
+
+There are many different kinds of crashes for Firefox, there is not a single system used to record all of them.
+
+Main process crashes
+====================
+
+If the Firefox main process dies, that should be recorded as an aborted session. We would submit a :doc:`main ping <../data/main-ping>` with the reason ``aborted-session``.
+If we have a crash dump for that crash, we should also submit a :doc:`crash ping <../data/crash-ping>`.
+
+The ``aborted-session`` information is first written to disk 60 seconds after startup, any earlier crashes will not trigger an ``aborted-session`` ping.
+Also, the ``aborted-session`` is updated at least every 5 minutes, so it may lag behind the last session state.
+
+Crashes during startup should be recorded in the next sessions main ping in the ``STARTUP_CRASH_DETECTED`` histogram.
+
+Child process crashes
+=====================
+
+If a Firefox plugin, content, gmplugin, or any other type of child process dies unexpectedly, this is recorded in the main ping's ``SUBPROCESS_ABNORMAL_ABORT`` keyed histogram.
+
+If we catch a crash report for this, then additionally the ``SUBPROCESS_CRASHES_WITH_DUMP`` keyed histogram is incremented.
+
+Some processes also generate :doc:`crash pings <../data/crash-ping>` when they crash and generate a crash dump. See `bug 1352496 <https://bugzilla.mozilla.org/show_bug.cgi?id=1352496>`_ for an example of how to allow crash pings for new process types.
diff --git a/toolkit/components/telemetry/docs/concepts/index.rst b/toolkit/components/telemetry/docs/concepts/index.rst
new file mode 100644
index 0000000000..a49466f8d0
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/index.rst
@@ -0,0 +1,23 @@
+========
+Concepts
+========
+
+There are common concepts used throughout Telemetry:
+
+* :doc:`pings <pings>` - the packets we use to submit data
+* :doc:`sessions & subsessions <sessions>` - how we slice a users' time in the browser
+* *measurements* - how we :doc:`collect data <../collection/index>`
+* *opt-in* & *opt-out* - the different sets of data we collect
+* :doc:`submission <submission>` - how we send data to the servers
+* :doc:`archiving <archiving>` - retaining ping data locally
+* :doc:`crashes <crashes>` - the different data crashes generate
+
+.. toctree::
+ :maxdepth: 2
+ :titlesonly:
+ :glob:
+ :hidden:
+
+ pings
+ crashes
+ *
diff --git a/toolkit/components/telemetry/docs/concepts/pings.rst b/toolkit/components/telemetry/docs/concepts/pings.rst
new file mode 100644
index 0000000000..092f7b13fa
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/pings.rst
@@ -0,0 +1,29 @@
+.. _telemetry_pings:
+
+=====================
+Telemetry pings
+=====================
+
+A *Telemetry ping* is the data that we send to Mozilla's Telemetry servers.
+
+The top-level structure is defined by the :doc:`common ping format <../data/common-ping>`. This is a JSON object which contains:
+
+* some basic information shared between different ping types
+* the :doc:`environment data <../data/environment>` (optional)
+* the data specific to the *ping type*, the *payload*.
+
+Ping types
+==========
+
+We send Telemetry with different ping types. The :doc:`main <../data/main-ping>` ping is the ping that contains the bulk of the Telemetry measurements for Firefox. For more specific use-cases, we send other ping types.
+
+Pings sent from code that ships with Firefox are listed in the :doc:`data documentation <../data/index>`.
+
+Important examples are:
+
+* :doc:`main <../data/main-ping>` - contains the information collected by Telemetry (Histograms, Scalars, ...)
+* :doc:`saved-session <../data/main-ping>` - has the same format as a main ping, but it contains the *"classic"* Telemetry payload with measurements covering the whole browser session. This is only a separate type to make storage of saved-session easier server-side. As of Firefox 61 this is sent on Android only.
+* :doc:`crash <../data/crash-ping>` - a ping that is captured and sent after a Firefox process crashes.
+* :doc:`new-profile <../data/new-profile-ping>` - sent on the first run of a new profile.
+* :doc:`update <../data/update-ping>` - sent right after an update is downloaded.
+* :doc:`deletion-request <../data/deletion-request-ping>` - sent when FHR upload is disabled
diff --git a/toolkit/components/telemetry/docs/concepts/sessions.rst b/toolkit/components/telemetry/docs/concepts/sessions.rst
new file mode 100644
index 0000000000..39d6df961d
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/sessions.rst
@@ -0,0 +1,37 @@
+========
+Sessions
+========
+
+A *session* is the time from when Firefox starts until it shuts down.
+A session can be very long-running. E.g. for users that always put their computers into sleep-mode, Firefox may run for weeks.
+We slice the sessions into smaller logical units called *subsessions*.
+
+Subsessions
+===========
+
+The first subsession starts when the browser starts. After that, we split the subsession for different reasons:
+
+* ``daily``, when crossing local midnight. This keeps latency acceptable by triggering a ping at least daily for most active users.
+* ``environment-change``, when a change to the *environment* happens. This happens for important changes to the Firefox settings and when add-ons activate or deactivate.
+
+On a subsession split, a :doc:`main ping <../data/main-ping>` with that reason will be submitted. We store the reason in the pings payload, to see what triggered it.
+
+A session always ends with a subsession with one of two reason:
+
+* ``shutdown``, when the browser was cleanly shut down. To avoid delaying shutdown, we only save this ping to disk and send it at the next opportunity (typically the next browsing session).
+* ``aborted-session``, when the browser crashed. While Firefox is active, we write the current ``main`` ping data to disk every 5 minutes. If the browser crashes, we find this data on disk on the next start and send it with this reason.
+
+.. image:: subsession_triggers.png
+
+Subsession data
+===============
+
+A subsessions data consists of:
+
+* general information: the date the subsession started, how long it lasted, etc.
+* specific measurements: histogram & scalar data, etc.
+
+This has some advantages:
+
+* Latency - Sending a ping with all the data of a subsession immediately after it ends means we get the data from installs faster. For ``main`` pings, we aim to send a ping at least daily by starting a new subsession at local midnight.
+* Correlation - By starting new subsessions when fundamental settings change (i.e. changes to the *environment*), we can better correlate a subsession's data to those settings.
diff --git a/toolkit/components/telemetry/docs/concepts/submission.rst b/toolkit/components/telemetry/docs/concepts/submission.rst
new file mode 100644
index 0000000000..6f3ba3b0f0
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/submission.rst
@@ -0,0 +1,42 @@
+==========
+Submission
+==========
+
+*Note:* The server-side behaviour is documented in the `HTTP Edge Server specification <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification>`_.
+
+Pings are submitted via a common API on ``TelemetryController``.
+If a ping fails to successfully submit to the server immediately (e.g. because
+of missing internet connection), Telemetry will store it on disk and retry to
+send it until the maximum ping age is exceeded (14 days).
+
+.. note::
+
+ The :doc:`main pings <../data/main-ping>` are kept locally even after successful submission to enable the HealthReport feature. They will be deleted after their retention period of 180 days.
+
+Submission logic
+================
+
+Sending of pending pings starts as soon as the delayed startup is finished. They are sent in batches, newest-first, with up
+to 10 persisted pings per batch plus all unpersisted pings.
+The send logic then waits for each batch to complete.
+
+If it succeeds we trigger the next send of a ping batch. This is delayed as needed to only trigger one batch send per minute.
+
+If ping sending encounters an error that means retrying later, a backoff timeout behavior is
+triggered, exponentially increasing the timeout for the next try from 1 minute up to a limit of 120 minutes.
+Any new ping submissions and "idle-daily" events reset this behavior as a safety mechanism and trigger immediate ping sending.
+
+Pingsender
+==========
+Some pings (e.g. :doc:`crash pings <../data/crash-ping>` and :doc:`main pings <../data/main-ping>` with reason `shutdown`) are submitted using the :doc:`../internals/pingsender`.
+
+The pingsender tries to send each ping once and, if it fails, no additional attempt is performed: ``TelemetrySend`` will take care of retrying using the previously described submission logic.
+
+Status codes
+============
+
+The telemetry server team is working towards `the common services status codes <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification#Server_Responses>`_, but for now the following logic is sufficient for Telemetry:
+
+* `2XX` - success, don't resubmit
+* `4XX` - there was some problem with the request - the client should not try to resubmit as it would just receive the same response
+* `5XX` - there was a server-side error, the client should try to resubmit later
diff --git a/toolkit/components/telemetry/docs/concepts/subsession_triggers.png b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png
new file mode 100644
index 0000000000..0a1dae2c23
--- /dev/null
+++ b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png
Binary files differ