diff options
Diffstat (limited to 'toolkit/components/telemetry/docs/concepts')
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/archiving.rst | 23 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/crashes.rst | 25 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/index.rst | 23 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/pings.rst | 29 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/sessions.rst | 37 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/submission.rst | 42 | ||||
-rw-r--r-- | toolkit/components/telemetry/docs/concepts/subsession_triggers.png | bin | 0 -> 857375 bytes |
7 files changed, 179 insertions, 0 deletions
diff --git a/toolkit/components/telemetry/docs/concepts/archiving.rst b/toolkit/components/telemetry/docs/concepts/archiving.rst new file mode 100644 index 0000000000..0466f13769 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/archiving.rst @@ -0,0 +1,23 @@ +========= +Archiving +========= + +When archiving is enabled through the relevant pref (``toolkit.telemetry.archive.enabled``), pings submitted to ``TelemetryController`` are also stored locally in the user profile directory, in ``<profile-dir>/datareporting/archived``. + +To allow for cheaper lookup of archived pings, storage follows a specific naming scheme for both the directory and the ping file name: `<YYYY-MM>/<timestamp>.<UUID>.<type>.jsonlz4`. + +* ``<YYYY-MM>`` - The subdirectory name, generated from the ping creation date. +* ``<timestamp>`` - Timestamp of the ping creation date. +* ``<UUID>`` - The ping identifier. +* ``<type>`` - The ping type. + +Archived data can be viewed on ``about:telemetry``. + +Cleanup +------- + +Archived pings are not kept around forever. +After startup of Firefox and initialization of Telemetry, the archive is cleaned up if necessary. + +* Old ping data is removed by month if it is older than 60 days. +* If the total size of the archive exceeds the quota of 120 MB, pings are removed to reduce the size of the archive again. diff --git a/toolkit/components/telemetry/docs/concepts/crashes.rst b/toolkit/components/telemetry/docs/concepts/crashes.rst new file mode 100644 index 0000000000..4ea40f89c4 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/crashes.rst @@ -0,0 +1,25 @@ +======= +Crashes +======= + +There are many different kinds of crashes for Firefox, there is not a single system used to record all of them. + +Main process crashes +==================== + +If the Firefox main process dies, that should be recorded as an aborted session. We would submit a :doc:`main ping <../data/main-ping>` with the reason ``aborted-session``. +If we have a crash dump for that crash, we should also submit a :doc:`crash ping <../data/crash-ping>`. + +The ``aborted-session`` information is first written to disk 60 seconds after startup, any earlier crashes will not trigger an ``aborted-session`` ping. +Also, the ``aborted-session`` is updated at least every 5 minutes, so it may lag behind the last session state. + +Crashes during startup should be recorded in the next sessions main ping in the ``STARTUP_CRASH_DETECTED`` histogram. + +Child process crashes +===================== + +If a Firefox plugin, content, gmplugin, or any other type of child process dies unexpectedly, this is recorded in the main ping's ``SUBPROCESS_ABNORMAL_ABORT`` keyed histogram. + +If we catch a crash report for this, then additionally the ``SUBPROCESS_CRASHES_WITH_DUMP`` keyed histogram is incremented. + +Some processes also generate :doc:`crash pings <../data/crash-ping>` when they crash and generate a crash dump. See `bug 1352496 <https://bugzilla.mozilla.org/show_bug.cgi?id=1352496>`_ for an example of how to allow crash pings for new process types. diff --git a/toolkit/components/telemetry/docs/concepts/index.rst b/toolkit/components/telemetry/docs/concepts/index.rst new file mode 100644 index 0000000000..a49466f8d0 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/index.rst @@ -0,0 +1,23 @@ +======== +Concepts +======== + +There are common concepts used throughout Telemetry: + +* :doc:`pings <pings>` - the packets we use to submit data +* :doc:`sessions & subsessions <sessions>` - how we slice a users' time in the browser +* *measurements* - how we :doc:`collect data <../collection/index>` +* *opt-in* & *opt-out* - the different sets of data we collect +* :doc:`submission <submission>` - how we send data to the servers +* :doc:`archiving <archiving>` - retaining ping data locally +* :doc:`crashes <crashes>` - the different data crashes generate + +.. toctree:: + :maxdepth: 2 + :titlesonly: + :glob: + :hidden: + + pings + crashes + * diff --git a/toolkit/components/telemetry/docs/concepts/pings.rst b/toolkit/components/telemetry/docs/concepts/pings.rst new file mode 100644 index 0000000000..092f7b13fa --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/pings.rst @@ -0,0 +1,29 @@ +.. _telemetry_pings: + +===================== +Telemetry pings +===================== + +A *Telemetry ping* is the data that we send to Mozilla's Telemetry servers. + +The top-level structure is defined by the :doc:`common ping format <../data/common-ping>`. This is a JSON object which contains: + +* some basic information shared between different ping types +* the :doc:`environment data <../data/environment>` (optional) +* the data specific to the *ping type*, the *payload*. + +Ping types +========== + +We send Telemetry with different ping types. The :doc:`main <../data/main-ping>` ping is the ping that contains the bulk of the Telemetry measurements for Firefox. For more specific use-cases, we send other ping types. + +Pings sent from code that ships with Firefox are listed in the :doc:`data documentation <../data/index>`. + +Important examples are: + +* :doc:`main <../data/main-ping>` - contains the information collected by Telemetry (Histograms, Scalars, ...) +* :doc:`saved-session <../data/main-ping>` - has the same format as a main ping, but it contains the *"classic"* Telemetry payload with measurements covering the whole browser session. This is only a separate type to make storage of saved-session easier server-side. As of Firefox 61 this is sent on Android only. +* :doc:`crash <../data/crash-ping>` - a ping that is captured and sent after a Firefox process crashes. +* :doc:`new-profile <../data/new-profile-ping>` - sent on the first run of a new profile. +* :doc:`update <../data/update-ping>` - sent right after an update is downloaded. +* :doc:`deletion-request <../data/deletion-request-ping>` - sent when FHR upload is disabled diff --git a/toolkit/components/telemetry/docs/concepts/sessions.rst b/toolkit/components/telemetry/docs/concepts/sessions.rst new file mode 100644 index 0000000000..39d6df961d --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/sessions.rst @@ -0,0 +1,37 @@ +======== +Sessions +======== + +A *session* is the time from when Firefox starts until it shuts down. +A session can be very long-running. E.g. for users that always put their computers into sleep-mode, Firefox may run for weeks. +We slice the sessions into smaller logical units called *subsessions*. + +Subsessions +=========== + +The first subsession starts when the browser starts. After that, we split the subsession for different reasons: + +* ``daily``, when crossing local midnight. This keeps latency acceptable by triggering a ping at least daily for most active users. +* ``environment-change``, when a change to the *environment* happens. This happens for important changes to the Firefox settings and when add-ons activate or deactivate. + +On a subsession split, a :doc:`main ping <../data/main-ping>` with that reason will be submitted. We store the reason in the pings payload, to see what triggered it. + +A session always ends with a subsession with one of two reason: + +* ``shutdown``, when the browser was cleanly shut down. To avoid delaying shutdown, we only save this ping to disk and send it at the next opportunity (typically the next browsing session). +* ``aborted-session``, when the browser crashed. While Firefox is active, we write the current ``main`` ping data to disk every 5 minutes. If the browser crashes, we find this data on disk on the next start and send it with this reason. + +.. image:: subsession_triggers.png + +Subsession data +=============== + +A subsessions data consists of: + +* general information: the date the subsession started, how long it lasted, etc. +* specific measurements: histogram & scalar data, etc. + +This has some advantages: + +* Latency - Sending a ping with all the data of a subsession immediately after it ends means we get the data from installs faster. For ``main`` pings, we aim to send a ping at least daily by starting a new subsession at local midnight. +* Correlation - By starting new subsessions when fundamental settings change (i.e. changes to the *environment*), we can better correlate a subsession's data to those settings. diff --git a/toolkit/components/telemetry/docs/concepts/submission.rst b/toolkit/components/telemetry/docs/concepts/submission.rst new file mode 100644 index 0000000000..6f3ba3b0f0 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/submission.rst @@ -0,0 +1,42 @@ +========== +Submission +========== + +*Note:* The server-side behaviour is documented in the `HTTP Edge Server specification <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification>`_. + +Pings are submitted via a common API on ``TelemetryController``. +If a ping fails to successfully submit to the server immediately (e.g. because +of missing internet connection), Telemetry will store it on disk and retry to +send it until the maximum ping age is exceeded (14 days). + +.. note:: + + The :doc:`main pings <../data/main-ping>` are kept locally even after successful submission to enable the HealthReport feature. They will be deleted after their retention period of 180 days. + +Submission logic +================ + +Sending of pending pings starts as soon as the delayed startup is finished. They are sent in batches, newest-first, with up +to 10 persisted pings per batch plus all unpersisted pings. +The send logic then waits for each batch to complete. + +If it succeeds we trigger the next send of a ping batch. This is delayed as needed to only trigger one batch send per minute. + +If ping sending encounters an error that means retrying later, a backoff timeout behavior is +triggered, exponentially increasing the timeout for the next try from 1 minute up to a limit of 120 minutes. +Any new ping submissions and "idle-daily" events reset this behavior as a safety mechanism and trigger immediate ping sending. + +Pingsender +========== +Some pings (e.g. :doc:`crash pings <../data/crash-ping>` and :doc:`main pings <../data/main-ping>` with reason `shutdown`) are submitted using the :doc:`../internals/pingsender`. + +The pingsender tries to send each ping once and, if it fails, no additional attempt is performed: ``TelemetrySend`` will take care of retrying using the previously described submission logic. + +Status codes +============ + +The telemetry server team is working towards `the common services status codes <https://wiki.mozilla.org/CloudServices/DataPipeline/HTTPEdgeServerSpecification#Server_Responses>`_, but for now the following logic is sufficient for Telemetry: + +* `2XX` - success, don't resubmit +* `4XX` - there was some problem with the request - the client should not try to resubmit as it would just receive the same response +* `5XX` - there was a server-side error, the client should try to resubmit later diff --git a/toolkit/components/telemetry/docs/concepts/subsession_triggers.png b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png Binary files differnew file mode 100644 index 0000000000..0a1dae2c23 --- /dev/null +++ b/toolkit/components/telemetry/docs/concepts/subsession_triggers.png |