diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 00:47:55 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 00:47:55 +0000 |
commit | 26a029d407be480d791972afb5975cf62c9360a6 (patch) | |
tree | f435a8308119effd964b339f76abb83a57c29483 /toolkit/components/glean/docs/user | |
parent | Initial commit. (diff) | |
download | firefox-26a029d407be480d791972afb5975cf62c9360a6.tar.xz firefox-26a029d407be480d791972afb5975cf62c9360a6.zip |
Adding upstream version 124.0.1.upstream/124.0.1
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'toolkit/components/glean/docs/user')
-rw-r--r-- | toolkit/components/glean/docs/user/geckoview_streaming_migration.md | 262 | ||||
-rw-r--r-- | toolkit/components/glean/docs/user/getting_started.md | 97 | ||||
-rw-r--r-- | toolkit/components/glean/docs/user/gifft.md | 241 | ||||
-rw-r--r-- | toolkit/components/glean/docs/user/index.md | 17 | ||||
-rw-r--r-- | toolkit/components/glean/docs/user/instrumentation_tests.md | 278 | ||||
-rw-r--r-- | toolkit/components/glean/docs/user/migration.md | 909 | ||||
-rw-r--r-- | toolkit/components/glean/docs/user/new_definitions_file.md | 116 |
7 files changed, 1920 insertions, 0 deletions
diff --git a/toolkit/components/glean/docs/user/geckoview_streaming_migration.md b/toolkit/components/glean/docs/user/geckoview_streaming_migration.md new file mode 100644 index 0000000000..6dede3e90e --- /dev/null +++ b/toolkit/components/glean/docs/user/geckoview_streaming_migration.md @@ -0,0 +1,262 @@ +# Migrating Telemetry Collected via Geckoview Streaming to Glean + +With Geckoview Streaming (GVST) having been deprecated, +this is a guide to migrating collections to [Glean][book-of-glean] +via [Firefox on Glean](../index.md). + +```{contents} +``` + +## Before we Begin + +You should familiarize yourself with the guide on +[Adding New Metrics to Firefox Desktop](./new_definitions_file.md). + +You should also read through the guide for +[Migrating Metrics from Legacy Telemetry to Glean](./migration.md). + +This guide assumes some basic familiarity with the above. +The [Glean book][book-of-glean] has a full API reference, as well. + +## Process + +There are 3 main steps: + +1. Move the metric definition and make necessary updates +2. Update the call to use the Glean API +3. Update the tests to use the Glean test API + +### Move and Update the Metric Definition + +Existing metrics that make use of the GVST will already have a fully specified YAML +entry that we will use as a starting point. +This is convenient, but we want to make some minor updates rather than take it fully as is. +At a minimum we need to move the definition out of the +[Geckoview metrics.yaml][gv-metrics-yaml] file and change from GVST to [GIFFT](./gifft.md). +It can go into whichever metrics.yaml file you feel is most appropriate. +If an appropriate one does not exist, create a new one [following this guide][new-yaml]. +Completely remove the metric definition from the Geckoview `metrics.yaml`. + +For all metric types other than `labeled counters` the first step is to change the key +of the `gecko_datapoint` entry to `telemetry_mirror`. +Next, update the value as per the rules outlined in the [GIFFT guide][telemetry-mirror-doc]. +This change is required to keep data flowing in the Legacy Telemetry version of the metric. +Doing so will ensure that downstream analyses do not break unintentionally. +It is not necessary to modify the [Histograms.json][histograms-json] or +[Scalars.yaml][scalars-yaml] file. + +To migrate `labeled counters` instead fully remove the `gecko_datapoint` entry. +Note that our overall treatment of this type is slightly different. + +Next add the bug that covers the migration to the `bugs` field. +Update the `description` field as well to indicate the metric used to be collected +via the Geckoview Streaming API. Other fields should be updated as makes sense. +For example, you may need to update the field for `notification_emails`. +Since you are not changing the collection a new data review is not necessary. +However, if you notice a metric is set to expire soon and it should continue to be collected, +complete a [data review renewal][dr-renewal] form. + +Do not change the name or metric type. +**If you need to change or one both you are creating a new collection.** + +### Update Calls to Use the Glean API + +The next step is to update the metric calls to the Glean API. +Fortunately, for the vast matjority of metricsthis is a 1:1 swapout, +or for `labeled counters` (which are `categorical histograms` in legacy) we add a second call. +We identify the Glean API, remove the old call, and put in its place the Glean call. +You can find a full API reference in the [Glean][book-of-glean], but we'll look at how to record +values for the types that have existing GVST metrics. + +One way to mentally organize the metrics is to break them into two groups, those that are set, +and those that are accumulated to. As when you use the Legacy Telemetry API for GVST, +they are invoked slightly differently. + +To record in C++, you need to include `#include "mozilla/glean/GleanMetrics.h"`. +In Javascript, it is extremely unlikely that you will not already have access to `Glean`. +If you do not, please reach out to a Data Collection Tools team member on +[the #glean:mozilla.org Matrix channel](https://chat.mozilla.org/#/room/#glean:mozilla.org). + +Let's try a few examples. + +#### Migrating a Set Value (string) in C++ + +Let's look at the case of the Graphics Adaptor Vendor ID. +This is a String, +and it's recorded via GVST in C++ + +GVST YAML entry (slightly truncated): + +```YAML +geckoview: + version: + description: > + The version of the Gecko engine, example: 74.0a1 + type: string + gecko_datapoint: gecko.version +``` + +And this is recorded: + +```CPP +Telemetry::ScalarSet(Telemetry::ScalarID::GECKO_VERSION, + NS_ConvertASCIItoUTF16(gAppData->version)); +``` + +To migrate this, let's update our YAML entry, again moving it out of the GVST +metrics.yaml into the most appropriate one: + +```YAML +geckoview: + version: + description: > + The version of the Gecko engine, example: 74.0a1 + (Migrated from geckoview.gfx.adapter.primary) + type: string + telemetry_mirror: GECKO_VERSION +``` + +Notice how we've checked all of our boxes: + +* Made sure our category makes sense +* Changed gecko_datapoint to telemetry_mirror +* Updated the description to note that it was migrated from another collection +* Kept the type identical. + +Now we can update our call: + +```CPP +mozilla::glean::geckoview::version.Set(nsDependentCString(gAppData->version)); +``` + +#### Migrating a Labeled Counter in C++ + +Let's look at probably the most complicated scenario, one where we need to accumulate +to a labeled collection. Because the glean `labeled counter` and legacy `categorical histogram` +type do not support GIFFT, we will add a second call. +Let's take a look at an elided version of how this would be done with GVST: + +```CPP +switch (aResult.as<NonDecoderResult>()) { + case NonDecoderResult::SizeOverflow: + AccumulateCategorical(LABELS_AVIF_DECODE_RESULT::size_overflow); + return; + case NonDecoderResult::OutOfMemory: + AccumulateCategorical(LABELS_AVIF_DECODE_RESULT::out_of_memory); + return; + case NonDecoderResult::PipeInitError: + AccumulateCategorical(LABELS_AVIF_DECODE_RESULT::pipe_init_error); + return; +} +``` + +And we update it by adding a call with the FOG API: + +```CPP +switch (aResult.as<NonDecoderResult>()) { + case NonDecoderResult::SizeOverflow: + AccumulateCategorical(LABELS_AVIF_DECODE_RESULT::size_overflow); + mozilla::glean::avif::decode_result.EnumGet(avif::DecodeResult::eSizeOverflow).Add(); + return; + case NonDecoderResult::OutOfMemory: + AccumulateCategorical(LABELS_AVIF_DECODE_RESULT::out_of_memory); + mozilla::glean::avif::decode_result.EnumGet(avif::DecodeResult::eOutOfMemory).Add(); + return; + case NonDecoderResult::PipeInitError: + AccumulateCategorical(LABELS_AVIF_DECODE_RESULT::pipe_init_error); + mozilla::glean::avif::decode_result.EnumGet(avif::DecodeResult::ePipeInitError).Add(); + return; +} +``` + +#### Migrating an Accumulated Value (Histogram) in Javascript + +Javascript follows the same pattern. Consider the case when want to record the number +of uniqiue site origins. Here's the original JS implementation: + +```Javascript +let originCount = this.computeSiteOriginCount(aWindows, aIsGeckoView); +let histogram = Services.telemetry.getHistogramById( + "FX_NUMBER_OF_UNIQUE_SITE_ORIGINS_ALL_TABS", +); + +if (!this._lastRecordSiteOrigin) { + this._lastRecordSiteOrigin = currentTime; +} else if (currentTime >= this._lastRecordSiteOrigin + this.min_interval) { + this._lastRecordSiteOrigin = currentTime; + + histogram.add(originCount); +} +``` + +And here is the direct Glean version + +```Javascript +let originCount = this.computeSiteOriginCount(aWindows, aIsGeckoView); + +if (!this._lastRecordSiteOrigin) { + this._lastRecordSiteOrigin = currentTime; +} else if (currentTime >= this._lastRecordSiteOrigin + this.min_interval) { + this._lastRecordSiteOrigin = currentTime; + + Glean.tabs.uniqueSiteOriginsAllTabs.accumulateSamples([originCount]); +} + +``` + +Note that we don't have to call into Services to get the histogram object. + +### Update the tests to use the Glean test API + +The last piece is updating tests. If tests don't exist +(which is often the case since testing metrics collected via GVST can be challenging), +we recommend that you write them as the +[the Glean test API is quite straightforward](./instrumentation_tests.md). + +The main test method is `testGetValue()`. Returning to our earlier example of +Number of Unique Site Origins, in a Javascript test we can invoke: + +```Javascript +let result = Glean.tabs.uniqueSiteOriginsAllTabs.testGetValue(); + +// This collection is a histogram, we can check the sum for this test +Assert.equal(result.sum, 144); +``` + +If your collection is in a child process, it can be helpful to invoke +`await Services.fog.testFlushAllChildren();` + +If you wish to write a C++ test, `testGetValue()` is also our main method: + +```CPP +#include "mozilla/glean/GleanMetrics.h" + +ASSERT_EQ(1, + mozilla::glean::avif::image_decode_result + .EnumGet(avif::DecodeResult::eSizeOverflow) + .TestGetValue() + .unwrap() + .ref()); + +ASSERT_EQ(3, + mozilla::glean::avif::image_decode_result + .EnumGet(avif::DecodeResult::eOutOfMemory) + .TestGetValue() + .unwrap() + .ref()); + +ASSERT_EQ(0, + mozilla::glean::avif::image_decode_result + .EnumGet(avif::DecodeResult::ePipeInitError) + .TestGetValue() + .unwrap() + .ref()); +``` + +[book-of-glean]: https://mozilla.github.io/glean/book/index.html +[gv-metrics-yaml]: https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/geckoview/streaming/metrics.yaml +[histograms-json]: https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Histograms.json +[scalars-yaml]: https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/Scalars.yaml +[new-yaml]: ./new_definitions_file.md#where-do-i-define-new-metrics-and-pings +[dr-renewal]: https://github.com/mozilla/data-review/blob/main/renewal_request.md +[telemetry-mirror-doc]: https://firefox-source-docs.mozilla.org/toolkit/components/glean/user/gifft.html#the-telemetry-mirror-property-in-metrics-yaml diff --git a/toolkit/components/glean/docs/user/getting_started.md b/toolkit/components/glean/docs/user/getting_started.md new file mode 100644 index 0000000000..50b955bced --- /dev/null +++ b/toolkit/components/glean/docs/user/getting_started.md @@ -0,0 +1,97 @@ +# Getting Started with Firefox on Glean (FOG) + +This documentation is designed to be helpful to those who are +* New to data collection in Firefox Desktop, +* Experienced with data collection in Firefox Desktop, but not the Glean kind +* Those who are just interested in a refresher. + +## What is FOG? + +Firefox on Glean (FOG) is the library that brings +[the Glean SDK](https://mozilla.github.io/glean/book/index.html), +Mozilla's modern data collection system, +to Firefox Desktop. + +FOG's code is in `toolkit/components/glean` and is considered part of the +`Toolkit :: Telemetry` [module][modules]. +Bugs against FOG can be [filed][file-fog-bugs] +in Bugzilla in the `Toolkit` product and the `Telemetry` component. +(No bugs about adding new instrumentation, please. +You can file those in the components that you want instrumented.) +You can find folks who can help answer your questions about FOG in +* [#glean:mozilla.org](https://chat.mozilla.org/#/room/#glean:mozilla.org) +* [#telemetry:mozilla.org](https://chat.mozilla.org/#/room/#telemetry:mozilla.org) +* Slack#data-help + +On top of the usual things Glean embedders supply +(user engagement monitoring, network upload configuration, data upload preference watching, ...) +FOG supplies Firefox-Desktop-specific things: +* Privileged JS API +* C++ API +* IPC +* Test Preferences +* Support for `xpcshell`, browser-chrome mochitests, GTests, and rusttests +* `about:glean` +* ...and more. + +## What do I need to know about Glean? + +You use the APIs supplied by the Glean SDK to instrument Mozilla projects. + +The unit of instrumentation is the **metric**. +Recording the number of times a user opens a new tab? That's a metric. +Timing how long each JS garbage collector pass takes? Also a metric. + +Glean has documentation about +[how to add a new metric][add-a-metric] +that you should follow to learn how to add a metric to instrument Firefox Desktop. +There are some [peculiarities specific to Firefox Desktop](new_definitions_file) +that you'll wish to review as well. +Don't forget to get [Data Collection Review][data-review] +for any new or expanded data collections in mozilla projects. + +By adding a metric you've told the Glean SDK what shape of instrumentation you want. +And by using the metric's APIs to instrument your code, +you've put your interesting data into that metric. +But how does the data leave Firefox Desktop and make it to Mozilla's Data Pipeline? + +Batches of related metrics are collected into **pings** +which are submitted according to their specific schedules. +If you don't say otherwise, any non-`event`-metric will be sent in the +[built-in Glean "metrics" ping][metrics-ping] about once a day. +(`event` metrics are sent in [the "events" ping][events-ping] +more frequently than that). + +With data being sent to Mozilla's Data Pipeline, how do you analyse it? + +That's an impossible question to answer completely without knowing a _lot_ about what questions you want to answer. +However, in general, if you want to see what data is being collected by your instrumentation, +[go to its page in Glean Dictionary][glean-dictionary] +and you'll find links and information there about how to proceed. + +## Where do I learn more? + +Here in the [FOG User Documentation](./index) you will find FOG-specific details like +[how to write instrumentation tests](instrumentation_tests), or +[how to use Glean APIs to mirror data to Telemetry](gifft). + +Most of what you should have to concern yourself with, as an instrumentor, +is documented in [the Book of Glean](https://mozilla.github.io/glean/book/index.html). +Such as its [illuminating glossary][glean-glossary], +the [list of all metric types][metrics-types], +or the index of our long-running blog series [This Week in Glean][twig-index]. + +And for anything else you need help with, please find us in +[#glean:mozilla.org](https://chat.mozilla.org/#/room/#glean:mozilla.org). +We'll be happy to help you learn more about FOG and Glean. + +[add-a-metric]: https://mozilla.github.io/glean/book/user/metrics/adding-new-metrics.html +[metrics-ping]: https://mozilla.github.io/glean/book/user/pings/metrics.html +[events-ping]: https://mozilla.github.io/glean/book/user/pings/events.html +[modules]: https://wiki.mozilla.org/Modules/All +[data-review]: https://wiki.mozilla.org/Data_Collection +[glean-dictionary]: https://dictionary.telemetry.mozilla.org/ +[glean-glossary]: https://mozilla.github.io/glean/book/appendix/glossary.html +[twig-index]: https://mozilla.github.io/glean/book/appendix/twig.html +[metrics-types]: https://mozilla.github.io/glean/book/reference/metrics/index.html +[file-fog-bugs]: https://bugzilla.mozilla.org/enter_bug.cgi?product=Toolkit&component=Telemetry diff --git a/toolkit/components/glean/docs/user/gifft.md b/toolkit/components/glean/docs/user/gifft.md new file mode 100644 index 0000000000..4c173884ce --- /dev/null +++ b/toolkit/components/glean/docs/user/gifft.md @@ -0,0 +1,241 @@ +# Glean Interface For Firefox Telemetry (GIFFT) + +To make Migration from Firefox Telemetry to Glean easier, +the C++ and JS Glean API can be configured +(on a metric-by-metric basis) +to mirror data collection to both the Glean metric and a Telemetry probe. + +GIFFT should ideally be used only when the data you require for analysis still mostly lives in Telemetry, +and should be removed promptly when no longer needed. +Instrumentors are encouraged to have the Telemetry mirror probe expire within six versions. +(As always you can renew an expiring probe if you're still using it, +but this will help us get closer to the time when we eventually turn Telemetry off.) + +**Note:** GIFFT only works for data provided via C++ or JS. +Rust Glean metrics APIs will not mirror to Telemetry as Telemetry does not have a Rust API. + +**Note:** Using the Glean API replaces the Telemetry API. +Do not use any mix of the two APIs for the same probe. + +## How to Mirror a Glean Metric to a Firefox Telemetry Probe + +For the mirror to work, you need three things: +* A compatible Glean metric (defined in a `metrics.yaml`) +* A compatible Telemetry probe + (defined in `Histograms.json`, `Scalars.yaml`, or `Events.yaml`) +* A `telemetry_mirror` property on the Glean metric definition identifying the Telemetry probe + +### Compatibility + +This compatibility table explains which Telemetry probe types can be mirrors for which Glean metric types: + +| Glean Metric Type | Telementry Probe Type | +| ----------------- | --------------------- | +| [boolean](https://mozilla.github.io/glean/book/reference/metrics/boolean.html) | [Scalar of kind: boolean](/toolkit/components/telemetry/collection/scalars.html) | +| [labeled_boolean](https://mozilla.github.io/glean/book/reference/metrics/labeled_booleans.html) | [Keyed scalar of kind: boolean](/toolkit/components/telemetry/collection/scalars.html) | +| [counter](https://mozilla.github.io/glean/book/reference/metrics/counter.html) | [Scalar of kind: uint](/toolkit/components/telemetry/collection/scalars.html) | +| [labeled_counter](https://mozilla.github.io/glean/book/reference/metrics/labeled_counters.html) | [Keyed Scalar of kind: uint](/toolkit/components/telemetry/collection/scalars.html) | +| [string](https://mozilla.github.io/glean/book/reference/metrics/string.html) | [Scalar of kind: string](/toolkit/components/telemetry/collection/scalars.html) | +| [labeled_string](https://mozilla.github.io/glean/book/reference/metrics/labeled_strings.html) | *No Supported Telemetry Type* | +| [string_list](https://mozilla.github.io/glean/book/reference/metrics/string_list.html) | [Keyed Scalar of kind: boolean](/toolkit/components/telemetry/collection/scalars.html). The keys are the strings. The values are all `true`. Calling `Set` on the labeled_string is not mirrored (since there's no way to remove keys from a keyed scalar of kind boolean). Doing so will log a warning. | +| [timespan](https://mozilla.github.io/glean/book/reference/metrics/timespan.html) | [Scalar of kind: uint](/toolkit/components/telemetry/collection/scalars.html). The value is in units of milliseconds. | +| [timing_distribution](https://mozilla.github.io/glean/book/reference/metrics/timing_distribution.html) | [Histogram of kind "linear" or "exponential"](/toolkit/components/telemetry/collection/histograms.html#exponential). Samples will be in units of milliseconts. | +| [memory_distribution](https://mozilla.github.io/glean/book/reference/metrics/memory_distribution.html) | [Histogram of kind "linear" or "exponential"](/toolkit/components/telemetry/collection/histograms.html#exponential). Samples will be in `memory_unit` units. | +| [custom_distribution](https://mozilla.github.io/glean/book/reference/metrics/custom_distribution.html) | [Histogram of kind "linear" or "exponential"](/toolkit/components/telemetry/collection/histograms.html#exponential). Samples will be used as is. Ensure the bucket count and range match. | +| [uuid](https://mozilla.github.io/glean/book/reference/metrics/uuid.html) | [Scalar of kind: string](/toolkit/components/telemetry/collection/scalars.html). Value will be in canonical 8-4-4-4-12 format. Value is not guaranteed to be valid, and invalid values may be present in the mirrored scalar while the uuid metric remains empty. Calling `GenerateAndSet` on the uuid is not mirrored, and will log a warning. | +| [url](https://mozilla.github.io/glean/book/reference/metrics/url.html) | [Scalar of kind: string](/toolkit/components/telemetry/collection/scalars.html). The stringified Url will be cropped to the maximum length allowed by the legacy type. | +| [datetime](https://mozilla.github.io/glean/book/reference/metrics/datetime.html) | [Scalar of kind: string](/toolkit/components/telemetry/collection/scalars.html). Value will be in ISO8601 format. | +| [events](https://mozilla.github.io/glean/book/reference/metrics/event.html) | [Events](/toolkit/components/telemetry/collection/events.html). The `value` field will be left empty. | +| [quantity](https://mozilla.github.io/glean/book/reference/metrics/quantity.html) | [Scalar of kind: uint](/toolkit/components/telemetry/collection/scalars.html) | +| [rate](https://mozilla.github.io/glean/book/reference/metrics/rate.html) | [Keyed Scalar of kind: uint](/toolkit/components/telemetry/collection/scalars.html). The keys are "numerator" and "denominator". Does not work for `rate` metrics with external denominators. | +| [text](https://mozilla.github.io/glean/book/reference/metrics/text.html) | *No Supported Telemetry Type* | + +### The `telemetry_mirror` property in `metrics.yaml` + +You must use the C++ enum identifier of the Histogram, Scalar, or Event being mirrored to: +* For Histograms, the Telemetry C++ enum identifier is the histogram's name + * e.g. The C++ enum identifier for `WR_RENDERER_TIME` is + `WR_RENDERER_TIME` (see {searchfox}`gfx/metrics.yaml`) +* For Scalars, the Telemetry C++ enum identifier is the Scalar category and name in + `SCREAMING_SNAKE_CASE` with any `.` replaced with `_` + * e.g. The enum identifier for `extensions.startupCache.load_time` is + `EXTENSIONS_STARTUPCACHE_LOAD_TIME` (see {searchfox}`toolkit/components/extensions/metrics.yaml`) +* For Events, the Telemetry C++ enum identifier is the Event category, method, and object + rendered in `Snakey_CamelCase`. + * e.g. The enum identifier for `page_load.toplevel#content` is + `Page_load_Toplevel_Content` (see {searchfox}`dom/metrics.yaml`) + +If you use the wrong enum identifier, this will manifest as a build error. + +If you are having trouble finding the correct conjugation for the mirror Telemetry probe, +you can find the specific value in the list of all Telemetry C++ enum identifiers in +`<objdir>/toolkit/components/telemetry/Telemetry{Histogram|Scalar|Event}Enums.h`. +(Choose the file appropriate to the type of the Telemetry mirror.) + +## Artifact Build Support + +Sadly, GIFFT does not support Artifact builds. +You must build Firefox when you add the mirrored metric so the C++ enum value is present, +even if you only use the metric from Javascript. + +## Analysis Gotchas + +Firefox Telemetry and the Glean SDK are very different. +Though GIFFT bridges the differences as best it can, +there are many things it cannot account for. + +These are a few of the ways that differences between Firefox Telemetry and the Glean SDK might manifest as anomalies during analysis. + +### Processes, Products, and Channels + +Like Firefox on Glean itself, +GIFFT doesn't know what process, product, or channel it is recording in. +Telemetry does, and imposes restrictions on which probes can be recorded to and when. + +Ensure that the following fields in any Telemetry mirror's definition aren't too restrictive for your use: +* `record_in_processes` +* `products` +* `release_channel_collection`/`releaseChannelCollection` + +A mismatch won't result in an error. +If you, for example, +record to a Glean metric in a release channel that the Telemetry mirror probe doesn't permit, +then the Glean metric will have a value and the Telemetry mirror probe won't. + +Also recall that Telemetry probes split their values across processes. +[Glean metrics do not](../dev/ipc.md). +This may manifest as curious anomalies when comparing the Glean metric to its Telemetry mirror probe. +Ensure your analyses are aggregating Telemetry values from all processes, +or define and use process-specific Glean metrics and Telemetry mirror probes to keep things separate. + +### Pings + +Glean and Telemetry both send their built-in pings on their own schedules. +This means the values present in these pings may not agree since they reflect state at different time. + +For example, if you are measuring "Number of Monitors" with a +[`quantity`](https://mozilla.github.io/glean/book/reference/metrics/quantity.html) +sent by default in the Glean "metrics" ping mirrored to a +[Scalar of kind: uint](/toolkit/components/telemetry/collection/scalars.rst) +sent by default in the Telemetry "main" ping, +then if the user plugs in a second monitor between midnight +(when Telemetry "main" pings with reason "daily" are sent) and 4AM +(when Glean "metrics" pings with reason "today" are sent), +the value in the `quantity` will be `2` +while the value in the Scalar of kind: uint will be `1`. + +If the metric or mirrored probe are sent in Custom pings, +the schedules could line up exactly or be entirely unrelated. + +### Labels + +Labeled metrics supported by GIFFT +(`labeled_boolean` and `labeled_counter`) +adhere to the Glean SDK's +[label format](https://mozilla.github.io/glean/book/reference/metrics/index.html#label-format). + +Keyed Scalars, on the other hand, do not have a concept of an "Invalid key". +Firefox Telemetry will accept just about any sequence of bytes as a key. + +This means that a label deemed invalid by the Glean SDK may appear in the mirrored probe's data. +For example, using 72 "1" characters as a label that doesn't conform to the format +(it is longer than 71 printable ASCII characters). +See that the `labeled_boolean` metric +[correctly ascribes it to `__other__`](https://mozilla.github.io/glean/book/reference/metrics/index.html#labeled-metrics) +whereas the mirrored Keyed Scalar with kind boolean stores and retrieves it without change: +```js +Glean.testOnly.mirrorsForLabeledBools["1".repeat(72)].set(true); +Assert.equal(true, Glean.testOnly.mirrorsForLabeledBools.__other__.testGetValue()); +// The above actually throws NS_ERROR_LOSS_OF_SIGNIFICANT_DATA because it also records +// an invalid_label error. But you get the idea. +let snapshot = Services.telemetry.getSnapshotForKeyedScalars().parent; +Assert.equal(true, snapshot["telemetry.test.mirror_for_labeled_bool"]["1".repeat(72)]); +``` + +### Telemetry Events + +A Glean event can be mirrored to a Telemetry Event. +Telemetry Events must be enabled before they can be recorded to via the API +`Telemetry.setEventRecordingEnabled(category, enable);`. +If the Telemetry Event isn't enabled, +recording to the Glean event will still work, +and the event will be Summarized in Telemetry as all disabled events are. + +See +[the Telemetry Event docs](/toolkit/components/telemetry/collection/events.rst) +for details on how disabled Telemetry Events behave. + +### Numeric Values + +The arguments and storage formats for Glean's numeric types +(`counter`, `labeled_counter`, `quantity`, `rate`, and `timespan`) +are different from Telemetry's numeric type +(Scalar of kind `uint`). + +This results in a few notable differences. + +#### Saturation and Overflow + +`counter`, `labeled_counter`, and `rate` metrics are stored as 32-bit signed values. +`quantity` metrics are stored as 64-bit signed values. +`timing_distribution` samples can be 64-bit signed values. +All of these Glean numeric metric types saturate at their maximum representable value, +or according to the Limits section of the Glean metric type documentation. + +Scalars of kind `uint` are stored as 32-bit unsigned values. +They will overflow if they exceed the value $2^{32} - 1$. + +If a Glean numeric type saturates, it will record an error of type `invalid_overflow`. +In your analyses please check for these errors. + +#### Quantity Value Over-size + +Values greater than $2^{32} - 1$ passed to a `quantity` metric's +`set()` method will be clamped to $2^{32} - 1$ before being passed to the metric's Telemetry mirror. + +#### Negative Values + +Values less than 0 passed to any numeric metric type's API will not be passed on to the Telemetry mirror. +This avoids small negative numbers being cast into a stunningly large numbers, +and keeps the Telemetry mirror's value closer to that of the Glean metric. + +#### Long Time Spans + +If the number of milliseconds between calls to a +`timespan` metric's `start()` and `stop()` methods exceeds $2^{32} - 1$, +the value passed to the metric's Telemetry mirror will be clamped to $2^{32} - 1$. + +The same happens for samples in `timing_distribution` metrics: +values passed to the Telemetry mirror histogram will saturate at $2^{32} - 1$ +until they get past $2^{64}$ when they'll overflow. + +#### `timing_distribution` mirrors: Samples and Sums might be Different + +A specific value in a `timing_distribution` metric will not always agree with +the corresponding value in its mirrored-to histogram. +Though the calls to the clock are very close together in the code in Telemetry and Glean, +Telemetry's are not on the exact same instruction as Glean's _and_ +Telemetry uses a different clock source (`TimeStamp::Now()`) than Glean (`time::precise_time_ns()`). + +Also, if these slight drifts happen to cross the boundary of a bucket in either system, +samples might end up looking more different than you'd expect. + +This shouldn't affect analysis, but it can affect testing, so please +[bear this difference in mind](./instrumentation_tests.md#general-things-to-bear-in-mind) +in testing. + +### App Shutdown + +Telemetry only works up to +[`ShutdownPhase::AppShutdownTelemetry` aka `profile-before-change-telemetry`][app-shutdown]. +Telemetry data recorded after that phase just aren't persisted. + +FOG _presently_ shuts down Glean in a later phase, +and so is able to collect data deeper into shutdown. +(The particular phase is not presently something anyone's asked us to guarantee, +so that's why I'm not being precise.) + +What this means is that, for data recorded later in shutdown, +Glean will report more complete information than Telemetry will. + +[app-shutdown]: https://searchfox.org/mozilla-central/source/xpcom/base/AppShutdown.cpp#57 diff --git a/toolkit/components/glean/docs/user/index.md b/toolkit/components/glean/docs/user/index.md new file mode 100644 index 0000000000..f5ddf6d9e3 --- /dev/null +++ b/toolkit/components/glean/docs/user/index.md @@ -0,0 +1,17 @@ +# Using Firefox on Glean + +This section of docs is designed to be helpful to people instrumenting Firefox Desktop. +You may wish to begin with the [Getting Started](getting_started.md) docs. +Or, if you're already acquainted with Glean concepts and what FOG is, +you might want to know [how to migrate a piece of Firefox Telemetry to Glean](migration.md). + +```{toctree} +:titlesonly: +:maxdepth: 1 +:glob: + +getting_started +new_definitions_file +* +Glean SDK Documentation <https://mozilla.github.io/glean/book/index.html> +``` diff --git a/toolkit/components/glean/docs/user/instrumentation_tests.md b/toolkit/components/glean/docs/user/instrumentation_tests.md new file mode 100644 index 0000000000..e457e9fd39 --- /dev/null +++ b/toolkit/components/glean/docs/user/instrumentation_tests.md @@ -0,0 +1,278 @@ +# Writing Instrumentation Tests + +```{admonition} Old Glean Proverb +If it's important enough to be instrumented, it's important enough to be tested. +``` + +All metrics and pings in the Glean SDK have [well-documented APIs for testing][glean-metrics-apis]. +You'll want to familiarize yourself with `TestGetValue()` +(here's [an example JS (xpcshell) test of some metrics][metrics-xpcshell-test]) +for metrics and +[`TestBeforeNextSubmit()`][test-before-next-submit] +(here's [an example C++ (gtest) test of a custom ping][ping-gtest]) +for pings. + +All test APIs are available in all three of FOG's supported languages: +Rust, C++, and JavaScript. + +But how do you get into a position where you can even call these test APIs? +How do they fit in with Firefox Desktop's testing frameworks? + +## Manual Testing and Debugging + +The Glean SDK has [debugging capabilities][glean-debug] +for manually verifying that instrumentation makes it to Mozilla's Data Pipeline. +Firefox Desktop supports these via environment variables _and_ +via the interface on `about:glean`. + +This is all well and good for getting a good sense check that things are going well _now_, +but in order to check that everything stays good through the future, +you're going to want to write some automated tests. + +## General Things To Bear In Mind + +* You may see values from previous tests persist across tests because the profile directory was shared between test cases. + * You can reset Glean before your test by calling + `Services.fog.testResetFOG()` (in JS). + * If your instrumentation isn't on the parent process, + you should call `await Services.fog.testFlushAllChildren()` before `testResetFOG`. + That will ensure all pending data makes it to the parent process to be cleared. + * You shouldn't have to do this in C++ or Rust since there you should use the + `FOGFixture` test fixture. +* If your metric is based on timing (`timespan`, `timing_distribution`), + do not expect to be able to assert the correct timing value. + Glean does a lot of timing for you deep in the SDK, so unless you mock the system's monotonic clock, + do not expect the values to be predictable. + * Instead, check that a value is `> 0` or that the number of samples is expected. + * You might be able to assert that the value is at least as much as a known, timed value, + but beware of rounding. + * If your metric is a `timing_distribution` mirroring to a Telemetry probe via [GIFFT](./gifft.md), + there may be [small observed differences between systems](./gifft.md#timing-distribution-mirrors-samples-and-sums-might-be-different) + that can cause equality assertions to fail. +* Errors in instrumentation APIs do not panic, throw, or crash. + But Glean remembers that the errors happened. + * Test APIs, on the other hand, are permitted + (some may say "encouraged") + to panic, throw, or crash on bad behaviour. + * If you call a test API and it panics, throws, or crashes, + that means your instrumentation did something wrong. + Check your test logs for details about what went awry. + +### Tests and Artifact Builds + +Artifact build support is provided by [the JOG subsystem](../dev/jog). +It is able to register the latest versions of all metrics and pings at runtime. +However, the compiled code is still running against the +version of those metrics and pings that was current at the time the artifacts were compiled. + +This isn't a problem unless: +* You are changing a metric or ping that is used in instrumentation in the compiled code, or +* You are using `testBeforeNextSubmit` in JavaScript for a ping submitted in the compiled code. + +When in doubt, simply test your new test in artifact mode +(by e.g. passing `--enable-artifact-builds` to `mach try`) +before submitting it. +If it doesn't pass in artifact mode because of one of these two cases, +you may need to skip your test whenever FOG's artifact build support is enabled: +* xpcshell: +```js +add_task( + { skip_if: () => Services.prefs.getBoolPref("telemetry.fog.artifact_build", false) }, + function () { + // ... your test ... + } +); +``` +* mochitest: +```js +add_task(function () { + if (Services.prefs.getBoolPref("telemetry.fog.artifact_build", false)) { + Assert.ok(true, "Test skipped in artifact mode."); + return; + } + // ... your test ... +}); +``` + +## The Usual Test Format + +Instrumentation tests tend to follow the same three-part format: +1) Assert no value in the metric +2) Express behaviour +3) Assert correct value in the metric + +Your choice of test suite will depend on how the instrumented behaviour can be expressed. + + +## `xpcshell` Tests + +If the instrumented behaviour is on the main or content process and can be called from privileged JS, +`xpcshell` is an excellent choice. + +`xpcshell` is so minimal an environment, however, that +(pending [bug 1756055](https://bugzilla.mozilla.org/show_bug.cgi?id=1756055)) +you'll need to manually tell it you need two things: +1) A profile directory +2) An initialized FOG + +```js +/* Any copyright is dedicated to the Public Domain. + http://creativecommons.org/publicdomain/zero/1.0/ */ + +"use strict"; + +add_setup(function test_setup() { + // FOG needs a profile directory to put its data in. + do_get_profile(); + + // FOG needs to be initialized in order for data to flow. + Services.fog.initializeFOG(); +}); +``` + +From there, just follow The Usual Test Format: + +```js +add_task(function test_instrumentation() { + // 1) Assert no value + Assert.equal(undefined, Glean.myMetricCategory.myMetricName.testGetValue()); + + // 2) Express behaviour + // ...<left as an exercise to the reader>... + + // 3) Assert correct value + Assert.equal(kValue, Glean.myMetricCategory.myMetricName.testGetValue()); +}); +``` + +If your new instrumentation includes a new custom ping, +there are two small additions to The Usual Test Format: + +* 1.1) Call `testBeforeNextSubmit` _before_ your ping is submitted. + The callback you register in `testBeforeNextSubmit` + is called synchronously with the call to the ping's `submit()`. +* 3.1) Check that the ping actually was submitted. + If all your Asserts are inside `testBeforeNextSubmit`'s closure, + another way this test could pass is by not running any of them. + +```js +add_task(function test_custom_ping() { + // 1) Assert no value + Assert.equal(undefined, Glean.myMetricCategory.myMetricName.testGetValue()); + + // 1.1) Set up Step 3. + let submitted = false; + GleanPings.myPing.testBeforeNextSubmit(reason => { + submitted = true; + // 3) Assert correct value + Assert.equal(kExpectedReason, reason, "Reason of submitted ping must match."); + Assert.equal(kExpectedMetricValue, Glean.myMetricCategory.myMetricName.testGetValue()); + }); + + // 2) Express behaviour that sends a ping with expected reason and contents + // ...<left as an exercise to the reader>... + + // 3.1) Check that the ping actually was submitted. + Assert.ok(submitted, "Ping was submitted, callback was called."); +}); +``` + +(( We acknowledge that this isn't the most ergonomic form. +Please follow +[bug 1756637](https://bugzilla.mozilla.org/show_bug.cgi?id=1756637) +for updates on a better design and implementation for ping tests. )) + +## mochitest + +`browser-chrome`-flavoured mochitests can be tested very similarly to `xpcshell`, +though you do not need to request a profile or initialize FOG. +`plain`-flavoured mochitests aren't yet supported (follow +[bug 1799977](https://bugzilla.mozilla.org/show_bug.cgi?id=1799977) +for updates and a workaround). + +If you're testing in `mochitest`, your instrumentation (or your test) +might not be running in the parent process. +This means you get to learn the IPC test APIs. + +### IPC + +All test APIs must be called on the main process +(they'll assert otherwise). +But your instrumentation might be on any process, so how do you test it? + +In this case there's a slight addition to the Usual Test Format: +1) Assert no value in the metric +2) Express behaviour +3) _Flush all pending FOG IPC operations with `await Services.fog.testFlushAllChildren()`_ +4) Assert correct value in the metric. + +**NOTE:** We learned in +[bug 1843178](https://bugzilla.mozilla.org/show_bug.cgi?id=1843178) +that the list of all content processes that `Services.fog.testFlushAllChildren()` +uses is very quickly updated after the end of a call to `BrowserUtils.withNewTab(...)`. +If you are using `withNewTab`, you should consider calling `testFlushAllChildren()` +_within_ the callback. + +## GTests/Google Tests + +Please make use of the `FOGFixture` fixture when writing your tests, like: + +```cpp +TEST_F(FOGFixture, MyTestCase) { + // 1) Assert no value + ASSERT_EQ(mozilla::Nothing(), + my_metric_category::my_metric_name.TestGetValue()); + + // 2) Express behaviour + // ...<left as an exercise to the reader>... + + // 3) Assert correct value + ASSERT_EQ(kValue, + my_metric_category::my_metric_name.TestGetValue().unwrap().ref()); +} +``` + +The fixture will take care of ensuring storage is reset between tests. + +## Rust `rusttests` + +The general-purpose +[Testing & Debugging Rust Code in Firefox](/testing-rust-code/index) +is a good thing to review first. + +Unfortunately, FOG requires gecko +(to tell it where the profile dir is, and other things), +which means we need to use the +[GTest + FFI approach](/testing-rust-code/index.md#gtests) +where GTest is the runner and Rust is just the language the test is written in. + +This means your test will look like a GTest like this: + +```cpp +extern "C" void Rust_MyRustTest(); +TEST_F(FOGFixture, MyRustTest) { Rust_MyRustTest(); } +``` + +Plus a Rust test like this: + +```rust +#[no_mangle] +pub extern "C" fn Rust_MyRustTest() { + // 1) Assert no value + assert_eq!(None, + fog::metrics::my_metric_category::my_metric_name.test_get_value(None)); + + // 2) Express behaviour + // ...<left as an exercise to the reader>... + + // 3) Assert correct value + assert_eq!(Some(value), + fog::metrics::my_metric_category::my_metric_name.test_get_value(None)); +} +``` + +[glean-metrics-apis]: https://mozilla.github.io/glean/book/reference/metrics/index.html +[metrics-xpcshell-test]: https://searchfox.org/mozilla-central/rev/66e59131c1c76fe486424dc37f0a8a399ca874d4/toolkit/mozapps/update/tests/unit_background_update/test_backgroundupdate_glean.js#28 +[ping-gtest]: https://searchfox.org/mozilla-central/rev/66e59131c1c76fe486424dc37f0a8a399ca874d4/toolkit/components/glean/tests/gtest/TestFog.cpp#232 +[test-before-next-submit]: https://mozilla.github.io/glean/book/reference/pings/index.html#testbeforenextsubmit +[glean-debug]: https://mozilla.github.io/glean/book/reference/debug/index.html diff --git a/toolkit/components/glean/docs/user/migration.md b/toolkit/components/glean/docs/user/migration.md new file mode 100644 index 0000000000..f3c86a182b --- /dev/null +++ b/toolkit/components/glean/docs/user/migration.md @@ -0,0 +1,909 @@ +# Migrating Firefox Telemetry to Glean + +This guide aims to help you migrate individual data collections from +[Firefox Telemetry](/toolkit/components/telemetry/index.rst) +to +[Glean][book-of-glean] via [Firefox on Glean](../index.md). + +This is intended to be a reference to help you fill out your +[migration worksheet][migration-worksheet], +or for mentally translating Telemetry concepts to Glean ones. + +```{contents} +``` + +## General Things To Bear In Mind + +You should familiarize yourself with +[the guide on adding new metrics to Firefox Desktop](new_definitions_file.md). +Its advice stacks with the advice included in this guide as +(once you've figured out what kind) you will indeed be adding new metrics. + +There are some other broad topics specific to migrating Firefox Telemetry stuff to Glean stuff: + +### Process-Agnosticism: No more `record_in_processes` field + +Glean (and thus FOG) [doesn't know anything about processes][ipc-dev-doc] +except what it has to in order to ensure all the data makes it to the parent process. +Firefox Telemetry cared very much about which process was collecting which specific data, +keeping them separate. + +If you collect data in multiple processes and wish to keep data from each process type separate, +you will need to provide this separation yourself. + +Please see [this dev doc][ipc-dev-doc] for an example of how to do that. + +### Channel-Agnosticism: No more `release_channel_collection: opt-out` + +FOG doesn't make a differentiation between pre-release Firefox and release Firefox, +except inasmuch as is necessary to put the correct channel in `client_info.app_channel`. + +This means all data is collected in all build configurations. + +If you wish or are required to only collect your data in pre-release Firefox, +please avail yourself of the `EARLY_BETA_OR_EARLIER` `#define` or `AppConstant`. + +### File-level Product Inclusion/Exclusion: No more `products` field + +Glean determines which metrics are recorded in which products via +[a dependency tree][repositories-yaml]. +This means FOG doesn't distinguish between products at the per-product level. + +If some of your metrics are recorded in different sets of products +(e.g. some of your metrics are collected in both Firefox Desktop _and_ Firefox for Android, +but others are Firefox Desktop-specific) +you must separate them into separate [definitions files](new_definitions_file.md). + +### Many Definitions Files + +Each component is expected to own and care for its own +[metrics definitions files](new_definitions_file.md). +There is no centralized `Histograms.json` or `Scalars.yaml` or `Events.yaml`. + +Instead the component being instrumented will have its own `metrics.yaml` +(and `pings.yaml` for any [Custom Pings][custom-pings]) +in which you will define the data. + +See [this guide](new_definitions_file.md) for details. + +### Testing + +Firefox Telemetry had very uneven support for testing instrumentation code. +FOG has much better support. Anywhere you can instrument is someplace you can test. + +It's as simple as calling `testGetValue`. + +All migrated collections are expected to be tested. +If you can't test them, then you'd better have an exceptionally good reason why not. + +For more details, please peruse the +[instrumentation testing docs](instrumentation_tests). + +## Which Glean Metric Type Should I Use? + +Glean uses higher-level metric types than Firefox Telemetry does. +This complicates migration as something that is "just a number" +in Firefox Telemetry might map to any number of Glean metric types. + +Please choose the most specific metric type that solves your problem. +This'll make analysis easier as +1. Others will know more about how to analyse the metric from more specific types. +2. Tooling will be able to present only relevant operations for more specific types. + +Example: +> In Firefox Telemetry I record the number of monitors attached to the computer that Firefox Desktop is running on. +> I could record this number as a [`string`][string-metric], a [`counter`][counter-metric], +> or a [`quantity`][quantity-metric]. +> The `string` is an obvious trap. It doesn't even have the correct data type (string vs number). +> But is it a `counter` or `quantity`? +> If you pay attention to this guide you'll learn that `counter`s are used to accumulate sums of information, +> whereas `quantity` metrics are used to record specific values. +> The "sum" of monitors over time doesn't make sense, so `counter` is out. +> `quantity` is the correct choice. + +## Histograms + +[Histograms][telemetry-histograms] +are the oldest Firefox Telemetry data type, and as such they've accumulated +([ha!][histogram-accumulate]) the most ways of being used. + +### Scalar Values in Histograms: kind `flag` and `count` + +If you have a Histogram that records exactly one value, +please scroll down and look at the migration guide for the relevant Scalar: +* For Histograms of kind `flag` see "Scalars of kind `bool`" +* For Histograms of kind `count` see "Scalars of kind `uint`" + +### Continuous Distributions: kind `linear` and `exponential` + +If the Histogram you wish to migrate is formed of multiple buckets that together form a single continuous range +(like you have buckets 1-5, 6-10, 11-19, and 20-50 - they form the range 1-50), +then you will want a "distribution" metric type in Glean. +Which kind of "distribution" metric type depends on what the samples are. + +#### Timing samples - Use Glean's `timing_distribution` + +The most common type of continuous distribution in Firefox Telemetry is a histogram of timing samples like +[`GC_MS`][gc-ms]. + +In Glean this sort of data is recorded using a +[`timing-distribution`][timing-distribution-metric] metric type. + +You will no longer need to worry about the range of values or number or distribution of buckets +(represented by the `low`, `high`, `n_buckets`, or `kind` in your Histogram's definition). +Glean uses a [clever automatic bucketing algorithm][timing-distribution-metric] instead. + +So for a Histogram that records timing samples like this: + +``` + "GC_MS": { + "record_in_processes": ["main", "content"], + "products": ["firefox", "geckoview_streaming"], + "alert_emails": ["dev-telemetry-gc-alerts@mozilla.org", "jcoppeard@mozilla.com"], + "expires_in_version": "never", + "releaseChannelCollection": "opt-out", + "kind": "exponential", + "high": 10000, + "n_buckets": 50, + "bug_numbers": [1636419], + "description": "Time spent running JS GC (ms)" + }, +``` + +You will migrate to a `timing_distibution` metric type like this: + +```yaml +js: + gc: + type: timing_distribution + time_unit: millisecond + description: | + Time spent running the Javascript Garbage Collector. + Migrated from Firefox Telemetry's `GC_MS`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1636419 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1636419#c8 + data_sensitivity: + - technical + notification_emails: + - dev-telemetry-gc-alerts@mozilla.org + - jcoppeard@mozilla.com + expires: never +``` + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +#### Memory Samples - Use Glean's `memory_distribution` + +Another common content of `linear` or `exponential` +Histograms in Firefox Telemetry is memory samples. +For example, [`MEMORY_TOTAL`][memory-total]'s samples are in kilobytes. + +In Glean this sort of data is recorded using a +[`memory-distribution`][memory-distribution-metric] metric type. + +You will no longer need to worry about the range of values or number or distribution of buckets +(represented by the `low`, `high`, `n_buckets`, or `kind` in your Histogram's definition). +Glean uses a [clever automatic bucketing algorithm][memory-distribution-metric] instead. + +So for a Histogram that records memory samples like this: + +``` + "MEMORY_TOTAL": { + "record_in_processes": ["main"], + "products": ["firefox", "thunderbird"], + "alert_emails": ["memshrink-telemetry-alerts@mozilla.com", "amccreight@mozilla.com"], + "bug_numbers": [1198209, 1511918], + "expires_in_version": "never", + "kind": "exponential", + "low": 32768, + "high": 16777216, + "n_buckets": 100, + "description": "Total Memory Across All Processes (KB)", + "releaseChannelCollection": "opt-out" + }, +``` + +You will migrate to a `memory_distribution` metric type like this: + +```yaml +memory: + total: + type: memory_distribution + memory_unit: kilobyte + description: | + The total memory allocated across all processes. + Migrated from Telemetry's `MEMORY_TOTAL`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1198209 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1511918 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1511918#c9 + data_sensitivity: + - technical + notification_emails: + - memshrink-telemetry-alerts@mozilla.com + - amccreight@mozilla.com + expires: never +``` + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +#### Percentage Samples - Comment on bug 1657467 + +A very common Histogram in Firefox Desktop is a distribution of percentage samples. +[For example, `GC_SLICE_DURING_IDLE`][gc-idle]. + +Glean doesn't currently have a good metric type for this. +But we [intend to add one][new-metric-percent]. +If you are migrating a collection of this type, +please add a comment to the bug detailing which probe you are migrating, +and when you need it migrated by. +We'll prioritize adding this metric type accordingly. + +#### Other - Use Glean's `custom_distribution` + +Continuous Distribution Histograms have been around long enough to have gotten weird. +If you're migrating one of those histograms with units like +["square root of pixels times milliseconds"][checkerboard-severity], +we have a "catch all" metric type for you: [Custom Distribution][custom-distribution-metric]. + +Sadly, you'll have to care about the bucketing algorithm and bucket ranges for this one. +So for a Histogram with artisinal samples like: + +``` + "CHECKERBOARD_SEVERITY": { + "record_in_processes": ["main", "content", "gpu"], + "products": ["firefox", "fennec", "geckoview_streaming"], + "alert_emails": ["gfx-telemetry-alerts@mozilla.com", "botond@mozilla.com"], + "bug_numbers": [1238040, 1539309, 1584109], + "releaseChannelCollection": "opt-out", + "expires_in_version": "never", + "kind": "exponential", + "high": 1073741824, + "n_buckets": 50, + "description": "Opaque measure of the severity of a checkerboard event" + }, +``` + +You will migrate it to a `custom_distribution` like: + +```yaml +gfx.checkerboard: + severity: + type: custom_distribution + range_max: 1073741824 + bucket_count: 50 + histogram_type: exponential + unit: Opaque unit + description: > + An opaque measurement of the severity of a checkerboard event. + This doesn't have units, it's just useful for comparing two checkerboard + events to see which one is worse, for some implementation-specific + definition of "worse". The larger the value, the worse the + checkerboarding. + Migrated from Telemetry's `CHECKERBOARD_SEVERITY`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1238040 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1539309 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1584109 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1584109#c1 + notification_emails: + - gfx-telemetry-alerts@mozilla.com + - botond@mozilla.com + data_sensitivity: + - technical + expires: never +``` + +**TODO [Bug 1677447](https://bugzilla.mozilla.org/show_bug.cgi?id=1677447):** +Custom Distributions aren't yet implemented in FOG. We're working on it. +When they're done we'll see if they'll support GIFFT like the other distributions. + +#### Keyed Histograms with Continuous Sample Distributions - Ask on #glean:mozilla.org for assistance + +Glean doesn't currently have a good metric type for keyed continuous distributions +like video play time keyed by codec. +Please [reach out to us][glean-matrix] to explain your use-case. +We will help you either work within what Glean currently affords or +[design a new metric type for you][new-metric-type]. + +### Discrete Distributions: kind `categorical`, `enumerated`, or `boolean` - Use Glean's `labeled_counter` + +If the samples don't fall in a continuous range and instead fall into a known number of buckets, +Glean provides the [Labeled Counter][labeled-counter-metric] for these cases. + +Simply enumerate the discrete categories as `labels` in the `labeled_counter`. + +For example, for a Histogram of kind `categorical` like: + +``` + "AVIF_DECODE_RESULT": { + "record_in_processes": ["main", "content"], + "products": ["firefox", "geckoview_streaming"], + "alert_emails": ["cchang@mozilla.com", "jbauman@mozilla.com"], + "expires_in_version": "never", + "releaseChannelCollection": "opt-out", + "kind": "categorical", + "labels": [ + "success", + "parse_error", + "no_primary_item", + "decode_error", + "size_overflow", + "out_of_memory", + "pipe_init_error", + "write_buffer_error", + "alpha_y_sz_mismatch", + "alpha_y_bpc_mismatch" + ], + "description": "Decode result of AVIF image", + "bug_numbers": [1670827] + }, +``` + +You would migrate to a `labeled_counter` like: + +```yaml +avif: + decode_result: + type: labeled_counter, + description: | + Each AVIF image's decode result. + Migrated from Telemetry's `AVIF_DECODE_RESULT`. + labels: + - success + - parse_error + - no_primary_item + - decode_error + - size_overflow + - out_of_memory + - pipe_init_error + - write_buffer_error + - alpha_y_sz_mismatch + - alpha_y_bpc_mismatch + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1670827 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1670827#c9 + data_sensitivity: + - technical + notification_emails: + - cchang@mozilla.com + - jbauman@mozilla.com + expires: never +``` + +**N.B:** Glean Labels have a strict regex. +You may have to transform some categories to +`snake_case` so that they're safe for the data pipeline. + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. +**N.B.:** This will mirror back as a Keyed Scalar of kind `uint`, +not as any kind of Histogram, +so your original un-migrated histogram cannot be used as the mirror. + +#### Keyed Histograms with Discrete Sample Distributions: `"keyed": true` and kind `categorical`, `enumerated`, or `boolean` - Comment on bug 1657470 + +Glean doesn't currently have a good metric type for this. +But we [intend to add one][new-metric-keyed-categorical]. +If you are migrating a collection of this type, +please add a comment to the bug detailing which probe you are migrating, +and when you need it migrated by. +We'll prioritize adding this metric type accordingly. + +## Scalars + +[Scalars][telemetry-scalars] are low-level individual data collections with a variety of uses. + +### Scalars of `kind: uint` that you call `scalarAdd` on - Use Glean's `counter` + +The most common kind of Scalar is of `kind: uint`. +The most common use of such a scalar is to repeatedly call `scalarAdd` +on it as countable things happen. + +The Glean metric type for countable things is [the `counter` metric type][counter-metric]. + +So for a Scalar like this: + +```yaml +script.preloader: + mainthread_recompile: + bug_numbers: + - 1364235 + description: + How many times we ended up recompiling a script from the script preloader + on the main thread. + expires: "100" + keyed: false + kind: uint + notification_emails: + - dothayer@mozilla.com + - plawless@mozilla.com + release_channel_collection: opt-out + products: + - 'firefox' + - 'fennec' + record_in_processes: + - 'main' + - 'content' +``` + +You will migrate to a `counter` metric type like this: + +```yaml +script.preloader: + mainthread_recompile: + type: counter + description: | + How many times we ended up recompiling a script from the script preloader + on the main thread. + Migrated from Telemetry's `script.preloader.mainthread_recompile`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1364235 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1364235#c25 + data_sensitivity: + - technical + notification_emails: + - dothayer@mozilla.com + - plawless@mozilla.com + expires: "100" +``` + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +#### Keyed Scalars of `kind: uint` that you call `scalarAdd` on - Use Glean's `labeled_counter` + +Another very common use of Scalars is to have a Keyed Scalar of +`kind: uint`. This was often used to track UI usage. + +This is supported by the [Glean `labeled_counter` metric type][labeled-counter-metric]. + +So for a Keyed Scalar of `kind: uint` like this: + +```yaml +urlbar: + tips: + bug_numbers: + - 1608461 + description: > + A keyed uint recording how many times particular tips are shown in the + Urlbar and how often their confirm and help buttons are pressed. + expires: never + kind: uint + keyed: true + notification_emails: + - email@example.com + release_channel_collection: opt-out + products: + - 'firefox' + record_in_processes: + - main +``` + +You would migrate it to a `labeled_counter` like this: + +```yaml +urlbar: + tips: + type: labeled_counter + description: > + A keyed uint recording how many times particular tips are shown in the + Urlbar and how often their confirm and help buttons are pressed. + Migrated from Telemetry's `urlbar.tips`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1608461 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1608461#c42 + data_sensitivity: + - interaction + expires: never + notification_emails: + - email@example.com +``` + +Now, if your Keyed Scalar has a list of known keys, +you should provide it to the `labeled_counter` using the `labels` property like so: + +```yaml +urlbar: + tips: + type: labeled_counter + labels: + - tabtosearch_onboard_shown + - tabtosearch_shown + - searchtip_onboard_shown + ... +``` + +**N.B:** Glean Labels have a strict regex. +You may have to transform some categories to +`snake_case` so that they're safe for the data pipeline. + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +### Scalars of `kind: uint` that you call `scalarSet` on - Use Glean's `quantity` + +Distinct from counts which are partial sums, +Scalars of `kind: uint` that you _set_ could contain just about anything. +The best metric type depends on the type of data you're setting +(See "Other Scalar-ish types" for some possibilities). + +If it's a numerical value you are setting, chances are you will be best served by +[Glean's `quantity` metric type][quantity-metric]. + +For a such a quantitative Scalar like: + +```yaml +gfx.display: + primary_height: + bug_numbers: + - 1594145 + description: > + Height of the primary display, takes device rotation into account. + expires: never + kind: uint + notification_emails: + - gfx-telemetry-alerts@mozilla.com + - ktaeleman@mozilla.com + products: + - 'geckoview_streaming' + record_in_processes: + - 'main' + release_channel_collection: opt-out +``` + +You would migrate it to a `quantity` like: + +```yaml +gfx.display: + primary_height: + type: quantity + unit: pixels + description: > + Height of the primary display, takes device rotation into account. + Migrated from Telemetry's `gfx.display.primary_height`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1594145 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1687219 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1594145#c4 + data_sensitivity: + - technical + notification_emails: + - gfx-telemetry-alerts@mozilla.com + expires: never +``` + +Note the required `unit` property. + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +**IPC Note:** Due to `set` not being a [commutative operation][ipc-docs], using `quantity` +on non-parent processes is forbidden. +This is a restriction that favours correctness over friendliness, +which we may revisit if enough use cases require it. +Please [contact us][glean-matrix] if you'd like us to do so. + +#### Keyed Scalars of `kind: uint` that you call `scalarSet` on - Ask on #glean:mozilla.org for assistance + +Glean doesn't currently have a good metric type for keyed quantities. +Please [reach out to us][glean-matrix] to explain your use-case. +We will help you either work within what Glean currently affords or +[design a new metric type for you][new-metric-type]. + +### Scalars of `kind: uint` that you call `scalarSetMaximum` or some combination of operations on - Ask on #glean:mozilla.org for assistance + +Glean doesn't currently have a good metric type for dealing with maximums, +or for dealing with values you both count and set. +Please [reach out to us][glean-matrix] to explain your use-case. +We will help you either work within what Glean currently affords or +[design a new metric type for you][new-metric-type]. + +### Scalars of `kind: string` - Use Glean's `string` + +If your string value is a unique identifier, then consider +[Glean's `uuid` metric type][uuid-metric] first. + +If the string scalar value doesn't fit that or any other more specific metric type, +then [Glean's `string` metric type][string-metric] will do. + +For a Scalar of `kind: string` like: + +```yaml +widget: + gtk_version: + bug_numbers: + - 1670145 + description: > + The version of Gtk 3 in use. + kind: string + expires: never + notification_emails: + - layout-telemetry-alerts@mozilla.com + release_channel_collection: opt-out + products: + - 'firefox' + record_in_processes: + - 'main' +``` + +You will migrate it to a `string` metric like: + +```yaml +widget: + gtk_version: + type: string + description: > + The version of Gtk 3 in use. + Migrated from Telemetry's `widget.gtk_version`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1670145 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1670145#c7 + data_sensitivity: + - technical + notification_emails: + - layout-telemetry-alerts@mozilla.com + expires: never +``` + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +**IPC Note:** Due to `set` not being a [commutative operation][ipc-docs], using `string` +on non-parent processes is forbidden. +This is a restriction that favours correctness over friendliness, +which we may revisit if enough use cases require it. +Please [contact us][glean-matrix] if you'd like us to do so. + +### Scalars of `kind: boolean` - Use Glean's `boolean` + +If you need to store a simple true/false, +[Glean's `boolean` metric type][boolean-metric] is likely best. + +If you have more that just `true` and `false` to store, +you may prefer a `labeled_counter`. + +For a Scalar of `kind: boolean` like: + +```yaml +widget: + dark_mode: + bug_numbers: + - 1601846 + description: > + Whether the OS theme is dark. + expires: never + kind: boolean + notification_emails: + - layout-telemetry-alerts@mozilla.com + - cmccormack@mozilla.com + release_channel_collection: opt-out + products: + - 'firefox' + - 'fennec' + record_in_processes: + - 'main' +``` + +You would migrate to a `boolean` metric type like: + +```yaml +widget: + dark_mode: + type: boolean + description: > + Whether the OS theme is dark. + Migrated from Telemetry's `widget.dark_mode`. + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1601846 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1601846#c5 + data_sensitivity: + - technical + notification_emails: + - layout-telemetry-alerts@mozilla.com + - cmccormack@mozilla.com + expires: never +``` + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +**IPC Note:** Due to `set` not being a [commutative operation][ipc-docs], using `boolean` +on non-parent processes is forbidden. +This is a restriction that favours correctness over friendliness, +which we may revisit if enough use cases require it. +Please [contact us][glean-matrix] if you'd like us to do so. + +#### Keyed Scalars of `kind: boolean` - Use Glean's `labeled_boolean` + +If you have multiple related true/false values, you may have put them in a +Keyed Scalar of `kind: boolean`. + +The best match for this is +[Glean's `labeled_boolean` metric type][labeled-boolean-metric]. + +For a Keyed Scalar of `kind: boolean` like: + +```yaml +devtools.tool: + registered: + bug_numbers: + - 1447302 + - 1503568 + - 1587985 + description: > + Recorded on enable tool checkbox check/uncheck in Developer Tools options + panel. Boolean stating if the tool was enabled or disabled by the user. + Keyed by tool id. Current default tools with their id's are defined in + https://searchfox.org/mozilla-central/source/devtools/client/definitions.js + expires: never + kind: boolean + keyed: true + notification_emails: + - dev-developer-tools@lists.mozilla.org + - accessibility@mozilla.com + release_channel_collection: opt-out + products: + - 'firefox' + - 'fennec' + record_in_processes: + - 'main' +``` + +You would migrate to a `labeled_boolean` like: + +```yaml +devtools.tool: + registered: + type: labeled_boolean + description: > + Recorded on enable tool checkbox check/uncheck in Developer Tools options + panel. Boolean stating if the tool was enabled or disabled by the user. + Migrated from Telemetry's `devtools.tool`. + labels: + - options + - inspector + - webconsole + - jsdebugger + - styleeditor + - performance + - memory + - netmonitor + - storage + - dom + - accessibility + - application + - dark + - light + bugs: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1447302 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1503568 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1587985 + data_reviews: + - https://bugzilla.mozilla.org/show_bug.cgi?id=1447302#c17 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1503568#c3 + - https://bugzilla.mozilla.org/show_bug.cgi?id=1587985#c5 + data_sensitivity: + - interaction + notification_emails: + - dev-developer-tools@lists.mozilla.org + - accessibility@mozilla.com + expires: never +``` + +**N.B:** Glean Labels have a strict regex. +You may have to transform some categories to +`snake_case` so that they're safe for the data pipeline. + +**GIFFT:** This type of collection is mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +**IPC Note:** Due to `set` not being a [commutative operation][ipc-docs], using `labeled_boolean` +on non-parent processes is forbidden. +This is a restriction that favours correctness over friendliness, +which we may revisit if enough use cases require it. +Please [contact us][glean-matrix] if you'd like us to do so. + +### Other Scalar-ish types: `rate`, `timespan`, `datetime`, `uuid` + +The Glean SDK provides some very handy higher-level metric types for specific data. +If your data +* Is two or more numbers that are related (like failure count vs total count), + then consider the [Glean `rate` metric type][rate-metric]. +* Is a single duration or span of time (like how long Firefox takes to start), + then consider the [Glean `timespan` metric type][timespan-metric]. +* Is a single point in time (like the most recent sync time), + then consider the [Glean `datetime` metric type][datetime-metric]. +* Is a unique identifier (like a session id), + then consider the [Glean `uuid` metric type][uuid-metric]. + +**GIFFT:** These types of collection are mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +## Events - Use Glean's `event` + +[Telemetry Events][telemetry-events] +are a lesser-used form of data collection in Firefox Desktop. +Glean aimed to remove some of the stumbling blocks facing instrumentors when using events +in the [Glean `event` metric type][event-metric]: + +* Don't worry about enabling event categories. + In Glean all `events` are always on. +* No more event `name`. + Events in Glean follow the same `category.name.metric_name` + naming structure that other metrics do. +* No more `method`/`object`/`value`. + Events in Glean are just their identifier and an `extras` key/value dictionary. + +Since the two Event types aren't that analogous you will need to decide if your event +* Prefers to put its `method`/`object`/`value` in the `extras` dictionary +* Prefers to fold its `method`/`object`/`value` into its identifier + +**GIFFT:** Events are mirrorable back to Firefox Telemetry via the +[Glean Interface For Firefox Telemetry][gifft]. +See [the guide][gifft] for instructions. + +## Other: Environment, Crash Annotations, Use Counters, Etc - Ask on #glean:mozilla.org for assistance + +Telemetry has a lot of collection subsystems build adjacent to those already mentioned. +We have solutions for the common ones, +but they are entirely dependent on the specific use case. +Please [reach out to us][glean-matrix] to explain it to us so we can help you either +work within what Glean currently affords or +[design a new metric type for you][new-metric-type]. + +[book-of-glean]: https://mozilla.github.io/glean/book/index.html +[gc-ms]: https://glam.telemetry.mozilla.org/firefox/probe/gc_ms/explore +[histogram-accumulate]: https://searchfox.org/mozilla-central/rev/d59bdea4956040e16113b05296c56867f761735b/toolkit/components/telemetry/core/Telemetry.h#61 +[ipc-docs]: ../dev/ipc.md +[gifft]: gifft.md +[memory-total]: https://glam.telemetry.mozilla.org/firefox/probe/memory_total/explore +[migration-worksheet]: https://docs.google.com/spreadsheets/d/1uEK7zSIJDcGGmof9NywP5AwaovVQCv_Bm3iNqibtESI/edit#gid=0 +[boolean-metric]: https://mozilla.github.io/glean/book/reference/metrics/boolean.html +[labeled-boolean-metric]: https://mozilla.github.io/glean/book/reference/metrics/labeled_booleans.html +[counter-metric]: https://mozilla.github.io/glean/book/reference/metrics/counter.html +[labeled-counter-metric]: https://mozilla.github.io/glean/book/reference/metrics/labeled_counters.html +[string-metric]: https://mozilla.github.io/glean/book/reference/metrics/string.html +[labeled-string-metric]: https://mozilla.github.io/glean/book/reference/metrics/labeled_strings.html +[timespan-metric]: https://mozilla.github.io/glean/book/reference/metrics/timespan.html +[timing-distribution-metric]: https://mozilla.github.io/glean/book/reference/metrics/timing_distribution.html +[memory-distribution-metric]: https://mozilla.github.io/glean/book/reference/metrics/memory_distribution.html +[uuid-metric]: https://mozilla.github.io/glean/book/reference/metrics/uuid.html +[datetime-metric]: https://mozilla.github.io/glean/book/reference/metrics/datetime.html +[event-metric]: https://mozilla.github.io/glean/book/reference/metrics/event.html +[custom-distribution-metric]: https://mozilla.github.io/glean/book/reference/metrics/custom_distribution.html +[quantity-metric]: https://mozilla.github.io/glean/book/reference/metrics/quantity.html +[rate-metric]: https://mozilla.github.io/glean/book/reference/metrics/rate.html +[ipc-dev-doc]: ../dev/ipc.md +[gc-idle]: https://glam.telemetry.mozilla.org/firefox/probe/gc_slice_during_idle/explore +[new-metric-keyed-categorical]: https://bugzilla.mozilla.org/show_bug.cgi?id=1657470 +[new-metric-percent]: https://bugzilla.mozilla.org/show_bug.cgi?id=1657467 +[new-metric-type]: https://wiki.mozilla.org/Glean/Adding_or_changing_Glean_metric_types +[glean-matrix]: https://chat.mozilla.org/#/room/#glean:mozilla.org +[checkerboard-severity]: https://searchfox.org/mozilla-central/rev/d59bdea4956040e16113b05296c56867f761735b/gfx/layers/apz/src/CheckerboardEvent.cpp#44 +[telemetry-events]: /toolkit/components/telemetry/collection/events.rst +[telemetry-scalars]: /toolkit/components/telemetry/collection/scalars.rst +[telemetry-histograms]: /toolkit/components/telemetry/collection/histograms.rst +[repositories-yaml]: https://github.com/mozilla/probe-scraper/blob/main/repositories.yaml diff --git a/toolkit/components/glean/docs/user/new_definitions_file.md b/toolkit/components/glean/docs/user/new_definitions_file.md new file mode 100644 index 0000000000..b9d1ada8f3 --- /dev/null +++ b/toolkit/components/glean/docs/user/new_definitions_file.md @@ -0,0 +1,116 @@ +# New Metrics and Pings + +To add a new metric or ping to Firefox Desktop you should follow the +[Glean SDK documentation on the subject](https://mozilla.github.io/glean/book/user/adding-new-metrics.html), +with some few twists we detail herein: + +## Testing + +Instrumentation, being code, should be tested. +Firefox on Glean [supports a wide variety of Firefox Desktop test suites][instrumentation-tests] +in addition to [Glean's own debugging mechanisms][glean-debug]. + +## IPC + +Firefox Desktop is made of multiple processes. +You can record data from any process in Firefox Desktop +[subject to certain conditions](../dev/ipc.md). + +If you will be recording data to this metric in multiple processes, +you should make yourself aware of those conditions. + +## Where do I Define new Metrics and Pings? + +Metrics and pings are defined in their definitions files +(`metrics.yaml` or `pings.yaml`, respectively). +But where can you find `metrics.yaml` or `pings.yaml`? + +If you're not the first person in your component to ask that question, +the answer is likely "in the root of your component". +Look for the definitions files near to where you are instrumenting your code. +Or you can look in +`toolkit/components/glean/metrics_index.py` +to see the list of all currently-known definitions files. + +If you _are_ the first person in your component to ask that question, +you get to choose where to start them! +We recommend adding them in the root of your component, next to a `moz.build`. +Be sure to link to this document at the top of the file! +It contains many useful tidbits of information that anyone adding new metrics should know. +Preferably, use this blank template to get started, +substituting your component's `product :: component` tag from +[the list](https://searchfox.org/mozilla-central/source/toolkit/components/glean/tags.yaml): + +```yaml +# This Source Code Form is subject to the terms of the Mozilla Public +# License, v. 2.0. If a copy of the MPL was not distributed with this +# file, You can obtain one at http://mozilla.org/MPL/2.0/. + +# Adding a new metric? We have docs for that! +# https://firefox-source-docs.mozilla.org/toolkit/components/glean/user/new_definitions_file.html + +--- +$schema: moz://mozilla.org/schemas/glean/metrics/2-0-0 +$tags: + - 'Your Product :: Your Component' + +``` + +If you add a new definitions file, be sure to edit +`toolkit/components/glean/metrics_index.py`, +adding your definitions files to the Python lists therein. +If you don't, no API will be generated for your metrics and your build will fail. +You will have to decide which products your metrics will be used in. +For code that's also used in other Gecko-based products (Firefox Desktop, Firefox for Android, Focus for Android), use `gecko_metrics`. +For Desktop-only instrumentation use `firefox_desktop_metrics`. +For other products use their respective lists. + +Changes to `metrics_index.py` are automatically reflected in the data pipeline once a day +using the [fog-updater automation in probe-scraper](https://github.com/mozilla/probe-scraper/tree/main/fog-updater). +Data will not show up in datasets and tools until this happens. +If something is unclear or data is not showing up in time you will need to file a bug in +`Data Platform and Tools :: General`. + +If you have any questions, be sure to ask on +[the #glean channel](https://chat.mozilla.org/#/room/#glean:mozilla.org). + +**Note:** Do _not_ use `toolkit/components/glean/metrics.yaml` +or `toolkit/components/glean/pings.yaml`. +These are for metrics instrumenting the code under `toolkit/components/glean` +and are not general-purpose locations for adding metrics and pings. + +## How does Expiry Work? + +In FOG, +unlike in other Glean-SDK-using projects, +metrics expire based on Firefox application version. +This is to allow metrics to be valid over the entire life of an application version, +whether that is the 4-6 weeks of usual releases or the 13 months of ESR releases. + +There are three values accepted in the `expires` field of `metrics.yaml`s for FOG: +* `"X"` (where `X` is the major portion of a Firefox Desktop version) - + The metric will be expired when the `MOZ_APP_VERSION` reaches or exceeds `X`. + (For example, when the Firefox Version is `88.0a1`, + all metrics marked with `expires: "88"` or lower will be expired.) + This is the recommended form for all new metrics to ensure they stop recording when they stop being relevant. +* `expired` - For marking a metric as manually expired. + Not usually used, but sometimes helpful for internal tests. +* `never` - For marking a metric as part of a permanent data collection. + Metrics marked with `never` must have + [instrumentation tests](instrumentation_tests). + +For more information on what expiry means and the +`metrics.yaml` format, see +[the Glean SDK docs](https://mozilla.github.io/glean/book/user/metric-parameters.html) +on this subject. Some quick facts: + +* Data collected to expired metrics is not recorded or sent. +* Recording to expired metrics is not an error at runtime. +* Expired metrics being in a `metrics.yaml` is a linting error in `glean_parser`. +* Expired (and non-expired) metrics that are no longer useful should be promptly removed from your `metrics.yaml`. + This reduces the size and improves the performance of Firefox + (and speeds up the Firefox build process) + by decreasing the amount of code that needs to be generated. + +[instrumentation-tests]: ./instrumentation_tests +[glean-debug]: https://mozilla.github.io/glean/book/reference/debug/index.html |