diff options
Diffstat (limited to 'toolkit/components/glean/docs/dev/ipc.md')
-rw-r--r-- | toolkit/components/glean/docs/dev/ipc.md | 183 |
1 files changed, 183 insertions, 0 deletions
diff --git a/toolkit/components/glean/docs/dev/ipc.md b/toolkit/components/glean/docs/dev/ipc.md new file mode 100644 index 0000000000..e008235c05 --- /dev/null +++ b/toolkit/components/glean/docs/dev/ipc.md @@ -0,0 +1,183 @@ +# Inter-process Communication (IPC) + +Firefox Desktop is a multi-process desktop application. +Code requiring instrumentation may be on any of its processes, +so FOG provide facilities to do just that. + +## Design + +The IPC Design of FOG was worked out in +[bug 1618253](https://bugzilla.mozilla.org/show_bug.cgi?id=1618253). + +It centred around a few specific concepts: + +### Forbidding Non-Commutative Operations + +Because we cannot nicely impose a canonical ordering of metric operations across all processes, +FOG forbids non-[commutative](https://en.wikipedia.org/wiki/Commutative_property) +metric operations in some circumstances. + +For example, +`Add()`-ing to a Counter metric works from multiple processes because the order doesn't matter. +However, given a String metric being `Set()` from multiple processes simultaneously, +which value should it take? + +This ambiguity is not a good foundation to build trust on, +so we forbid setting a String metric from multiple processes. + +#### List of Forbidden Operations + +* Boolean's `set` (this is the metric type's only operation) +* Labeled Boolean's `set` (this is the metric type's only operation) +* String's `set` (this is the metric type's only operation) +* Labeled String's `set` (this is the metric type's only operation) +* String List's `set` + * `add` is permitted (order and uniqueness are not guaranteed) +* Timespan's `start`, `stop`, and `cancel` (these are the metric type's only operations) +* UUID's `set` and `generateAndSet` (these are the metric type's only operations) +* Datetime's `set` (this is the metric type's only operation) +* Quantity's `set` (this is the metric type's only operation) + +This list may grow over time as new metric types are added. +If there's an operation/metric type on this list that you need to use in a non-parent process, +please reach out +[on the #glean channel](https://chat.mozilla.org/#/room/#glean:mozilla.org) +and we'll help you out. + +### Process Agnosticism + +For metric types that can be used cross-process, +FOG provides no facility for identifying which process the instrumentation is on. + +What this means is that if you accumulate to a +[Timing Distribution](https://mozilla.github.io/glean/book/user/metrics/timing_distribution.html) +in multiple processes, +all the samples from all the processes will be combined in the same metric. + +If you wish to distinguish samples from different process types, +you will need multiple metrics and inline code to select the proper one for the given process. +For example: + +```C++ +if (XRE_GetProcessType() == GeckoProcessType_Default) { + mozilla::glean::performance::cache_size.Accumulate(numBytes / 1024); +} else { + mozilla::glean::performance::non_main_process_cache_size.Accumulate(numBytes / 1024); +} +``` + +### Scheduling + +FOG makes no guarantee about when non-main-process metric values are sent across IPC. +FOG will try its best to schedule opportunistically in idle moments, +and during orderly shutdowns. + +There are a few cases where we provide more firm guarantees: + +#### Tests + +There are test-only APIs in Rust, C++, and Javascript. +These do not await a flush of child process metric values. +You can use the test-only method `testFlushAllChildren` on the `FOG` +XPCOM component to await child data's arrival: +```js +await Services.fog.testFlushAllChildren(); +``` +See [the test documentation](testing.md) for more details on testing FOG. +For writing tests about instrumentation, see +[the instrumentation test documentation](../user/instrumentation_tests). + +#### Pings + +We do not guarantee that non-main-process data has made it into a specific ping. + +[Built-in pings](https://mozilla.github.io/glean/book/user/pings/index.html) +are submitted by the Rust Glean SDK at times FOG doesn't directly control, +so there may be data not present in the parent process when a built-in ping is submitted. +We don't anticipate this causing a problem since child-process data that +"misses" a given ping will be included in the next one. + +At this time, +[Custom Pings](https://mozilla.github.io/glean/book/user/pings/custom.html) +must be sent in the parent process and have no mechanism +to schedule their submission for after child-process data arrives in the parent process. +[bug 1732118](https://bugzilla.mozilla.org/show_bug.cgi?id=1732118) +tracks the addition of such a mechanism or guarantee. + +#### Shutdown + +We will make a best effort during an orderly shutdown to flush all pending data in child processes. +This means a disorderly shutdown (usually a crash) +may result in child process data being lost. + +#### Size + +We don't measure or keep an up-to-date calculation of the size of the IPC Payload. +We do, however, keep a count of the number of times the IPC Payload has been accessed. +This is used as a (very) conservative estimate of the size of the IPC Payload so we do not exceed the +[IPC message size limit](https://searchfox.org/mozilla-central/search?q=kMaximumMessageSize). + +See [bug 1745660](https://bugzilla.mozilla.org/show_bug.cgi?id=1745660). + +### Mechanics + +The rough design is that the parent process can request an immediate flush of pending data, +and each child process can decide to flush its pending data whenever it wishes. +The former is via `FlushFOGData() returns (ByteBuf)` and the latter via `FOGData(ByteBuf)`. + +Pending Data is a buffer of bytes generated by `bincode` in Rust in the Child, +handed off to C++, passed over IPC, +then given back to `bincode` in Rust on the Parent. + +Rust is then responsible for turning the pending data into +[metrics API][glean-metrics] calls on the metrics in the parent process. + +#### Supported Process Types + +FOG supports messaging between the following types of child process and the parent process: +* content children (via `PContent` + (for now. See [bug 1641989](https://bugzilla.mozilla.org/show_bug.cgi?id=1641989)) +* gmp children (via `PGMP`) +* gpu children (via `PGPU`) +* rdd children (via `PRDD`) +* socket children (via `PSocketProcess`) + +See +[the process model docs](/dom/ipc/process_model.rst) +for more information about what that means. + +### Adding Support for a new Process Type + +Adding support for a new process type is a matter of extending the two messages +mentioned above in "Mechanics" to another process type's protocol (ipdl file). + +1. Add two messages to the appropriate sections in `P<ProcessType>.ipdl` + * (( **Note:** `PGPU` _should_ be the only ipdl where `parent` + means the non-parent/-main/-UI process, + but double-check that you get this correct.)) + * Add `async FOGData(ByteBuf&& aBuf);` to the parent/main/UI process side of things + (most often `parent:`). + * Add `async FlushFOGData() returns (ByteBuf buf);` to the non-parent/-main/-UI side + (most often `child:`). +2. Implement the protocol endpoints in `P<ProcessType>{Child|Parent}.{h|cpp}` + * The message added to the `parent: ` section goes in + `P<ProcessType>Parent.{h|cpp}` and vice versa. +3. Add to `FOGIPC.cpp`'s `FlushAllChildData` code that + 1. Enumerates all processes of the newly-supported type (there may only be one), + 2. Calls `SendFlushFOGData on each, and + 3. Adds the resulting promise to the array. +4. Add to `FOGIPC.cpp`'s `SendFOGData` the correct `GeckoProcessType_*` + enum value, and appropriate code for getting the parent process singleton and calling + `SendFOGData` on it. +5. Add to the fog crate's `register_process_shutdown` function + handling for at-shutdown flushing of IPC data. + If this isn't added, we will log (but not panic) + on the first use of Glean APIs on an unsupported process type. + * "Handling" might be an empty block with a comment explaining where to find it + (like how `PROCESS_TYPE_DEFAULT` is handled) + * Or it might be custom code + (like `PROCESS_TYPE_CONTENT`'s) +6. Add to the documented [list of supported process types](#supported-process-types) + the process type you added support for. + +[glean-metrics]: https://mozilla.github.io/glean/book/reference/metrics/index.html |