diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 17:32:43 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 17:32:43 +0000 |
commit | 6bf0a5cb5034a7e684dcc3500e841785237ce2dd (patch) | |
tree | a68f146d7fa01f0134297619fbe7e33db084e0aa /ipc/docs | |
parent | Initial commit. (diff) | |
download | thunderbird-upstream.tar.xz thunderbird-upstream.zip |
Adding upstream version 1:115.7.0.upstream/1%115.7.0upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'ipc/docs')
-rw-r--r-- | ipc/docs/index.rst | 20 | ||||
-rw-r--r-- | ipc/docs/ipdl.rst | 1766 | ||||
-rw-r--r-- | ipc/docs/processes.rst | 1252 | ||||
-rw-r--r-- | ipc/docs/utility_process.rst | 69 |
4 files changed, 3107 insertions, 0 deletions
diff --git a/ipc/docs/index.rst b/ipc/docs/index.rst new file mode 100644 index 0000000000..0485fdd768 --- /dev/null +++ b/ipc/docs/index.rst @@ -0,0 +1,20 @@ +Processes, Threads and IPC +========================== + +These pages contain the documentation for Gecko's architecture for platform +process and thread creation, communication and synchronization. They live +in mozilla-central in the 'ipc/docs' directory. + +.. toctree:: + :maxdepth: 3 + + ipdl + processes + utility_process + +For inter-process communication involving Javascript, see `JSActors`_. They +are a very limited case, used for communication between elements in the DOM, +which may exist in separate processes. They only involve the main process and +content processes -- no other processes run Javascript. + +.. _JSActors: /dom/ipc/jsactors.html diff --git a/ipc/docs/ipdl.rst b/ipc/docs/ipdl.rst new file mode 100644 index 0000000000..3e8ace542b --- /dev/null +++ b/ipc/docs/ipdl.rst @@ -0,0 +1,1766 @@ +IPDL: Inter-Thread and Inter-Process Message Passing +==================================================== + +The Idea +-------- + +**IPDL**, the "Inter-[thread|process] Protocol Definition Language", is the +Mozilla-specific language that allows code to communicate between system +threads or processes in a standardized, efficient, safe, secure and +platform-agnostic way. IPDL communications take place between *parent* and +*child* objects called *actors*. The architecture is inspired by the `actor +model <https://en.wikipedia.org/wiki/Actor_model>`_. + +.. note:: + IPDL actors differ from the actor model in one significant way -- all + IPDL communications are *only* between a parent and its only child. + +The actors that constitute a parent/child pair are called **peers**. Peer +actors communicate through an **endpoint**, which is an end of a message pipe. +An actor is explicitly bound to its endpoint, which in turn is bound to a +particular thread soon after it is constructed. An actor never changes its +endpoint and may only send and receive predeclared **messages** from/to that +endpoint, on that thread. Violations result in runtime errors. A thread may +be bound to many otherwise unrelated actors but an endpoint supports +**top-level** actors and any actors they **manage** (see below). + +.. note:: + More precisely, endpoints can be bound to any ``nsISerialEventTarget``, + which are themselves associated with a specific thread. By default, + IPDL will bind to the current thread's "main" serial event target, + which, if it exists, is retrieved with ``GetCurrentSerialEventTarget``. + For the sake of clarity, this document will frequently refer to actors + as bound to threads, although the more precise interpretation of serial + event targets is also always valid. + +.. note:: + Internally, we use the "Ports" component of the `Chromium Mojo`_ library + to *multiplex* multiple endpoints (and, therefore, multiple top-level + actors). This means that the endpoints communicate over the same native + pipe, which conserves limited OS resources. The implications of this are + discussed in `IPDL Best Practices`_. + +Parent and child actors may be bound to threads in different processes, in +different threads in the same process, or even in the same thread in the same +process. That last option may seem unreasonable but actors are versatile and +their layout can be established at run-time so this could theoretically arise +as the result of run-time choices. One large example of this versatility is +``PCompositorBridge`` actors, which in different cases connect endpoints in the +main process and the GPU process (for UI rendering on Windows), in a content +process and the GPU process (for content rendering on Windows), in the main +process and the content process (for content rendering on Mac, where there is +no GPU process), or between threads on the main process (UI rendering on Mac). +For the most part, this does not require elaborate or redundant coding; it +just needs endpoints to be bound judiciously at runtime. The example in +:ref:`Connecting With Other Processes` shows one way this can be done. It +also shows that, without proper plain-language documentation of *all* of the +ways endpoints are configured, this can quickly lead to unmaintainable code. +Be sure to document your endpoint bindings throroughly!!! + +.. _Chromium Mojo: https://chromium.googlesource.com/chromium/src/+/refs/heads/main/mojo/core/README.md#Port + +The Approach +------------ + +The actor framework will schedule tasks to run on its associated event target, +in response to messages it receives. Messages are specified in an IPDL +**protocol** file and the response handler tasks are defined per-message by C++ +methods. As actors only communicate in pairs, and each is bound to one thread, +sending is always done sequentially, never concurrently (same for receiving). +This means that it can, and does, guarantee that an actor will always receive +messages in the same order they were sent by its related actor -- and that this +order is well defined since the related actor can only send from one thread. + +.. warning:: + There are a few (rare) exceptions to the message order guarantee. They + include `synchronous nested`_ messages, `interrupt`_ messages, and + messages with a ``[Priority]`` or ``[Compress]`` annotation. + +An IPDL protocol file specifies the messages that may be sent between parent +and child actors, as well as the direction and payload of those messages. +Messages look like function calls but, from the standpoint of their caller, +they may start and end at any time in the future -- they are *asynchronous*, +so they won't block their sending actors or any other components that may be +running in the actor's thread's ``MessageLoop``. + +.. note:: + Not all IPDL messages are asynchronous. Again, we run into exceptions for + messages that are synchronous, `synchronous nested`_ or `interrupt`_. Use + of synchronous and nested messages is strongly discouraged but may not + always be avoidable. They will be defined later, along with superior + alternatives to both that should work in nearly all cases. Interrupt + messages were prone to misuse and are deprecated, with removal expected in + the near future + (`Bug 1729044 <https://bugzilla.mozilla.org/show_bug.cgi?id=1729044>`_). + +Protocol files are compiled by the *IPDL compiler* in an early stage of the +build process. The compiler generates C++ code that reflects the protocol. +Specifically, it creates one C++ class that represents the parent actor and one +that represents the child. The generated files are then automatically included +in the C++ build process. The generated classes contain public methods for +sending the protocol messages, which client code will use as the entry-point to +IPC communication. The generated methods are built atop our IPC framework, +defined in `/ipc <https://searchfox.org/mozilla-central/source/ipc>`_, that +standardizes the safe and secure use of sockets, pipes, shared memory, etc on +all supported platforms. See `Using The IPDL compiler`_ for more on +integration with the build process. + +Client code must be written that subclasses these generated classes, in order +to add handlers for the tasks generated to respond to each message. It must +also add routines (``ParamTraits``) that define serialization and +deserialization for any types used in the payload of a message that aren't +already known to the IPDL system. Primitive types, and a bunch of Mozilla +types, have predefined ``ParamTraits`` (`here +<https://searchfox.org/mozilla-central/source/ipc/glue/IPCMessageUtils.h>`__ +and `here +<https://searchfox.org/mozilla-central/source/ipc/glue/IPCMessageUtilsSpecializations.h>`__). + +.. note:: + Among other things, client code that uses the generated code must include + ``chromium-config.mozbuild`` in its ``moz.build`` file. See `Using The + IPDL compiler`_ for a complete list of required build changes. + +.. _interrupt: `The Old Ways`_ +.. _synchronous nested: `The Rest`_ + +The Steps To Making A New Actor +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +#. Decide what folder you will work in and create: + + #. An IPDL protocol file, named for your actor (e.g. ``PMyActor.ipdl`` -- + actor protocols must begin with a ``P``). See `The Protocol Language`_. + #. Properly-named source files for your actor's parent and child + implementations (e.g. ``MyActorParent.h``, ``MyActorChild.h`` and, + optionally, adjacent .cpp files). See `The C++ Interface`_. + #. IPDL-specific updates to the ``moz.build`` file. See `Using The IPDL + compiler`_. +#. Write your actor protocol (.ipdl) file: + + #. Decide whether you need a top-level actor or a managed actor. See + `Top Level Actors`_. + #. Find/write the IPDL and C++ data types you will use in communication. + Write ``ParamTraits`` for C++ data types that don't have them. See + `Generating IPDL-Aware C++ Data Types: IPDL Structs and Unions`_ for IPDL + structures. See `Referencing Externally Defined Data Types: IPDL + Includes`_ and `ParamTraits`_ for C++ data types. + #. Write your actor and its messages. See `Defining Actors`_. +#. Write C++ code to create and destroy instances of your actor at runtime. + + * For managed actors, see `Actor Lifetimes in C++`_. + * For top-level actors, see `Creating Top Level Actors From Other Actors`_. + The first actor in a process is a very special exception -- see `Creating + First Top Level Actors`_. +#. Write handlers for your actor's messages. See `Actors and Messages in + C++`_. +#. Start sending messages through your actors! Again, see `Actors and Messages + in C++`_. + +The Protocol Language +--------------------- + +This document will follow the integration of two actors into Firefox -- +``PMyManager`` and ``PMyManaged``. ``PMyManager`` will manage ``PMyManaged``. +A good place to start is with the IPDL actor definitions. These are files +that are named for the actor (e.g. ``PMyManager.ipdl``) and that declare the +messages that a protocol understands. These actors are for demonstration +purposes and involve quite a bit of functionality. Most actors will use a very +small fraction of these features. + +.. literalinclude:: _static/PMyManager.ipdl + :language: c++ + :name: PMyManager.ipdl + +.. literalinclude:: _static/PMyManaged.ipdl + :language: c++ + :name: PMyManaged.ipdl + +These files reference three additional files. ``MyTypes.ipdlh`` is an "IPDL +header" that can be included into ``.ipdl`` files as if it were inline, except +that it also needs to include any external actors and data types it uses: + +.. literalinclude:: _static/MyTypes.ipdlh + :language: c++ + :name: MyTypes.ipdlh + +``MyActorUtils.h`` and ``MyDataTypes.h`` are normal C++ header files that +contain definitions for types passed by these messages, as well as instructions +for serializing them. They will be covered in `The C++ Interface`_. + +Using The IPDL compiler +~~~~~~~~~~~~~~~~~~~~~~~ + +To build IPDL files, list them (alphabetically sorted) in a ``moz.build`` file. +In this example, the ``.ipdl`` and ``.ipdlh`` files would be alongside a +``moz.build`` containing: + +.. code-block:: c++ + + IPDL_SOURCES += [ + "MyTypes.ipdlh", + "PMyManaged.ipdl", + "PMyManager.ipdl", + ] + + UNIFIED_SOURCES += [ + "MyManagedChild.cpp", + "MyManagedParent.cpp", + "MyManagerChild.cpp", + "MyManagerParent.cpp", + ] + + include("/ipc/chromium/chromium-config.mozbuild") + +``chromium-config.mozbuild`` sets up paths so that generated IPDL header files +are in the proper scope. If it isn't included, the build will fail with +``#include`` errors in both your actor code and some internal ipc headers. For +example: + +.. code-block:: c++ + + c:/mozilla-src/mozilla-unified/obj-64/dist/include\ipc/IPCMessageUtils.h(13,10): fatal error: 'build/build_config.h' file not found + +``.ipdl`` files are compiled to C++ files as one of the earliest post-configure +build steps. Those files are, in turn, referenced throughout the source code +and build process. From ``PMyManager.ipdl`` the compiler generates two header +files added to the build context and exported globally: +``mozilla/myns/PMyManagerParent.h`` and ``mozilla/myns/PMyManagerChild.h``, as +discussed in `Namespaces`_ below. These files contain the base classes for the +actors. It also makes several other files, including C++ source files and +another header, that are automatically included into the build and should not +require attention. + +C++ definions of the actors are required for IPDL. They define the actions +that are taken in response to messages -- without this, they would have no +value. There will be much more on this when we discuss `Actors and Messages in +C++`_ but note here that C++ header files named for the actor are required by +the IPDL `compiler`. The example would expect +``mozilla/myns/MyManagedChild.h``, ``mozilla/myns/MyManagedParent.h``, +``mozilla/myns/MyManagerChild.h`` and ``mozilla/myns/MyManagerParent.h`` and +will not build without them. + +Referencing Externally Defined Data Types: IPDL Includes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Let's begin with ``PMyManager.ipdl``. It starts by including types that it +will need from other places: + +.. code-block:: c++ + + include protocol PMyManaged; + include MyTypes; // for MyActorPair + + using MyActorEnum from "mozilla/myns/MyActorUtils.h"; + using struct mozilla::myns::MyData from "mozilla/MyDataTypes.h"; + [MoveOnly] using mozilla::myns::MyOtherData from "mozilla/MyDataTypes.h"; + [RefCounted] using class mozilla::myns::MyThirdData from "mozilla/MyDataTypes.h"; + +The first line includes a protocol that PMyManager will manage. That protocol +is defined in its own ``.ipdl`` file. Cyclic references are expected and pose +no concern. + +The second line includes the file ``MyTypes.ipdlh``, which defines types like +structs and unions, but in IPDL, which means they have behavior that goes +beyond the similar C++ concepts. Details can be found in `Generating +IPDL-Aware C++ Data Types: IPDL Structs and Unions`_. + +The final lines include types from C++ headers. Additionally, the [RefCounted] +and [MoveOnly] attributes tell IPDL that the types have special functionality +that is important to operations. These are the data type attributes currently +understood by IPDL: + +================ ============================================================== +``[RefCounted]`` Type ``T`` is reference counted (by ``AddRef``/``Release``). + As a parameter to a message or as a type in IPDL + structs/unions, it is referenced as a ``RefPtr<T>``. +``[MoveOnly]`` The type ``T`` is treated as uncopyable. When used as a + parameter in a message or an IPDL struct/union, it is as an + r-value ``T&&``. +================ ============================================================== + +Finally, note that ``using``, ``using class`` and ``using struct`` are all +valid syntax. The ``class`` and ``struct`` keywords are optional. + +Namespaces +~~~~~~~~~~ + +From the IPDL file: + +.. code-block:: c++ + + namespace mozilla { + namespace myns { + + // ... data type and actor definitions ... + + } // namespace myns + } // namespace mozilla + + +Namespaces work similar to the way they do in C++. They also mimic the +notation, in an attempt to make them comfortable to use. When IPDL actors are +compiled into C++ actors, the namespace scoping is carried over. As previously +noted, when C++ types are included into IPDL files, the same is true. The most +important way in which they differ is that IPDL also uses the namespace to +establish the path to the generated files. So, the example defines the IPDL +data type ``mozilla::myns::MyUnion`` and the actors +``mozilla::myns::PMyManagerParent`` and ``mozilla::myns::PMyManagerChild``, +which can be included from ``mozilla/myns/PMyManagerParent.h``, +``mozilla/myns/PMyManagerParent.h`` and ``mozilla/myns/PMyManagerChild.h``, +respectively. The namespace becomes part of the path. + +Generating IPDL-Aware C++ Data Types: IPDL Structs and Unions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``PMyManager.ipdl`` and ``MyTypes.ipdlh`` define: + +.. code-block:: c++ + + [Comparable] union MyUnion { + float; + MyOtherData; + }; + + struct MyActorPair { + PMyManaged actor1; + nullable PMyManaged actor2; + }; + +From these descriptions, IPDL generates C++ classes that approximate the +behavior of C++ structs and unions but that come with pre-defined +``ParamTraits`` implementations. These objects can also be used as usual +outside of IPDL, although the lack of control over the generated code means +they are sometimes poorly suited to use as plain data. See `ParamTraits`_ for +details. + +The ``[Comparable]`` attribute tells IPDL to generate ``operator==`` and +``operator!=`` for the new type. In order for it to do that, the fields inside +the new type need to define both of those operators. + +Finally, the ``nullable`` keyword indicates that, when serialized, the actor +may be null. It is intended to help users avoid null-object dereference +errors. It only applies to actor types and may also be attached to parameters +in message declarations. + +Defining Actors +~~~~~~~~~~~~~~~ + +The real point of any ``.ipdl`` file is that each defines exactly one actor +protocol. The definition always matches the ``.ipdl`` filename. Repeating the +one in ``PMyManager.ipdl``: + +.. code-block:: c++ + + sync protocol PMyManager { + manages PMyManaged; + + async PMyManaged(); + // ... more message declarations ... + }; + +.. important:: + A form of reference counting is `always` used internally by IPDL to make + sure that it and its clients never address an actor the other component + deleted but this becomes fragile, and sometimes fails, when the client code + does not respect the reference count. For example, when IPDL detects that + a connection died due to a crashed remote process, deleting the actor could + leave dangling pointers, so IPDL `cannot` delete it. On the other hand, + there are many cases where IPDL is the only entity to have references to + some actors (this is very common for one side of a managed actor) so IPDL + `must` delete it. If all of those objects were reference counted then + there would be no complexity here. Indeed, new actors using + ``[ManualDealloc]`` should not be approved without a very compelling + reason. New ``[ManualDealloc]`` actors may soon be forbidden. + +The ``sync`` keyword tells IPDL that the actor contains messages that block the +sender using ``sync`` blocking, so the sending thread waits for a response to +the message. There is more on what it and the other blocking modes mean in +`IPDL messages`_. For now, just know that this is redundant information whose +value is primarily in making it easy for other developers to know that there +are ``sync`` messages defined here. This list gives preliminary definitions of +the options for the actor-blocking policy of messages: + +======================= ======================================================= +``async`` Actor may contain only asynchronous messages. +``sync`` Actor has ``async`` capabilities and adds ``sync`` + messages. ``sync`` messages + can only be sent from the child actor to the parent. +``intr`` (deprecated) Actor has ``sync`` capabilities and adds ``intr`` + messages. Some messages can be received while an actor + waits for an ``intr`` response. This type will be + removed soon. +======================= ======================================================= + +Beyond these protocol blocking strategies, IPDL supports annotations that +indicate the actor has messages that may be received in an order other than +the one they were sent in. These orderings attempt to handle messages in +"message thread" order (as in e.g. mailing lists). These behaviors can be +difficult to design for. Their use is discouraged but is sometimes warranted. +They will be discussed further in `Nested messages`_. + +============================== ================================================ +``[NestedUpTo=inside_sync]`` Actor has high priority messages that can be + handled while waiting for a ``sync`` response. +``[NestedUpTo=inside_cpow]`` Actor has the highest priority messages that + can be handled while waiting for a ``sync`` + response. +============================== ================================================ + +The ``manages`` clause tells IPDL that ``PMyManager`` manages the +``PMyManaged`` actor that was previously ``include`` d. As with any managed +protocol, it must also be the case that ``PMyManaged.ipdl`` includes +``PMyManager`` and declares that ``PMyManaged`` is ``managed`` by +``PMyManager``. Recalling the code: + +.. code-block:: c++ + + // PMyManaged.ipdl + include protocol PMyManager; + // ... + + protocol PMyManaged { + manager PMyManager; + // ... + }; + +An actor has a ``manager`` (e.g. ``PMyManaged``) or else it is a top-level +actor (e.g. ``PMyManager``). An actor protocol may be managed by more than one +actor type. For example, ``PMyManaged`` could have also been managed by some +``PMyOtherManager`` not shown here. In that case, ``manager`` s are presented +in a list, separated by ``or`` -- e.g. ``manager PMyManager or +PMyOtherManager``. Of course, an **instance** of a managed actor type has only +one manager actor (and is therefore managed by only one of the types of +manager). The manager of an instance of a managee is always the actor that +constructed that managee. + +Finally, there is the message declaration ``async PMyManaged()``. This message +is a constructor for ``MyManaged`` actors; unlike C++ classes, it is found in +``MyManager``. Every manager will need to expose constructors to create its +managed types. These constructors are the only way to create an actor that is +managed. They can take parameters and return results, like normal messages. +The implementation of IPDL constructors are discussed in `Actor Lifetimes in +C++`_. + +We haven't discussed a way to construct new top level actors. This is a more +advanced topic and is covered separately in `Top Level Actors`_. + +.. _IPDL messages: `Declaring IPDL Messages`_ + +Declaring IPDL Messages +~~~~~~~~~~~~~~~~~~~~~~~ + +The final part of the actor definition is the declaration of messages: + +.. code-block:: c++ + + sync protocol PMyManager { + // ... + parent: + async __delete__(nsString aNote); + sync SomeMsg(MyActorPair? aActors, MyData[] aMyData) + returns (int32_t x, int32_t y, MyUnion aUnion); + async PMyManaged(); + both: + [Tainted] async AnotherMsg(MyActorEnum aEnum, int32_t aNumber) + returns (MyOtherData aOtherData); + }; + +The messages are grouped into blocks by ``parent:``, ``child:`` and ``both:``. +These labels work the way ``public:`` and ``private:`` work in C++ -- messages +after these descriptors are sent/received (only) in the direction specified. + +.. note:: + As a mnemonic to remember which direction they indicate, remember to put + the word "to" in front of them. So, for example, ``parent:`` precedes + ``__delete__``, meaning ``__delete__`` is sent from the child **to** the + parent, and ``both:`` states that ``AnotherMsg`` can be sent **to** either + endpoint. + +IPDL messages support the following annotations: + +======================== ====================================================== +``[Compress]`` Indicates repeated messages of this type will + consolidate. +``[Tainted]`` Parameters are required to be validated before using + them. +``[Priority=Foo]`` Priority of ``MessageTask`` that runs the C++ message + handler. ``Foo`` is one of: ``normal``, ``input``, + ``vsync``, ``mediumhigh``, or ``control``. + See the ``IPC::Message::PriorityValue`` enum. +``[Nested=inside_sync]`` Indicates that the message can sometimes be handled + while a sync message waits for a response. +``[Nested=inside_cpow]`` Indicates that the message can sometimes be handled + while a sync message waits for a response. +``[LazySend]`` Messages with this annotation will be queued up to be + sent together either immediately before a non-LazySend + message, or from a direct task. +======================== ====================================================== + +``[Compress]`` provides crude protection against spamming with a flood of +messages. When messages of type ``M`` are compressed, the queue of unprocessed +messages between actors will never contain an ``M`` beside another one; they +will always be separated by a message of a different type. This is achieved by +throwing out the older of the two messages if sending the new one would break +the rule. This has been used to throttle pointer events between the main and +content processes. + +``[Compress=all]`` is similar but applies whether or not the messages are +adjacent in the message queue. + +``[Tainted]`` is a C++ mechanism designed to encourage paying attentiton to +parameter security. The values of tainted parameters cannot be used until you +vouch for their safety. They are discussed in `Actors and Messages in C++`_. + +The ``Nested`` annotations are deeply related to the message's blocking policy +that follows it and which was briefly discussed in `Defining Actors`_. See +`Nested messages`_ for details. + +``[LazySend]`` indicates the message doesn't need to be sent immediately, and +can be sent later, from a direct task. Worker threads which do not support +direct task dispatch will ignore this attribute. Messages with this annotation +will still be delivered in-order with other messages, meaning that if a normal +message is sent, any queued ``[LazySend]`` messages will be sent first. The +attribute allows the transport layer to combine messages to be sent together, +potentially reducing thread wake-ups for I/O and receiving threads. + +The following is a complete list of the available blocking policies. It +resembles the list in `Defining Actors`_: + +====================== ======================================================== +``async`` Actor may contain only asynchronous messages. +``sync`` Actor has ``async`` capabilities and adds ``sync`` + messages. ``sync`` messages can only be sent from the + child actor to the parent. +``intr`` (deprecated) Actor has ``sync`` capabilities and adds ``intr`` + messages. This type will be removed soon. +====================== ======================================================== + +The policy defines whether an actor will wait for a response when it sends a +certain type of message. A ``sync`` actor will wait immediately after sending +a ``sync`` message, stalling its thread, until a response is received. This is +an easy source of browser stalls. It is rarely required that a message be +synchronous. New ``sync`` messages are therefore required to get approval from +an IPC peer. The IPDL compiler will require such messages to be listed in the +file ``sync-messages.ini``. + +The notion that only child actors can send ``sync`` messages was introduced to +avoid potential deadlocks. It relies on the belief that a cycle (deadlock) of +sync messages is impossible because they all point in one direction. This is +no longer the case because any endpoint can be a child `or` parent and some, +like the main process, sometimes serve as both. This means that sync messages +should be used with extreme care. + +.. note:: + The notion of sync messages flowing in one direction is still the main + mechanism IPDL uses to avoid deadlock. New actors should avoid violating + this rule as the consequences are severe (and complex). Actors that break + these rules should not be approved without **extreme** extenuating + circumstances. If you think you need this, check with the IPC team on + Element first (#ipc). + +An ``async`` actor will not wait. An ``async`` response is essentially +identical to sending another ``async`` message back. It may be handled +whenever received messages are handled. The value over an ``async`` response +message comes in the ergonomics -- async responses are usually handled by C++ +lambda functions that are more like continuations than methods. This makes +them easier to write and to read. Additionally, they allow a response to +return message failure, while there would be no such response if we were +expecting to send a new async message back, and it failed. + +Following synchronization is the name of the message and its parameter list. +The message ``__delete__`` stands out as strange -- indeed, it terminates the +actor's connection. `It does not delete any actor objects itself!` It severs +the connections of the actor `and any actors it manages` at both endpoints. An +actor will never send or receive any messages after it sends or receives a +``__delete__``. Note that all sends and receives have to happen on a specific +*worker* thread for any actor tree so the send/receive order is well defined. +Anything sent after the actor processes ``__delete__`` is ignored (send returns +an error, messages yet to be received fail their delivery). In other words, +some future operations may fail but no unexpected behavior is possible. + +In our example, the child can break the connection by sending ``__delete__`` to +the parent. The only thing the parent can do to sever the connection is to +fail, such as by crashing. This sort of unidirectional control is both common +and desirable. + +``PMyManaged()`` is a managed actor constructor. Note the asymmetry -- an +actor contains its managed actor's constructors but its own destructor. + +The list of parameters to a message is fairly straight-forward. Parameters +can be any type that has a C++ ``ParamTraits`` specialization and is imported +by a directive. That said, there are some surprises in the list of messages: + +================= ============================================================= +``int32_t``,... The standard primitive types are included. See `builtin.py`_ + for a list. Pointer types are, unsurprisingly, forbidden. +``?`` When following a type T, the parameter is translated into + ``Maybe<T>`` in C++. +``[]`` When following a type T, the parameter is translated into + ``nsTArray<T>`` in C++. +================= ============================================================= + +Finally, the returns list declares the information sent in response, also as a +tuple of typed parameters. As previously mentioned, even ``async`` messages +can receive responses. A ``sync`` message will always wait for a response but +an ``async`` message will not get one unless it has a ``returns`` clause. + +This concludes our tour of the IPDL example file. The connection to C++ is +discussed in the next chapter; messages in particular are covered in `Actors +and Messages in C++`_. For suggestions on best practices when designing your +IPDL actor approach, see `IPDL Best Practices`_. + +.. _builtin.py: https://searchfox.org/mozilla-central/source/ipc/ipdl/ipdl/builtin.py + +IPDL Syntax Quick Reference +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following is a list of the keywords and operators that have been introduced +for use in IPDL files: + +============================= ================================================= +``include`` Include a C++ header (quoted file name) or + ``.ipdlh`` file (unquoted with no file suffix). +``using (class|struct) from`` Similar to ``include`` but imports only a + specific data type. +``include protocol`` Include another actor for use in management + statements, IPDL data types or as parameters to + messages. +``[RefCounted]`` Indicates that the imported C++ data types are + reference counted. Refcounted types require a + different ``ParamTraits`` interface than + non-reference-counted types. +``[ManualDealloc]`` Indicates that the IPDL interface uses the legacy + manual allocation/deallocation interface, rather + than modern reference counting. +``[MoveOnly]`` Indicates that an imported C++ data type should + not be copied. IPDL code will move it instead. +``namespace`` Specifies the namespace for IPDL generated code. +``union`` An IPDL union definition. +``struct`` An IPDL struct definition. +``[Comparable]`` Indicates that IPDL should generate + ``operator==`` and ``operator!=`` for the given + IPDL struct/union. +``nullable`` Indicates that an actor reference in an IPDL type + may be null when sent over IPC. +``protocol`` An IPDL protocol (actor) definition. +``sync/async`` These are used in two cases: (1) to indicate + whether a message blocks as it waits for a result + and (2) because an actor that contains ``sync`` + messages must itself be labeled ``sync`` or + ``intr``. +``[NestedUpTo=inside_sync]`` Indicates that an actor contains + [Nested=inside_sync] messages, in addition to + normal messages. +``[NestedUpTo=inside_cpow]`` Indicates that an actor contains + [Nested=inside_cpow] messages, in addition to + normal messages. +``intr`` Used to indicate either that (1) an actor + contains ``sync``, ``async`` and (deprecated) + ``intr`` messages, or (2) a message is ``intr`` + type. +``[Nested=inside_sync]`` Indicates that the message can be handled while + waiting for lower-priority, or in-message-thread, + sync responses. +``[Nested=inside_cpow]`` Indicates that the message can be handled while + waiting for lower-priority, or in-message-thread, + sync responses. Cannot be sent by the parent + actor. +``manager`` Used in a protocol definition to indicate that + this actor manages another one. +``manages`` Used in a protocol definition to indicate that + this actor is managed by another one. +``or`` Used in a ``manager`` clause for actors that have + multiple potential managers. +``parent: / child: / both:`` Indicates direction of subsequent actor messages. + As a mnemonic to remember which direction they + indicate, put the word "to" in front of them. +``returns`` Defines return values for messages. All types + of message, including ``async``, support + returning values. +``__delete__`` A special message that destroys the related + actors at both endpoints when sent. + ``Recv__delete__`` and ``ActorDestroy`` are + called before destroying the actor at the other + endpoint, to allow for cleanup. +``int32_t``,... The standard primitive types are included. +``String`` Translated into ``nsString`` in C++. +``?`` When following a type T in an IPDL data structure + or message parameter, + the parameter is translated into ``Maybe<T>`` in + C++. +``[]`` When following a type T in an IPDL data structure + or message parameter, + the parameter is translated into ``nsTArray<T>`` + in C++. +``[Tainted]`` Used to indicate that a message's handler should + receive parameters that it is required to + manually validate. Parameters of type ``T`` + become ``Tainted<T>`` in C++. +``[Compress]`` Indicates repeated messages of this type will + consolidate. When two messages of this type are + sent and end up side-by-side in the message queue + then the older message is discarded (not sent). +``[Compress=all]`` Like ``[Compress]`` but discards the older + message regardless of whether they are adjacent + in the message queue. +``[Priority=Foo]`` Priority of ``MessageTask`` that runs the C++ + message handler. ``Foo`` is one of: ``normal``, + ``input``, ``vsync``, ``mediumhigh``, or + ``control``. +``[LazySend]`` Messages with this annotation will be queued up to + be sent together immediately before a non-LazySend + message, or from a direct task. +``[ChildImpl="RemoteFoo"]`` Indicates that the child side implementation of + the actor is a class named ``RemoteFoo``, and the + definition is included by one of the + ``include "...";`` statements in the file. + *New uses of this attribute are discouraged.* +``[ParentImpl="FooImpl"]`` Indicates that the parent side implementation of + the actor is a class named ``FooImpl``, and the + definition is included by one of the + ``include "...";`` statements in the file. + *New uses of this attribute are discouraged.* +``[ChildImpl=virtual]`` Indicates that the child side implementation of + the actor is not exported by a header, so virtual + ``Recv`` methods should be used instead of direct + function calls. *New uses of this attribute are + discouraged.* +``[ParentImpl=virtual]`` Indicates that the parent side implementation of + the actor is not exported by a header, so virtual + ``Recv`` methods should be used instead of direct + function calls. *New uses of this attribute are + discouraged.* +============================= ================================================= + + +The C++ Interface +----------------- + +ParamTraits +~~~~~~~~~~~ + +Before discussing how C++ represents actors and messages, we look at how IPDL +connects to the imported C++ data types. In order for any C++ type to be +(de)serialized, it needs an implementation of the ``ParamTraits`` C++ type +class. ``ParamTraits`` is how your code tells IPDL what bytes to write to +serialize your objects for sending, and how to convert those bytes back to +objects at the other endpoint. Since ``ParamTraits`` need to be reachable by +IPDL code, they need to be declared in a C++ header and imported by your +protocol file. Failure to do so will result in a build error. + +Most basic types and many essential Mozilla types are always available for use +without inclusion. An incomplete list includes: C++ primitives, strings +(``std`` and ``mozilla``), vectors (``std`` and ``mozilla``), ``RefPtr<T>`` +(for serializable ``T``), ``UniquePtr<T>``, ``nsCOMPtr<T>``, ``nsTArray<T>``, +``std::unordered_map<T>``, ``nsresult``, etc. See `builtin.py +<https://searchfox.org/mozilla-central/source/ipc/ipdl/ipdl/builtin.py>`_, +`ipc_message_utils.h +<https://searchfox.org/mozilla-central/source/ipc/chromium/src/chrome/common/ipc_message_utils.h>`_ +and `IPCMessageUtilsSpecializations.h +<https://searchfox.org/mozilla-central/source/ipc/glue/IPCMessageUtilsSpecializations.h>`_. + +``ParamTraits`` typically bootstrap with the ``ParamTraits`` of more basic +types, until they hit bedrock (e.g. one of the basic types above). In the most +extreme cases, a ``ParamTraits`` author may have to resort to designing a +binary data format for a type. Both options are available. + +We haven't seen any of this C++ yet. Let's look at the data types included +from ``MyDataTypes.h``: + +.. code-block:: c++ + + // MyDataTypes.h + namespace mozilla::myns { + struct MyData { + nsCString s; + uint8_t bytes[17]; + MyData(); // IPDL requires the default constructor to be public + }; + + struct MoveonlyData { + MoveonlyData(); + MoveonlyData& operator=(const MoveonlyData&) = delete; + + MoveonlyData(MoveonlyData&& m); + MoveonlyData& operator=(MoveonlyData&& m); + }; + + typedef MoveonlyData MyOtherData; + + class MyUnusedData { + public: + NS_INLINE_DECL_REFCOUNTING(MyUnusedData) + int x; + }; + }; + + namespace IPC { + // Basic type + template<> + struct ParamTraits<mozilla::myns::MyData> { + typedef mozilla::myns::MyData paramType; + static void Write(MessageWriter* m, const paramType& in); + static bool Read(MessageReader* m, paramType* out); + }; + + // [MoveOnly] type + template<> + struct ParamTraits<mozilla::myns::MyOtherData> { + typedef mozilla::myns::MyOtherData paramType; + static void Write(MessageWriter* m, const paramType& in); + static bool Read(MessageReader* m, paramType* out); + }; + + // [RefCounted] type + template<> + struct ParamTraits<mozilla::myns::MyUnusedData*> { + typedef mozilla::myns::MyUnusedData paramType; + static void Write(MessageWriter* m, paramType* in); + static bool Read(MessageReader* m, RefPtr<paramType>* out); + }; + } + +MyData is a struct and MyOtherData is a typedef. IPDL is fine with both. +Additionally, MyOtherData is not copyable, matching its IPDL ``[MoveOnly]`` +annotation. + +``ParamTraits`` are required to be defined in the ``IPC`` namespace. They must +contain a ``Write`` method with the proper signature that is used for +serialization and a ``Read`` method, again with the correct signature, for +deserialization. + +Here we have three examples of declarations: one for an unannotated type, one +for ``[MoveOnly]`` and a ``[RefCounted]`` one. Notice the difference in the +``[RefCounted]`` type's method signatures. The only difference that may not be +clear from the function types is that, in the non-reference-counted case, a +default-constructed object is supplied to ``Read`` but, in the +reference-counted case, ``Read`` is given an empty ``RefPtr<MyUnusedData>`` and +should only allocate a ``MyUnusedData`` to return if it so desires. + +These are straight-forward implementations of the ``ParamTraits`` methods for +``MyData``: + +.. code-block:: c++ + + /* static */ void IPC::ParamTraits<MyData>::Write(MessageWriter* m, const paramType& in) { + WriteParam(m, in.s); + m->WriteBytes(in.bytes, sizeof(in.bytes)); + } + /* static */ bool IPC::ParamTraits<MyData>::Read(MessageReader* m, paramType* out) { + return ReadParam(m, &out->s) && + m->ReadBytesInto(out->bytes, sizeof(out->bytes)); + } + +``WriteParam`` and ``ReadParam`` call the ``ParamTraits`` for the data you pass +them, determined using the type of the object as supplied. ``WriteBytes`` and +``ReadBytesInto`` work on raw, contiguous bytes as expected. ``MessageWriter`` +and ``MessageReader`` are IPDL internal objects which hold the incoming/outgoing +message as a stream of bytes and the current spot in the stream. It is *very* +rare for client code to need to create or manipulate these objects. Their +advanced use is beyond the scope of this document. + +.. important:: + Potential failures in ``Read`` include everyday C++ failures like + out-of-memory conditions, which can be handled as usual. But ``Read`` can + also fail due to things like data validation errors. ``ParamTraits`` read + data that is considered insecure. It is important that they catch + corruption and properly handle it. Returning false from ``Read`` will + usually result in crashing the process (everywhere except in the main + process). This is the right behavior as the browser would be in an + unexpected state, even if the serialization failure was not malicious + (since it cannot process the message). Other responses, such as failing + with a crashing assertion, are inferior. IPDL fuzzing relies on + ``ParamTraits`` not crashing due to corruption failures. + Occasionally, validation will require access to state that ``ParamTraits`` + can't easily reach. (Only) in those cases, validation can be reasonably + done in the message handler. Such cases are a good use of the ``Tainted`` + annotation. See `Actors and Messages in C++`_ for more. + +.. note:: + In the past, it was required to specialize ``mozilla::ipc::IPDLParamTraits<T>`` + instead of ``IPC::ParamTraits<T>`` if you needed the actor object itself during + serialization or deserialization. These days the actor can be fetched using + ``IPC::Message{Reader,Writer}::GetActor()`` in ``IPC::ParamTraits``, so that + trait should be used for all new serializations. + +A special case worth mentioning is that of enums. Enums are a common source of +security holes since code is rarely safe with enum values that are not valid. +Since data obtained through IPDL messages should be considered tainted, enums +are of principal concern. ``ContiguousEnumSerializer`` and +``ContiguousEnumSerializerInclusive`` safely implement ``ParamTraits`` for +enums that are only valid for a contiguous set of values, which is most of +them. The generated ``ParamTraits`` confirm that the enum is in valid range; +``Read`` will return false otherwise. As an example, here is the +``MyActorEnum`` included from ``MyActorUtils.h``: + +.. code-block:: c++ + + enum MyActorEnum { e1, e2, e3, e4, e5 }; + + template<> + struct ParamTraits<MyActorEnum> + : public ContiguousEnumSerializerInclusive<MyActorEnum, MyActorEnum::e1, MyActorEnum::e5> {}; + +IPDL Structs and Unions in C++ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +IPDL structs and unions become C++ classes that provide interfaces that are +fairly self-explanatory. Recalling ``MyUnion`` and ``MyActorPair`` from +`IPDL Structs and Unions`_ : + +.. code-block:: c++ + + union MyUnion { + float; + MyOtherData; + }; + + struct MyActorPair { + PMyManaged actor1; + nullable PMyManaged actor2; + }; + +These compile to: + +.. code-block:: c++ + + class MyUnion { + enum Type { Tfloat, TMyOtherData }; + Type type(); + MyUnion(float f); + MyUnion(MyOtherData&& aOD); + MyUnion& operator=(float f); + MyUnion& operator=(MyOtherData&& aOD); + operator float&(); + operator MyOtherData&(); + }; + + class MyActorPair { + MyActorPair(PMyManagedParent* actor1Parent, PMyManagedChild* actor1Child, + PMyManagedParent* actor2Parent, PMyManagedChild* actor2Child); + // Exactly one of { actor1Parent(), actor1Child() } must be non-null. + PMyManagedParent*& actor1Parent(); + PMyManagedChild*& actor1Child(); + // As nullable, zero or one of { actor2Parent(), actor2Child() } will be non-null. + PMyManagedParent*& actor2Parent(); + PMyManagedChild*& actor2Child(); + } + +The generated ``ParamTraits`` use the ``ParamTraits`` for the types referenced +by the IPDL struct or union. Fields respect any annotations for their type +(see `IPDL Includes`_). For example, a ``[RefCounted]`` type ``T`` generates +``RefPtr<T>`` fields. + +Note that actor members result in members of both the parent and child actor +types, as seen in ``MyActorPair``. When actors are used to bridge processes, +only one of those could ever be used at a given endpoint. IPDL makes sure +that, when you send one type (say, ``PMyManagedChild``), the adjacent actor of +the other type (``PMyManagedParent``) is received. This is not only true for +message parameters and IPDL structs/unions but also for custom ``ParamTraits`` +implementations. If you ``Write`` a ``PFooParent*`` then you must ``Read`` a +``PFooChild*``. This is hard to confuse in message handlers since they are +members of a class named for the side they operate on, but this cannot be +enforced by the compiler. If you are writing +``MyManagerParent::RecvSomeMsg(Maybe<MyActorPair>&& aActors, nsTArray<MyData>&& aMyData)`` +then the ``actor1Child`` and ``actor2Child`` fields cannot be valid since the +child (usually) exists in another process. + +.. _IPDL Structs and Unions: `Generating IPDL-Aware C++ Data Types: IPDL Structs and Unions`_ +.. _IPDL Includes: `Referencing Externally Defined Data Types: IPDL Includes`_ + +Actors and Messages in C++ +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As mentioned in `Using The IPDL compiler`_, the IPDL compiler generates two +header files for the protocol ``PMyManager``: ``PMyManagerParent.h`` and +``PMyManagerChild.h``, which declare the actor's base classes. There, we +discussed how the headers are visible to C++ components that include +``chromium-config.mozbuild``. We, in turn, always need to define two files +that declare our actor implementation subclasses (``MyManagerParent.h`` and +``MyManagerChild.h``). The IPDL file looked like this: + +.. literalinclude:: _static/PMyManager.ipdl + :language: c++ + :name: PMyManager.ipdl + +So ``MyManagerParent.h`` looks like this: + +.. code-block:: c++ + + #include "PMyManagerParent.h" + + namespace mozilla { + namespace myns { + + class MyManagerParent : public PMyManagerParent { + NS_INLINE_DECL_REFCOUNTING(MyManagerParent, override) + protected: + IPCResult Recv__delete__(const nsString& aNote); + IPCResult RecvSomeMsg(const Maybe<MyActorPair>& aActors, const nsTArray<MyData>& aMyData, + int32_t* x, int32_t* y, MyUnion* aUnion); + IPCResult RecvAnotherMsg(const Tainted<MyActorEnum>& aEnum, const Tainted<int32_t>& aNumber, + AnotherMsgResolver&& aResolver); + + already_AddRefed<PMyManagerParent> AllocPMyManagedParent(); + IPCResult RecvPMyManagedConstructor(PMyManagedConstructor* aActor); + + // ... etc ... + }; + + } // namespace myns + } // namespace mozilla + +All messages that can be sent to the actor must be handled by ``Recv`` methods +in the proper actor subclass. They should return ``IPC_OK()`` on success and +``IPC_FAIL(actor, reason)`` if an error occurred (where ``actor`` is ``this`` +and ``reason`` is a human text explanation) that should be considered a failure +to process the message. The handling of such a failure is specific to the +process type. + +``Recv`` methods are called by IPDL by enqueueing a task to run them on the +``MessageLoop`` for the thread on which they are bound. This thread is the +actor's *worker thread*. All actors in a managed actor tree have the same +worker thread -- in other words, actors inherit the worker thread from their +managers. Top level actors establish their worker thread when they are +*bound*. More information on threads can be found in `Top Level Actors`_. For +the most part, client code will never engage with an IPDL actor outside of its +worker thread. + +Received parameters become stack variables that are ``std::move``-d into the +``Recv`` method. They can be received as a const l-value reference, +rvalue-reference, or by value (type-permitting). ``[MoveOnly]`` types should +not be received as const l-values. Return values for sync messages are +assigned by writing to non-const (pointer) parameters. Return values for async +messages are handled differently -- they are passed to a resolver function. In +our example, ``AnotherMsgResolver`` would be a ``std::function<>`` and +``aResolver`` would be given the value to return by passing it a reference to a +``MyOtherData`` object. + +``MyManagerParent`` is also capable of ``sending`` an async message that +returns a value: ``AnotherMsg``. This is done with ``SendAnotherMsg``, which +is defined automatically by IPDL in the base class ``PMyManagerParent``. There +are two signatures for ``Send`` and they look like this: + +.. code-block:: c++ + + // Return a Promise that IPDL will resolve with the response or reject. + RefPtr<MozPromise<MyOtherData, ResponseRejectReason, true>> + SendAnotherMsg(const MyActorEnum& aEnum, int32_t aNumber); + + // Provide callbacks to process response / reject. The callbacks are just + // std::functions. + void SendAnotherMsg(const MyActorEnum& aEnum, int32_t aNumber, + ResolveCallback<MyOtherData>&& aResolve, RejectCallback&& aReject); + +The response is usually handled by lambda functions defined at the site of the +``Send`` call, either by attaching them to the returned promise with e.g. +``MozPromise::Then``, or by passing them as callback parameters. See docs on +``MozPromise`` for more on its use. The promise itself is either resolved or +rejected by IPDL when a valid reply is received or when the endpoint determines +that the communication failed. ``ResponseRejectReason`` is an enum IPDL +provides to explain failures. + +Additionally, the ``AnotherMsg`` handler has ``Tainted`` parameters, as a +result of the [Tainted] annotation in the protocol file. Recall that +``Tainted`` is used to force explicit validation of parameters in the message +handler before their values can be used (as opposed to validation in +``ParamTraits``). They therefore have access to any state that the message +handler does. Their APIs, along with a list of macros that are used to +validate them, are detailed `here +<https://searchfox.org/mozilla-central/source/mfbt/Tainting.h>`__. + +Send methods that are not for async messages with return values follow a +simpler form; they return a ``bool`` indicating success or failure and return +response values in non-const parameters, as the ``Recv`` methods do. For +example, ``PMyManagerChild`` defines this to send the sync message ``SomeMsg``: + +.. code-block:: c++ + + // generated in PMyManagerChild + bool SendSomeMsg(const Maybe<MyActorPair>& aActors, const nsTArray<MyData>& aMyData, + int32_t& x, int32_t& y, MyUnion& aUnion); + +Since it is sync, this method will not return to its caller until the response +is received or an error is detected. + +All calls to ``Send`` methods, like all messages handler ``Recv`` methods, must +only be called on the worker thread for the actor. + +Constructors, like the one for ``MyManaged``, are clearly an exception to these +rules. They are discussed in the next section. + +.. _Actor Lifetimes in C++: + +Actor Lifetimes in C++ +~~~~~~~~~~~~~~~~~~~~~~ + +The constructor message for ``MyManaged`` becomes *two* methods at the +receiving end. ``AllocPMyManagedParent`` constructs the managed actor, then +``RecvPMyManagedConstructor`` is called to update the new actor. The following +diagram shows the construction of the ``MyManaged`` actor pair: + +.. mermaid:: + :align: center + :caption: A ``MyManaged`` actor pair being created by some ``Driver`` + object. Internal IPC objects in the parent and child processes + are combined for compactness. Connected **par** blocks run + concurrently. This shows that messages can be safely sent while + the parent is still being constructed. + + %%{init: {'sequence': {'boxMargin': 4, 'actorMargin': 10} }}%% + sequenceDiagram + participant d as Driver + participant mgdc as MyManagedChild + participant mgrc as MyManagerChild + participant ipc as IPC Child/Parent + participant mgrp as MyManagerParent + participant mgdp as MyManagedParent + d->>mgdc: new + mgdc->>d: [mgd_child] + d->>mgrc: SendPMyManagedConstructor<br/>[mgd_child, params] + mgrc->>ipc: Form actor pair<br/>[mgd_child, params] + par + mgdc->>ipc: early PMyManaged messages + and + ipc->>mgrp: AllocPMyManagedParent<br/>[params] + mgrp->>mgdp: new + mgdp->>mgrp: [mgd_parent] + ipc->>mgrp: RecvPMyManagedConstructor<br/>[mgd_parent, params] + mgrp->>mgdp: initialization + ipc->>mgdp: early PMyManaged messages + end + Note over mgdc,mgdp: Bi-directional sending and receiving will now happen concurrently. + +The next diagram shows the destruction of the ``MyManaged`` actor pair, as +initiated by a call to ``Send__delete__``. ``__delete__`` is sent from the +child process because that is the only side that can call it, as declared in +the IPDL protocol file. + +.. mermaid:: + :align: center + :caption: A ``MyManaged`` actor pair being disconnected due to some + ``Driver`` object in the child process sending ``__delete__``. + + %%{init: {'sequence': {'boxMargin': 4, 'actorMargin': 10} }}%% + sequenceDiagram + participant d as Driver + participant mgdc as MyManagedChild + participant ipc as IPC Child/Parent + participant mgdp as MyManagedParent + d->>mgdc: Send__delete__ + mgdc->>ipc: Disconnect<br/>actor pair + par + ipc->>mgdc: ActorDestroy + ipc->>mgdc: Release + and + ipc->>mgdp: Recv__delete__ + ipc->>mgdp: ActorDestroy + ipc->>mgdp: Release + end + +Finally, let's take a look at the behavior of an actor whose peer has been lost +(e.g. due to a crashed process). + +.. mermaid:: + :align: center + :caption: A ``MyManaged`` actor pair being disconnected when its peer is + lost due to a fatal error. Note that ``Recv__delete__`` is not + called. + + %%{init: {'sequence': {'boxMargin': 4, 'actorMargin': 10} }}%% + sequenceDiagram + participant mgdc as MyManagedChild + participant ipc as IPC Child/Parent + participant mgdp as MyManagedParent + Note over mgdc: CRASH!!! + ipc->>ipc: Notice fatal error. + ipc->>mgdp: ActorDestroy + ipc->>mgdp: Release + +The ``Alloc`` and ``Recv...Constructor`` methods are somewhat mirrored by +``Recv__delete__`` and ``ActorDestroy`` but there are a few differences. +First, the ``Alloc`` method really does create the actor but the +``ActorDestroy`` method does not delete it. Additionally, ``ActorDestroy`` +is run at *both* endpoints, during ``Send__delete__`` or after +``Recv__delete__``. Finally and most importantly, ``Recv__delete__`` is only +called if the ``__delete__`` message is received but it may not be if, for +example, the remote process crashes. ``ActorDestroy``, on the other hand, is +guaranteed to run for *every* actor unless the process terminates uncleanly. +For this reason, ``ActorDestroy`` is the right place for most actor shutdown +code. ``Recv__delete__`` is rarely useful, although it is occasionally +beneficial to have it receive some final data. + +The relevant part of the parent class looks like this: + +.. code-block:: c++ + + class MyManagerParent : public PMyManagerParent { + already_AddRefed<PMyManagedParent> AllocPMyManagedParent(); + IPCResult RecvPMyManagedConstructor(PMyManagedParent* aActor); + + IPCResult Recv__delete__(const nsString& aNote); + void ActorDestroy(ActorDestroyReason why); + + // ... etc ... + }; + +The ``Alloc`` method is required for managed actors that are constructed by +IPDL receiving a ``Send`` message. It is not required for the actor at the +endpoint that calls ``Send``. The ``Recv...Constructor`` message is not +required -- it has a base implementation that does nothing. + +If the constructor message has parameters, they are sent to both methods. +Parameters are given to the ``Alloc`` method by const reference but are moved +into the ``Recv`` method. They differ in that messages can be sent from the +``Recv`` method but, in ``Alloc``, the newly created actor is not yet +operational. + +The ``Send`` method for a constructor is similarly different from other +``Send`` methods. In the child actor, ours looks like this: + +.. code-block:: c++ + + IPCResult SendPMyManagedConstructor(PMyManagedChild* aActor); + +The method expects a ``PMyManagedChild`` that the caller will have constructed, +presumably using ``new`` (this is why it does not require an ``Alloc`` method). +Once ``Send...Constructor`` is called, the actor can be used to send and +receive messages. It does not matter that the remote actor may not have been +created yet due to asynchronicity. + +The destruction of actors is as unusual as their construction. Unlike +construction, it is the same for managed and top-level actors. Avoiding +``[ManualDealloc]`` actors removes a lot of the complexity but there is still +a process to understand. Actor destruction begins when an ``__delete__`` +message is sent. In ``PMyManager``, this message is declared from child to +parent. The actor calling ``Send__delete__`` is no longer connected to +anything when the method returns. Future calls to ``Send`` return an error +and no future messages will be received. This is also the case for an actor +that has run ``Recv__delete__``; it is no longer connected to the other +endpoint. + +.. note:: + Since ``Send__delete__`` may release the final reference to itself, it + cannot safely be a class instance method. Instead, unlike other ``Send`` + methods, it's a ``static`` class method and takes the actor as a parameter: + + .. code-block:: c++ + + static IPCResult Send__delete__(PMyManagerChild* aToDelete); + + Additionally, the ``__delete__`` message tells IPDL to disconnect both the + given actor *and all of its managed actors*. So it is really deleting the + actor subtree, although ``Recv__delete__`` is only called for the actor it + was sent to. + +During the call to ``Send__delete__``, or after the call to ``Recv__delete__``, +the actor's ``ActorDestroy`` method is called. This method gives client code a +chance to do any teardown that must happen in `all` circumstances were it is +possible -- both expected and unexpected. This means that ``ActorDestroy`` +will also be called when, for example, IPDL detects that the other endpoint has +terminated unexpectedly, so it is releasing its reference to the actor, or +because an ancestral manager (manager or manager's manager...) received a +``__delete__``. The only way for an actor to avoid ``ActorDestroy`` is for its +process to crash first. ``ActorDestroy`` is always run after its actor is +disconnected so it is pointless to try to send messages from it. + +Why use ``ActorDestroy`` instead of the actor's destructor? ``ActorDestroy`` +gives a chance to clean up things that are only used for communication and +therefore don't need to live for however long the actor's (reference-counted) +object does. For example, you might have references to shared memory (Shmems) +that are no longer valid. Or perhaps the actor can now release a cache of data +that was only needed for processing messages. It is cleaner to deal with +communication-related objects in ``ActorDestroy``, where they become invalid, +than to leave them in limbo until the destructor is run. + +Consider actors to be like normal reference-counted objects, but where IPDL +holds a reference while the connection will or does exist. One common +architecture has IPDL holding the `only` reference to an actor. This is common +with actors created by sending constructor messages but the idea is available to +any actor. That only reference is then released when the ``__delete__`` +message is sent or received. + +The dual of IPDL holding the only reference is to have client code hold the +only reference. A common pattern to achieve this has been to override the +actor's ``AddRef`` to have it send ``__delete__`` only when it's count is down +to one reference (which must be IPDL if ``actor.CanSend()`` is true). A better +approach would be to create a reference-counted delegate for your actor that +can send ``__delete__`` from its destructor. IPDL does not guarantee that it +will not hold more than one reference to your actor. + +.. _Top Level Actors: + +Top Level Actors +---------------- + +Recall that top level actors are actors that have no manager. They are at the +root of every actor tree. There are two settings in which we use top-level +actors that differ pretty dramatically. The first type are top-level actors +that are created and maintained in a way that resembles managed actors, but +with some important differences we will cover in this section. The second type +of top-level actors are the very first actors in a new process -- these actors +are created through different means and closing them (usually) terminates the +process. The `new process example +<https://phabricator.services.mozilla.com/D119038>`_ demonstrates both of +these. It is discussed in detail in :ref:`Adding a New Type of Process`. + +Value of Top Level Actors +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Top-level actors are harder to create and destroy than normal actors. They +used to be more heavyweight than managed actors but this has recently been +dramatically reduced. + +.. note:: + Top-level actors previously required a dedicated *message channel*, which + are limited OS resources. This is no longer the case -- message channels + are now shared by actors that connect the same two processes. This + *message interleaving* can affect message delivery latency but profiling + suggests that the change was basically inconsequential. + +So why use a new top level actor? + +* The most dramatic property distinguishing top-level actors is the ability to + *bind* to whatever ``EventTarget`` they choose. This means that any thread + that runs a ``MessageLoop`` can use the event target for that loop as the + place to send incoming messages. In other words, ``Recv`` methods would be + run by that message loop, on that thread. The IPDL apparatus will + asynchronously dispatch messages to these event targets, meaning that + multiple threads can be handling incoming messages at the same time. The + `PBackground`_ approach was born of a desire to make it easier to exploit + this, although it has some complications, detailed in that section, that + limit its value. +* Top level actors suggest modularity. Actor protocols are tough to debug, as + is just about anything that spans process boundaries. Modularity can give + other developers a clue as to what they need to know (and what they don't) + when reading an actor's code. The alternative is proverbial *dumpster + classes* that are as critical to operations (because they do so much) as they + are difficult to learn (because they do so much). +* Top level actors are required to connect two processes, regardless of whether + the actors are the first in the process or not. As said above, the first + actor is created through special means but other actors are created through + messages. In Gecko, apart from the launcher and main processes, all new + processes X are created with their first actor being between X and the main + process. To create a connection between X and, say, a content process, the + main process has to send connected ``Endpoints`` to X and to the content + process, which in turn use those endpoints to create new top level actors + that form an actor pair. This is discussed at length in :ref:`Connecting + With Other Processes`. + +Top-level actors are not as frictionless as desired but they are probably +under-utilized relative to their power. In cases where it is supported, +``PBackground`` is sometimes a simpler alternative to achieve the same goals. + +Creating Top Level Actors From Other Actors +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The most common way to create new top level actors is by creating a pair of +connected Endpoints and sending one to the other actor. This is done exactly +the way it sounds. For example: + +.. code-block:: c++ + + bool MyPreexistingActorParent::MakeMyActor() { + Endpoint<PMyActorParent> parentEnd; + Endpoint<PMyActorChild> childEnd; + if (NS_WARN_IF(NS_FAILED(PMyActor::CreateEndpoints( + base::GetCurrentProcId(), OtherPid(), &parentEnd, &childEnd)))) { + // ... handle failure ... + return false; + } + RefPtr<MyActorParent> parent = new MyActorParent; + if (!parentEnd.Bind(parent)) { + // ... handle failure ... + delete parent; + return false; + } + // Do this second so we skip child if parent failed to connect properly. + if (!SendCreateMyActorChild(std::move(childEnd))) { + // ... assume an IPDL error will destroy parent. Handle failure beyond that ... + return false; + } + return true; + } + +Here ``MyPreexistingActorParent`` is used to send a child endpoint for the new +top level actor to ``MyPreexistingActorChild``, after it hooks up the parent +end. In this example, we bind our new actor to the same thread we are running +on -- which must be the same thread ``MyPreexistingActorParent`` is bound to +since we are sending ``CreateMyActorChild`` from it. We could have bound on a +different thread. + +At this point, messages can be sent on the parent. Eventually, it will start +receiving them as well. + +``MyPreexistingActorChild`` still has to receive the create message. The code +for that handler is pretty similar: + +.. code-block:: c++ + + IPCResult MyPreexistingActorChild::RecvCreateMyActorChild(Endpoint<PMyActorChild>&& childEnd) { + RefPtr<MyActorChild> child = new MyActorChild; + if (!childEnd.Bind(child)) { + // ... handle failure and return ok, assuming a related IPDL error will alert the other side to failure ... + return IPC_OK(); + } + return IPC_OK(); + } + +Like the parent, the child is ready to send as soon as ``Bind`` is complete. +It will start receiving messages soon afterward on the event target for the +thread on which it is bound. + +Creating First Top Level Actors +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The first actor in a process is an advanced topic that is covered in +:ref:`the documentation for adding a new process<Adding a New Type of Process>`. + +PBackground +----------- + +Developed as a convenient alternative to top level actors, ``PBackground`` is +an IPDL protocol whose managees choose their worker threads in the child +process and share a thread dedicated solely to them in the parent process. +When an actor (parent or child) should run without hogging the main thread, +making that actor a managee of ``PBackground`` (aka a *background actor*) is an +option. + +.. warning:: + Background actors can be difficult to use correctly, as spelled out in this + section. It is recommended that other options -- namely, top-level actors + -- be adopted instead. + +Background actors can only be used in limited circumstances: + +* ``PBackground`` only supports the following process connections (where + ordering is parent <-> child): main <-> main, main <-> content, + main <-> socket and socket <-> content. + +.. important:: + + Socket process ``PBackground`` actor support was added after the other + options. It has some rough edges that aren't easy to anticipate. In the + future, their support may be broken out into a different actor or removed + altogether. You are strongly encouraged to use new `Top Level Actors`_ + instead of ``PBackground`` actor when communicating with socket process + worker threads. + +* Background actor creation is always initiated by the child. Of course, a + request to create one can be sent to the child by any other means. +* All parent background actors run in the same thread. This thread is + dedicated to serving as the worker for parent background actors. While it + has no other functions, it should remain responsive to all connected + background actors. For this reason, it is a bad idea to conduct long + operations in parent background actors. For such cases, create a top level + actor and an independent thread on the parent side instead. +* Background actors are currently *not* reference-counted. IPDL's ownership + has to be carefully respected and the (de-)allocators for the new actors have + to be defined. See `The Old Ways`_ for details. + +A hypothetical layout of ``PBackground`` threads, demonstrating some of the +process-type limitations, is shown in the diagram below. + +.. mermaid:: + :align: center + :caption: Hypothetical ``PBackground`` thread setup. Arrow direction + indicates child-to-parent ``PBackground``-managee relationships. + Parents always share a thread and may be connected to multiple + processes. Child threads can be any thread, including main. + + flowchart LR + subgraph content #1 + direction TB + c1tm[main] + c1t1[worker #1] + c1t2[worker #2] + c1t3[worker #3] + end + subgraph content #2 + direction TB + c2tm[main] + c2t1[worker #1] + c2t2[worker #2] + end + subgraph socket + direction TB + stm[main] + st1[background parent /\nworker #1] + st2[worker #2] + end + subgraph main + direction TB + mtm[main] + mt1[background parent] + end + + %% PBackground connections + c1tm --> mt1 + c1t1 --> mt1 + c1t2 --> mt1 + + c1t3 --> mt1 + c1t3 --> st1 + + c2t1 --> st1 + c2t1 --> mt1 + + c2t2 --> mt1 + + c2tm --> st1 + + stm --> mt1 + st1 --> mt1 + st2 --> mt1 + +Creating background actors is done a bit differently than normal managees. The +new managed type and constructor are still added to ``PBackground.ipdl`` as +with normal managees but, instead of ``new``-ing the child actor and then +passing it in a ``SendFooConstructor`` call, background actors issue the send +call to the ``BackgroundChild`` manager, which returns the new child: + +.. code-block:: c++ + + // Bind our new PMyBackgroundActorChild to the current thread. + PBackgroundChild* bc = BackgroundChild::GetOrCreateForCurrentThread(); + if (!bc) { + return false; + } + PMyBackgroundActorChild* pmyBac = bac->SendMyBackgroundActor(constructorParameters); + if (!pmyBac) { + return false; + } + auto myBac = static_cast<MyBackgroundActorChild*>(pmyBac); + +.. note:: + ``PBackgroundParent`` still needs a ``RecvMyBackgroundActorConstructor`` + handler, as usual. This must be done in the ``ParentImpl`` class. + ``ParentImpl`` is the non-standard name used for the implementation of + ``PBackgroundParent``. + +To summarize, ``PBackground`` attempts to simplify a common desire in Gecko: +to run tasks that communicate between the main and content processes but avoid +having much to do with the main thread of either. Unfortunately, it can be +complicated to use correctly and has missed on some favorable IPDL +improvements, like reference counting. While top level actors are always a +complete option for independent jobs that need a lot of resources, +``PBackground`` offers a compromise for some cases. + +IPDL Best Practices +------------------- + +IPC performance is affected by a lot of factors. Many of them are out of our +control, like the influence of the system thread scheduler on latency or +messages whose travel internally requires multiple legs for security reasons. +On the other hand, some things we can and should control for: + +* Messages incur inherent performance overhead for a number of reasons: IPDL + internal thread latency (e.g. the I/O thread), parameter (de-)serialization, + etc. While not usually dramatic, this cost can add up. What's more, each + message generates a fair amount of C++ code. For these reasons, it is wise + to reduce the number of messages being sent as far as is reasonable. This + can be as simple as consolidating two asynchronous messages that are always + in succession. Or it can be more complex, like consolidating two + somewhat-overlapping messages by merging their parameter lists and marking + parameters that may not be needed as optional. It is easy to go too far down + this path but careful message optimization can show big gains. +* Even ``[moveonly]`` parameters are "copied" in the sense that they are + serialized. The pipes that transmit the data are limited in size and require + allocation. So understand that the performance of your transmission will be + inversely proportional to the size of your content. Filter out data you + won't need. For complex reasons related to Linux pipe write atomicity, it is + highly desirable to keep message sizes below 4K (including a small amount for + message metadata). +* On the flip side, very large messages are not permitted by IPDL and will + result in a runtime error. The limit is currently 256M but message failures + frequently arise even with slightly smaller messages. +* Parameters to messages are C++ types and therefore can be very complex in the + sense that they generally represent a tree (or graph) of objects. If this + tree has a lot of objects in it, and each of them is serialized by + ``ParamTraits``, then we will find that serialization is allocating and + constructing a lot of objects, which will stress the allocator and cause + memory fragmentation. Avoid this by using larger objects or by sharing this + kind of data through careful use of shared memory. +* As it is with everything, concurrency is critical to the performance of IPDL. + For actors, this mostly manifests in the choice of bound thread. While + adding a managed actor to an existing actor tree may be a quick + implementation, this new actor will be bound to the same thread as the old + one. This contention may be undesirable. Other times it may be necessary + since message handlers may need to use data that isn't thread safe or may + need a guarantee that the two actors' messages are received in order. Plan + up front for your actor hierarchy and its thread model. Recognize when you + are better off with a new top level actor or ``PBackground`` managee that + facilitates processing messages simultaneously. +* Remember that latency will slow your entire thread, including any other + actors/messages on that thread. If you have messages that will need a long + time to be processed but can run concurrently then they should use actors + that run on a separate thread. +* Top-level actors decide a lot of properties for their managees. Probably the + most important are the process layout of the actor (including which process + is "Parent" and which is "Child") and the thread. Every top-level actor + should clearly document this, ideally in their .ipdl file. + +The Old Ways +------------ + +TODO: + +The FUD +------- + +TODO: + +The Rest +-------- + +Nested messages +~~~~~~~~~~~~~~~ + +The ``Nested`` message annotations indicate the nesting type of the message. +They attempt to process messages in the nested order of the "conversation +thread", as found in e.g. a mailing-list client. This is an advanced concept +that should be considered to be discouraged, legacy functionality. +Essentially, ``Nested`` messages can make other ``sync`` messages break the +policy of blocking their thread -- nested messages are allowed to be received +while a sync messagee is waiting for a response. The rules for when a nested +message can be handled are somewhat complex but they try to safely allow a +``sync`` message ``M`` to handle and respond to some special (nested) messages +that may be needed for the other endpoint to finish processing ``M``. There is +a `comment in MessageChannel`_ with info on how the decision to handle nested +messages is made. For sync nested messages, note that this implies a relay +between the endpoints, which could dramatically affect their throughput. + +Declaring messages to nest requires an annotation on the actor and one on the +message itself. The nesting annotations were listed in `Defining Actors`_ and +`Declaring IPDL Messages`_. We repeat them here. The actor annotations +specify the maximum priority level of messages in the actor. It is validated +by the IPDL compiler. The annotations are: + +============================== ================================================ +``[NestedUpTo=inside_sync]`` Indicates that an actor contains messages of + priority [Nested=inside_sync] or lower. +``[NestedUpTo=inside_cpow]`` Indicates that an actor contains messages of + priority [Nested=inside_cpow] or lower. +============================== ================================================ + +.. note:: + + The order of the nesting priorities is: + (no nesting priority) < ``inside_sync`` < ``inside_cpow``. + +The message annotations are: + +========================== ==================================================== +``[Nested=inside_sync]`` Indicates that the message can be handled while + waiting for lower-priority, or in-message-thread, + sync responses. +``[Nested=inside_cpow]`` Indicates that the message can be handled while + waiting for lower-priority, or in-message-thread, + sync responses. Cannot be sent by the parent actor. +========================== ==================================================== + +.. note:: + + ``[Nested=inside_sync]`` messages must be sync (this is enforced by the + IPDL compiler) but ``[Nested=inside_cpow]`` may be async. + +Nested messages are obviously only interesting when sent to an actor that is +performing a synchronous wait. Therefore, we will assume we are in such a +state. Say ``actorX`` is waiting for a sync reply from ``actorY`` for message +``m1`` when ``actorY`` sends ``actorX`` a message ``m2``. We distinguish two +cases here: (1) when ``m2`` is sent while processing ``m1`` (so ``m2`` is sent +by the ``RecvM1()`` method -- this is what we mean when we say "nested") and +(2) when ``m2`` is unrelated to ``m1``. Case (2) is easy; ``m2`` is only +dispatched while ``m1`` waits if +``priority(m2) > priority(m1) > (no priority)`` and the message is being +received by the parent, or if ``priority(m2) >= priority(m1) > (no priority)`` +and the message is being received by the child. Case (1) is less +straightforward. + +To analyze case (1), we again distinguish the two possible ways we can end up +in the nested case: (A) ``m1`` is sent by the parent to the child and ``m2`` +is sent by the child to the parent, or (B) where the directions are reversed. +The following tables explain what happens in all cases: + +.. |strike| raw:: html + + <strike> + +.. |endstrike| raw:: html + + </strike> + +.. |br| raw:: html + + <br/> + +.. table :: Case (A): Child sends message to a parent that is awaiting a sync response + :align: center + + ============================== ======================== ======================================================== + sync ``m1`` type (from parent) ``m2`` type (from child) ``m2`` handled or rejected + ============================== ======================== ======================================================== + sync (no priority) \* IPDL compiler error: parent cannot send sync (no priority) + sync inside_sync async (no priority) |strike| ``m2`` delayed until after ``m1`` completes |endstrike| |br| + Currently ``m2`` is handled during the sync wait (bug?) + sync inside_sync sync (no priority) |strike| ``m2`` send fails: lower priority than ``m1`` |endstrike| |br| + Currently ``m2`` is handled during the sync wait (bug?) + sync inside_sync sync inside_sync ``m2`` handled during ``m1`` sync wait: same message thread and same priority + sync inside_sync async inside_cpow ``m2`` handled during ``m1`` sync wait: higher priority + sync inside_sync sync inside_cpow ``m2`` handled during ``m1`` sync wait: higher priority + sync inside_cpow \* IPDL compiler error: parent cannot use inside_cpow priority + ============================== ======================== ======================================================== + +.. table :: Case (B): Parent sends message to a child that is awaiting a sync response + :align: center + + ============================= ========================= ======================================================== + sync ``m1`` type (from child) ``m2`` type (from parent) ``m2`` handled or rejected + ============================= ========================= ======================================================== + \* async (no priority) ``m2`` delayed until after ``m1`` completes + \* sync (no priority) IPDL compiler error: parent cannot send sync (no priority) + sync (no priority) sync inside_sync ``m2`` send fails: no-priority sync messages cannot handle + incoming messages during wait + sync inside_sync sync inside_sync ``m2`` handled during ``m1`` sync wait: same message thread and same priority + sync inside_cpow sync inside_sync ``m2`` send fails: lower priority than ``m1`` + \* async inside_cpow IPDL compiler error: parent cannot use inside_cpow priority + \* sync inside_cpow IPDL compiler error: parent cannot use inside_cpow priority + ============================= ========================= ======================================================== + +We haven't seen rule #2 from the `comment in MessageChannel`_ in action but, as +the comment mentions, it is needed to break deadlocks in cases where both the +parent and child are initiating message-threads simultaneously. It +accomplishes this by favoring the parent's sent messages over the child's when +deciding which message-thread to pursue first (and blocks the other until the +first completes). Since this distinction is entirely thread-timing based, +client code needs only to be aware that IPDL internals will not deadlock +because of this type of race, and that this protection is limited to a single +actor tree -- the parent/child messages are only well-ordered when under the +same top-level actor so simultaneous sync messages across trees are still +capable of deadlock. + +Clearly, tight control over these types of protocols is required to predict how +they will coordinate within themselves and with the rest of the application +objects. Control flow, and hence state, can be very difficult to predict and +are just as hard to maintain. This is one of the key reasons why we have +stressed that message priorities should be avoided whenever possible. + +.. _comment in MessageChannel: https://searchfox.org/mozilla-central/rev/077501b34cca91763ae04f4633a42fddd919fdbd/ipc/glue/MessageChannel.cpp#54-118 + +.. _Message Logging: + +Message Logging +~~~~~~~~~~~~~~~ + +The environment variable ``MOZ_IPC_MESSAGE_LOG`` controls the logging of IPC +messages. It logs details about the transmission and reception of messages. +This isn't controlled by ``MOZ_LOG`` -- it is a separate system. Set this +variable to ``1`` to log information on all IPDL messages, or specify a +comma-separated list of **top-level** protocols to log (e.g. +``MOZ_IPC_MESSAGE_LOG="PMyManagerChild,PMyManagedParent,PMyManagedChild"``). +:ref:`Debugging with IPDL Logging` has an example where IPDL logging is useful +in tracking down a bug. + +.. important:: + The preceding ``P`` and the ``Parent`` or ``Child`` suffix are required + when listing individual protocols in ``MOZ_IPC_MESSAGE_LOG``. diff --git a/ipc/docs/processes.rst b/ipc/docs/processes.rst new file mode 100644 index 0000000000..e29de211ac --- /dev/null +++ b/ipc/docs/processes.rst @@ -0,0 +1,1252 @@ +Gecko Processes +=============== + +Before Creating a New Process +----------------------------- + +Firefox started out as a one process application. Then, one became two as +NPAPI plugins like Flash were pushed into their own process (plugin processes) +for security and stability reasons. Then, it split again so that the browser +could also disentangle itself from web content (content processes). Then, +implementations on some platforms developed processes for graphics ("GPU" +processes). And for media codecs. And VR. And file URLs. And sockets. And +even more content processes. And so on... + +Here is an incomplete list of *good* reasons we've created new processes: + +* Separating HTML and JS from the browser makes it possible to secure the + browser and the rest of the system from them, even when those APIs are + compromised. +* Browser stability was also improved by separating HTML and JS from the + browser, since catastrophic failures related to a tab could be limited to the + tab instead of crashing the browser. +* Site isolation requires additional processes to separate HTML and JS for + different sites. The separation of memory spaces undermines many types of + exploits. +* Sandboxing processes offers great security guarantees but requires making + tradeoffs between power and protection. More processes means more options. + For example, we heavily sandbox content processes to protect from external + code, while the File process, which is a content process that can access + ``file://`` URLs, has a sandbox that is similar but allows access to local + files. +* One of the benefits of the GPU process was that it improved browser + stability by separating a system component that had frequent stability + issues -- GPU drivers. The same logic inspired the NPAPI (Flash) plugin + process. + +Informed by this history, there is some of non-obvious preparation that you +should do before starting down this path. This falls under the category of +"First, do no harm": + +* **Consult the Platform and IPC teams** (#ipc) to develop the plan for the + way your process will integrate with the systems in which it will exist, as + well as how it will be handled on any platforms where it will *not* exist. + For example, an application's process hierarchy forms a tree where one process + spawns another. Currently, all processes in Firefox are spawned by the main + process (excepting the `launcher process`_). There is good reason for this, + mostly based on our sandboxing restrictions that forbid non-main processes + from launching new processes themselves. But it means that the main process + will need to know to create your process. If you make the decision to do + this from, say, a content process, you will need a safe, performant and + stable way to request this of the main process. You will also need a way to + efficiently communicate directly with your new process. And you will need to + consider limitations of some platforms (think Android) where you may not want + to or not be able to spawn the new process. +* **Consult the sandboxing team** (#hardening) to discuss what the sandbox for + your new process will look like. Anything that compromises security is a + non-starter. You may, for instance, want to create a new process to escape + the confines of the sandbox in a content process. This can be legitimate, + for example you may need access to some device API that is unavailable to a + content process, but the security for your new process will then have to come + from a different source. "I won't run Javascript" is not sufficient. Keep + in mind that your process will have to have some mechanism for communication + with other processes to be useful, so it is always a potential target. + +.. note:: + Firefox has, to date, undergone exactly one occurrence of the *removal* of + a process type. In 2020, the NPAPI plugin process was removed when the + last supported plugin, Adobe's FlashPlayer, reached its end-of-life. + +.. _launcher process: https://wiki.mozilla.org/Platform/Integration/InjectEject/Launcher_Process/ + +Firefox Process Hierarchy +------------------------- + +This diagram shows the primary process types in Firefox. + +.. mermaid:: + + graph TD + RDD -->|PRemoteDecoderManager| Content + RDD(Data Decoder) ==>|PRDD| Main + + Launcher --> Main + + Main ==>|PContent| Content + Main ==>|PSocketProcess| Socket(Network Socket) + Main ==>|PGMP| GMP(Gecko Media Plugins) + VR ==>|PVR| Main + GPU ==>|PGPU| Main + + Socket -->|PSocketProcessBridge| Content + + GPU -->|PCompositorManager| Main + GPU -->|PCompositorManager| Content + + Content -->|PGMPContent| GMP + + VR -->|PVRGPU| GPU + +.. warning:: + The main process is sometimes called the UI process, the chrome process, + the browser process or the parent process. This is true for documentation, + conversation and, most significantly, **code**. Due to the syntactic + overlap with IPDL actors, that last name can get pretty confusing. Less + commonly, the content process is called the renderer process, which is it's + name in Chromium code. Since the content process sandbox won't allow it, + Firefox never does (hardware) rendering in the content/rendering process! + +The arrows point from the parent side to the child. Bolded arrows indicate the +first top-level actors for the various process types. The other arrows show +important actors that are usually the first connections established between the +two processes. These relationships difficult to discern from code. Processes +should clearly document their top-level connections in their IPDL files. + +Some process types only exist on some platforms and some processes may only be +created on demand. For example, Mac builds do not use a GPU process but +instead fold the same actor connections into its main process (except ``PGPU``, +which it does not use). These exceptions are also very hard to learn from code +and should be clearly documented. + +``about:processes`` shows statistics for the processes in a currently running +browser. It is also useful to see the distribution of web pages across content +processes. + +.. _Adding a New Type of Process: + +Adding a New Type of Process +---------------------------- + +Adding a new process type doesn't require any especially difficult steps but it +does require a lot of steps that are not obvious. This section will focus on +the steps as it builds an example. It will be light on the details of the +classes and protocols involved. Some implementations may need to seek out a +deeper understanding of the components set up here but most should instead +strive for simplicity. + +In the spirit of creating a *responsible* process, the sample will connect +several components that any deployed Gecko process is likely to need. These +include configuring a sandbox, `registration with the CrashReporter service`_ +and ("minimal") XPCOM initialization. Consult documentation for these +components for more information on their integration. + +This example will be loosely based on the old (now defunct) IPDL **Extending a +Protocol** example for adding a new actor. We will add a command to the +browser's ``navigator`` JS object, ``navigator.getAssistance()``. When the +user enters the new command in, say, the browser's console window, it will +create a new process of our new **Demo** process type and ask that process for +"assistance" in the form of a string that it will then print to the console. +Once that is done, the new process will be cleanly destroyed. + +Code for the complete demo can be found `here +<https://phabricator.services.mozilla.com/D119038>`_. + +.. _registration with the CrashReporter service: `Crash Reporter`_ + +Common Architecture +~~~~~~~~~~~~~~~~~~~ + +Every type of process (besides the launcher and main processes) needs two +classes and an actor pair to launch. This sample will be adding a process type +we call **Demo**. + +* An actor pair where the parent actor is a top-level actor in the main process + and the child is the (first) top-level actor in the new process. It is common + for this actor to simply take the name of the process type. The sample uses + ``PDemo``, so it creates ``DemoParent`` and ``DemoChild`` actor subclasses + as usual (see :ref:`IPDL: Inter-Thread and Inter-Process Message Passing`). +* A subclass of `GeckoChildProcessHost + <https://searchfox.org/mozilla-central/source/ipc/glue/GeckoChildProcessHost.h>`_ + that exists in the main process (where new processes are created) and handles + most of the machinery needed for new process creation. It is common for these + names to be the process type plus ``ProcessParent`` or ``ProcessHost``. The + sample uses ``DemoParent::Host``, a private class, which keeps + ``GeckoChildProcessHost`` out of the **Demo** process' *public interface* + since it is large, complicated and mostly unimportant externally. This + complexity is also why it is a bad idea to add extra responsibilities to the + ``Host`` object that inherits it. +* A subclass of `ProcessChild + <https://searchfox.org/mozilla-central/source/ipc/glue/ProcessChild.h>`_ that + exists in the new process. These names are usually generated by affixing + ``ProcessChild`` or ``ProcessImpl`` to the type. The sample will use + ``DemoChild::Process``, another private class, for the same reasons it did + with the ``Host``. + +A fifth class is optional but integration with common services requires +something like it: + +* A singleton class that "manages" the collective of processes (usually the + Host objects) of the new type in the main process. In many instances, there + is at most one instance of a process type, so this becomes a singleton that + manages a singleton... that manages a singleton. Object ownership is often + hard to establish between manager objects and the hosts they manage. It is + wise to limit the power of these classes. This class will often get its name + by appending ``ProcessManager`` to the process type. The sample provides a + very simple manager in ``DemoParent::Manager``. + +Finally, it is highly probable and usually desirable for the new process to +include another new top-level actor that represents the top-level operations +and communications of the new process. This actor will use the new process as +a child but may have any other process as the parent, unlike ``PDemo`` whose +parent is always the main process. This new actor will be created by the main +process, which creates a pair of ``Endpoint`` objects specifically for the +desired process pairing, and then sends those ``Endpoint`` objects to their +respective processes. The **Demo** example is interesting because the user can +issue the command from a content process or the main one, by opening the +console in a normal or a privileged page (e.g. ``about:sessionrestore``), +respectively. Supporting both of these cases will involve very little +additional effort. The sample will show this as part of implementing the +second top-level actor pair ``PDemoHelpline`` in `Connecting With Other +Processes`_, where the parent can be in either the main or a content process. + +The rest of the sections will explain how to compose these classes and +integrate them with Gecko. + +Process Bookkeeping +~~~~~~~~~~~~~~~~~~~ + +To begin with, look at the `geckoprocesstypes generator +<https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/geckoprocesstypes_generator/geckoprocesstypes/__init__.py>`_ +which adds the bones for a new process (by defining enum values and so on). +Some further manual intervention is still required, and you need to follow the +following checklists depending on your needs. + +Basic requirements +^^^^^^^^^^^^^^^^^^ + +* Add a new entry to the `enum WebIDLProcType + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/dom/chrome-webidl/ChromeUtils.webidl#610-638>`_ +* Update the `static_assert + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/xre/nsAppRunner.cpp#988-990>`_ + call checking for boundary against ``GeckoProcessType_End`` +* Add your process to the correct ``MessageLoop::TYPE_x`` in the first + ``switch(XRE_GetProcessType())`` in `XRE_InitChildProcess + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/xre/nsEmbedFunctions.cpp#572-590>`_. + You can get more information about that topic in `this comment + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/ipc/chromium/src/base/message_loop.h#159-187>`_ +* Instantiate your child within the second ``switch (XRE_GetProcessType())`` in + `XRE_InitChildProcess + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/xre/nsEmbedFunctions.cpp#615-671>`_ +* Add a new entry ``PROCESS_TYPE_x`` in `nsIXULRuntime interface + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/system/nsIXULRuntime.idl#183-196>`_ + +Graphics +######## + +If you need graphics-related interaction, hack into `gfxPlatform +<https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/gfx/thebes/gfxPlatform.cpp>`_ + +- Add a call to your process manager init in ``gfxPlatform::Init()`` in + `gfxPlatform + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/gfx/thebes/gfxPlatform.cpp#808-810>`_ +- Add a call to your process manager shutdown in ``gfxPlatform::Shutdown()`` in + `gfxPlatform + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/gfx/thebes/gfxPlatform.cpp#1255-1259>`_ + +Android +####### + +You might want to talk with `#geckoview` maintainers to ensure if this is +required or applicable to your new process type. + +- Add a new ``<service>`` entry against + ``org.mozilla.gecko.process.GeckoChildProcessServices$XXX`` in the + `AndroidManifest + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/mobile/android/geckoview/src/main/AndroidManifest.xml#45-81>`_ +- Add matching class inheritance from `GeckoChildProcessServices + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoChildProcessServices.jinja#10-13>`_ +- Add new entry in `public enum GeckoProcessType + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/mobile/android/geckoview/src/main/java/org/mozilla/gecko/process/GeckoProcessType.java#11-23>`_ + +Crash reporting +############### + +- Add ``InitCrashReporter`` message to the parent-side `InitCrashReporter + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#30>`_ +- Ensure your parent class inherits `public ipc::CrashReporterHelper<GeckoProcessType_Xxx> + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessParent.h#23>`_ +- Add new ``Xxx*Status`` `annotations + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/crashreporter/CrashAnnotations.yaml#968-971>`_ + entry for your new process type description. The link here points to + `UtilityProcessStatus` so you can see the similar description you have to + write, but you might want to respect ordering in that file and put your new + code at the appropriate place. +- Add entry in `PROCESS_CRASH_SUBMIT_ATTEMPT + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/telemetry/Histograms.json#13403-13422>`_ + +Memory reporting +################ + +Throughout the linked code, please consider those methods more as boilerplate code that will require some trivial modification to fit your exact usecase. + +- Add definition of memory reporter to your new :ref:`top-level actor <Top Level Actors>` + + + Type inclusion `MemoryReportTypes <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#6>`_ + + To parent-side `AddMemoryReport <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#32>`_ + + To child-side `RequestMemoryReport <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#44-48>`_ + +- Add handling for your new process within `nsMemoryReporterManager::GetReportsExtended <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/xpcom/base/nsMemoryReporterManager.cpp#1813-1819>`_ +- Provide a process manager level abstraction + + + Implement a new class deriving ``MemoryReportingProcess`` such as `UtilityMemoryReporter <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessManager.cpp#253-292>`_ + + Write a `GetProcessMemoryReport <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessManager.cpp#294-300>`_ + +- On the child side, provide an implementation for `RequestMemoryReport <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessChild.cpp#153-166>`_ +- On the parent side + + + Provide an implementation for `RequestMemoryReport <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessParent.cpp#41-69>`_ + + Provide an implementation for `AddMemoryReport <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessParent.cpp#71-77>`_ + +If you want to add a test that ensures proper behavior, you can have a look at the `utility process memory report test <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/test/browser/browser_utility_memoryReport.js>`_ + +Process reporting +################# + +Those elements will be used for exposing processes to users in some `about:` +pages. You might want to ping `#fluent-reviewers` to ensure if you need your +process there. + +- Add a `user-facing localizable name + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/locales/en-US/toolkit/global/processTypes.ftl#39-57>`_ + for your process, if needed +- Hashmap from process type to user-facing string above in `const ProcessType + <https://searchfox.org/mozilla-central/rev/c5c002f81f08a73e04868e0c2bf0eb113f200b03/toolkit/modules/ProcessType.sys.mjs#10-16`_ +- For `about:processes` you will probably want to follow the following steps: + + + Add handling for your new process type producing a unique `fluentName <https://searchfox.org/mozilla-central/rev/be4604e4be8c71b3c1dbff2398a5b05f15411673/toolkit/components/aboutprocesses/content/aboutProcesses.js#472-539>`_, i.e., constructing a dynamic name is highly discouraged + + Add matching localization strings within `fluent localization file <https://searchfox.org/mozilla-central/rev/be4604e4be8c71b3c1dbff2398a5b05f15411673/toolkit/locales/en-US/toolkit/about/aboutProcesses.ftl#35-55>`_ + +Profiler +######## + +- Add definition of ``PProfiler`` to your new IPDL + + + Type inclusion `protocol PProfiler <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#9>`_ + + Child-side `InitProfiler <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#42>`_ + +- Make sure your initialization path contains a `SendInitProfiler <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessHost.cpp#222-223>`_. You will want to perform the call once a ``OnChannelConnected`` is issued, thus ensuring your new process is connected to IPC. +- Provide an implementation for `InitProfiler <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessChild.cpp#147-151>`_ + +- You will probably want to make sure your child process code register within the profiler a proper name, otherwise it will default to ``GeckoMain`` ; this can be done by issuing ``profiler_set_process_name(nsCString("XxX"))`` on the child init side. + +Static Components +################# + +The amount of changes required here are significant, `Bug 1740485: Improve +StaticComponents code generation +<https://bugzilla.mozilla.org/show_bug.cgi?id=1740485>`_ tracks improving that. + +- Update allowance in those configuration files to match new process selector + that includes your new process. When exploring those components definitions, + keep in mind that you are looking at updating `processes` field in the + `Classes` object. The `ProcessSelector` value will come from what the reader + writes based on the instructions below. Some of these also contains several + services, so you might have to ensure you have all your bases covered. Some of + the components might not need to be updated as well. + + + `libpref <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/modules/libpref/components.conf>`_ + + `telemetry <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/telemetry/core/components.conf>`_ + + `android <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/widget/android/components.conf>`_ + + `gtk <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/widget/gtk/components.conf>`_ + + `windows <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/widget/windows/components.conf>`_ + + `base <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/base/components.conf>`_ + + `components <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/components/components.conf>`_ + + `ds <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/ds/components.conf>`_ + + `threads <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/threads/components.conf>`_ + + `cocoa kWidgetModule <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/widget/cocoa/nsWidgetFactory.mm#194-202>`_ + + `build <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/build/components.conf>`_ + + `XPCOMinit kXPCOMModule <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/build/XPCOMInit.cpp#172-180>`_ + +- Within `static components generator + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/components/gen_static_components.py>`_ + + + Add new definition in ``ProcessSelector`` for your new process + ``ALLOW_IN_x_PROCESS = 0x..`` + + Add new process selector masks including your new process definition + + Also add those into the ``PROCESSES`` structure + +- Within `module definition <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/components/Module.h>`_ + + + Add new definition in ``enum ProcessSelector`` + + Add new process selector mask including the new definition + + Update ``kMaxProcessSelector`` + +- Within `nsComponentManager <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/xpcom/components/nsComponentManager.cpp>`_ + + + Add new selector match in ``ProcessSelectorMatches`` for your new process + (needed?) + + Add new process selector for ``gProcessMatchTable`` in + ``nsComponentManagerImpl::Init()`` + +Glean telemetry +############### + +- Ensure your new IPDL includes on the child side + + + `FlushFOGData + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#55>`_ + + `TestTriggerMetrics + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/PUtilityProcess.ipdl#60>`_ + +- Provide a parent-side implementation for `FOGData + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessParent.cpp#79-82>`_ +- Provide a child-side implementation for `FlushFOGData + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessChild.cpp#179-183>`_ +- Child-side should flush its FOG data at IPC `ActorDestroy + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessChild.cpp#199-201>`_ +- Child-side `test metrics + <https://searchfox.org/mozilla-central/rev/fc4d4a8d01b0e50d20c238acbb1739ccab317ebc/ipc/glue/UtilityProcessChild.cpp#185-191>`_ +- Within `FOGIPC + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/glean/ipc/FOGIPC.cpp>`_ + + + Add handling of your new process type within ``FlushAllChildData()`` `here + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/glean/ipc/FOGIPC.cpp#106-121>`_ + and ``SendFOGData()`` `here + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/glean/ipc/FOGIPC.cpp#165-182>`_ + + Add support for sending test metrics in ``TestTriggerMetrics()`` `here + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/glean/ipc/FOGIPC.cpp#208-232>`_ + +- Handle process shutdown in ``register_process_shutdown()`` of `glean + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/toolkit/components/glean/api/src/ipc.rs>`_ + +Third-Party Modules +################### + +- Ensure your new IPDL includes on the child side + + + `GetUntrustedModulesData + <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/PUtilityProcess.ipdl#106>`_ + + `UnblockUntrustedModulesThread + <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/PUtilityProcess.ipdl#113>`_ + +- Provide a parent side implementation for both + +- Add handling of your new process type in ``MultiGetUntrustedModulesData::GetUntrustedModuleLoadEvents()`` `here <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/toolkit/components/telemetry/other/UntrustedModules.cpp#145-151>`_ + +- `Update your IPDL <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/PUtilityProcess.ipdl#75>`_ and make sure your ``Init()`` can receive a boolean for + ``isReadyForBackgroundProcessing`` `like here <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/UtilityProcessChild.cpp#157-160>`_, then within the child's ``RecvInit()`` + make sure a call to ``DllServices``'s ``StartUntrustedModulesProcessor()`` `is + performed <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/UtilityProcessChild.cpp#185-186>`_. + +- Ensure your new IPDL includes for the parent side + + + `GetModulesTrust <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/PUtilityProcess.ipdl#60-61>`_ + +- Provide an implementation on the `parent side <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/ipc/glue/UtilityProcessParent.cpp#69-81>`_ + +- Expose your new process type as supported in ``UntrustedModulesProcessor::IsSupportedProcessType()`` `like others <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/toolkit/xre/dllservices/UntrustedModulesProcessor.cpp#76-91>`_ + +- Update ``UntrustedModulesProcessor::SendGetModulesTrust()`` to call `your new child process <https://searchfox.org/mozilla-central/rev/2ce39261ea6a69e49d87f76a119494b2a7a7e42a/toolkit/xre/dllservices/UntrustedModulesProcessor.cpp#757-761>`_ + +Sandboxing +########## + +Sandboxing changes related to a new process can be non-trivial, so it is +strongly advised that you reach to the Sandboxing team in ``#hardening`` to +discuss your needs prior to making changes. + +Linux Sandbox +_____________ + +Linux sandboxing mostly works by allowing / blocking system calls for child +process and redirecting (brokering) some from the child to the parent. Rules +are written in a specific DSL: `BPF +<https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/chromium/sandbox/linux/bpf_dsl/bpf_dsl.h#21-72>`_. + +- Add new ``SetXXXSandbox()`` function within `linux sandbox + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/Sandbox.cpp#719-748>`_ +- Within `sandbox filter + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/SandboxFilter.cpp>`_ + + + Add new helper ``GetXXXSandboxPolicy()`` `like this one + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/SandboxFilter.cpp#2036-2040>`_ + called by ``SetXXXSandbox()`` + + Derive new class `similar to this + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/SandboxFilter.cpp#2000-2034>`_ + inheriting ``SandboxPolicyCommon`` or ``SandboxPolicyBase`` and defining + the sandboxing policy + +- Add new ``SandboxBrokerPolicyFactory::GetXXXProcessPolicy()`` in `sandbox + broker + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp#881-932>`_ +- Add new case handling in ``GetEffectiveSandboxLevel()`` in `sandbox launch + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/launch/SandboxLaunch.cpp#243-271>`_ +- Add new entry in ``enum class ProcType`` of `sandbox reporter header + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/reporter/SandboxReporterCommon.h#32-39>`_ +- Add new case handling in ``SubmitToTelemetry()`` in `sandbox reporter + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/reporter/SandboxReporter.cpp#131-152>`_ +- Add new case handling in ``SandboxReportWrapper::GetProcType()`` of `sandbox + reporter wrapper + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/linux/reporter/SandboxReporterWrappers.cpp#69-91>`_ + +MacOS Sandbox +_____________ + +- Add new case handling in ``GeckoChildProcessHost::StartMacSandbox()`` of + `GeckoChildProcessHost <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/ipc/glue/GeckoChildProcessHost.cpp#1720-1743>`_ +- Add new entry in ``enum MacSandboxType`` defined in `macOS sandbox header + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/mac/Sandbox.h#12-20>`_ +- Within `macOS sandbox core + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/mac/Sandbox.mm>`_ + handle the new ``MacSandboxType`` in + + + ``MacSandboxInfo::AppendAsParams()`` in the `switch statement + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/mac/Sandbox.mm#164-188>`_ + + ``StartMacSandbox()`` in the `serie of if/else statements + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/mac/Sandbox.mm#286-436>`_. + This code sets template values for the sandbox string rendering, and is + running on the side of the main process. + + ``StartMacSandboxIfEnabled()`` in this `switch statement + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/mac/Sandbox.mm#753-782>`_. + You might also need a ``GetXXXSandboxParamsFromArgs()`` that performs CLI + parsing on behalf of ``StartMacSandbox()``. + +- Create the new sandbox definition file + ``security/sandbox/mac/SandboxPolicy<XXX>.h`` for your new process ``<XXX>``, + and make it exposed in the ``EXPORTS.mozilla`` section of `moz.build + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/mac/moz.build#7-13>`_. + Those rules follows a specific Scheme-like language. You can learn more about + it in `Apple Sandbox Guide + <https://reverse.put.as/wp-content/uploads/2011/09/Apple-Sandbox-Guide-v1.0.pdf>`_ + as well as on your system within ``/System/Library/Sandbox/Profiles/``. + +Windows Sandbox +_______________ + +- Introduce a new ``SandboxBroker::SetSecurityLevelForXXXProcess()`` that + defines the new sandbox in both + + + the sandbox broker basing yourself on that `example + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp#1241-1344>`_ + + the remote sandbox broker getting `inspired by + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/win/src/remotesandboxbroker/remoteSandboxBroker.cpp#161-165>`_ + +- Add new case handling in ``WindowsProcessLauncher::DoSetup()`` calling + ``SandboxBroker::SetSecurityLevelForXXXProcess()`` in `GeckoChildProcessHost + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/ipc/glue/GeckoChildProcessHost.cpp#1391-1470>`_. + This will apply actual sandboxing rules to your process. + +Sandbox tests +_____________ + +- New process' first top level actor needs to `include PSandboxTesting + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/common/test/PSandboxTesting.ipdl>`_ + and implement ``RecvInitSandboxTesting`` `like there + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/ipc/glue/UtilityProcessChild.cpp#165-174>`_. +- Add your new process ``string_name`` in the ``processTypes`` list of `sandbox + tests <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/test/browser_sandbox_test.js#17>`_ +- Add a new case in ``SandboxTest::StartTests()`` in `test core + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/common/test/SandboxTest.cpp#100-232>`_ + to handle your new process +- Add a new if branch for your new process in ``SandboxTestingChild::Bind()`` + in `testing child + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/common/test/SandboxTestingChild.cpp#68-96>`_ +- Add a new ``RunTestsXXX`` function for your new process (called by ``Bind()`` + above) `similar to that implementation + <https://searchfox.org/mozilla-central/rev/d4b9c457db637fde655592d9e2048939b7ab2854/security/sandbox/common/test/SandboxTestingChildTests.h#333-363>`_ + +Creating the New Process +~~~~~~~~~~~~~~~~~~~~~~~~ + +The sample does this in ``DemoParent::LaunchDemoProcess``. The core +behavior is fairly clear: + +.. code-block:: c++ + + /* static */ + bool DemoParent::LaunchDemoProcess( + base::ProcessId aParentPid, LaunchDemoProcessResolver&& aResolver) { + UniqueHost host(new Host(aParentPid, std::move(aResolver))); + + // Prepare "command line" startup args for new process + std::vector<std::string> extraArgs; + if (!host->BuildProcessArgs(&extraArgs)) { + return false; + } + + // Async launch creates a promise that we use below. + if (!host->AsyncLaunch(extraArgs)) { + return false; + } + + host->WhenProcessHandleReady()->Then( + GetCurrentSerialEventTarget(), __func__, + [host = std::move(host)]( + const ipc::ProcessHandlePromise::ResolveOrRejectValue& + aResult) mutable { + if (aResult.IsReject()) { + host->ResolveAsFailure(); + return; + } + + auto actor = MakeRefPtr<DemoParent>(std::move(host)); + actor->Init(); + }); + } + +First, it creates an object of our ``GeckoChildProcessHost`` subclass (storing +some stuff for later). ``GeckoChildProcessHost`` is a base class that +abstracts the system-level operations involved in launching the new process. +It is the most substantive part of the launch procedure. After its +construction, the code prepares a bunch of strings to pass on the "command +line", which is the only way to pass data to the new process before IPDL is +established. All new processes will at least include ``-parentBuildId`` for +validating that dynamic libraries are properly versioned, and shared memory for +passing user preferences, which can affect early process behavior. Finally, it +tells ``GeckoChildProcessHost`` to asynchronously launch the process and run +the given lambda when it has a result. The lambda creates ``DemoParent`` with +the new host, if successful. + +In this sample, the ``DemoParent`` is owned (in the reference-counting sense) +by IPDL, which is why it doesn't get assigned to anything. This simplifies the +design dramatically. IPDL takes ownership when the actor calls ``Bind`` from +the ``Init`` method: + +.. code-block:: c++ + + DemoParent::DemoParent(UniqueHost&& aHost) + : mHost(std::move(aHost)) {} + + DemoParent::Init() { + mHost->TakeInitialEndpoint().Bind(this); + // ... + mHost->MakeBridgeAndResolve(); + } + +After the ``Bind`` call, the actor is live and communication with the new +process can begin. The constructor concludes by initiating the process of +connecting the ``PDemoHelpline`` actors; ``Host::MakeBridgeAndResolve`` will be +covered in `Creating a New Top Level Actor`_. However, before we get into +that, we should finish defining the lifecycle of the process. In the next +section we look at launching the new process from the new process' perspective. + +.. warning:: + The code could have chosen to create a ``DemoChild`` instead of a + ``DemoParent`` and the choice may seem cosmetic but it has substantial + implications that could affect browser stability. The most + significant is that the prohibitibition on synchronous IPDL messages going + from parent to child can no longer guarantee freedom from multiprocess + deadlock. + +Initializing the New Process +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The new process first adopts the **Demo** process type in +``XRE_InitChildProcess``, where it responds to the **Demo** values we added to +some enums above. Specifically, we need to choose the type of MessageLoop our +main thread will run (this is discussed later) and we need to create our +``ProcessChild`` subclass. This is not an insignificant choice so pay close +attention to the `MessageLoop` options: + +.. code-block:: c++ + + MessageLoop::Type uiLoopType; + switch (XRE_GetProcessType()) { + case GeckoProcessType_Demo: + uiLoopType = MessageLoop::TYPE_MOZILLA_CHILD; break; + // ... + } + + // ... + + UniquePtr<ProcessChild> process; + switch (XRE_GetProcessType()) { + // ... + case GeckoProcessType_Demo: + process = MakeUnique<DemoChild::Process>(parentPID); + break; + } + +We then need to create our singleton ``DemoChild`` object, which can occur in +the constructor or the ``Process::Init()`` call, which is common. We store a +strong reference to the actor (as does IPDL) so that we are guaranteed that it +exists as long as the ``ProcessChild`` does -- although the message channel may +be closed. We will release the reference either when the process is properly +shutting down or when an IPC error closes the channel. + +``Init`` is given the command line arguments constructed above so it will need +to be overridden to parse them. It does this, binds our actor by +calling ``Bind`` as was done with the parent, then initializes a bunch of +components that the process expects to use: + +.. code-block:: c++ + + bool DemoChild::Init(int aArgc, char* aArgv[]) { + #if defined(MOZ_SANDBOX) && defined(OS_WIN) + mozilla::SandboxTarget::Instance()->StartSandbox(); + #elif defined(__OpenBSD__) && defined(MOZ_SANDBOX) + StartOpenBSDSandbox(GeckoProcessType_Demo); + #endif + + if (!mozilla::ipc::ProcessChild::InitPrefs(aArgc, aArgv)) { + return false; + } + + if (NS_WARN_IF(NS_FAILED(nsThreadManager::get().Init()))) { + return false; + } + + if (NS_WARN_IF(!TakeInitialEndpoint().Bind(this))) { + return false; + } + + // ... initializing components ... + + if (NS_FAILED(NS_InitMinimalXPCOM())) { + return false; + } + + return true; + } + +This is a slimmed down version of the real ``Init`` method. We see that it +establishes a sandbox (more on this later) and then reads the command line and +preferences that we sent from the main process. It then initializes the thread +manager, which is required by for the subsequent ``Bind`` call. + +Among the list of components we initialize in the sample code, XPCOM is +special. XPCOM includes a suite of components, including the component +manager, and is usually required for serious Gecko development. It is also +heavyweight and should be avoided if possible. We will leave the details of +XPCOM development to that module but we mention XPCOM configuration that is +special to new processes, namely ``ProcessSelector``. ``ProcessSelector`` +is used to determine what process types have access to what XPCOM components. +By default, a process has access to none. The code adds enums for selecting +a subset of process types, like +``ALLOW_IN_GPU_RDD_VR_SOCKET_UTILITY_AND_DEMO_PROCESS``, to the +``ProcessSelector`` enum in `gen_static_components.py +<https://searchfox.org/mozilla-central/source/xpcom/components/gen_static_components.py>`_ +and `Module.h +<https://searchfox.org/mozilla-central/source/xpcom/components/Module.h>`_. +It then updates the selectors in various ``components.conf`` files and +hardcoded spots like ``nsComponentManager.cpp`` to add the **Demo** processes +to the list that can use them. Some modules are required to bootstrap XPCOM +and will cause it to fail to initialize if they are not permitted. + +At this point, the new process is idle, waiting for messages from the main +process that will start the ``PDemoHelpline`` actor. We discuss that in +`Creating a New Top Level Actor`_ below but, first, let's look at how the main +and **Demo** processes will handle clean destruction. + +Destroying the New Process +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Gecko processes have a clean way for clients to request that they shutdown. +Simply calling ``Close()`` on the top level actor at either endpoint will begin +the shutdown procedure (so, ``PDemoParent::Close`` or ``PDemoChild::Close``). +The only other way for a child process to terminate is to crash. Each of these +three options requires some special handling. + +.. note:: + There is no need to consider the case where the parent (main) process + crashed, because the **Demo** process would be quickly terminated by Gecko. + +In cases where ``Close()`` is called, the shutdown procedure is fairly +straightforward. Once the call completes, the actor is no longer connected to +a channel -- messages will not be sent or received, as is the case with any +normal top-level actor (or any managed actor after calling +``Send__delete__()``). In the sample code, we ``Close`` the ``DemoChild`` +when some (as yet unwritten) **Demo** process code calls +``DemoChild::Shutdown``. + +.. code-block:: c++ + + /* static */ + void DemoChild::Shutdown() { + if (gDemoChild) { + // Wait for the other end to get everything we sent before shutting down. + // We never want to Close during a message (response) handler, so + // we dispatch a new runnable. + auto dc = gDemoChild; + RefPtr<nsIRunnable> runnable = NS_NewRunnableFunction( + "DemoChild::FinishShutdown", + [dc2 = std::move(gDemoChild)]() { dc2->Close(); }); + dc->SendEmptyMessageQueue( + [runnable](bool) { NS_DispatchToMainThread(runnable); }, + [runnable](mozilla::ipc::ResponseRejectReason) { + NS_DispatchToMainThread(runnable); + }); + } + } + +The comment in the code makes two important points: + +* ``Close`` should never be called from a message handler (e.g. in a + ``RecvFoo`` method). We schedule it to run later. +* If the ``DemoParent`` hasn't finished handling messages the ``DemoChild`` + sent, or vice-versa, those messages will be lost. For that reason, we have a + trivial sentinel message ``EmptyMessageQueue`` that we simply send and wait + to respond before we ``Close``. This guarantees that the main process will + have handled all of the messages we sent before it. Because we know the + details of the ``PDemo`` protocol, we know that this means we won't lose any + important messages this way. Note that we say "important" messages because + we could still lose messages sent *from* the main process. For example, a + ``RequestMemoryReport`` message sent by the MemoryReporter could be lost. + The actor would need a more complex shutdown protocol to catch all of these + messages but in our case there would be no point. A process that is + terminating is probably not going to produce useful memory consumption data. + Those messages can safely be lost. + +`Debugging Process Startup`_ looks at what happens if we omit the +``EmptyMessageQueue`` message. + +We can also see that, once the ``EmptyMessageQueue`` response is run, we are +releasing ``gDemoChild``, which will result in the termination of the process. + +.. code-block:: c++ + + DemoChild::~DemoChild() { + // ... + XRE_ShutdownChildProcess(); + } + +At this point, the ``DemoParent`` in the main process is alerted to the +channel closure because IPDL will call its :ref:`ActorDestroy <Actor Lifetimes +in C++>` method. + +.. code-block:: c++ + + void DemoParent::ActorDestroy(ActorDestroyReason aWhy) { + if (aWhy == AbnormalShutdown) { + GenerateCrashReport(OtherPid()); + } + // ... + } + +IPDL then releases its (sole) reference to ``DemoParent`` and the destruction +of the process apparatus is complete. + +The ``ActorDestroy`` code shows how we handle the one remaining shutdown case: +a crash in the **Demo** process. In this case, IPDL will *detect* the dead +process and free the ``DemoParent`` actor as above, only with an +``AbnormalShutdown`` reason. We generate a crash report, which requires crash +reporter integration, but no additional "special" steps need to be taken. + +Creating a New Top Level Actor +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We now have a framework that creates the new process and connects it to the +main process. We now want to make another top-level actor but this one will be +responsible for our intended behavior, not just bootstrapping the new process. +Above, we saw that this is started by ``Host::MakeBridgeAndResolve`` after the +``DemoParent`` connection is established. + +.. code-block:: c++ + + bool DemoParent::Host::MakeBridgeAndResolve() { + ipc::Endpoint<PDemoHelplineParent> parent; + ipc::Endpoint<PDemoHelplineChild> child; + + auto resolveFail = MakeScopeExit([&] { mResolver(Nothing()); }); + + // Parent side is first PID (main/content), child is second (demo). + nsresult rv = PDempHelpline::CreateEndpoints( + mParentPid, base::GetProcId(GetChildProcessHandle()), &parent, &child); + + // ... + + if (!mActor->SendCreateDemoHelplineChild(std::move(child))) { + NS_WARNING("Failed to SendCreateDemoHelplineChild"); + return false; + } + + resolveFail.release(); + mResolver(Some(std::move(parent))); + return true; + } + +Because the operation of launching a process is asynchronous, we have +configured this so that it creates the two endpoints for the new top-level +actors, then we send the child one to the new process and resolve a promise +with the other. The **Demo** process creates its ``PDemoHelplineChild`` +easily: + +.. code-block:: c++ + + mozilla::ipc::IPCResult DemoChild::RecvCreateDemoHelplineChild( + Endpoint<PDemoHelplineChild>&& aEndpoint) { + mDemoHelplineChild = new DemoHelplineChild(); + if (!aEndpoint.Bind(mDemoHelplineChild)) { + return IPC_FAIL(this, "Unable to bind DemoHelplineChild"); + } + return IPC_OK(); + } + +``MakeProcessAndGetAssistance`` binds the same way: + +.. code-block:: c++ + + RefPtr<DemoHelplineParent> demoHelplineParent = new DemoHelplineParent(); + if (!endpoint.Bind(demoHelplineParent)) { + NS_WARNING("Unable to bind DemoHelplineParent"); + return false; + } + MOZ_ASSERT(ok); + +However, the parent may be in the main process or in content. We handle both +cases in the next section. + +.. _Connecting With Other Processes: + +Connecting With Other Processes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +``DemoHelplineParent::MakeProcessAndGetAssistance`` is the method that we run +from either the main or the content process and that should kick off the +procedure that will result in sending a string (that we get from a new **Demo** +process) to a DOM promise. It starts by constructing a different promise -- +one like the ``mResolver`` in ``Host::MakeBridgeAndResolve`` in the last +section that produced a ``Maybe<Endpoint<PDemoHelplineParent>>``. In the main +process, we just make the promise ourselves and call +``DemoParent::LaunchDemoProcess`` to start the procedure that will result in +it being resolved as already described. If we are calling from the content +process, we simply write an async ``PContent`` message that calls +``DemoParent::LaunchDemoProcess`` and use the message handler's promise as +our promise: + +.. code-block:: c++ + + /* static */ + bool DemoHelplineParent::MakeProcessAndGetAssistance( + RefPtr<mozilla::dom::Promise> aPromise) { + RefPtr<LaunchDemoProcessPromise> resolver; + + if (XRE_IsContentProcess()) { + auto* contentChild = mozilla::dom::ContentChild::GetSingleton(); + MOZ_ASSERT(contentChild); + + resolver = contentChild->SendLaunchDemoProcess(); + } else { + MOZ_ASSERT(XRE_IsParentProcess()); + auto promise = MakeRefPtr<LaunchDemoProcessPromise::Private>(__func__); + resolver = promise; + + if (!DemoParent::LaunchDemoProcess( + base::GetCurrentProcId(), + [promise = std::move(promise)]( + Maybe<Endpoint<PDemoHelplineParent>>&& aMaybeEndpoint) mutable { + promise->Resolve(std::move(aMaybeEndpoint), __func__); + })) { + NS_WARNING("Failed to launch Demo process"); + resolver->Reject(NS_ERROR_FAILURE); + return false; + } + } + + resolver->Then( + GetMainThreadSerialEventTarget(), __func__, + [aPromise](Maybe<Endpoint<PDemoHelplineParent>>&& maybeEndpoint) mutable { + if (!maybeEndpoint) { + aPromise->MaybeReject(NS_ERROR_FAILURE); + return; + } + + RefPtr<DemoHelplineParent> demoHelplineParent = new DemoHelplineParent(); + Endpoint<PDemoHelplineParent> endpoint = maybeEndpoint.extract(); + if (!endpoint.Bind(demoHelplineParent)) { + NS_WARNING("Unable to bind DemoHelplineParent"); + return false; + } + MOZ_ASSERT(ok); + + // ... communicate with PDemoHelpline and write message to console ... + }, + [aPromise](mozilla::ipc::ResponseRejectReason&& aReason) { + aPromise->MaybeReject(NS_ERROR_FAILURE); + }); + + return true; + } + + mozilla::ipc::IPCResult ContentParent::RecvLaunchDemoProcess( + LaunchDemoProcessResolver&& aResolver) { + if (!DemoParent::LaunchDemoProcess(OtherPid(), + std::move(aResolver))) { + NS_WARNING("Failed to launch Demo process"); + } + return IPC_OK(); + } + +To summarize, connecting processes always requires endpoints to be constructed +by the main process, even when neither process being connected is the main +process. It is the only process that creates ``Endpoint`` objects. From that +point, connecting is just a matter of sending the endpoints to the right +processes, constructing an actor for them, and then calling ``Endpoint::Bind``. + +Completing the Sample +~~~~~~~~~~~~~~~~~~~~~ + +We have covered the main parts needed for the sample. Now we just need to wire +it all up. First, we add the new JS command to ``Navigator.webidl`` and +``Navigator.h``/``Navigator.cpp``: + +.. code-block:: c++ + + partial interface Navigator { + [Throws] + Promise<DOMString> getAssistance(); + }; + + already_AddRefed<Promise> Navigator::GetAssistance(ErrorResult& aRv) { + if (!mWindow || !mWindow->GetDocShell()) { + aRv.Throw(NS_ERROR_UNEXPECTED); + return nullptr; + } + + RefPtr<Promise> echoPromise = Promise::Create(mWindow->AsGlobal(), aRv); + if (NS_WARN_IF(aRv.Failed())) { + return nullptr; + } + + if (!DemoHelplineParent::MakeProcessAndGetAssistance(echoPromise)) { + aRv.Throw(NS_ERROR_FAILURE); + return nullptr; + } + + return echoPromise.forget(); + } + +Then, we need to add the part that gets the string we use to resolve the +promise in ``MakeProcessAndGetAssistance`` (or reject it if it hasn't been +resolved by the time ``ActorDestroy`` is called): + +.. code-block:: c++ + + using DemoPromise = MozPromise<nsString, nsresult, true>; + + /* static */ + bool DemoHelplineParent::MakeProcessAndGetAssistance( + RefPtr<mozilla::dom::Promise> aPromise) { + + // ... construct and connect demoHelplineParent ... + + RefPtr<DemoPromise> promise = demoHelplineParent->mPromise.Ensure(__func__); + promise->Then( + GetMainThreadSerialEventTarget(), __func__, + [demoHelplineParent, aPromise](nsString aMessage) mutable { + aPromise->MaybeResolve(aMessage); + }, + [demoHelplineParent, aPromise](nsresult aErr) mutable { + aPromise->MaybeReject(aErr); + }); + + if (!demoHelplineParent->SendRequestAssistance()) { + NS_WARNING("DemoHelplineParent::SendRequestAssistance failed"); + } + } + + mozilla::ipc::IPCResult DemoHelplineParent::RecvAssistance( + nsString&& aMessage, const AssistanceResolver& aResolver) { + mPromise.Resolve(aMessage, __func__); + aResolver(true); + return IPC_OK(); + } + + void DemoHelplineParent::ActorDestroy(ActorDestroyReason aWhy) { + mPromise.RejectIfExists(NS_ERROR_FAILURE, __func__); + } + +The ``DemoHelplineChild`` has to respond to the ``RequestAssistance`` method, +which it does by returning a string and then calling ``Close`` on itself when +the string has been received (but we do not call ``Close`` in the ``Recv`` +method!). We use an async response to the ``GiveAssistance`` message to detect +that the string was received. During closing, the actor's ``ActorDestroy`` +method then calls the ``DemoChild::Shutdown`` method we defined in `Destroying +the New Process`_: + +.. code-block:: c++ + + mozilla::ipc::IPCResult DemoHelplineChild::RecvRequestAssistance() { + RefPtr<DemoHelplineChild> me = this; + RefPtr<nsIRunnable> runnable = + NS_NewRunnableFunction("DemoHelplineChild::Close", [me]() { me->Close(); }); + + SendAssistance( + nsString(HelpMessage()), + [runnable](bool) { NS_DispatchToMainThread(runnable); }, + [runnable](mozilla::ipc::ResponseRejectReason) { + NS_DispatchToMainThread(runnable); + }); + + return IPC_OK(); + } + + void DemoHelplineChild::ActorDestroy(ActorDestroyReason aWhy) { + DemoChild::Shutdown(); + } + +During the **Demo** process lifetime, there are two references to the +``DemoHelplineChild``, one from IPDL and one from the ``DemoChild``. The call +to ``Close`` releases the one held by IPDL and the other isn't released until +the ``DemoChild`` is destroyed. + +Running the Sample +~~~~~~~~~~~~~~~~~~ + +To run the sample, build and run and open the console. The new command is +``navigator.getAssistance().then(console.log)``. The message sent by +``SendAssistance`` is then logged to the console. The sample code also +includes the name of the type of process that was used for the +``DemoHelplineParent`` so you can confirm that it works from main and from +content. + +Debugging Process Startup +------------------------- + +Debugging a child process at the start of its life is tricky. With most +platforms/toolchains, it is surprisingly difficult to connect a debugger before +the main routine begins execution. You may also find that console logging is +not yet established by the operating system, especially when working with +sandboxed child processes. Gecko has some facilities that make this less +painful. + +.. _Debugging with IPDL Logging: + +Debugging with IPDL Logging +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is also best seen with an example. To start, we can create a bug in the +sample by removing the ``EmptyMessageQueue`` message sent to ``DemoParent``. +This message was intended to guarantee that the ``DemoParent`` had handled all +messages sent before it, so we could ``Close`` with the knowledge that we +didn't miss anything. This sort of bug can be very difficult to track down +because it is likely to be intermittent and may manifest more easily on some +platforms/architectures than others. To create this bug, replace the +``SendEmptyMessageQueue`` call in ``DemoChild::Shutdown``: + +.. code-block:: c++ + + auto dc = gDemoChild; + RefPtr<nsIRunnable> runnable = NS_NewRunnableFunction( + "DemoChild::FinishShutdown", + [dc2 = std::move(gDemoChild)]() { dc2->Close(); }); + dc->SendEmptyMessageQueue( + [runnable](bool) { NS_DispatchToMainThread(runnable); }, + [runnable](mozilla::ipc::ResponseRejectReason) { + NS_DispatchToMainThread(runnable); + }); + +with just an (asynchronous) call to ``Close``: + +.. code-block:: c++ + + NS_DispatchToMainThread(NS_NewRunnableFunction( + "DemoChild::FinishShutdown", + [dc = std::move(gDemoChild)]() { dc->Close(); })); + +When we run the sample now, everything seems to behave ok but we see messages +like these in the console: :: + + ###!!! [Parent][RunMessage] Error: (msgtype=0x410001,name=PDemo::Msg_InitCrashReporter) Channel closing: too late to send/recv, messages will be lost + + [Parent 16672, IPC I/O Parent] WARNING: file c:/mozilla-src/mozilla-unified/ipc/chromium/src/base/process_util_win.cc:167 + [Parent 16672, Main Thread] WARNING: Not resolving response because actor is dead.: file c:/mozilla-src/mozilla-unified/ipc/glue/ProtocolUtils.cpp:931 + [Parent 16672, Main Thread] WARNING: IPDL resolver dropped without being called!: file c:/mozilla-src/mozilla-unified/ipc/glue/ProtocolUtils.cpp:959 + +We could probably figure out what is happening here from the messages but, +with more complex protocols, understanding what led to this may not be so easy. +To begin diagnosing, we can turn on IPC Logging, which was defined in the IPDL +section on :ref:`Message Logging`. We just need to set an environment variable +before starting the browser. Let's turn it on for all ``PDemo`` and +``PDemoHelpline`` actors: :: + + MOZ_IPC_MESSAGE_LOG="PDemoParent,PDemoChild,PDemoHelplineParent,PDemoHelplineChild" + +To underscore what we said above, when logging is active, the change in timing +makes the error message go away and everything closes properly on a tested +Windows desktop. However, the issue remains on a Macbook Pro and the log +shows the issue rather clearly: :: + + [time: 1627075553937959][63096->63085] [PDemoChild] Sending PDemo::Msg_InitCrashReporter + [time: 1627075553949441][63085->63096] [PDemoParent] Sending PDemo::Msg_CreateDemoHelplineChild + [time: 1627075553950293][63092->63096] [PDemoHelplineParent] Sending PDemoHelpline::Msg_RequestAssistance + [time: 1627075553979151][63096<-63085] [PDemoChild] Received PDemo::Msg_CreateDemoHelplineChild + [time: 1627075553979433][63096<-63092] [PDemoHelplineChild] Received PDemoHelpline::Msg_RequestAssistance + [time: 1627075553979498][63096->63092] [PDemoHelplineChild] Sending PDemoHelpline::Msg_GiveAssistance + [time: 1627075553980105][63092<-63096] [PDemoHelplineParent] Received PDemoHelpline::Msg_GiveAssistance + [time: 1627075553980181][63092->63096] [PDemoHelplineParent] Sending reply PDemoHelpline::Reply_GiveAssistance + [time: 1627075553980449][63096<-63092] [PDemoHelplineChild] Received PDemoHelpline::Reply_GiveAssistance + [tab 63092] NOTE: parent actor received `Goodbye' message. Closing channel. + [default 63085] NOTE: parent actor received `Goodbye' message. Closing channel. + [...] + ###!!! [Parent][RunMessage] Error: (msgtype=0x420001,name=PDemo::Msg_InitCrashReporter) Channel closing: too late to send/recv, messages will be lost + [...] + [default 63085] NOTE: parent actor received `Goodbye' message. Closing channel. + +The imbalance with ``Msg_InitCrashReporter`` is clear. The message was not +*Received* before the channel was closed. Note that the first ``Goodbye`` for +the main (default) process is for the ``PDemoHelpline`` actor -- in this case, +its child actor was in a content (tab) process. The second default process +``Goodbye`` is from the **Demo** process, sent when doing ``Close``. It might +seem that it should handle the ``Msg_InitCrashReporter`` if it can handle the +later ``Goodbye`` but this does not happen for safety reasons. + +Early Debugging For A New Process +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Let's assume now that we still don't understand the problem -- maybe we don't +know that the ``InitCrashReporter`` message is sent internally by the +``CrashReporterClient`` we initialized. Or maybe we're only looking at Windows +builds. We decide we'd like to be able to hook a debugger to the new process +so that we can break on the ``SendInitCrashReporter`` call. Attaching the +debugger has to happen fast -- process startup probably completes in under a +second. Debugging this is not always easy. + +Windows users have options that work with both the Visual Studio and WinDbg +debuggers. For Visual Studio users, there is an easy-to-use VS addon called +the `Child Process Debugging Tool`_ that allows you to connect to *all* +processes that are launched by a process you are debugging. So, if the VS +debugger is connected to the main process, it will automatically connect to the +new **Demo** process (and every other launched process) at the point that they +are spawned. This way, the new process never does anything outside of the +debugger. Breakpoints, etc work as expected. The addon mostly works like a +toggle and will remain on until it is disabled from the VS menu. + +WinDbg users can achieve essentially the same behavior with the `.childdbg`_ +command. See the docs for details but essentially all there is to know is that +``.childdbg 1`` enables it and ``.childdbg 0`` disables it. You might add it +to a startup config file (see the WinDbg ``-c`` command line option) + +Linux and mac users should reference gdb's ``detach-on-fork``. The command to +debug child processes is ``set detach-on-fork off``. Again, the behavior is +largely what you would expect -- that all spawned processes are added to the +current debug session. The command can be added to ``.gdbinit`` for ease. At +the time of this writing, lldb does not support automatically connecting to +newly spawned processes. + +Finally, Linux users can use ``rr`` for time-travel debugging. See `Debugging +Firefox with rr`_ for details. + +These solutions are not always desirable. For example, the fact that they hook +*all* spawned processes can mean that targeting breakpoints to one process +requires us to manually disconnect many other processes. In these cases, an +easier solution may be to use Gecko environment variables that will cause the +process to sleep for some number of seconds. During that time, you can find +the process ID (PID) for the process you want to debug and connect your +debugger to it. OS tools like ``ProcessMonitor`` can give you the PID but it +will also be clearly logged to the console just before the process waits. + +Set ``MOZ_DEBUG_CHILD_PROCESS=1`` to turn on process startup pausing. You can +also set ``MOZ_DEBUG_CHILD_PAUSE=N`` where N is the number of seconds to sleep. +The default is 10 seconds on Windows and 30 on other platforms. + +Pausing for the debugger is not a panacea. Since the environmental variables +are not specific to process type, you will be forced to wait for all of the +processes Gecko creates before you wait for it to get to yours. The pauses can +also end up exposing unknown concurrency bugs in the browser before it even +gets to your issue, which is good to discover but doesn't fix your bug. That +said, any of these strategies would be enough to facilitate easily breaking on +``SendInitCrashReporter`` and finding our sender. + +.. _Child Process Debugging Tool: https://marketplace.visualstudio.com/items?itemName=vsdbgplat.MicrosoftChildProcessDebuggingPowerTool +.. _.childdbg: https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/-childdbg--debug-child-processes- diff --git a/ipc/docs/utility_process.rst b/ipc/docs/utility_process.rst new file mode 100644 index 0000000000..8940a56317 --- /dev/null +++ b/ipc/docs/utility_process.rst @@ -0,0 +1,69 @@ +Utility Process +=============== + +.. warning:: + As of january 2022, this process is under heavy work, and many things can + evolve. Documentation might not always be as accurate as it should be. + Please reach to #ipc if you intent to add a new utility. + +The utility process is used to provide a simple way to implement IPC actor with +some more specific sandboxing properties, in case where you don't need or want +to deal with the extra complexity of adding a whole new process type but you +just want to apply different sandboxing policies. +To implement such an actor, you will have to follow a few steps like for +implementing the trivial example visible in `EmptyUtil +<https://phabricator.services.mozilla.com/D126402>`_: + + - Define a new IPC actor, e.g., ``PEmptyUtil`` that allows to get some string + via ``GetSomeString()`` from the child to the parent + + - In the ``PUtilityProcess`` definition, expose a new child-level method, + e.g., ``StartEmptyUtilService(Endpoint<PEmptyUtilChild>)`` + + - Implement ``EmptyUtilChild`` and ``EmptyUtilParent`` classes both deriving + from their ``PEmptyUtilXX``. If you want or need to run things from a + different thread, you can have a look at ``UtilityProcessGenericActor`` + + - Make sure both are refcounted + + - Expose your new service on ``UtilityProcessManager`` with a method + performing the heavy lifting of starting your process, you can take + inspiration from ``StartEmptyUtil()`` in the sample. + + - Ideally, this starting method should rely on `StartUtility() <https://searchfox.org/mozilla-central/rev/fb511723f821ceabeea23b123f1c50c9e93bde9d/ipc/glue/UtilityProcessManager.cpp#210-258,266>`_ + + - To use ``StartUtility()`` mentioned above, please ensure that you provide + a ``nsresult BindToUtilityProcess(RefPtr<UtilityProcessParent> + aUtilityParent)``. Usually, it should be in charge of creating a set of + endpoints and performing ``Bind()`` to setup the IPC. You can see some example for `Utility AudioDecoder <https://searchfox.org/mozilla-central/rev/4b3039b48c3cb67774270ebcc2a7d8624d888092/ipc/glue/UtilityAudioDecoderChild.h#31-51>`_ + + - For proper user-facing exposition in ``about:processes`` you will have to also provide an actor + name via a method ``UtilityActorName GetActorName() { return UtilityActorName::EmptyUtil; }`` + + + Add member within `enum WebIDLUtilityActorName in <https://searchfox.org/mozilla-central/rev/fb511723f821ceabeea23b123f1c50c9e93bde9d/dom/chrome-webidl/ChromeUtils.webidl#686-689>`_ + + - Handle reception of ``StartEmptyUtilService`` on the child side of + ``UtilityProcess`` within ``RecvStartEmptyUtilService()`` + + - In ``UtilityProcessChild::ActorDestroy``, release any resources that + you stored a reference to in ``RecvStartEmptyUtilService()``. This + will probably include a reference to the ``EmptyUtilChild``. + + - The specific sandboxing requirements can be implemented by tracking + ``SandboxingKind``, and it starts within `UtilityProcessSandboxing header + <https://searchfox.org/mozilla-central/source/ipc/glue/UtilityProcessSandboxing.h>`_ + + - Try and make sure you at least add some ``gtest`` coverage of your new + actor, for example like in `existing gtest + <https://searchfox.org/mozilla-central/source/ipc/glue/test/gtest/TestUtilityProcess.cpp>`_ + + - Also ensure actual sandbox testing within + + + ``SandboxTest`` to start your new process, + `<https://searchfox.org/mozilla-central/source/security/sandbox/common/test/SandboxTest.cpp>`_ + + + ``SandboxTestingChildTests`` to define the test + `<https://searchfox.org/mozilla-central/source/security/sandbox/common/test/SandboxTestingChildTests.h>`_ + + + ``SandboxTestingChild`` to run your test + `<https://searchfox.org/mozilla-central/source/security/sandbox/common/test/SandboxTestingChild.cpp>`_ |