diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-28 14:29:10 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-28 14:29:10 +0000 |
commit | 2aa4a82499d4becd2284cdb482213d541b8804dd (patch) | |
tree | b80bf8bf13c3766139fbacc530efd0dd9d54394c /js/src/doc/index.rst | |
parent | Initial commit. (diff) | |
download | firefox-2aa4a82499d4becd2284cdb482213d541b8804dd.tar.xz firefox-2aa4a82499d4becd2284cdb482213d541b8804dd.zip |
Adding upstream version 86.0.1.upstream/86.0.1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'js/src/doc/index.rst')
-rw-r--r-- | js/src/doc/index.rst | 192 |
1 files changed, 192 insertions, 0 deletions
diff --git a/js/src/doc/index.rst b/js/src/doc/index.rst new file mode 100644 index 0000000000..40199f7e90 --- /dev/null +++ b/js/src/doc/index.rst @@ -0,0 +1,192 @@ +============ +SpiderMonkey +============ + +*SpiderMonkey* is the *JavaScript* and *WebAssembly* implementation library of +the *Mozilla Firefox* web browser. The implementation behaviour is defined by +the `ECMAScript <https://tc39.es/ecma262/>`_ and `WebAssembly +<https://webassembly.org/>`_ specifications. + +Much of the internal technical documentation of the engine can be found +throughout the source files themselves by looking for comments labelled with +`[SMDOC]`_. Information about the team, our processes, and about embedding +*SpiderMonkey* in your own projects can be found at https://spidermonkey.dev. + +Components of SpiderMonkey +########################## + +๐งน Garbage Collector +********************* + +*JavaScript* is a garbage collected language and at the core of *SpiderMonkey* +we manage a garbage-collected memory heap. Elements of this heap have a base +C++ type of `gc::Cell`_. Each round of garbage collection will free up any +*Cell* that is not referenced by a *root* or another live *Cell* in turn. + +See :doc:`GC overview<gc>` for more details. + + +๐ฆ JS::Value and JSObject +************************** + +*JavaScript* values are divided into either objects or primitives +(*Undefined*, *Null*, *Boolean*, *Number*, *BigInt*, *String*, or *Symbol*). +Values are represented with the `JS::Value`_ type which may in turn point to +an object that extends from the `JSObject`_ type. Objects include both plain +*JavaScript* objects and exotic objects representing various things from +functions to *ArrayBuffers* to *HTML Elements* and more. + +Most objects extend ``NativeObject`` (which is a subtype of ``JSObject``) +which provides a way to store properties as key-value pairs similar to a hash +table. These objects hold their *values* and point to a *Shape* that +represents the set of *keys*. Similar objects point to the same *Shape* which +saves memory and allows the JITs to quickly work with objects similar to ones +it has seen before. See the `[SMDOC] Shapes`_ comment for more details. + +C++ (and Rust) code may create and manipulate these objects using the +collection of interfaces we traditionally call the **JSAPI**. + + +๐๏ธ JavaScript Parser +********************* + +In order to evaluate script text, we parse it using the *Parser* into an +`Abstract Syntax Tree`_ (AST) temporarily and then run the *BytecodeEmitter* +(BCE) to generate `Bytecode`_ and associated metadata. We refer to this +resulting format as `Stencil`_ and it has the helpful characteristic that it +does not utilize the Garbage Collector. The *Stencil* can then be +instantiated into a series of GC *Cells* that can be mutated and understood +by the execution engines described below. + +Each function as well as the top-level itself generates a distinct script. +This is the unit of execution granularity since functions may be set as +callbacks that the host runs at a later time. There are both +``ScriptStencil`` and ``js::BaseScript`` forms of scripts. + +By default, the parser runs in a mode called *syntax* or *lazy* parsing where +we avoid generating full bytecode for functions within the source that we are +parsing. This lazy parsing is still required to check for all *early errors* +that the specification describes. When such a lazily compiled inner function +is first executed, we recompile just that function in a process called +*delazification*. Lazy parsing avoids allocating the AST and bytecode which +saves both CPU time and memory. In practice, many functions are never +executed during a given load of a webpage so this delayed parsing can be +quite beneficial. + + +โ๏ธ JavaScript Interpreter +************************** + +The *bytecode* generated by the parser may be executed by an interpreter +written in C++ that manipulates objects in the GC heap and invokes native +code of the host (eg. web browser). See `[SMDOC] Bytecode Definitions`_ for +descriptions of each bytecode opcode and ``js/src/vm/Interpreter.cpp`` for +their implementation. + + +โก JavaScript JITs +******************* + +In order to speed up execution of *bytecode*, we use a series of Just-In-Time +(JIT) compilers to generate specialized machine code (eg. x86, ARM, etc) +tailored to the *JavaScript* that is run and the data that is processed. + +As an individual script runs more times (or has a loop that runs many times) +we describe it as getting *hotter* and at certain thresholds we *tier-up* by +JIT-compiling it. Each subsequent JIT tier spends more time compiling but +aims for better execution performance. + +Baseline Interpreter +-------------------- + +The *Baseline Interpreter* is a hybrid interpreter/JIT that interprets the +*bytecode* one opcode at a time, but attaches small fragments of code called +*Inline Caches* (ICs) that rapidly speed-up executing the same opcode the next +time (if the data is similar enough). See the `[SMDOC] JIT Inline Caches`_ +comment for more details. + +Baseline Compiler +----------------- + +The *Baseline Compiler* use the same *Inline Caches* mechanism from the +*Baseline Interpreter* but additionally translates the entire bytecode to +native machine code. This removes dispatch overhead and does minor local +optimizations. This machine code still calls back into C++ for complex +operations. The translation is very fast but the ``BaselineScript`` uses +memory and requires ``mprotect`` and flushing CPU caches. + +WarpMonkey +---------- + +The *WarpMonkey* JIT replaces the former *IonMonkey* engine and is the +highest level of optimization for the most frequently run scripts. It is able +to inline other scripts and specialize code based on the data and arguments +being processed. + +We translate the *bytecode* and *Inline Cache* data into a Mid-level +`Intermediate Representation`_ (Ion MIR) representation. This graph is +transformed and optimized before being *lowered* to a Low-level Intermediate +Representation (Ion LIR). This *LIR* performs register allocation and then +generates native machine code in a process called *Code Generation*. + +The optimizations here assume that a script continues to see data similar +what has been seen before. The *Baseline* JITs are essential to success here +because they generate *ICs* that match observed data. If after a script is +compiled with *Warp*, it encounters data that it is not prepared to handle it +performs a *bailout*. The *bailout* mechanism reconstructs the native machine +stack frame to match the layout used by the *Baseline Interpreter* and then +branches to that interpreter as though we were running it all along. Building +this stack frame may use special side-table saved by *Warp* to reconstruct +values that are not otherwise available. + + +๐ช WebAssembly +*************** + +In addition to *JavaScript*, the engine is also able to execute *WebAssembly* +(WASM) sources. + +WASM-Baseline (RabaldrMonkey) +----------------------------- + +This engine performs fast translation to machine code in order to minimize +latency to first execution. + +WASM-Ion (BaldrMonkey) +---------------------- + +This engine translates the WASM input into same *MIR* form that *WarpMonkey* +uses and uses the *IonBackend* to optimize. These optimizations (and in +particular, the register allocation) generate very fast native machine code. + +Cranelift +--------- + +This experimental alternative to *BaldrMonkey* is an optimizing WASM compiler +written in Rust. This currently is used on ARM64-based platforms (which do +not support *BaldrMonkey*). + + +Other documentation +################### + +.. toctree:: + :maxdepth: 1 + + build + gc + Debugger/index + SavedFrame/index + + +.. _gc::Cell: https://searchfox.org/mozilla-central/search?q=[SMDOC]+GC+Cell +.. _JSObject: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JSObject+layout +.. _JS::Value: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JS%3A%3AValue+type&path=js%2F +.. _[SMDOC]: https://searchfox.org/mozilla-central/search?q=[SMDOC]&path=js%2F +.. _[SMDOC] Shapes: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Shapes +.. _[SMDOC] Bytecode Definitions: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Bytecode+Definitions&path=js%2F +.. _[SMDOC] JIT Inline Caches: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JIT+Inline+Caches +.. _Stencil: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Script+Stencil +.. _Bytecode: https://en.wikipedia.org/wiki/Bytecode +.. _Abstract Syntax Tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree +.. _Intermediate Representation: https://en.wikipedia.org/wiki/Intermediate_representation |