From 26a029d407be480d791972afb5975cf62c9360a6 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Fri, 19 Apr 2024 02:47:55 +0200 Subject: Adding upstream version 124.0.1. Signed-off-by: Daniel Baumann --- js/src/doc/index.rst | 204 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 204 insertions(+) create mode 100644 js/src/doc/index.rst (limited to 'js/src/doc/index.rst') diff --git a/js/src/doc/index.rst b/js/src/doc/index.rst new file mode 100644 index 0000000000..82c36d52ad --- /dev/null +++ b/js/src/doc/index.rst @@ -0,0 +1,204 @@ +============ +SpiderMonkey +============ + +*SpiderMonkey* is the *JavaScript* and *WebAssembly* implementation library of +the *Mozilla Firefox* web browser. The implementation behaviour is defined by +the `ECMAScript `_ and `WebAssembly +`_ specifications. + +Much of the internal technical documentation of the engine can be found +throughout the source files themselves by looking for comments labelled with +`[SMDOC]`_. Information about the team, our processes, and about embedding +*SpiderMonkey* in your own projects can be found at https://spidermonkey.dev. + +Specific documentation on a few topics is available at: + +.. toctree:: + :maxdepth: 1 + + build + test + hacking_tips + Debugger/index + SavedFrame/index + feature_checklist + bytecode_checklist + + +Components of SpiderMonkey +########################## + +๐Ÿงน Garbage Collector +********************* + +.. toctree:: + :maxdepth: 2 + :hidden: + + Overview + Rooting Hazard Analysis + Running the Analysis + +*JavaScript* is a garbage collected language and at the core of *SpiderMonkey* +we manage a garbage-collected memory heap. Elements of this heap have a base +C++ type of `gc::Cell`_. Each round of garbage collection will free up any +*Cell* that is not referenced by a *root* or another live *Cell* in turn. + +See :doc:`GC overview` for more details. + + +๐Ÿ“ฆ JS::Value and JSObject +************************** + +*JavaScript* values are divided into either objects or primitives +(*Undefined*, *Null*, *Boolean*, *Number*, *BigInt*, *String*, or *Symbol*). +Values are represented with the `JS::Value`_ type which may in turn point to +an object that extends from the `JSObject`_ type. Objects include both plain +*JavaScript* objects and exotic objects representing various things from +functions to *ArrayBuffers* to *HTML Elements* and more. + +Most objects extend ``NativeObject`` (which is a subtype of ``JSObject``) +which provides a way to store properties as key-value pairs similar to a hash +table. These objects hold their *values* and point to a *Shape* that +represents the set of *keys*. Similar objects point to the same *Shape* which +saves memory and allows the JITs to quickly work with objects similar to ones +it has seen before. See the `[SMDOC] Shapes`_ comment for more details. + +C++ (and Rust) code may create and manipulate these objects using the +collection of interfaces we traditionally call the **JSAPI**. + + +๐Ÿ—ƒ๏ธ JavaScript Parser +********************* + +In order to evaluate script text, we parse it using the *Parser* into an +`Abstract Syntax Tree`_ (AST) temporarily and then run the *BytecodeEmitter* +(BCE) to generate `Bytecode`_ and associated metadata. We refer to this +resulting format as `Stencil`_ and it has the helpful characteristic that it +does not utilize the Garbage Collector. The *Stencil* can then be +instantiated into a series of GC *Cells* that can be mutated and understood +by the execution engines described below. + +Each function as well as the top-level itself generates a distinct script. +This is the unit of execution granularity since functions may be set as +callbacks that the host runs at a later time. There are both +``ScriptStencil`` and ``js::BaseScript`` forms of scripts. + +By default, the parser runs in a mode called *syntax* or *lazy* parsing where +we avoid generating full bytecode for functions within the source that we are +parsing. This lazy parsing is still required to check for all *early errors* +that the specification describes. When such a lazily compiled inner function +is first executed, we recompile just that function in a process called +*delazification*. Lazy parsing avoids allocating the AST and bytecode which +saves both CPU time and memory. In practice, many functions are never +executed during a given load of a webpage so this delayed parsing can be +quite beneficial. + + +โš™๏ธ JavaScript Interpreter +************************** + +The *bytecode* generated by the parser may be executed by an interpreter +written in C++ that manipulates objects in the GC heap and invokes native +code of the host (eg. web browser). See `[SMDOC] Bytecode Definitions`_ for +descriptions of each bytecode opcode and ``js/src/vm/Interpreter.cpp`` for +their implementation. + + +โšก JavaScript JITs +******************* + +.. toctree:: + :maxdepth: 1 + :hidden: + + MIR-optimizations/index + +In order to speed up execution of *bytecode*, we use a series of Just-In-Time +(JIT) compilers to generate specialized machine code (eg. x86, ARM, etc) +tailored to the *JavaScript* that is run and the data that is processed. + +As an individual script runs more times (or has a loop that runs many times) +we describe it as getting *hotter* and at certain thresholds we *tier-up* by +JIT-compiling it. Each subsequent JIT tier spends more time compiling but +aims for better execution performance. + +Baseline Interpreter +-------------------- + +The *Baseline Interpreter* is a hybrid interpreter/JIT that interprets the +*bytecode* one opcode at a time, but attaches small fragments of code called +*Inline Caches* (ICs) that rapidly speed-up executing the same opcode the next +time (if the data is similar enough). See the `[SMDOC] JIT Inline Caches`_ +comment for more details. + +Baseline Compiler +----------------- + +The *Baseline Compiler* use the same *Inline Caches* mechanism from the +*Baseline Interpreter* but additionally translates the entire bytecode to +native machine code. This removes dispatch overhead and does minor local +optimizations. This machine code still calls back into C++ for complex +operations. The translation is very fast but the ``BaselineScript`` uses +memory and requires ``mprotect`` and flushing CPU caches. + +WarpMonkey +---------- + +The *WarpMonkey* JIT replaces the former *IonMonkey* engine and is the +highest level of optimization for the most frequently run scripts. It is able +to inline other scripts and specialize code based on the data and arguments +being processed. + +We translate the *bytecode* and *Inline Cache* data into a Mid-level +`Intermediate Representation`_ (Ion MIR) representation. This graph is +transformed and optimized before being *lowered* to a Low-level Intermediate +Representation (Ion LIR). This *LIR* performs register allocation and then +generates native machine code in a process called *Code Generation*. + +See `MIR Optimizations`_ for an overview of MIR optimizations. + +The optimizations here assume that a script continues to see data similar +what has been seen before. The *Baseline* JITs are essential to success here +because they generate *ICs* that match observed data. If after a script is +compiled with *Warp*, it encounters data that it is not prepared to handle it +performs a *bailout*. The *bailout* mechanism reconstructs the native machine +stack frame to match the layout used by the *Baseline Interpreter* and then +branches to that interpreter as though we were running it all along. Building +this stack frame may use special side-table saved by *Warp* to reconstruct +values that are not otherwise available. + + +๐ŸŸช WebAssembly +*************** + +In addition to *JavaScript*, the engine is also able to execute *WebAssembly* +(WASM) sources. + +WASM-Baseline (RabaldrMonkey) +----------------------------- + +This engine performs fast translation to machine code in order to minimize +latency to first execution. + +WASM-Ion (BaldrMonkey) +---------------------- + +This engine translates the WASM input into same *MIR* form that *WarpMonkey* +uses and uses the *IonBackend* to optimize. These optimizations (and in +particular, the register allocation) generate very fast native machine code. + + +.. _gc::Cell: https://searchfox.org/mozilla-central/search?q=[SMDOC]+GC+Cell +.. _JSObject: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JSObject+layout +.. _JS::Value: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JS%3A%3AValue+type&path=js%2F +.. _[SMDOC]: https://searchfox.org/mozilla-central/search?q=[SMDOC]&path=js%2F +.. _[SMDOC] Shapes: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Shapes +.. _[SMDOC] Bytecode Definitions: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Bytecode+Definitions&path=js%2F +.. _[SMDOC] JIT Inline Caches: https://searchfox.org/mozilla-central/search?q=[SMDOC]+JIT+Inline+Caches +.. _Stencil: https://searchfox.org/mozilla-central/search?q=[SMDOC]+Script+Stencil +.. _Bytecode: https://en.wikipedia.org/wiki/Bytecode +.. _Abstract Syntax Tree: https://en.wikipedia.org/wiki/Abstract_syntax_tree +.. _Intermediate Representation: https://en.wikipedia.org/wiki/Intermediate_representation +.. _MIR Optimizations: ./MIR-optimizations/index.html -- cgit v1.2.3