diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 17:32:43 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-07 17:32:43 +0000 |
commit | 6bf0a5cb5034a7e684dcc3500e841785237ce2dd (patch) | |
tree | a68f146d7fa01f0134297619fbe7e33db084e0aa /js/src/doc/HazardAnalysis/index.md | |
parent | Initial commit. (diff) | |
download | thunderbird-upstream.tar.xz thunderbird-upstream.zip |
Adding upstream version 1:115.7.0.upstream/1%115.7.0upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'js/src/doc/HazardAnalysis/index.md')
-rw-r--r-- | js/src/doc/HazardAnalysis/index.md | 98 |
1 files changed, 98 insertions, 0 deletions
diff --git a/js/src/doc/HazardAnalysis/index.md b/js/src/doc/HazardAnalysis/index.md new file mode 100644 index 0000000000..3a28658ed5 --- /dev/null +++ b/js/src/doc/HazardAnalysis/index.md @@ -0,0 +1,98 @@ +# Static Analysis for Rooting and Heap Write Hazards + +Treeherder can run two static analysis builds: the full browser (linux64-haz), just the JS shell (linux64-shell-haz). They show up on treeherder as `H` and `SM(H)`. + +## Diagnosing a hazard failure + +The first step is to look at what sort of hazard is being reported. There are two types that cause the job to fail: stack rooting hazards for garbage collection, and heap write thread safety hazards for stylo. + +The summary output will include either the string `<N> rooting hazards detected` or `<N> heap write hazards detected out of <M> allowed`. See the appropriate section below for each. + +## Diagnosing a rooting hazards failure + +Click on the `H` build link, select the "Artifacts" pane on the bottom left, and download the `public/build/hazards.txt.gz` file. + +Example snippet: + + Function 'jsopcode.cpp:uint8 DecompileExpressionFromStack(JSContext*, int32, int32, class JS::Handle<JS::Value>, int8**)' has unrooted 'ed' of type 'ExpressionDecompiler' live across GC call 'uint8 ExpressionDecompiler::decompilePC(uint8*)' at js/src/jsopcode.cpp:1866 + js/src/jsopcode.cpp:1866: Assume(74,75, !__temp_23*, true) + js/src/jsopcode.cpp:1867: Assign(75,76, return := 0) + js/src/jsopcode.cpp:1867: Call(76,77, ed.~ExpressionDecompiler()) + GC Function: uint8 ExpressionDecompiler::decompilePC(uint8*) + JSString* js::ValueToSource(JSContext*, class JS::Handle<JS::Value>) + uint8 js::Invoke(JSContext*, JS::Value*, JS::Value*, uint32, JS::Value*, class JS::MutableHandle<JS::Value>) + uint8 js::Invoke(JSContext*, JS::CallArgs, uint32) + JSScript* JSFunction::getOrCreateScript(JSContext*) + uint8 JSFunction::createScriptForLazilyInterpretedFunction(JSContext*, class JS::Handle<JSFunction*>) + uint8 JSRuntime::cloneSelfHostedFunctionScript(JSContext*, class JS::Handle<js::PropertyName*>, class JS::Handle<JSFunction*>) + JSScript* js::CloneScript(JSContext*, class JS::Handle<JSObject*>, class JS::Handle<JSFunction*>, const class JS::Handle<JSScript*>, uint32) + JSObject* js::CloneStaticBlockObject(JSContext*, class JS::Handle<JSObject*>, class JS::Handle<js::StaticBlockObject*>) + js::StaticBlockObject* js::StaticBlockObject::create(js::ExclusiveContext*) + js::Shape* js::EmptyShape::getInitialShape(js::ExclusiveContext*, js::Class*, js::TaggedProto, JSObject*, JSObject*, uint32, uint32) + js::Shape* js::EmptyShape::getInitialShape(js::ExclusiveContext*, js::Class*, js::TaggedProto, JSObject*, JSObject*, uint64, uint32) + js::UnownedBaseShape* js::BaseShape::getUnowned(js::ExclusiveContext*, js::StackBaseShape*) + js::BaseShape* js_NewGCBaseShape(js::ThreadSafeContext*) [with js::AllowGC allowGC = (js::AllowGC)1u] + js::BaseShape* js::gc::NewGCThing(js::ThreadSafeContext*, uint32, uint64, uint32) [with T = js::BaseShape; js::AllowGC allowGC = (js::AllowGC)1u; size_t = long unsigned int] + void js::gc::RunDebugGC(JSContext*) + void js::MinorGC(JSRuntime*, uint32) + GC + +This means that a rooting hazard was discovered at `js/src/jsopcode.cpp` line 1866, in the function `DecompileExpressionFromStack` (it is prefixed with the filename because it's a static function.) The problem is that there is an unrooted variable `ed` that holds an `ExpressionDecompiler` live across a call to `decompilePC`. "Live" means that the variable is used after the call to `decompilePC` returns. `decompilePC` may trigger a GC according to the static call stack given starting from the line beginning with "`GC Function:`". + +The hazard itself has some barely comprehensible `Assume(...)` and `Call(...)` gibberish that describes the exact data flow path of the variable into the function call. That stuff is rarely useful -- usually, you'll only need to look at it if it's complaining about a temporary and you want to know where the temporary came from. The type `ExpressionDecompiler` is believed to hold pointers to GC-controlled objects of some sort. The analysis currently does not describe the exact field it is worried about. + +To unpack this a little, the analysis is saying the following can happen: + +* `ExpressionDecompiler` contains some pointer to a GC thing. For example, it might have a field `obj` of type `JSObject*`. +* `DecompileExpressionFromStack` is called. +* A pointer is stored in that field of the `ed` variable. +* `decompilePC` is invoked, which calls `ValueToSource`, which calls `Invoke`, which eventually calls `js::MinorGC` +* During the resulting garbage collection, the object pointed to by `ed.obj` is moved to a different location. All pointers stored in the JS heap are updated automatically, as are all rooted pointers. `ed.obj` is not, because the GC doesn't know about it. +* After `decompilePC` returns, something accesses `ed.obj`. This is now a stale pointer, and may refer to just about anything -- the wrong object, an invalid object, or whatever. As TeX would say, **badness 10000**. + +## Diagnosing a heap write hazard failure + +For the thread unsafe heap write analysis, a hazard means that some Gecko_* function calls, directly or indirectly, code that writes to something on the heap, or calls an unknown function that *might* write to something on the heap. The analysis requires quite a few annotations to describe things that are actually safe. This section will be expanded as we gain more experience with the analysis, but here are some common issues: + +* Adding a new Gecko_* function: often, you will need to annotate any outparams or owned (thread-local) parameters in the `treatAsSafeArgument` function in `js/src/devtools/rootAnalysis/analyzeHeapWrites.js`. +* Calling some libc function: if you add a call to some random libc function (eg `sin()` or `floor()` or `ceil()`, though the latter two are already annotated), the analysis will report an "External Function". Add it to `checkExternalFunction`, assuming it *doesn't* have the possibility of writing to shared heap memory. +* If you call some non-returning (crashing) function that the analysis doesn't know about, you'll need to add it to `ignoreContents`. + +On the other hand, you might have a real thread safety issue on your hands. Shared caches are common problems. Fix it. + +## Analysis implementation + +These builds do the following: + +* set up a build environment and run the analysis within it, then upload the resulting files + * compile an optimized JS shell to later run the analysis + * compile the browser with gcc, using a slightly modified version of the sixgill (http://svn.sixgill.org) gcc plugin +* produce a set of `.xdb` files describing everything encountered during the compilation +* analyze the `.xdb` files with scripts in `js/src/devtools/rootAnalysis` + +The format of the information stored in those files is [somewhat documented][CFG]. + +## Running the analysis + +### Pushing to try + +The easiest way to run an analysis is to push to try with `mach try fuzzy -q "'haz"` (or, if the hazards of interest are contained entirely within `js/src`, use `mach try fuzzy -q "'shell-haz"` for a much faster result). The expected turnaround time for linux64-haz is just under 1.5 hours (~20 minutes for `hazard-linux64-shell-haz`). + +The output will be uploaded and an output file `hazards.txt.xz` will be placed into the "Artifacts" info pane on treeherder. + +### Running locally + +The rooting [hazard analysis may be run][running] using mach. + +## So you broke the analysis by adding a hazard. Now what? + +Backout, fix the hazard, or (final resort) update the expected number of hazards in `js/src/devtools/rootAnalysis/expect.browser.json` (but don't do that). + +The most common way to fix a hazard is to change the variable to be a `Rooted` type, as described in [RootingAPI.h][rooting] + +For more complicated cases, ask on the Matrix channel (see [spidermonkey.dev][spidermonkey] for contact info). If you don't get a response, ping sfink or jonco for rooting hazards, bholley or sfink for heap write hazards. + +[running]: running.md +[rooting]: https://searchfox.org/mozilla-central/source/js/public/RootingAPI.h +[spidermonkey]: https://spidermonkey.dev/ +[CFG]: CFG.md |