From 26a029d407be480d791972afb5975cf62c9360a6 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Fri, 19 Apr 2024 02:47:55 +0200 Subject: Adding upstream version 124.0.1. Signed-off-by: Daniel Baumann --- .../debugging/understanding_crash_reports.rst | 325 +++++++++++++++++++++ 1 file changed, 325 insertions(+) create mode 100644 docs/contributing/debugging/understanding_crash_reports.rst (limited to 'docs/contributing/debugging/understanding_crash_reports.rst') diff --git a/docs/contributing/debugging/understanding_crash_reports.rst b/docs/contributing/debugging/understanding_crash_reports.rst new file mode 100644 index 0000000000..9f8e5bbe03 --- /dev/null +++ b/docs/contributing/debugging/understanding_crash_reports.rst @@ -0,0 +1,325 @@ +Understanding Crash Reports +=========================== + ++--------------------------------------------------------------------+ +| This page is an import from MDN and the contents might be outdated | ++--------------------------------------------------------------------+ + +If a user experiences a crash they will be prompted to submit a raw +crash report, which is generated by Breakpad. The raw crash report is +received by `Socorro `__ which +`creates `__ +a processed crash report. The processed crash report is based on the raw +crash report but also has a signature, classifications, and a number of +improved fields (e.g. OS, product, version). Many of the fields in both +the raw crash report and the processed crash report are viewable and +searchable on `crash-stats `__. +Although there are two distinct crash reports, the raw and the +processed, people typically talk about a single "crash report" because +crash-stats mostly presents them in a combined way. + +Each crash report contains a wealth of data about the crash +circumstances. Despite this, many crash reports lack sufficient data for +a developer to understand why the crash occurred. As well as providing a +general overview, this page aims to highlight parts of a crash report +that may provide non-obvious insights. + +Note that most crash report fields are visible, but a few +privacy-sensitive parts of it are only available to users who are logged +in and have "minidump access". A relatively small number of users have +minidump access, and they are required to follow certain rules. For +access, see the `Protected Data Access docs on Crash Stats +`__. + +Each crash report has the following tabs: Details, Metadata, Modules, +Raw Dump, Extensions, and (optional) Correlations. + +Details tab +----------- + +The Details tab is the first place to look because it contains the most +important pieces of information. + +Primary fields +~~~~~~~~~~~~~~ + +| The first part of the Details tab shows a table containing the most + important crash report fields. It includes such things as when the + crash occurred, in which product and version, the crash kind, and + various details about the OS and configuration of the machine on which + the crash occurred. The following screenshot shows some of these + fields. +| |Example fields in the "Details" tab of a crash report| + +All fields have a tool-tip. For many fields, the tool-tip describes its +meaning. For all fields, the tool-tip indicates the key to use when you +want to do searches involving this field. (The field name is usually but +not always similar to the search key. E.g. the field "Adapter Device ID" +has the search key "adapter_device_id".) These descriptions are shown in +the `SuperSearchFields +API `__ and can be +`modified in super_search_fields.py `__ +or by writing up a `bug in Socorro `__. + +The fields present in this tab vary depending on the crash kind. Not all +fields are always present. + +The "Signature" field is the main identifier or label for a crash report. +Rather than considering each crash report in isolation, we want to put +crash reports into clusters so we can deal with groups of them at once. +An ideal clustering algorithm would put all crash reports with the same +root cause into a single cluster, and all crash reports with different +root causes into different clusters. The crash signature is our +imperfect but still useful attempt at such an algorithm. Most crash +signatures are based on the crashing stack trace, but some +special-purpose annotations are used to indicate particular kinds of +crashes. + +- ``Abort``: A controlled abort, e.g. via ``NS_RUNTIMEABORT``. + (Controlled aborts that occur via ``MOZ_CRASH`` or + ``MOZ_RELEASE_ASSERT`` currently don't get an ``Abort`` annotation, + but they do get a "MOZ_CRASH Reason" field.) +- ``OOM | ``, where ```` is one of ``large``, ``small``, + ``unknown``: an out-of-memory (OOM) abort. The ```` annotation + is determined by the "OOM Allocation Size" field; if that field is + missing ```` will be ``unknown``. +- ``hang``: a hang prior to shutdown. +- ``shutdownhang``: a hang during shutdown. +- ``IPCError-browser``: a problem involving IPC. If the parent Firefox + process detects that the child process has sent broken or + unprocessable IPDL data, or is not shutting down in a timely manner, + it kills the child process with a crash report. These crashes will + now have a signature that indicates why the process was killed, + rather than the child stack at the moment. + +When no special-purpose annotation is present and the signature begins +with a stack frame, it's usually a vanilla uncontrolled crash. The crash +cause can be determined from the "Crash Reason" field. Most commonly +it's a bad memory access. In that case, on Windows you can tell from the +reason field if the crash occurred while reading, writing or executing +memory (e.g. ``EXCEPTION_VIOLATION_ACCESS_READ`` indicates a bad memory +read). On Mac and Linux the reason will be SIGSEGV or SIGBUS and you +cannot tell from this field what kind of memory access it was. + +See `this +file `__ +for a detailed explanation of the crash report signature generation +procedure, and for information on how modify this procedure. + +There are no fields that uniquely identify the user that a crash report +came from, but if you want to know if multiple crashes come from a +single user the "Install Time" field is a good choice. Use it in +conjunction with other fields that don't change, such as those +describing the OS or graphics card, for additional confidence. + +For bad memory accesses, the "Crash Address" field can give additional +indications what went wrong. + +- 0x0 is probably a null pointer deference[*]. +- Small addresses like 0x8 can indicate an object access (e.g. + ``this->mFoo``) via a null ``this`` pointer. +- Addresses like 0xfffffffffd8 might be stack accesses, depending on + the platform[*]. +- Addresses like 0x80cdefd3 might be heap accesses, depending on the + platform. +- Addresses may be poisoned: 0xe4 indicates the address comes from + memory that has been allocated by jemalloc but not yet initialized; + 0xe5 indicates the address comes from memory freed by jemalloc. The + JS engine also has multiple poison values defined in + ``js/src/jsutil.h``. + +[*] Note that due to the way addressing works on x86-64, if the crash +address is 0x0 for a Linux/macOS crash report, or 0xffffffffffffffff for +a Windows crash report, it's highly likely that the value is incorrect. +(There is a `bug +report `__ open +for this problem.) You can sanity-check these crashes by looking at the +raw dump or minidump in the Raw Dump tab (see below). + +Note that for non-release builds the "Version" field represents multiple +different builds since nightly and beta version numbers are reused for +builds created over a series of days until the version number is bumped. +(The "Build ID" field can disambiguate.) It's not currently possible to +`restrict searches to a given version or +later `__ (using +>= with a build ID and a given release channel may work around this). + +Some fields, such as "URL" and "Email Address", are privacy-sensitive +and are only visible to users with minidump access. + +The Windows-only "Total Virtual Memory" field indicates if the Firefox +build and OS are 32-bit or 64-bit. + +- A value of 2 GiB indicates 32-bit Firefox on 32-bit Windows. +- A value of 3 or 4 GiB indicates 32-bit Firefox on 64-bit Windows + (a.k.a. "WoW64"). Such a user could switch to 64-bit Firefox. +- A value much larger than 4 GiB (e.g. 128 TiB) indicates 64-bit + Firefox. (The "Build Architecture" field should be "amd64" in this + case.) + +Some crash reports might contain a memory report. This memory report will +have been made some time before the crash, at a time when available +memory was low. In this case, a number of key measurements from the +memory report are shown in the Details tab, each one having a field name +starting with "MR:", short for "memory report". The full memory report +can be obtained in the Raw Dump tab (see below). + +Bug-related information +~~~~~~~~~~~~~~~~~~~~~~~ + +The second part of the Details tab shows bug-related information, as the +following screenshot shows. + +|Information relating to bug reports in the "Details" tab of a crash +report| + +The "Report this bug in" links can be used to easily file bug reports. +Each one links to a Bugzilla bug report creation page that has various +fields pre-filled, such as the crash signature. + +The "Related Bugs" section shows related bug reports, as determined by +the crash signature. + +Stack traces +~~~~~~~~~~~~ + +The third part of the Details tab shows the stack trace and thread +number of the crashing thread, as the following screenshot shows. + +|Information relating to threads in the "Details" tab of a crash report| + +Each stack frame has a link to the source code, when possible. If a +crash is new, the regressing changeset can often be identified by +looking for recent changes in the blame annotations for one or more of +the top stack frames. Blame annotations are also good for identifying +who might know about the code in question. + +Sometimes the highlighted source code is puzzling, e.g. the identified +line may not touch memory even though the crash is memory-related. This +can be caused by compiler optimizations. It's often better to look at +the disassembly (e.g. in a minidump) to understand exactly what code is +being executed. + +Stack frame entries take on a variety of forms. + +- The simplest are functions names, such as ``NS_InitXPCOM2``. +- Name/address pairs such as ``nss3.dll@0x1eb720`` are within system + libraries. +- Names such as ``F1398665248_____________________________`` ('F' + followed by many numbers then many underscores) are in Flash. +- Addresses such as ``@0xe1a850ac`` may indicate an address that wasn't + part of any legitimate code. If an address such as this occurs in the + first stack frame, the crash may be + `exploitable `__. + +Stack traces for other threads can be viewed by clicking on the small +"Show other threads" link. + +If the crash report is for a hang, the crashing thread will be the +"watchdog" thread, which exists purely to detect hangs; its top stack +frame will be something +like\ :literal:`mozilla::`anonymous namespace'::RunWatchdog`. In that +case you should look at the other threads' stack traces to determine the +problem; many of them will be waiting on some kind of response, as shown +by a top stack frame containing a function like +``NtWaitForSingleObject`` or ``ZwWaitForMultipleObjects``. + +Metadata tab +------------ + +The Metadata tab is similar to the first part of the Details tab, +containing a table with various fields. These are the fields from the +raw crash report, ordered alphabetically by field name, but with +privacy-sensitive fields shown only to users with minidump access. There +is some overlap with the fields shown in the Details tab. + +Modules tab +----------- + +The modules tab shows all the system libraries loaded at the time of the +crash, as the following screenshot shows. + +|Table of modules in the "Modules" tab of a crash report| + +On Windows these are mostly DLLs, on Mac they are mostly ``.dylib`` +files, and on Linux they are mostly ``.so`` files. + +This information is most useful for Windows crashes, because DLLs loaded +by antivirus software or malware often cause Firefox to crash. +Correlations between loaded modules and crash signatures can be seen in +the "Correlations" tab (see below). + +`This page `__ +says that files lacking version/debug identifier/debug filename are +likely to be malware. + +Raw Dump tab +------------ + +The first part of the Raw Dump tab shows the raw crash report, in JSON +format. Once again, privacy-sensitive fields are shown only to users +with minidump access. + +|JSON data in the "Raw Dump" tab of a crash report| + +For users with minidump access, the second part of the Raw Dump tab has +some links, as the following screenshot shows. + +|Links to downloadable files in the "Raw Dump" tab of a crash report| + +These links are to the following items. + +#. A minidump. Minidumps can be extremely useful in understanding a + crash report; see :ref:`this page ` for an + explanation how to use them. +#. The aforementioned JSON raw crash report. +#. The memory report contained within the crash report. +#. The unredacted crash report, which has additional information. + +Extensions tab +-------------- + +The Extensions tab shows which extensions are installed and enabled. + +|Table of extensions in the "Extensions" tab of a crash report| + +Usually it just shows an ID rather than the proper extension name. + +Note that several extensions ship by default with Firefox and so will be +present in almost all crash reports. (The exact set of default +extensions depends on the release channel.) The least obvious of these +has an Id of ``{972ce4c6-7e08-4474-a285-3208198ce6fd}``, which is the +default Firefox theme. Some (but not all) of the other extensions +shipped by default have the following Ids: ``webcompat@mozilla.org``, +``e10srollout@mozilla.org``, ``firefox@getpocket.com``, +``flyweb@mozilla.org``, ``loop@mozilla.org``. + +If an extension only has a hexadecimal identifier, a Google search of +that identifier is usually enough to identify the extension's name. + +This information is useful because some crashes are caused by +extensions. Correlations between extensions and crash signatures can be +seen in the "Correlations" tab (see below). + +Correlations tab +---------------- + +This tab is only shown when crash-stats identifies correlations between +a crash and modules or extensions that are present, which happens +occasionally. + +See also +-------- + +- `A talk about understanding crash + reports `__, + by David Baron, from March 2016. +- :ref:`A guide to searching crash reports` + +.. |Example fields in the "Details" tab of a crash report| image:: https://mdn.mozillademos.org/files/13579/Details1.png +.. |Information relating to bug reports in the "Details" tab of a crash report| image:: https://mdn.mozillademos.org/files/13581/Details2.png +.. |Information relating to threads in the "Details" tab of a crash report| image:: https://mdn.mozillademos.org/files/13583/Details3.png +.. |Table of modules in the "Modules" tab of a crash report| image:: https://mdn.mozillademos.org/files/13593/Modules1.png +.. |JSON data in the "Raw Dump" tab of a crash report| image:: https://mdn.mozillademos.org/files/13595/RawDump1.png +.. |Links to downloadable files in the "Raw Dump" tab of a crash report| image:: https://mdn.mozillademos.org/files/14047/raw-dump-links.png +.. |Table of extensions in the "Extensions" tab of a crash report| image:: https://mdn.mozillademos.org/files/13599/Extensions1.png -- cgit v1.2.3