summaryrefslogtreecommitdiffstats
path: root/toolkit/crashreporter/docs/index.rst
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-21 11:44:51 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-21 11:44:51 +0000
commit9e3c08db40b8916968b9f30096c7be3f00ce9647 (patch)
treea68f146d7fa01f0134297619fbe7e33db084e0aa /toolkit/crashreporter/docs/index.rst
parentInitial commit. (diff)
downloadthunderbird-upstream.tar.xz
thunderbird-upstream.zip
Adding upstream version 1:115.7.0.upstream/1%115.7.0upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'toolkit/crashreporter/docs/index.rst')
-rw-r--r--toolkit/crashreporter/docs/index.rst265
1 files changed, 265 insertions, 0 deletions
diff --git a/toolkit/crashreporter/docs/index.rst b/toolkit/crashreporter/docs/index.rst
new file mode 100644
index 0000000000..57b4c3a5e5
--- /dev/null
+++ b/toolkit/crashreporter/docs/index.rst
@@ -0,0 +1,265 @@
+==============
+Crash Reporter
+==============
+
+Overview
+========
+
+The **crash reporter** is a subsystem to record and manage application
+crash data.
+
+While the subsystem is known as *crash reporter*, it helps to think of
+it more as a *process dump manager*. This is because the heart of this
+subsystem is really managing process dump files and these files are
+created not only from process crashes but also from hangs and other
+exceptional events.
+
+The crash reporter subsystem is composed of a number of pieces working
+together.
+
+Breakpad
+ Breakpad is a library and set of tools to make collecting process
+ information (notably dumps from crashes) easy. Breakpad is a 3rd
+ party project (originally developed by Google) that is imported into
+ the tree.
+
+Dump files
+ Breakpad produces files called *dump files* that hold process data
+ (stacks, heap data, etc).
+
+Crash Reporter Client
+ The crash reporter client is a standalone executable that is launched
+ to handle dump files. This application optionally submits crashes to
+ Mozilla (or the configured server).
+
+Minidump Analyzer
+ The minidump analyzer is a standalone executable that is launched by the
+ crash reporter client or by the browser itself to extract stack traces from
+ the dump files generated during a crash. It appends the stack traces to the
+ .extra file associated with the crash dump.
+
+Ping Sender
+ The ping sender is a standalone executable that is launched by the crash
+ reporter client to deliver a crash ping to our telemetry servers. The ping
+ sender is used to speed up delivery of the crash ping which would otherwise
+ have to wait for Firefox to be restarted in order to be sent.
+
+How Main-Process Crash Handling Works
+=====================================
+
+The crash handler is hooked up very early in the Gecko process lifetime.
+It all starts in ``XREMain::XRE_mainInit()`` from ``nsAppRunner.cpp``.
+Assuming crash reporting is enabled, this startup function registers an
+exception handler for the process and tells the crash reporter subsystem
+about basic metadata such as the application name and version.
+
+The registration of the crash reporter exception handler doubles as
+initialization of the crash reporter itself. This happens in
+``CrashReporter::SetExceptionHandler()`` from ``nsExceptionHandler.cpp``.
+The crash reporter figures out what application to use for reporting
+dumped crashes and where to store these dump files on disk. The Breakpad
+exception handler (really just a mechanism for dumping process state) is
+initialized as part of this function. The Breakpad exception handler is
+a ``google_breakpad::ExceptionHandler`` instance and it's stored as
+``gExceptionHandler``.
+
+As the application runs, various other systems may write *annotations*
+or *notes* to the crash reporter to indicate state of the application,
+help with possible reasons for a current or future crash, etc. These are
+performed via ``CrashReporter::AnnotateCrashReport()`` and
+``CrashReporter::AppendAppNotesToCrashReport()`` from
+``nsExceptionHandler.h``.
+
+For well running applications, this is all that happens. However, if a
+crash or similar exceptional event occurs (such as a hang), we need to
+write a crash report.
+
+When an event worthy of writing a dump occurs, the Breakpad exception
+handler is invoked and Breakpad does its thing. When Breakpad has
+finished, it calls back into ``CrashReporter::MinidumpCallback()`` from
+``nsExceptionHandler.cpp`` to tell the crash reporter about what was
+written.
+
+``MinidumpCallback()`` performs a number of actions once a dump has been
+written. It writes a file with the time of the crash so other systems can
+easily determine the time of the last crash. It supplements the dump
+file with an *extra* file containing Mozilla-specific metadata. This data
+includes the annotations set via ``CrashReporter::AnnotateCrashReport()``
+as well as time since last crash, whether garbage collection was active at
+the time of the crash, memory statistics, etc.
+
+If the *crash reporter client* is enabled, ``MinidumpCallback()`` invokes
+it. It simply tries to create a new *crash reporter client* process (e.g.
+*crashreporter.exe*) with the path to the written minidump file as an
+argument.
+
+The *crash reporter client* performs a number of roles. There's a lot going
+on, so you may want to look at ``main()`` in ``crashreporter.cpp``. First,
+stack traces are extracted from the dump via the *minidump analyzer* tool.
+The resulting traces are appended to the .extra file of the crash together with
+the SHA256 hash of the minidump file. Once this
+is done a crash ping is assembled holding the same information as the one
+generated by the ```CrashManager``` and it's sent to the telemetry servers via
+the *ping sender* program. The UUID of the ping is then stored in the extra
+file; the ```CrashManager``` will later pick it up and generate a new ping
+with the same UUID so that the telemetry server can deduplicate both pings.
+Then, the
+*crash reporter client* verifies that the dump data is sane. If it isn't
+(e.g. required metadata is missing), the dump data is ignored. If dump data
+looks sane, the dump data
+is moved into the *pending* directory for the configured data directory
+(defined via the ``MOZ_CRASHREPORTER_DATA_DIRECTORY`` environment variable
+or from the UI). Once this is done, the main crash reporter UI is displayed
+via ``UIShowCrashUI()``. The crash reporter UI is platform specific: there
+are separate versions for Windows, OS X, and various \*NIX presentation
+flavors (such as GTK). The basic gist is a dialog is displayed to the user
+and the user has the opportunity to submit this dump data to a remote
+server.
+
+If a dump is submitted via the crash reporter, the raw dump files are
+removed from the *pending* directory and a file containing the
+crash ID from the remote server for the submitted dump is created in the
+*submitted* directory.
+
+If the user chooses not to submit a dump in the crash reporter UI, the dump
+files are deleted.
+
+And that's pretty much what happens when a crash/dump is written!
+
+Plugin and Child Process Crashes
+================================
+
+Crashes in plugin and child processes are also managed by the crash
+reporting subsystem.
+
+Child process crashes are handled by the ``mozilla::dom::CrashReporterParent``
+class defined in ``dom/ipc``. When a child process crashes, the toplevel IPDL
+actor should check for it by calling TakeMinidump in its ``ActorDestroy``
+Method: see ``mozilla::plugins::PluginModuleParent::ActorDestroy`` and
+``mozilla::plugins::PluginModuleParent::ProcessFirstMinidump``. That method
+is responsible for calling
+``mozilla::dom::CrashReporterParent::GenerateCrashReportForMinidump`` with
+appropriate crash annotations specific to the crash. All child-process
+crashes are annotated with a ``ProcessType`` annotation, such as "content" or
+"plugin".
+
+Once the minidump file has been generated the
+``mozilla::dom::CrashReporterHost`` is notified of the crash. It will first
+try to extract the stack traces from the minidump file using the
+*minidump analyzer*. Then the stack traces will be stored in the extra file
+together with the rest of the crash annotations and finally the crash will be
+recorded by calling ```CrashService.addCrash()```. This last step adds the
+crash to the ```CrashManager``` database and automatically sends a crash ping
+with information about the crash.
+
+Submission of child process crashes is handled by application code. This
+code prompts the user to submit crashes in context-appropriate UI and then
+submits the crashes using ``CrashSubmit.jsm``.
+
+Memory Reports
+==============
+
+When a process detects that it is running low on memory, a memory report is
+saved. If the process crashes, the memory report will be included with the crash
+report. ``nsThread::SaveMemoryReportNearOOM()`` checks to see if the process is
+low on memory every 30 seconds at most and saves a report every 3 minutes at
+most. Since a child process cannot actually save to the hard drive, it instead
+notifies its parent process, which saves the report for it. If a crash does
+occur, the memory report is moved to the *pending* directory with the other dump
+data and an annotation is added to indicate the presence of the report. This
+happens in ``nsExceptionHandler.cpp``, but occurs in different functions
+depending on what process crashed. When the main process crashes, this happens
+in ``MinidumpCallback()``. When a child process crashes, it happens in
+``OnChildProcessDumpRequested()``, with the annotation being added in
+``WriteExtraData()``.
+
+Plugin Hangs
+============
+
+Plugin hangs are handled as crash reports. If a plugin doesn't respond to an
+IPC message after 60 seconds, the plugin IPC code will take minidumps of all
+of the processes involved and then kill the plugin.
+
+In this case, there will be only one .extra file with the crash report metadata,
+but there will be multiple dump files: at least one for the browser process and
+one for the plugin process. All of these files are submitted together as a
+unit. Before submission, the filenames of the files are linked:
+
+- **uuid.extra** - *annotations, includes the `additional_minidumps` annotation
+ holding a comma-separated list of the additional minidumps*
+- **uuid.dmp** - *plugin process dump file*
+- **uuid-<other>.dmp** - *other process dump file as listed in
+ additional_minidumps*
+
+about:crashes
+=============
+
+If the crash reporter subsystem is enabled, the *about:crashes*
+page will be registered with the application. This page provides
+information about previous and submitted crashes.
+
+It is also possible to submit crashes from *about:crashes*.
+
+Environment variables affecting crash reporting
+===============================================
+
+The exception handler and crash reporter client behavior can be altered by
+setting certain environment variables, some of these variables are used for
+testing but quite a few have only internal users.
+
+User-specified environment variables
+------------------------------------
+
+- ``MOZ_CRASHREPORTER`` - The opposite of MOZ_CRASHREPORTER_DISABLE, force
+ crash reporting on even if disabled in application.ini. You must use this to
+ enable crash reporting on debug builds.
+- ``MOZ_CRASHREPORTER_DISABLE`` - Disable Breakpad crash reporting completely
+ in non-debug builds. You can use this if you would rather use the JIT
+ debugger on Windows with the symbol server, for example.
+- ``MOZ_CRASHREPORTER_FULLDUMP`` - Store full application memory in the
+ minidump, so you can open it in a Microsoft debugger. Don't submit it to the
+ server. (Windows only.)
+- ``MOZ_CRASHREPORTER_NO_DELETE_DUMP`` - Don't delete the crash report dump
+ file after submitting it to the server. Minidumps will still be moved to the
+ "Crash Reports/pending" directory.
+- ``MOZ_CRASHREPORTER_NO_REPORT`` - Save the minidump file but don't launch the
+ crash reporting UI or send the report to the server. Minidumps will be stored
+ in the user's profile directory, in a subdirectory named "minidumps".
+- ``MOZ_CRASHREPORTER_SHUTDOWN`` - Save the minidump and then force the
+ application to close. This is useful for content crashes that don't normally
+ close the chrome (main application) processes. This variable would cause the
+ application to close as well.
+- ``MOZ_CRASHREPORTER_URL`` - Sets the URL that the crash reporter will submit
+ reports to.
+
+Environment variables used internally
+-------------------------------------
+
+- ``MOZ_CRASHREPORTER_AUTO_SUBMIT`` - When set causes the crash reporter client
+ to skip the UI flow and submit the crash report directly.
+- ``MOZ_CRASHREPORTER_DATA_DIRECTORY`` - Platform dependent data directory, the
+ pending crash reports will be stored in a subdirectory of this path. This
+ overrides the default one generated by the client's code.
+- ``MOZ_CRASHREPORTER_DUMP_ALL_THREADS`` - When set to 1 stack traces for
+ all threads are generated and sent in the crash ping, when not set only the
+ trace for the crashing thread will be generated instead.
+- ``MOZ_CRASHREPORTER_EVENTS_DIRECTORY`` - Path of the directory holding the
+ crash event files.
+- ``MOZ_CRASHREPORTER_PING_DIRECTORY`` - Path of the directory holding the
+ pending crash ping files.
+- ``MOZ_CRASHREPORTER_RESTART_ARG_<n>`` - Each of these variable specifies one
+ of the arguments that had been passed to the application, the crash reporter
+ client uses them for restarting it.
+- ``MOZ_CRASHREPORTER_RESTART_XUL_APP_FILE`` - If a XUL app file was specified
+ when starting the app it has to be stored in this variable so that the crash
+ reporter client can restart the application.
+- ``MOZ_CRASHREPORTER_STRINGS_OVERRIDE`` - Overrides the path used to load the
+ .ini file holding the strings used in the crash reporter client UI.
+
+Other topics
+============
+
+.. toctree::
+ :titlesonly:
+
+ Using_the_Mozilla_symbol_server