summaryrefslogtreecommitdiffstats
path: root/toolkit/crashreporter/docs/index.rst
blob: fe1af45d8fd925bd38384557d0adec1be794fdf5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
==============
Crash Reporter
==============

Overview
========

The **crash reporter** is a subsystem to record and manage application
crash data.

While the subsystem is known as *crash reporter*, it helps to think of
it more as a *process dump manager*. This is because the heart of this
subsystem is really managing process dump files and these files are
created not only from process crashes but also from hangs and other
exceptional events.

The crash reporter subsystem is composed of a number of pieces working
together.

Breakpad
   Breakpad is a library and set of tools to make collecting process
   information (notably dumps from crashes) easy. Breakpad is a 3rd
   party project (originally developed by Google) that is imported into
   the tree.

Dump files
   Breakpad produces files called *dump files* that hold process data
   (stacks, heap data, etc).

Crash Reporter Client
   The crash reporter client is a standalone executable that is launched
   to handle dump files. This application optionally submits crashes to
   Mozilla (or the configured server).

Minidump Analyzer
   The minidump analyzer is a standalone executable that is launched by the
   crash reporter client or by the browser itself to extract stack traces from
   the dump files generated during a crash. It appends the stack traces to the
   .extra file associated with the crash dump.

Ping Sender
   The ping sender is a standalone executable that is launched by the crash
   reporter client to deliver a crash ping to our telemetry servers. The ping
   sender is used to speed up delivery of the crash ping which would otherwise
   have to wait for Firefox to be restarted in order to be sent.

How Main-Process Crash Handling Works
=====================================

The crash handler is hooked up very early in the Gecko process lifetime.
It all starts in ``XREMain::XRE_mainInit()`` from ``nsAppRunner.cpp``.
Assuming crash reporting is enabled, this startup function registers an
exception handler for the process and tells the crash reporter subsystem
about basic metadata such as the application name and version.

The registration of the crash reporter exception handler doubles as
initialization of the crash reporter itself. This happens in
``CrashReporter::SetExceptionHandler()`` from ``nsExceptionHandler.cpp``.
The crash reporter figures out what application to use for reporting
dumped crashes and where to store these dump files on disk. The Breakpad
exception handler (really just a mechanism for dumping process state) is
initialized as part of this function. The Breakpad exception handler is
a ``google_breakpad::ExceptionHandler`` instance and it's stored as
``gExceptionHandler``.

As the application runs, various other systems may write *annotations*
or *notes* to the crash reporter to indicate state of the application,
help with possible reasons for a current or future crash, etc. These are
performed via ``CrashReporter::RecordAnnotation*()``,
``CrashReporter::RegisterAnnotation*()`` functions and
``CrashReporter::AppendAppNotesToCrashReport()`` from
``nsExceptionHandler.h``.

For well running applications, this is all that happens. However, if a
crash or similar exceptional event occurs (such as a hang), we need to
write a crash report.

When an event worthy of writing a dump occurs, the Breakpad exception
handler is invoked and Breakpad does its thing. When Breakpad has
finished, it calls back into ``CrashReporter::MinidumpCallback()`` from
``nsExceptionHandler.cpp`` to tell the crash reporter about what was
written.

``MinidumpCallback()`` performs a number of actions once a dump has been
written. It writes a file with the time of the crash so other systems can
easily determine the time of the last crash. It supplements the dump
file with an *extra* file containing Mozilla-specific metadata. This data
includes the annotations set via ``CrashReporter::AnnotateCrashReport()``
as well as time since last crash, whether garbage collection was active at
the time of the crash, memory statistics, etc.

If the *crash reporter client* is enabled, ``MinidumpCallback()`` invokes
it. It simply tries to create a new *crash reporter client* process (e.g.
*crashreporter.exe*) with the path to the written minidump file as an
argument.

The *crash reporter client* performs a number of roles. There's a lot going
on, so you may want to look at ``main()`` in ``crashreporter.cpp``. First,
stack traces are extracted from the dump via the *minidump analyzer* tool.
The resulting traces are appended to the .extra file of the crash together with
the SHA256 hash of the minidump file. Once this
is done a crash ping is assembled holding the same information as the one
generated by the ```CrashManager``` and it's sent to the telemetry servers via
the *ping sender* program. The UUID of the ping is then stored in the extra
file; the ```CrashManager``` will later pick it up and generate a new ping
with the same UUID so that the telemetry server can deduplicate both pings.
Then, the
*crash reporter client* verifies that the dump data is sane. If it isn't
(e.g. required metadata is missing), the dump data is ignored. If dump data
looks sane, the dump data
is moved into the *pending* directory for the configured data directory
(defined via the ``MOZ_CRASHREPORTER_DATA_DIRECTORY`` environment variable
or from the UI). Once this is done, the main crash reporter UI is displayed
via ``UIShowCrashUI()``. The crash reporter UI is platform specific: there
are separate versions for Windows, OS X, and various \*NIX presentation
flavors (such as GTK). The basic gist is a dialog is displayed to the user
and the user has the opportunity to submit this dump data to a remote
server.

If a dump is submitted via the crash reporter, the raw dump files are
removed from the *pending* directory and a file containing the
crash ID from the remote server for the submitted dump is created in the
*submitted* directory.

If the user chooses not to submit a dump in the crash reporter UI, the dump
files are deleted.

And that's pretty much what happens when a crash/dump is written!

Plugin and Child Process Crashes
================================

Crashes in plugin and child processes are also managed by the crash
reporting subsystem.

Child process crashes are handled by the ``mozilla::dom::CrashReporterParent``
class defined in ``dom/ipc``. When a child process crashes, the toplevel IPDL
actor should check for it by calling TakeMinidump in its ``ActorDestroy``
Method: see ``mozilla::plugins::PluginModuleParent::ActorDestroy`` and
``mozilla::plugins::PluginModuleParent::ProcessFirstMinidump``. That method
is responsible for calling
``mozilla::dom::CrashReporterParent::GenerateCrashReportForMinidump`` with
appropriate crash annotations specific to the crash. All child-process
crashes are annotated with a ``ProcessType`` annotation, such as "content" or
"plugin".

Once the minidump file has been generated the
``mozilla::dom::CrashReporterHost`` is notified of the crash. It will first
try to extract the stack traces from the minidump file using the
*minidump analyzer*. Then the stack traces will be stored in the extra file
together with the rest of the crash annotations and finally the crash will be
recorded by calling ```CrashService.addCrash()```. This last step adds the
crash to the ```CrashManager``` database and automatically sends a crash ping
with information about the crash.

Submission of child process crashes is handled by application code. This
code prompts the user to submit crashes in context-appropriate UI and then
submits the crashes using ``CrashSubmit.sys.mjs``.

Memory Reports
==============

When a process detects that it is running low on memory, a memory report is
saved. If the process crashes, the memory report will be included with the crash
report. ``nsThread::SaveMemoryReportNearOOM()`` checks to see if the process is
low on memory every 30 seconds at most and saves a report every 3 minutes at
most. Since a child process cannot actually save to the hard drive, it instead
notifies its parent process, which saves the report for it. If a crash does
occur, the memory report is moved to the *pending* directory with the other dump
data and an annotation is added to indicate the presence of the report. This
happens in ``nsExceptionHandler.cpp``, but occurs in different functions
depending on what process crashed. When the main process crashes, this happens
in ``MinidumpCallback()``. When a child process crashes, it happens in
``OnChildProcessDumpRequested()``, with the annotation being added in
``WriteExtraData()``.

Plugin Hangs
============

Plugin hangs are handled as crash reports. If a plugin doesn't respond to an
IPC message after 60 seconds, the plugin IPC code will take minidumps of all
of the processes involved and then kill the plugin.

In this case, there will be only one .extra file with the crash report metadata,
but there will be multiple dump files: at least one for the browser process and
one for the plugin process. All of these files are submitted together as a
unit. Before submission, the filenames of the files are linked:

- **uuid.extra** - *annotations, includes the `additional_minidumps` annotation
  holding a comma-separated list of the additional minidumps*
- **uuid.dmp** - *plugin process dump file*
- **uuid-<other>.dmp** - *other process dump file as listed in
  additional_minidumps*

about:crashes
=============

If the crash reporter subsystem is enabled, the *about:crashes*
page will be registered with the application. This page provides
information about previous and submitted crashes.

It is also possible to submit crashes from *about:crashes*.

Environment variables affecting crash reporting
===============================================

The exception handler and crash reporter client behavior can be altered by
setting certain environment variables, some of these variables are used for
testing but quite a few have only internal users.

User-specified environment variables
------------------------------------

- ``MOZ_CRASHREPORTER`` - The opposite of MOZ_CRASHREPORTER_DISABLE, force
  crash reporting on even if disabled in application.ini. You must use this to
  enable crash reporting on debug builds.
- ``MOZ_CRASHREPORTER_DISABLE`` - Disable Breakpad crash reporting completely
  in non-debug builds. You can use this if you would rather use the JIT
  debugger on Windows with the symbol server, for example.
- ``MOZ_CRASHREPORTER_FULLDUMP`` - Store full application memory in the
  minidump, so you can open it in a Microsoft debugger. Don't submit it to the
  server. (Windows only.)
- ``MOZ_CRASHREPORTER_NO_DELETE_DUMP`` - Don't delete the crash report dump
  file after submitting it to the server. Minidumps will still be moved to the
  "Crash Reports/pending" directory.
- ``MOZ_CRASHREPORTER_NO_REPORT`` - Save the minidump file but don't launch the
  crash reporting UI or send the report to the server. Minidumps will be stored
  in the user's profile directory, in a subdirectory named "minidumps".
- ``MOZ_CRASHREPORTER_SHUTDOWN`` - Save the minidump and then force the
  application to close. This is useful for content crashes that don't normally
  close the chrome (main application) processes. This variable would cause the
  application to close as well.
- ``MOZ_CRASHREPORTER_URL`` - Sets the URL that the crash reporter will submit
  reports to.

Environment variables used internally
-------------------------------------

- ``MOZ_CRASHREPORTER_AUTO_SUBMIT`` - When set causes the crash reporter client
  to skip the UI flow and submit the crash report directly.
- ``MOZ_CRASHREPORTER_DATA_DIRECTORY`` - Platform dependent data directory, the
  pending crash reports will be stored in a subdirectory of this path. This
  overrides the default one generated by the client's code.
- ``MOZ_CRASHREPORTER_DUMP_ALL_THREADS`` - When set to 1 stack traces for
  all threads are generated and sent in the crash ping, when not set only the
  trace for the crashing thread will be generated instead.
- ``MOZ_CRASHREPORTER_EVENTS_DIRECTORY`` - Path of the directory holding the
  crash event files.
- ``MOZ_CRASHREPORTER_PING_DIRECTORY`` - Path of the directory holding the
  pending crash ping files.
- ``MOZ_CRASHREPORTER_RESTART_ARG_<n>`` - Each of these variable specifies one
  of the arguments that had been passed to the application, starting with the
  first after the executable, the crash reporter client uses them for restarting
  it.
- ``MOZ_CRASHREPORTER_RESTART_XUL_APP_FILE`` - If a XUL app file was specified
  when starting the app it has to be stored in this variable so that the crash
  reporter client can restart the application.
- ``MOZ_CRASHREPORTER_STRINGS_OVERRIDE`` - Overrides the path used to load the
  .ini file holding the strings used in the crash reporter client UI.

Other topics
============

.. toctree::
   :titlesonly:

   Using_the_Mozilla_symbol_server