summaryrefslogtreecommitdiffstats
path: root/docs/performance/memory/dmd.md
blob: ebd6b5a2f873f3f51fcd9a34ee950dc8bb143cd5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
# Dark Matter Detector (DMD)

DMD (short for "dark matter detector") is a heap profiler within
Firefox. It has four modes.

-   "Dark Matter" mode. In this mode, DMD tracks the contents of the
    heap, including which heap blocks have been reported by memory
    reporters. It helps us reduce the "heap-unclassified" value in
    Firefox's about:memory page, and also detects if any heap blocks
    are reported twice. Originally, this was the only mode that DMD had,
    which explains DMD's name. This is the default mode.
-   "Live" mode. In this mode, DMD tracks the current contents of the
    heap. You can dump that information to file, giving a profile of the
    live heap blocks at that point in time. This is good for
    understanding how memory is used at an interesting point in time,
    such as peak memory usage.
-   "Cumulative" mode. In this mode, DMD tracks both the past and
    current contents of the heap. You can dump that information to file,
    giving a profile of the heap usage for the entire session. This is
    good for finding parts of the code that cause high heap churn, e.g.
    by allocating many short-lived allocations.
-   "Heap scanning" mode. This mode is like live mode, but it also
    records the contents of every live block in the log. This can be
    used to investigate leaks by figuring out which objects might be
    holding references to other objects.

## Building and Running

### Nightly Firefox

The easiest way to use DMD is with the normal Nightly Firefox build,
which has DMD already enabled in the build. To have DMD active while
running it, you just need to set the environment variable `DMD=1` when
running. For instance, on OSX, you can run something like:

    DMD=1 /Applications/Firefox\ Nightly.app/Contents/MacOS/firefox

You can tell it is working by going to `about:memory` and looking for
"Save DMD Output". If DMD has been properly enabled, the "Save"
button won't be grayed out. Look at the "Trigger" section below to
see the full list of ways to get a DMD report once you have it
activated. Note that the stack information you get will likely be less
detailed, due to being unable to symbolicate. You will be able to get
function names, but not line numbers.

### Desktop Firefox

#### Build

Build Firefox with this option added to your mozconfig:

    ac_add_options --enable-dmd

If building via try server, modify
`browser/config/mozconfigs/linux64/common-opt` or a similar file before
pushing.

#### Launch

Use `mach run --dmd`; use `--mode` to choose the mode.

On a Windows build done by the try server, [these
instructions](https://bugzilla.mozilla.org/show_bug.cgi?id=936784#c69) from
2013 may or may not be useful.

#### Trigger

There are a few ways to trigger a DMD snapshot. Most of these will also
first get a memory report. When DMD is working on writing its output, it
will print logging like this:

    DMD[5222] opened /tmp/dmd-1414556492-5222.json.gz for writing
    DMD[5222] Dump 1 {
    DMD[5222]   Constructing the heap block list...
    DMD[5222]   Constructing the stack trace table...
    DMD[5222]   Constructing the stack frame table...
    DMD[5222] }

You'll see separate output for each process. This step can take 10 or
more seconds and may make Firefox freeze temporarily.

If you see the "opened" line, it tells you where the file was saved.
It's always in a temp directory, and the filenames are always of the
form dmd-<pid>.

The ways to trigger a DMD snapshot are:

1.  Visit about:memory and click the "Save" button under "Save DMD output".
    The button won't be present in non-DMD builds, and will be grayed out
    in DMD builds if DMD isn't enabled at start-up.

2.  If you wish to trigger DMD dumps from within C++ or JavaScript code,
    you can use `nsIMemoryInfoDumper.dumpMemoryInfoToTempDir`. For example,
    from JavaScript code you can do the following.

        const Cc = Components.classes;
        let mydumper = Cc["@mozilla.org/memory-info-dumper;1"]
                         .getService(Ci.nsIMemoryInfoDumper);
        mydumper.dumpMemoryInfoToTempDir(identifier, anonymize, minimize);

    This will dump memory reports and DMD output to the temporary
    directory. `identifier` is a string that will be used for part of
    the filename (or a timestamp will be used if it is an empty string);
    `anonymize` is a boolean that indicates if the memory reports should
    be anonymized; and `minimize` is a boolean that indicates if memory
    usage should be minimized first.

3.  On Linux, you can send signal 34 to the firefox process, e.g.
    with the following command.

        $ killall -34 firefox

4. The `MOZ_DMD_SHUTDOWN_LOG` environment variable, if set, triggers a DMD
   run at shutdown; its value must be a directory where the logs will be
   placed. This is mostly useful for debugging leaks. Which processes get
   logged is controlled by the `MOZ_DMD_LOG_PROCESS` environment variable.
   If this is not set, it will log all processes. It can be set to any valid
   value of `XRE_GetProcessTypeString()` and will log only those processes.
   For instance, if set to `default` it will only log the parent process. If
   set to `tab`, it will log content processes only.

   For example, if you have

       MOZ_DMD_SHUTDOWN_LOG=~/dmdlogs/ MOZ_DMD_LOG_PROCESS=tab

   then DMD will create logs at shutdown for content processes and save them to
   `~/dmdlogs/`.

**NOTE:**

-   To dump DMD data from content processes, you'll need to disable the
    sandbox with `MOZ_DISABLE_CONTENT_SANDBOX=1`.
-   MOZ_DMD_SHUTDOWN_LOG must (currently) include the trailing separator
    (\'\'/\")


### Fennec

**NOTE:**

You'll note from the name of this section being "Fennec" that these instructions
are very old. Hopefully they'll be more useful than not having them.

**NOTE:**

In order to use DMD on Fennec you will need root access on the Android
device. Instructions on how to root your device is outside the scope of
this document.


#### Build

Build with these options:

    ac_add_options --enable-dmd

#### Prep

In order to prepare your device for running Fennec with DMD enabled, you
will need to do a few things. First, you will need to push the libdmd.so
library to the device so that it can by dynamically loaded by Fennec.
You can do this by running:

    adb push $OBJDIR/dist/bin/libdmd.so /sdcard/

Second, you will need to make an executable wrapper for Fennec which
sets an environment variable before launching it. (If you are familiar
with the recommended "--es env0" method for setting environment
variables when launching Fennec, note that you cannot use this method
here because those are processed too late in the startup process. If you
are not familiar with that method, you can ignore this parenthetical
note.) First make the executable wrapper on your host machine using the
editor of your choice. Name the file dmd_fennec and enter this as the
contents:

    #!/system/bin/sh
    export MOZ_REPLACE_MALLOC_LIB=/sdcard/libdmd.so
    exec "$@"

If you want to use other DMD options, you can enter additional
environment variables above. You will need to push this to the device
and make it executable. Since you cannot mark files in /sdcard/ as
executable, we will use /data/local/tmp for this purpose:

    adb push dmd_fennec /data/local/tmp
    adb shell
    cd /data/local/tmp
    chmod 755 dmd_fennec

The final step is to make Android use the above wrapper script while
launching Fennec, so that the environment variable is present when
Fennec starts up. Assuming you have done a local build, the app
identifier will be `org.mozilla.fennec_$USERNAME` (`$USERNAME` is your
username on the host machine) and so we do this as shown below. If you
are using a DMD-enabled try build, or build from other source, adjust
the app identifier as necessary.

    adb shell
    su    # You need root access for the setprop command to take effect
    setprop wrap.org.mozilla.fennec_$USERNAME "/data/local/tmp/dmd_fennec"

Once this is set up, starting the `org.mozilla.fennec_$USERNAME` app
will use the wrapper script.

#### Launch

Launch Fennec either by tapping on the icon as usual, or from the
command line (as before, be sure to replace
`org.mozilla.fennec_$USERNAME` with the app identifier as appropriate).

    adb shell am start -n org.mozilla.fennec_$USERNAME/.App

#### Trigger

Use the existing memory-report dumping hook:

    adb shell am broadcast -a org.mozilla.gecko.MEMORY_DUMP

In logcat, you should see output similar to this:

    I/DMD     (20731): opened /storage/emulated/0/Download/memory-reports/dmd-default-20731.json.gz for writing
    ...
    I/GeckoConsole(20731): nsIMemoryInfoDumper dumped reports to /storage/emulated/0/Download/memory-reports/unified-memory-report-default-20731.json.gz

The path is where the memory reports and DMD reports get dumped to. You
can pull them like so:

    adb pull /sdcard/Download/memory-reports/dmd-default-20731.json.gz
    adb pull /sdcard/Download/memory-reports/unified-memory-report-default-20731.json.gz

## Processing the output

DMD outputs one gzipped JSON file per process that contains a
description of that process's heap. You can analyze these files (either
gzipped or not) using `dmd.py`. On Nightly Firefox, `dmd.py` is included
in the distribution. For instance on OS X, it is located in the
directory `/Applications/Firefox Nightly.app/Contents/Resources/`. For
Nightly, symbolication will fail, but you can at least get some
information. In a local build, `dmd.py` will be located in the directory
`$OBJDIR/dist/bin/`.

Some platforms (Linux, Mac, Android) require stack fixing, which adds
missing filenames, function names and line number information. This will
occur automatically the first time you run `dmd.py` on the output file.
This can take 10s of seconds or more to complete. (This will fail if
your build does not contain symbols. However, if you have crash reporter
symbols for your build -- as tryserver builds do -- you can use [this
script](https://github.com/mstange/analyze-tryserver-profiles/blob/master/resymbolicate_dmd.py)
instead: clone the whole repo, edit the paths at the top of
`resymbolicate_dmd.py` and run it.) The simplest way to do this is to
just run the `dmd.py` script on your DMD report while your working
directory is `$OBJDIR/dist/bin`. This will allow the local libraries to
be found and used.

If you invoke `dmd.py` without arguments you will get output appropriate
for the mode in which DMD was invoked.

### "Dark matter" mode output

For "dark matter" mode, `dmd.py`'s output describes how the live heap
blocks are covered by memory reports. This output is broken into
multiple sections.

1.  "Invocation". This tells you how DMD was invoked, i.e. what
    options were used.
2.  "Twice-reported stack trace records". This tells you which heap
    blocks were reported twice or more. The presence of any such records
    indicates bugs in one or more memory reporters.
3.  "Unreported stack trace records". This tells you which heap blocks
    were not reported, which indicate where additional memory reporters
    would be most helpful.
4.  "Once-reported stack trace records": like the "Unreported stack
    trace records" section, but for blocks reported once.
5.  "Summary": gives measurements of the total heap, and the
    unreported/once-reported/twice-reported portions of it.

The "Twice-reported stack trace records" and "Unreported stack trace
records" sections are the most important, because they indicate ways in
which the memory reporters can be improved.

Here's an example stack trace record from the "Unreported stack trace
records" section.

    Unreported {
      150 blocks in heap block record 283 of 5,495
      21,600 bytes (20,400 requested / 1,200 slop)
      Individual block sizes: 144 x 150
      0.00% of the heap (16.85% cumulative)
      0.02% of unreported (94.68% cumulative)
      Allocated at {
        #01: replace_malloc (/home/njn/moz/mi5/go64dmd/memory/replace/dmd/../../../../memory/replace/dmd/DMD.cpp:1286)
        #02: malloc (/home/njn/moz/mi5/go64dmd/memory/build/../../../memory/build/replace_malloc.c:153)
        #03: moz_xmalloc (/home/njn/moz/mi5/memory/mozalloc/mozalloc.cpp:84)
        #04: nsCycleCollectingAutoRefCnt::incr(void*, nsCycleCollectionParticipant*) (/home/njn/moz/mi5/go64dmd/dom/xul/../../dist/include/nsISupportsImpl.h:250)
        #05: nsXULElement::Create(nsXULPrototypeElement*, nsIDocument*, bool, bool,mozilla::dom::Element**) (/home/njn/moz/mi5/dom/xul/nsXULElement.cpp:287)
        #06: nsXBLContentSink::CreateElement(char16_t const**, unsigned int, mozilla::dom::NodeInfo*, unsigned int, nsIContent**, bool*, mozilla::dom::FromParser) (/home/njn/moz/mi5/dom/xbl/nsXBLContentSink.cpp:874)
        #07: nsCOMPtr<nsIContent>::StartAssignment() (/home/njn/moz/mi5/go64dmd/dom/xml/../../dist/include/nsCOMPtr.h:753)
        #08: nsXMLContentSink::HandleStartElement(char16_t const*, char16_t const**, unsigned int, unsigned int, bool) (/home/njn/moz/mi5/dom/xml/nsXMLContentSink.cpp:1007)
      }
    }

It tells you that there were 150 heap blocks that were allocated from
the program point indicated by the "Allocated at" stack trace, that
these blocks took up 21,600 bytes, that all 150 blocks had a size of 144
bytes, and that 1,200 of those bytes were "slop" (wasted space caused
by the heap allocator rounding up request sizes). It also indicates what
percentage of the total heap size and the unreported portion of the heap
these blocks represent.

Within each section, records are listed from largest to smallest.

Once-reported and twice-reported stack trace records also have stack
traces for the report point(s). For example:

    Reported at {
      #01: mozilla::dmd::Report(void const*) (/home/njn/moz/mi2/memory/replace/dmd/DMD.cpp:1740) 0x7f68652581ca
      #02: CycleCollectorMallocSizeOf(void const*) (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:3008) 0x7f6860fdfe02
      #03: nsPurpleBuffer::SizeOfExcludingThis(unsigned long (*)(void const*)) const (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:933) 0x7f6860fdb7af
      #04: nsCycleCollector::SizeOfIncludingThis(unsigned long (*)(void const*), unsigned long*, unsigned long*, unsigned long*, unsigned long*, unsigned long*) const (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:3029) 0x7f6860fdb6b1
      #05: CycleCollectorMultiReporter::CollectReports(nsIMemoryMultiReporterCallback*, nsISupports*) (/home/njn/moz/mi2/xpcom/base/nsCycleCollector.cpp:3075) 0x7f6860fde432
      #06: nsMemoryInfoDumper::DumpMemoryReportsToFileImpl(nsAString_internal const&) (/home/njn/moz/mi2/xpcom/base/nsMemoryInfoDumper.cpp:626) 0x7f6860fece79
      #07: nsMemoryInfoDumper::DumpMemoryReportsToFile(nsAString_internal const&, bool, bool) (/home/njn/moz/mi2/xpcom/base/nsMemoryInfoDumper.cpp:344) 0x7f6860febaf9
      #08: mozilla::(anonymous namespace)::DumpMemoryReportsRunnable::Run() (/home/njn/moz/mi2/xpcom/base/nsMemoryInfoDumper.cpp:58) 0x7f6860fefe03
    }

You can tell which memory reporter made the report by the name of the
`MallocSizeOf` function near the top of the stack trace. In this case it
was the cycle collector's reporter.

By default, DMD does not record an allocation stack trace for most
blocks, to make it run faster. The decision on whether to record is done
probabilistically, and larger blocks are more likely to have an
allocation stack trace recorded. All unreported blocks that lack an
allocation stack trace will end up in a single record. For example:

    Unreported {
      420,010 blocks in heap block record 2 of 5,495
      29,203,408 bytes (27,777,288 requested / 1,426,120 slop)
      Individual block sizes: 2,048 x 3; 1,024 x 103; 512 x 147; 496 x 7; 480 x 31; 464 x 6; 448 x 50; 432 x 41; 416 x 28; 400 x 53; 384 x 43; 368 x 216; 352 x 141; 336 x 58; 320 x 104; 304 x 5,130; 288 x 150; 272 x 591; 256 x 6,017; 240 x 1,372; 224 x 93; 208 x 488; 192 x 1,919; 176 x 18,903; 160 x 1,754; 144 x 5,041; 128 x 36,709; 112 x 5,571; 96 x 6,280; 80 x 40,738; 64 x 37,925; 48 x 78,392; 32 x 136,199; 16 x 31,001; 8 x 4,706
      3.78% of the heap (10.24% cumulative)
      21.24% of unreported (57.53% cumulative)
      Allocated at {
        #01: (no stack trace recorded due to --stacks=partial)
      }
    }

In contrast, stack traces are always recorded when a block is reported,
which means you can end up with records like this where the allocation
point is unknown but the reporting point *is* known:

    Once-reported {
      104,491 blocks in heap block record 13 of 4,689
      10,392,000 bytes (10,392,000 requested / 0 slop)
      Individual block sizes: 512 x 124; 256 x 242; 192 x 813; 128 x 54,664; 64 x 48,648
      1.35% of the heap (48.65% cumulative)
      1.64% of once-reported (59.18% cumulative)
      Allocated at {
        #01: (no stack trace recorded due to --stacks=partial)
      }
      Reported at {
        #01: mozilla::dmd::DMDFuncs::Report(void const*) (/home/njn/moz/mi5/go64dmd/memory/replace/dmd/../../../../memory/replace/dmd/DMD.cpp:1646)
        #02: WindowsMallocSizeOf(void const*) (/home/njn/moz/mi5/dom/base/nsWindowMemoryReporter.cpp:189)
        #03: nsAttrAndChildArray::SizeOfExcludingThis(unsigned long (*)(void const*)) const (/home/njn/moz/mi5/dom/base/nsAttrAndChildArray.cpp:880)
        #04: mozilla::dom::FragmentOrElement::SizeOfExcludingThis(unsigned long (*)(void const*)) const (/home/njn/moz/mi5/dom/base/FragmentOrElement.cpp:2337)
        #05: nsINode::SizeOfIncludingThis(unsigned long (*)(void const*)) const (/home/njn/moz/mi5/go64dmd/parser/html/../../../dom/base/nsINode.h:307)
        #06: mozilla::dom::NodeInfo::NodeType() const (/home/njn/moz/mi5/go64dmd/dom/base/../../dist/include/mozilla/dom/NodeInfo.h:127)
        #07: nsHTMLDocument::DocAddSizeOfExcludingThis(nsWindowSizes*) const (/home/njn/moz/mi5/dom/html/nsHTMLDocument.cpp:3710)
        #08: nsIDocument::DocAddSizeOfIncludingThis(nsWindowSizes*) const (/home/njn/moz/mi5/dom/base/nsDocument.cpp:12820)
      }
    }

The choice of whether to record an allocation stack trace for all blocks
is controlled by an option (see below).

### "Live" mode output


For "live" mode, dmd.py's output describes what live heap blocks are
present. This output is broken into multiple sections.

1.  "Invocation". This tells you how DMD was invoked, i.e. what
    options were used.
2.  "Live stack trace records". This tells you which heap blocks were
    present.
3.  "Summary": gives measurements of the total heap.

The individual records are similar to those output in "dark matter"
mode.

### "Cumulative" mode output

For "cumulative" mode, dmd.py's output describes how the live heap
blocks are covered by memory reports. This output is broken into
multiple sections.

1.  "Invocation". This tells you how DMD was invoked, i.e. what
    options were used.
2.  "Cumulative stack trace records". This tells you which heap blocks
    were allocated during the session.
3.  "Summary": gives measurements of the total (cumulative) heap.

The individual records are similar to those output in "dark matter"
mode.

### "Scan" mode output

For "scan" mode, the output of `dmd.py` is the same as "live" mode.
A separate script, `block_analyzer.py`, can be used to find out
information about which blocks refer to a particular block.
`dmd.py --clamp-contents` needs to be run on the log first. See [this
other page](heap_scan_mode.md) for an
overview of how to use heap scan mode to fix a leak involving refcounted
objects.

## Options

### Runtime

When you run `mach run --dmd` you can specify additional options to
control how DMD runs. Run `mach help run` for documentation on these.

The most interesting one is `--mode`. Acceptable values are
`dark-matter` (the default), `live`, `cumulative`, and `scan`.

Another interesting one is `--stacks`. Acceptable values are `partial`
(the default) and `full`. In the former case most blocks will not have
an allocation stack trace recorded. However, because larger blocks are
more likely to have one recorded, most allocated bytes should have an
allocation stack trace even though most allocated blocks do not. Use
`--stacks=full` if you want complete information, but note that DMD will
run substantially slower in that case.

The options may also be put in the environment variable DMD, or set DMD
to 1 to enable DMD with default options (dark-matter and partial
stacks).

### Post-processing

`dmd.py` also takes options that control how it works. Run `dmd.py -h`
for documentation. The following options are the most interesting ones.

-   `-f` / `--max-frames`. By default, records show up to 8 stack
    frames. You can choose a smaller number, in which case more
    allocations will be aggregated into each record, but you'll have
    less context. Or you can choose a larger number, in which cases
    allocations will be split across more records, but you will have
    more context. There is no single best value, but values in the range
    2..10 are often good. The maximum is 24.

-   `-a` / `--ignore-alloc-fns`. Many allocation stack traces start
    with multiple frames that mention allocation wrapper functions, e.g.
    `js_calloc()` calls `replace_calloc()`. This option filters these
    out. It often helps improve the quality of the output when using a
    small `--max-frames` value.

-   `-s` / `--sort-by`. This controls how records are sorted. Acceptable
    values are `usable` (the default), `req`, `slop` and `num-blocks`.

-   `--clamp-contents`. For a heap scan log, this performs a
    conservative pointer analysis on the contents of each block,
    changing any value that is a pointer into the middle of a live block
    into a pointer to the start of that block. All other values are
    changes to null. In addition, all trailing nulls are removed from
    the block contents.

As an example that combines multiple options, if you apply the following
command to a profile obtained in "live" mode:

    dmd.py -r -f 2 -a -s slop

it will give you a good idea of where the major sources of slop are.

`dmd.py` can also compute the difference between two DMD output files,
so long as those files were produced in the same mode. Simply pass it
two filenames instead of one to get the difference.

## Which heap blocks are reported?

At this stage you might wonder how DMD knows, in "dark matter" mode,
which allocations have been reported and which haven't. DMD only knows
about heap blocks that are measured via a function created with one of
the following two macros:

    MOZ_DEFINE_MALLOC_SIZE_OF
    MOZ_DEFINE_MALLOC_SIZE_OF_ON_ALLOC

Fortunately, most of the existing memory reporters do this. See
[Performance/Memory_Reporting](https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Memory_reporting "Platform/Memory Reporting")
for more details about how memory reporters are written.