summaryrefslogtreecommitdiffstats
path: root/docs/performance/powermetrics.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/performance/powermetrics.md')
-rw-r--r--docs/performance/powermetrics.md167
1 files changed, 167 insertions, 0 deletions
diff --git a/docs/performance/powermetrics.md b/docs/performance/powermetrics.md
new file mode 100644
index 0000000000..44df0eda9c
--- /dev/null
+++ b/docs/performance/powermetrics.md
@@ -0,0 +1,167 @@
+# powermetrics
+
+`powermetrics` is a Mac-only command-line utility that provides many
+high-quality power-related measurements. It is most useful for getting
+CPU, GPU and wakeup measurements in a precise and easily scriptable
+fashion (unlike [Activity Monitor and
+top](activity_monitor_and_top.md))
+especially in combination with
+[rapl](tools_power_rapl.md) via the
+`mach power` command. This document describes the version of
+`powermetrics` that comes with Mac OS 10.10. The one that comes with
+10.9 is less powerful.
+
+**Note**: The [power profiling
+overview](power_profiling_overview.md) is
+worth reading at this point if you haven\'t already. It may make parts
+of this document easier to understand.
+
+## Quick start
+
+`powermetrics` provides a vast number of measurements. The following
+command encompasses the most useful ones:
+
+sudo powermetrics --samplers tasks --show-process-coalition --show-process-gpu -n 1 -i 5000
+
+- `--samplers tasks` tells it to just do per-process measurements.
+- `--show-process-coalition`` `tells it to group *coalitions* of
+ related processes, e.g. the Firefox parent process and child
+ processes.
+- `--show-process-gpu` tells it to show per-process GPU measurements.
+- `-n 1` tells it to take one sample and then stop.
+- `-i 5000` tells it to use a sample length of 5 seconds (5000 ms).
+ Change this number to get shorter or longer samples.
+
+The following is example output from such an invocation:
+
+ *** Sampled system activity (Fri Sep 4 17:15:14 2015 +1000) (5009.63ms elapsed) ***
+
+ *** Running tasks ***
+
+ Name ID CPU ms/s User% Deadlines (<2 ms, 2-5 ms) Wakeups (Intr, Pkg idle) GPU ms/s
+ com.apple.Terminal 293 447.66 274.83 120.35 221.74
+ firefox 84627 77.59 55.55 15.37 2.59 91.42 42.12 204.47
+ plugin-container 84628 377.22 37.18 43.91 18.56 178.65 75.85 17.29
+ Terminal 694 9.86 79.94 0.00 0.00 4.39 2.20 0.00
+ powermetrics 84694 1.21 31.53 0.00 0.00 0.20 0.20 0.00
+ com.google.Chrome 489 233.83 48.10 25.95 0.00
+ Google Chrome Helper 84688 181.57 92.81 0.00 0.00 23.95 12.77 0.00
+ Google Chrome 84681 57.26 76.07 4.39 0.00 23.75 12.97 0.00
+ Google Chrome Helper 84685 0.13 48.08 0.00 0.00 0.40 0.20 0.00
+ kernel_coalition 1 128.64 780.19 330.52 0.00
+ kernel_task 0 109.97 0.00 0.20 0.00 779.47 330.35 0.00
+ launchd 1 18.88 2.44 0.00 0.00 0.40 0.20 0.00
+ com.apple.Safari 488 90.60 108.58 56.48 26.65
+ com.apple.WebKit.WebContent 84679 64.21 84.69 0.00 0.00 104.19 54.89 26.66
+ com.apple.WebKit.Networking 84678 26.89 58.89 0.40 0.00 1.60 0.00 0.00
+ Safari 84676 1.56 55.74 0.00 0.00 2.59 1.40 0.00
+ com.apple.Safari.SearchHelper 84690 0.15 49.49 0.00 0.00 0.20 0.20 0.00
+ org.mozilla.firefox 482 76.56 124.34 63.47 0.00
+ firefox 84496 76.70 89.18 10.58 5.59 124.55 63.48 0.00
+
+This sample was taken while the following programs were running:
+
+- Firefox Beta (single process, invoked from the Mac OS dock, shown in
+ the `org.mozilla.firefox` coalition.)
+- Firefox Nightly (multi-process, invoked from the command line, shown
+ in the `com.apple.Terminal` coalition.)
+- Google Chrome.
+- Safari.
+
+The grouping of parent and child processes (in coalitions) is obvious.
+The meaning of the columns is as follows.
+
+- **Name**: Coalition/process name. Process names within coalitions
+ are indented.
+- **ID**: Coalition/process ID number.
+- **CPU ms/s**: CPU time used by the coalition/process, per second,
+ during the sample period. The sum of the process values typically
+ exceeds the coalition value slightly, for unknown reasons.
+- **User%**: Percentage of that CPU time spent in user space (as
+ opposed to kernel mode.)
+- **Deadlines (\<2 ms, 2-5 ms)**: These two columns count how many
+ \"short\" timers woke up threads in the process, per second, during
+ the sample period. High frequency timers, which typically have short
+ time-to-deadlines, can cause high power consumption and should be
+ avoided if possible.
+- **Wakeups (Intr, Pkg idle)**: These two columns count how many
+ wakeups occurred, per second, during the sample period. The first
+ column counts interrupt-level wakeups that resulted in a thread
+ being dispatched in the process. The second column counts \"package
+ idle exit\" wakeups, which wake up the entire package as opposed to
+ just a single core; such wakeups are particularly expensive, and
+ this count is a subset of the first column\'s count.
+- **GPU ms/s**: GPU time used by the coalition/process, per second,
+ during the sample period.
+
+Other things to note.
+
+- Smaller is better --- i.e. results in lower power consumption ---
+ for all of these measurements.
+- There is some overlap between the two \"Deadlines\" columns and the
+ two \"Wakeups\" columns. For example, firing a single sub-2ms
+ deadline can also cause a package idle exit wakeup.
+- Many of these measurements are also obtainable by passing the
+ `TASK_POWER_INFO` flag and a `task_power_info` struct to the
+ `task_info` function.
+- By default, the coalitions/processes are sorted by a composite value
+ computed from several factors, though this can be changed via
+ command-line options.
+
+## Other measurements
+
+`powermetrics` can also report measurements of backlight usage, network
+activity, disk activity, interrupt distribution, device power states,
+C-state residency, P-state residency, quality of service classes, and
+thermal pressure. These are less likely to be useful for profiling
+Firefox, however. Run with the `--show-all` to see all of these at once,
+but note that you\'ll need a very wide window to see all the data.
+
+Also note that `powermetrics -h` is a better guide to the the
+command-line options than `man powermetrics`.
+
+## mach power
+
+You can use the `mach power` command to run `powermetrics` in
+combination with `rapl` in a way that gives the most useful summary
+measurements for each of Firefox, Chrome and Safari. The following is
+sample output.
+
+ total W = _pkg_ (cores + _gpu_ + other) + _ram_ W
+ #01 17.14 W = 14.98 ( 5.50 + 1.19 + 8.29) + 2.16 W
+
+ 1 sample taken over a period of 30.000 seconds
+
+ Name ID CPU ms/s User% Deadlines (<2 ms, 2-5 ms) Wakeups (Intr, Pkg idle) GPU ms/s
+ com.google.Chrome 500 439.64 585.35 218.62 19.17
+ Google Chrome Helper 67319 284.75 83.03 296.67 0.00 454.05 172.74 0.00
+ Google Chrome Helper 67304 55.23 64.83 0.03 0.00 9.43 4.33 19.17
+ Google Chrome 67301 63.77 68.09 29.46 0.13 76.11 22.26 0.00
+ Google Chrome Helper 67320 38.30 66.70 17.83 0.00 45.78 19.29 0.00
+ com.apple.WindowServer 68 102.58 112.36 43.15 80.52
+ WindowServer 141 103.03 58.19 60.48 6.40 112.36 43.15 80.53
+ com.apple.Safari 499 267.19 110.53 46.05 1.69
+ com.apple.WebKit.WebContent 67372 190.15 79.34 2.02 0.14 129.28 53.79 2.33
+ com.apple.WebKit.Networking 67292 65.23 52.74 0.07 0.00 4.33 1.40 0.00
+ Safari 67290 29.09 77.65 0.23 0.00 7.13 3.37 0.00
+ com.apple.Safari.SearchHelper 67371 13.88 91.18 0.00 0.00 0.36 0.05 0.00
+ com.apple.WebKit.WebContent 67297 0.81 56.84 0.10 0.00 2.20 1.30 0.00
+ com.apple.WebKit.WebContent 67293 0.46 76.40 0.03 0.00 0.57 0.20 0.00
+ com.apple.WebKit.WebContent 67295 0.24 67.72 0.00 0.00 0.90 0.37 0.00
+ com.apple.WebKit.WebContent 67298 0.17 59.88 0.00 0.00 0.50 0.13 0.00
+ com.apple.WebKit.WebContent 67296 0.07 43.51 0.00 0.00 0.10 0.03 0.00
+ kernel_coalition 1 111.76 724.80 213.09 0.12
+ kernel_task 0 107.06 0.00 5.86 0.00 724.46 212.99 0.12
+ org.mozilla.firefox 498 92.17 212.69 75.67 1.81
+ firefox 63865 61.00 87.18 1.00 0.87 25.79 9.00 1.81
+ plugin-container 67269 31.49 72.46 1.80 0.00 186.90 66.68 0.00
+ com.apple.WebKit.Plugin.64 67373 55.55 74.38 0.74 0.00 9.51 3.13 0.02
+ com.apple.Terminal 109 6.22 0.40 0.23 0.00
+ Terminal 208 6.25 92.99 0.00 0.00 0.33 0.20 0.00
+
+The `rapl` output is first, then the `powermetrics` output. As well as
+the browser processes, the `WindowServer` and kernel tasks are shown
+because browsers often trigger significant load in them.
+
+The default sample period is 30,000 milliseconds (30 seconds), but that
+can be changed with the `-i` option.