testing/perfdocs/generated/test-list.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

################
Raptor Test List
################

Currently the following Raptor tests are available. Note: Check the test details below to see which browser (i.e. Firefox, Google Chrome, Android) each test is supported on.

Page-Load Tests
---------------

Raptor page-load test documentation is generated by `PerfDocs <https://firefox-source-docs.mozilla.org/code-quality/lint/linters/perfdocs.html>`_ and available in the `Firefox Source Docs <https://firefox-source-docs.mozilla.org/testing/perfdocs/raptor.html>`_.

Benchmark Tests
---------------

motionmark-animometer, motionmark-htmlsuite
===========================================

* summarization:

    * subtest: FPS from the subtest, each subtest is run for 15 seconds, repeat this 5 times and report the median value
    * suite: we take a geometric mean of all the subtests (9 for animometer, 11 for html suite)

speedometer
===========

* measuring: responsiveness of web applications
* data: there are 16 subtests in Speedometer; each of these are made up of 9 internal benchmarks.
* summarization:

    * subtest: For all of the 16 subtests, we collect `a summed of all their internal benchmark results <https://searchfox.org/mozilla-central/source/third_party/webkit/PerformanceTests/Speedometer/resources/benchmark-report.js#66-67>`_ for each of them. To obtain a single score per subtest, we take `a median of the replicates <https://searchfox.org/mozilla-central/source/testing/raptor/raptor/output.py#427-470>`_.
    * score: `geometric mean of the 16 subtest metrics (along with some special corrections) <https://searchfox.org/mozilla-central/source/testing/raptor/raptor/output.py#317-330>`_.

This is the `Speedometer v1.0 <http://browserbench.org/Speedometer/>`_ JavaScript benchmark taken verbatim and slightly modified to work with the Raptor harness.

youtube-playback
================

* details: `YouTube playback performance <https://wiki.mozilla.org/TestEngineering/Performance/Raptor/Youtube_playback_performance>`_
* measuring: media streaming playback performance (dropped video frames)
* reporting: For each video the number of dropped and decoded frames, as well as its percentage value is getting recorded. The overall reported result is the mean value of dropped video frames across all tested video files.
* data: Given the size of the used media files those tests are currently run as live site tests, and are kept up-to-date via the `perf-youtube-playback <https://github.com/mozilla/perf-youtube-playback/>`_ repository on Github.

This are the `Playback Performance Tests <https://ytlr-cert.appspot.com/2019/main.html?test_type=playbackperf-test>`_ benchmark taken verbatim and slightly modified to work with the Raptor harness.

webaudio
========

* measuring: Rendering speed of various synthetic Web Audio API workloads
* reporting: The time time it took to render the audio of each test case, and a geometric mean of the full test suite. Lower is better
* data: Upstream is https://github.com/padenot/webaudio-benchmark/. Those benchmarks are run by other projects. Upstream is vendored in mozilla-central via an simple update script, at `third_party/webkit/PerformanceTests/webaudio`

Scenario Tests
--------------

This test type runs browser tests that use idle pages for a specified amount of time to gather resource usage information such as power usage. The pages used for testing do not need to be recorded with mitmproxy.

When creating a new scenario test, ensure that the `page-timeout` is greater than the `scenario-time` to make sure raptor doesn't exit the test before the scenario timer ends.

This test type can also be used for specialized tests that require communication with the control-server to do things like sending the browser to the background for X minutes.

Power-Usage Measurement Tests
=============================
These Android power measurement tests output 3 different PERFHERDER_DATA entries. The first contains the power usage of the test itself, the second contains the power usage of the android OS (named os-baseline) over the course of 1 minute, and the third (the name is the test name with '%change-power' appended to it) is a combination of these two measures which shows the percentage increase in power consumption when the test is run, in comparison to when it is not running. In these perfherder data blobs, we provide power consumption attributed to the cpu, wifi, and screen in Milli-ampere-hours (mAh).

raptor-scn-power-idle
^^^^^^^^^^^^^^^^^^^^^

* measuring: Power consumption for idle Android browsers, with about:blank loaded and app foregrounded, over a 20-minute duration

raptor-scn-power-idle-bg
^^^^^^^^^^^^^^^^^^^^^^^^

* measuring: Power consumption for idle Android browsers, with about:blank loaded and app backgrounded, over a 10-minute duration