1 files changed, 366 insertions, 0 deletions
diff --git a/testing/web-platform/tests/tools/wptrunner/docs/expectation.rst b/testing/web-platform/tests/tools/wptrunner/docs/expectation.rst
new file mode 100644
index 0000000000..fea676565b
--- /dev/null
+++ b/testing/web-platform/tests/tools/wptrunner/docs/expectation.rst
@@ -0,0 +1,366 @@
+Test Metadata
+=============
+
+Directory Layout
+----------------
+
+Metadata files must be stored under the ``metadata`` directory passed
+to the test runner. The directory layout follows that of
+web-platform-tests with each test source path having a corresponding
+metadata file. Because the metadata path is based on the source file
+path, files that generate multiple URLs e.g. tests with multiple
+variants, or multi-global tests generated from an ``any.js`` input
+file, share the same metadata file for all their corresponding
+tests. The metadata path under the ``metadata`` directory is the same
+as the source path under the ``tests`` directory, with an additional
+``.ini`` suffix.
+
+For example a test with URL::
+
+  /spec/section/file.html?query=param
+
+generated from a source file with path::
+
+  <tests root>/spec/section.file.html
+
+would have a metadata file ::
+
+  <metadata root>/spec/section/file.html.ini
+
+As an optimisation, files which produce only default results
+(i.e. ``PASS`` or ``OK``), and which don't have any other associated
+metadata, don't require a corresponding metadata file.
+
+Directory Metadata
+~~~~~~~~~~~~~~~~~~
+
+In addition to per-test metadata, default metadata can be applied to
+all the tests in a given source location, using a ``__dir__.ini``
+metadata file. For example to apply metadata to all tests under
+``<tests root>/spec/`` add the metadata in ``<tests
+root>/spec/__dir__.ini``.
+
+Metadata Format
+---------------
+The format of the metadata files is based on the ini format. Files are
+divided into sections, each (apart from the root section) having a
+heading enclosed in square braces. Within each section are key-value
+pairs. There are several notable differences from standard .ini files,
+however:
+
+ * Sections may be hierarchically nested, with significant whitespace
+   indicating nesting depth.
+
+ * Only ``:`` is valid as a key/value separator
+
+A simple example of a metadata file is::
+
+  root_key: root_value
+
+  [section]
+    section_key: section_value
+
+    [subsection]
+       subsection_key: subsection_value
+
+  [another_section]
+    another_key: [list, value]
+
+Conditional Values
+~~~~~~~~~~~~~~~~~~
+
+In order to support values that depend on some external data, the
+right hand side of a key/value pair can take a set of conditionals
+rather than a plain value. These values are placed on a new line
+following the key, with significant indentation. Conditional values
+are prefixed with ``if`` and terminated with a colon, for example::
+
+  key:
+    if cond1: value1
+    if cond2: value2
+    value3
+
+In this example, the value associated with ``key`` is determined by
+first evaluating ``cond1`` against external data. If that is true,
+``key`` is assigned the value ``value1``, otherwise ``cond2`` is
+evaluated in the same way. If both ``cond1`` and ``cond2`` are false,
+the unconditional ``value3`` is used.
+
+Conditions themselves use a Python-like expression syntax. Operands
+can either be variables, corresponding to data passed in, numbers
+(integer or floating point; exponential notation is not supported) or
+quote-delimited strings. Equality is tested using ``==`` and
+inequality by ``!=``. The operators ``and``, ``or`` and ``not`` are
+used in the expected way. Parentheses can also be used for
+grouping. For example::
+
+  key:
+    if (a == 2 or a == 3) and b == "abc": value1
+    if a == 1 or b != "abc": value2
+    value3
+
+Here ``a`` and ``b`` are variables, the value of which will be
+supplied when the metadata is used.
+
+Web-Platform-Tests Metadata
+---------------------------
+
+When used for expectation data, metadata files have the following format:
+
+ * A section per test URL provided by the corresponding source file,
+   with the section heading being the part of the test URL following
+   the last ``/`` in the path (this allows multiple tests in a single
+   metadata file with the same path part of the URL, but different
+   query parts). This may be omitted if there's no non-default
+   metadata for the test.
+
+ * A subsection per subtest, with the heading being the title of the
+   subtest. This may be omitted if there's no non-default metadata for
+   the subtest.
+
+ * The following known keys:
+
+   :expected:
+      The expectation value or values of each (sub)test. In
+      the case this value is a list, the first value represents the
+      typical expected test outcome, and subsequent values indicate
+      known intermittent outcomes e.g. ``expected: [PASS, ERROR]``
+      would indicate a test that usually passes but has a known-flaky
+      ``ERROR`` outcome.
+
+   :disabled:
+     Any values apart from the special value ``@False``
+     indicates that the (sub)test is disabled and should either not be
+     run (for tests) or that its results should be ignored (subtests).
+
+   :restart-after:
+     Any value apart from the special value ``@False``
+     indicates that the runner should restart the browser after running
+     this test (e.g. to clear out unwanted state).
+
+   :fuzzy:
+     Used for reftests. This is interpreted as a list containing
+     entries like ``<meta name=fuzzy>`` content value, which consists of
+     an optional reference identifier followed by a colon, then a range
+     indicating the maximum permitted pixel difference per channel, then
+     semicolon, then a range indicating the maximum permitted total
+     number of differing pixels. The reference identifier is either a
+     single relative URL, resolved against the base test URL, in which
+     case the fuzziness applies to any comparison with that URL, or
+     takes the form lhs URL, comparison, rhs URL, in which case the
+     fuzziness only applies for any comparison involving that specific
+     pair of URLs. Some illustrative examples are given below.
+
+   :implementation-status:
+     One of the values ``implementing``,
+     ``not-implementing`` or ``default``. This is used in conjunction
+     with the ``--skip-implementation-status`` command line argument to
+     ``wptrunner`` to ignore certain features where running the test is
+     low value.
+
+   :tags:
+     A list of labels associated with a given test that can be
+     used in conjunction with the ``--tag`` command line argument to
+     ``wptrunner`` for test selection.
+
+   In addition there are extra arguments which are currently tied to
+   specific implementations. For example Gecko-based browsers support
+   ``min-asserts``, ``max-asserts``, ``prefs``, ``lsan-disabled``,
+   ``lsan-allowed``, ``lsan-max-stack-depth``, ``leak-allowed``, and
+   ``leak-threshold`` properties.
+
+ * Variables taken from the ``RunInfo`` data which describe the
+   configuration of the test run. Common properties include:
+
+   :product: A string giving the name of the browser under test
+   :browser_channel: A string giving the release channel of the browser under test
+   :debug: A Boolean indicating whether the build is a debug build
+   :os: A string  the operating system
+   :version: A string indicating the particular version of that operating system
+   :processor: A string indicating the processor architecture.
+
+   This information is typically provided by :py:mod:`mozinfo`, but
+   different environments may add additional information, and not all
+   the properties above are guaranteed to be present in all
+   environments. The definitive list of available properties for a
+   specific run may be determined by looking at the ``run_info`` key
+   in the ``wptreport.json`` output for the run.
+
+ * Top level keys are taken as defaults for the whole file. So, for
+   example, a top level key with ``expected: FAIL`` would indicate
+   that all tests and subtests in the file are expected to fail,
+   unless they have an ``expected`` key of their own.
+
+An simple example metadata file might look like::
+
+  [test.html?variant=basic]
+    type: testharness
+
+    [Test something unsupported]
+       expected: FAIL
+
+    [Test with intermittent statuses]
+       expected: [PASS, TIMEOUT]
+
+  [test.html?variant=broken]
+    expected: ERROR
+
+  [test.html?variant=unstable]
+    disabled: http://test.bugs.example.org/bugs/12345
+
+A more complex metadata file with conditional properties might be::
+
+  [canvas_test.html]
+    expected:
+      if os == "mac": FAIL
+      if os == "windows" and version == "XP": FAIL
+      PASS
+
+Note that ``PASS`` in the above works, but is unnecessary since it's
+the default expected result.
+
+A metadata file with fuzzy reftest values might be::
+
+  [reftest.html]
+    fuzzy: [10;200, ref1.html:20;200-300, subtest1.html==ref2.html:10-15;20]
+
+In this case the default fuzziness for any comparison would be to
+require a maximum difference per channel of less than or equal to 10
+and less than or equal to 200 total pixels different. For any
+comparison involving ref1.html on the right hand side, the limits
+would instead be a difference per channel not more than 20 and a total
+difference count of not less than 200 and not more than 300. For the
+specific comparison ``subtest1.html == ref2.html`` (both resolved against
+the test URL) these limits would instead be 10 to 15 and 0 to 20,
+respectively.
+
+Generating Expectation Files
+----------------------------
+
+wpt provides the tool ``wpt update-expectations`` command to generate
+expectation files from the results of a set of test runs. The basic
+syntax for this is::
+
+  ./wpt update-expectations [options] [logfile]...
+
+Each ``logfile`` is a wptreport log file from a previous run. These
+can be generated from wptrunner using the ``--log-wptreport`` option
+e.g. ``--log-wptreport=wptreport.json``.
+
+``update-expectations`` takes several options:
+
+--full  Overwrite all the expectation data for any tests that have a
+        result in the passed log files, not just data for the same run
+        configuration.
+
+--disable-intermittent  When updating test results, disable tests that
+                        have inconsistent results across many
+                        runs. This can precede a message providing a
+                        reason why that test is disable. If no message
+                        is provided, ``unstable`` is the default text.
+
+--update-intermittent  When this option is used, the ``expected`` key
+                       stores expected intermittent statuses in
+                       addition to the primary expected status. If
+                       there is more than one status, it appears as a
+                       list. The default behaviour of this option is to
+                       retain any existing intermittent statuses in the
+                       list unless ``--remove-intermittent`` is
+                       specified.
+
+--remove-intermittent  This option is used in conjunction with
+                       ``--update-intermittent``.  When the
+                       ``expected`` statuses are updated, any obsolete
+                       intermittent statuses that did not occur in the
+                       specified log files are removed from the list.
+
+Property Configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+In cases where the expectation depends on the run configuration ``wpt
+update-expectations`` is able to generate conditional values. Because
+the relevant variables depend on the range of configurations that need
+to be covered, it's necessary to specify the list of configuration
+variables that should be used. This is done using a ``json`` format
+file that can be specified with the ``--properties-file`` command line
+argument to ``wpt update-expectations``. When this isn't supplied the
+defaults from ``<metadata root>/update_properties.json`` are used, if
+present.
+
+Properties File Format
+++++++++++++++++++++++
+
+The file is JSON formatted with two top-level keys:
+
+:``properties``:
+  A list of property names to consider for conditionals
+  e.g ``["product", "os"]``.
+
+:``dependents``:
+  An optional dictionary containing properties that
+  should only be used as "tie-breakers" when differentiating based on a
+  specific top-level property has failed. This is useful when the
+  dependent property is always more specific than the top-level
+  property, but less understandable when used directly. For example the
+  ``version`` property covering different OS versions is typically
+  unique amongst different operating systems, but using it when the
+  ``os`` property would do instead is likely to produce metadata that's
+  too specific to the current configuration and more difficult to
+  read. But where there are multiple versions of the same operating
+  system with different results, it can be necessary. So specifying
+  ``{"os": ["version"]}`` as a dependent property means that the
+  ``version`` property will only be used if the condition already
+  contains the ``os`` property and further conditions are required to
+  separate the observed results.
+
+So an example ``update-properties.json`` file might look like::
+
+  {
+    "properties": ["product", "os"],
+    "dependents": {"product": ["browser_channel"], "os": ["version"]}
+  }
+
+Examples
+~~~~~~~~
+
+Update all the expectations from a set of cross-platform test runs::
+
+  wpt update-expectations --full osx.log linux.log windows.log
+
+Add expectation data for some new tests that are expected to be
+platform-independent::
+
+  wpt update-expectations tests.log
+
+Why a Custom Format?
+--------------------
+
+Introduction
+------------
+
+Given the use of the metadata files in CI systems, it was desirable to
+have something with the following properties:
+
+ * Human readable
+
+ * Human editable
+
+ * Machine readable / writable
+
+ * Capable of storing key-value pairs
+
+ * Suitable for storing in a version control system (i.e. text-based)
+
+The need for different results per platform means either having
+multiple expectation files for each platform, or having a way to
+express conditional values within a certain file. The former would be
+rather cumbersome for humans updating the expectation files, so the
+latter approach has been adopted, leading to the requirement:
+
+ * Capable of storing result values that are conditional on the platform.
+
+There are few extant formats that clearly meet these requirements. In
+particular although conditional properties could be expressed in many
+existing formats, the representation would likely be cumbersome and
+error-prone for hand authoring. Therefore it was decided that a custom
+format offered the best tradeoffs given the requirements.