366 lines
14 KiB
ReStructuredText
366 lines
14 KiB
ReStructuredText
Test Metadata
|
|
=============
|
|
|
|
Directory Layout
|
|
----------------
|
|
|
|
Metadata files must be stored under the ``metadata`` directory passed
|
|
to the test runner. The directory layout follows that of
|
|
web-platform-tests with each test source path having a corresponding
|
|
metadata file. Because the metadata path is based on the source file
|
|
path, files that generate multiple URLs e.g. tests with multiple
|
|
variants, or multi-global tests generated from an ``any.js`` input
|
|
file, share the same metadata file for all their corresponding
|
|
tests. The metadata path under the ``metadata`` directory is the same
|
|
as the source path under the ``tests`` directory, with an additional
|
|
``.ini`` suffix.
|
|
|
|
For example a test with URL::
|
|
|
|
/spec/section/file.html?query=param
|
|
|
|
generated from a source file with path::
|
|
|
|
<tests root>/spec/section.file.html
|
|
|
|
would have a metadata file ::
|
|
|
|
<metadata root>/spec/section/file.html.ini
|
|
|
|
As an optimisation, files which produce only default results
|
|
(i.e. ``PASS`` or ``OK``), and which don't have any other associated
|
|
metadata, don't require a corresponding metadata file.
|
|
|
|
Directory Metadata
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
In addition to per-test metadata, default metadata can be applied to
|
|
all the tests in a given source location, using a ``__dir__.ini``
|
|
metadata file. For example to apply metadata to all tests under
|
|
``<tests root>/spec/`` add the metadata in ``<tests
|
|
root>/spec/__dir__.ini``.
|
|
|
|
Metadata Format
|
|
---------------
|
|
The format of the metadata files is based on the ini format. Files are
|
|
divided into sections, each (apart from the root section) having a
|
|
heading enclosed in square braces. Within each section are key-value
|
|
pairs. There are several notable differences from standard .ini files,
|
|
however:
|
|
|
|
* Sections may be hierarchically nested, with significant whitespace
|
|
indicating nesting depth.
|
|
|
|
* Only ``:`` is valid as a key/value separator
|
|
|
|
A simple example of a metadata file is::
|
|
|
|
root_key: root_value
|
|
|
|
[section]
|
|
section_key: section_value
|
|
|
|
[subsection]
|
|
subsection_key: subsection_value
|
|
|
|
[another_section]
|
|
another_key: [list, value]
|
|
|
|
Conditional Values
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
In order to support values that depend on some external data, the
|
|
right hand side of a key/value pair can take a set of conditionals
|
|
rather than a plain value. These values are placed on a new line
|
|
following the key, with significant indentation. Conditional values
|
|
are prefixed with ``if`` and terminated with a colon, for example::
|
|
|
|
key:
|
|
if cond1: value1
|
|
if cond2: value2
|
|
value3
|
|
|
|
In this example, the value associated with ``key`` is determined by
|
|
first evaluating ``cond1`` against external data. If that is true,
|
|
``key`` is assigned the value ``value1``, otherwise ``cond2`` is
|
|
evaluated in the same way. If both ``cond1`` and ``cond2`` are false,
|
|
the unconditional ``value3`` is used.
|
|
|
|
Conditions themselves use a Python-like expression syntax. Operands
|
|
can either be variables, corresponding to data passed in, numbers
|
|
(integer or floating point; exponential notation is not supported) or
|
|
quote-delimited strings. Equality is tested using ``==`` and
|
|
inequality by ``!=``. The operators ``and``, ``or`` and ``not`` are
|
|
used in the expected way. Parentheses can also be used for
|
|
grouping. For example::
|
|
|
|
key:
|
|
if (a == 2 or a == 3) and b == "abc": value1
|
|
if a == 1 or b != "abc": value2
|
|
value3
|
|
|
|
Here ``a`` and ``b`` are variables, the value of which will be
|
|
supplied when the metadata is used.
|
|
|
|
Web-Platform-Tests Metadata
|
|
---------------------------
|
|
|
|
When used for expectation data, metadata files have the following format:
|
|
|
|
* A section per test URL provided by the corresponding source file,
|
|
with the section heading being the part of the test URL following
|
|
the last ``/`` in the path (this allows multiple tests in a single
|
|
metadata file with the same path part of the URL, but different
|
|
query parts). This may be omitted if there's no non-default
|
|
metadata for the test.
|
|
|
|
* A subsection per subtest, with the heading being the title of the
|
|
subtest. This may be omitted if there's no non-default metadata for
|
|
the subtest.
|
|
|
|
* The following known keys:
|
|
|
|
:expected:
|
|
The expectation value or values of each (sub)test. In
|
|
the case this value is a list, the first value represents the
|
|
typical expected test outcome, and subsequent values indicate
|
|
known intermittent outcomes e.g. ``expected: [PASS, ERROR]``
|
|
would indicate a test that usually passes but has a known-flaky
|
|
``ERROR`` outcome.
|
|
|
|
:disabled:
|
|
Any values apart from the special value ``@False``
|
|
indicates that the (sub)test is disabled and should either not be
|
|
run (for tests) or that its results should be ignored (subtests).
|
|
|
|
:restart-after:
|
|
Any value apart from the special value ``@False``
|
|
indicates that the runner should restart the browser after running
|
|
this test (e.g. to clear out unwanted state).
|
|
|
|
:fuzzy:
|
|
Used for reftests. This is interpreted as a list containing
|
|
entries like ``<meta name=fuzzy>`` content value, which consists of
|
|
an optional reference identifier followed by a colon, then a range
|
|
indicating the maximum permitted pixel difference per channel, then
|
|
semicolon, then a range indicating the maximum permitted total
|
|
number of differing pixels. The reference identifier is either a
|
|
single relative URL, resolved against the base test URL, in which
|
|
case the fuzziness applies to any comparison with that URL, or
|
|
takes the form lhs URL, comparison, rhs URL, in which case the
|
|
fuzziness only applies for any comparison involving that specific
|
|
pair of URLs. Some illustrative examples are given below.
|
|
|
|
:implementation-status:
|
|
One of the values ``implementing``,
|
|
``not-implementing`` or ``backlog``. This is used in conjunction
|
|
with the ``--skip-implementation-status`` command line argument to
|
|
``wptrunner`` to ignore certain features where running the test is
|
|
low value.
|
|
|
|
:tags:
|
|
A list of labels associated with a given test that can be
|
|
used in conjunction with the ``--tag`` command line argument to
|
|
``wptrunner`` for test selection.
|
|
|
|
In addition there are extra arguments which are currently tied to
|
|
specific implementations. For example Gecko-based browsers support
|
|
``min-asserts``, ``max-asserts``, ``prefs``, ``lsan-disabled``,
|
|
``lsan-allowed``, ``lsan-max-stack-depth``, ``leak-allowed``, and
|
|
``leak-threshold`` properties.
|
|
|
|
* Variables taken from the ``RunInfo`` data which describe the
|
|
configuration of the test run. Common properties include:
|
|
|
|
:product: A string giving the name of the browser under test
|
|
:browser_channel: A string giving the release channel of the browser under test
|
|
:debug: A Boolean indicating whether the build is a debug build
|
|
:os: A string the operating system
|
|
:version: A string indicating the particular version of that operating system
|
|
:processor: A string indicating the processor architecture.
|
|
|
|
This information is typically provided by :py:mod:`mozinfo`, but
|
|
different environments may add additional information, and not all
|
|
the properties above are guaranteed to be present in all
|
|
environments. The definitive list of available properties for a
|
|
specific run may be determined by looking at the ``run_info`` key
|
|
in the ``wptreport.json`` output for the run.
|
|
|
|
* Top level keys are taken as defaults for the whole file. So, for
|
|
example, a top level key with ``expected: FAIL`` would indicate
|
|
that all tests and subtests in the file are expected to fail,
|
|
unless they have an ``expected`` key of their own.
|
|
|
|
An simple example metadata file might look like::
|
|
|
|
[test.html?variant=basic]
|
|
type: testharness
|
|
|
|
[Test something unsupported]
|
|
expected: FAIL
|
|
|
|
[Test with intermittent statuses]
|
|
expected: [PASS, TIMEOUT]
|
|
|
|
[test.html?variant=broken]
|
|
expected: ERROR
|
|
|
|
[test.html?variant=unstable]
|
|
disabled: http://test.bugs.example.org/bugs/12345
|
|
|
|
A more complex metadata file with conditional properties might be::
|
|
|
|
[canvas_test.html]
|
|
expected:
|
|
if os == "mac": FAIL
|
|
if os == "windows" and version == "XP": FAIL
|
|
PASS
|
|
|
|
Note that ``PASS`` in the above works, but is unnecessary since it's
|
|
the default expected result.
|
|
|
|
A metadata file with fuzzy reftest values might be::
|
|
|
|
[reftest.html]
|
|
fuzzy: [10;200, ref1.html:20;200-300, subtest1.html==ref2.html:10-15;20]
|
|
|
|
In this case the default fuzziness for any comparison would be to
|
|
require a maximum difference per channel of less than or equal to 10
|
|
and less than or equal to 200 total pixels different. For any
|
|
comparison involving ref1.html on the right hand side, the limits
|
|
would instead be a difference per channel not more than 20 and a total
|
|
difference count of not less than 200 and not more than 300. For the
|
|
specific comparison ``subtest1.html == ref2.html`` (both resolved against
|
|
the test URL) these limits would instead be 10 to 15 and 0 to 20,
|
|
respectively.
|
|
|
|
Generating Expectation Files
|
|
----------------------------
|
|
|
|
wpt provides the tool ``wpt update-expectations`` command to generate
|
|
expectation files from the results of a set of test runs. The basic
|
|
syntax for this is::
|
|
|
|
./wpt update-expectations [options] [logfile]...
|
|
|
|
Each ``logfile`` is a wptreport log file from a previous run. These
|
|
can be generated from wptrunner using the ``--log-wptreport`` option
|
|
e.g. ``--log-wptreport=wptreport.json``.
|
|
|
|
``update-expectations`` takes several options:
|
|
|
|
--full Overwrite all the expectation data for any tests that have a
|
|
result in the passed log files, not just data for the same run
|
|
configuration.
|
|
|
|
--disable-intermittent When updating test results, disable tests that
|
|
have inconsistent results across many
|
|
runs. This can precede a message providing a
|
|
reason why that test is disable. If no message
|
|
is provided, ``unstable`` is the default text.
|
|
|
|
--update-intermittent When this option is used, the ``expected`` key
|
|
stores expected intermittent statuses in
|
|
addition to the primary expected status. If
|
|
there is more than one status, it appears as a
|
|
list. The default behaviour of this option is to
|
|
retain any existing intermittent statuses in the
|
|
list unless ``--remove-intermittent`` is
|
|
specified.
|
|
|
|
--remove-intermittent This option is used in conjunction with
|
|
``--update-intermittent``. When the
|
|
``expected`` statuses are updated, any obsolete
|
|
intermittent statuses that did not occur in the
|
|
specified log files are removed from the list.
|
|
|
|
Property Configuration
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
In cases where the expectation depends on the run configuration ``wpt
|
|
update-expectations`` is able to generate conditional values. Because
|
|
the relevant variables depend on the range of configurations that need
|
|
to be covered, it's necessary to specify the list of configuration
|
|
variables that should be used. This is done using a ``json`` format
|
|
file that can be specified with the ``--properties-file`` command line
|
|
argument to ``wpt update-expectations``. When this isn't supplied the
|
|
defaults from ``<metadata root>/update_properties.json`` are used, if
|
|
present.
|
|
|
|
Properties File Format
|
|
++++++++++++++++++++++
|
|
|
|
The file is JSON formatted with two top-level keys:
|
|
|
|
:``properties``:
|
|
A list of property names to consider for conditionals
|
|
e.g ``["product", "os"]``.
|
|
|
|
:``dependents``:
|
|
An optional dictionary containing properties that
|
|
should only be used as "tie-breakers" when differentiating based on a
|
|
specific top-level property has failed. This is useful when the
|
|
dependent property is always more specific than the top-level
|
|
property, but less understandable when used directly. For example the
|
|
``version`` property covering different OS versions is typically
|
|
unique amongst different operating systems, but using it when the
|
|
``os`` property would do instead is likely to produce metadata that's
|
|
too specific to the current configuration and more difficult to
|
|
read. But where there are multiple versions of the same operating
|
|
system with different results, it can be necessary. So specifying
|
|
``{"os": ["version"]}`` as a dependent property means that the
|
|
``version`` property will only be used if the condition already
|
|
contains the ``os`` property and further conditions are required to
|
|
separate the observed results.
|
|
|
|
So an example ``update-properties.json`` file might look like::
|
|
|
|
{
|
|
"properties": ["product", "os"],
|
|
"dependents": {"product": ["browser_channel"], "os": ["version"]}
|
|
}
|
|
|
|
Examples
|
|
~~~~~~~~
|
|
|
|
Update all the expectations from a set of cross-platform test runs::
|
|
|
|
wpt update-expectations --full osx.log linux.log windows.log
|
|
|
|
Add expectation data for some new tests that are expected to be
|
|
platform-independent::
|
|
|
|
wpt update-expectations tests.log
|
|
|
|
Why a Custom Format?
|
|
--------------------
|
|
|
|
Introduction
|
|
------------
|
|
|
|
Given the use of the metadata files in CI systems, it was desirable to
|
|
have something with the following properties:
|
|
|
|
* Human readable
|
|
|
|
* Human editable
|
|
|
|
* Machine readable / writable
|
|
|
|
* Capable of storing key-value pairs
|
|
|
|
* Suitable for storing in a version control system (i.e. text-based)
|
|
|
|
The need for different results per platform means either having
|
|
multiple expectation files for each platform, or having a way to
|
|
express conditional values within a certain file. The former would be
|
|
rather cumbersome for humans updating the expectation files, so the
|
|
latter approach has been adopted, leading to the requirement:
|
|
|
|
* Capable of storing result values that are conditional on the platform.
|
|
|
|
There are few extant formats that clearly meet these requirements. In
|
|
particular although conditional properties could be expressed in many
|
|
existing formats, the representation would likely be cumbersome and
|
|
error-prone for hand authoring. Therefore it was decided that a custom
|
|
format offered the best tradeoffs given the requirements.
|