summaryrefslogtreecommitdiffstats
path: root/dom/webgpu/tests/cts/checkout/docs/organization.md
blob: fd7020afd623713e2f9d30d712594a429ebf62bd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
# Test Organization

## `src/webgpu/`

Because of the glorious amount of test needed, the WebGPU CTS is organized as a tree of arbitrary
depth (a filesystem with multiple tests per file).

Each directory may have a `README.txt` describing its contents.
Tests are grouped in large families (each of which has a `README.txt`);
the root and first few levels looks like the following (some nodes omitted for simplicity):

- **`api`** with tests for full coverage of the Javascript API surface of WebGPU.
    - **`validation`** with positive and negative tests for all the validation rules of the API.
    - **`operation`** with tests that checks the result of performing valid WebGPU operations,
      taking advantage of parametrization to exercise interactions between parts of the API.
    - **`regression`** for one-off tests that reproduce bugs found in implementations to prevent
      the bugs from appearing again.
- **`shader`** with tests for full coverage of the shaders that can be passed to WebGPU.
    - **`validation`**.
    - **`execution`** similar to `api/operation`.
    - **`regression`**.
- **`idl`** with tests to check that the WebGPU IDL is correctly implemented, for examples that
  objects exposed exactly the correct members, and that methods throw when passed incomplete
  dictionaries.
- **`web-platform`** with tests for Web platform-specific interactions like `GPUSwapChain` and
  `<canvas>`, WebXR and `GPUQueue.copyExternalImageToTexture`.

At the same time test hierarchies can be used to split the testing of a single sub-object into
several file for maintainability. For example `GPURenderPipeline` has a large descriptor and some
parts could be tested independently like `vertex_input` vs. `primitive_topology` vs. `blending`
but all live under the `render_pipeline` directory.

In addition to the test tree, each test can be parameterized. For coverage it is important to
test all enums values, for example for `GPUTextureFormat`. Instead of having a loop to iterate
over all the `GPUTextureFormat`, it is better to parameterize the test over them. Each format
will have a different entry in the test list which will help WebGPU implementers debug the test,
or suppress the failure without losing test coverage while they fix the bug.

Extra capabilities (limits and features) are often tested in the same files as the rest of the API.
For example, a compressed texture format capability would simply add a `GPUTextureFormat` to the
parametrization lists of many tests, while a capability adding significant new functionality
like ray-tracing could have a separate subtree.

Operation tests for optional features should be skipped using `t.selectDeviceOrSkipTestCase()` or
`t.skip()`. Validation tests should be written that test the behavior with and without the
capability enabled via `t.selectDeviceOrSkipTestCase()`, to ensure the functionality is valid
only with the capability enabled.

### Validation tests

Validation tests check the validation rules that are (or will be) set by the
WebGPU spec. Validation tests try to carefully trigger the individual validation
rules in the spec, without simultaneously triggering other rules.

Validation errors *generally* generate WebGPU errors, not exceptions.
But check the spec on a case-by-case basis.

Like all `GPUTest`s, `ValidationTest`s are wrapped in both types of error scope. These
"catch-all" error scopes look for any errors during the test, and report them as test failures.
Since error scopes can be nested, validation tests can nest an error scope to expect that there
*are* errors from specific operations.

#### Parameterization

Test parameterization can help write many validation tests more succinctly,
while making it easier for both authors and reviewers to be confident that
an aspect of the API is tested fully. Examples:

- [`webgpu:api,validation,render_pass,resolve:resolve_attachment:*`](https://github.com/gpuweb/cts/blob/ded3b7c8a4680a1a01621a8ac859facefadf32d0/src/webgpu/api/validation/render_pass/resolve.spec.ts#L35)
- [`webgpu:api,validation,createBindGroupLayout:bindingTypeSpecific_optional_members:*`](https://github.com/gpuweb/cts/blob/ded3b7c8a4680a1a01621a8ac859facefadf32d0/src/webgpu/api/validation/createBindGroupLayout.spec.ts#L68)

Use your own discretion when deciding the balance between heavily parameterizing
a test and writing multiple separate tests.

#### Guidelines

There are many aspects that should be tested in all validation tests:

- each individual argument to a method call (including `this`) or member of a descriptor
  dictionary should be tested including:
    - what happens when an error object is passed.
    - what happens when an optional feature enum or method is used.
    - what happens for numeric values when they are at 0, too large, too small, etc.
- each validation rule in the specification should be checked both with a control success case,
  and error cases.
- each set of arguments or state that interact for validation.

When testing numeric values, it is important to check on both sides of the boundary: if the error
happens for value N and not N - 1, both should be tested. Alignment of integer values should also
be tested but boundary testing of alignment should be between a value aligned to 2^N and a value
aligned to 2^(N-1).

Finally, this is probably also where we would test that extensions follow the rule that: if the
browser supports a feature but it is not enabled on the device, then calling methods from that
feature throws `TypeError`.

- Test providing unknown properties *that are definitely not part of any feature* are
  valid/ignored. (Unfortunately, due to the rules of IDL, adding a member to a dictionary is
  always a breaking change. So this is how we have to test this unless we can get a "strict"
  dictionary type in IDL. We can't test adding members from non-enabled extensions.)

### Operation tests

Operation tests test the actual results of using the API. They execute
(sometimes significant) code and check that the result is within the expected
set of behaviors (which can be quite complex to compute).

Note that operation tests need to test a lot of interactions between different
parts of the API, and so can become quite complex. Try to reduce the complexity by
utilizing combinatorics and [helpers](./helper_index.txt), and splitting/merging test files as needed.

#### Errors

Operation tests are usually `GPUTest`s. As a result, they automatically fail on any validation
errors that occur during the test.

When it's easier to write an operation test with invalid cases, use
`ParamsBuilder.filter`/`.unless` to avoid invalid cases, or detect and
`expect` validation errors in some cases.

#### Implementation

Use helpers like `expectContents` (and more to come) to check the values of data on the GPU.
(These are "eventual expectations" - the harness will wait for them to finish at the end).

When testing something inside a shader, it's not always necessary to output the result to a
render output. In fragment shaders, you can output to a storage buffer. In vertex shaders, you
can't - but you can render with points (simplest), send the result to the fragment shader, and
output it from there. (Someday, we may end up wanting a helper for this.)

#### Testing Default Values

Default value tests (for arguments and dictionary members) should usually be operation tests -
all you have to do is include `undefined` in parameterizations of other tests to make sure the
behavior with `undefined` has the same expected result that you have when the default value is
specified explicitly.

### IDL tests

TODO: figure out how to implement these. https://github.com/gpuweb/cts/issues/332

These tests test only rules that come directly from WebIDL. For example:

- Values out of range for `[EnforceRange]` cause exceptions.
- Required function arguments and dictionary members cause exceptions if omitted.
- Arguments and dictionary members cause exceptions if passed the wrong type.

They may also test positive cases like the following, but the behavior of these should be tested in
operation tests.

- OK to omit optional arguments/members.
- OK to pass the correct argument/member type (or of any type in a union type).

Every overload of every method should be tested.

## `src/stress/`, `src/manual/`

Stress tests and manual tests for WebGPU that are not intended to be run in an automated way.

## `src/unittests/`

Unit tests for the test framework (`src/common/framework/`).

## `src/demo/`

A demo of test hierarchies for the purpose of testing the `standalone` test runner page.