From 36d22d82aa202bb199967e9512281e9a53db42c9 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 7 Apr 2024 21:33:14 +0200 Subject: Adding upstream version 115.7.0esr. Signed-off-by: Daniel Baumann --- dom/webgpu/tests/cts/checkout/docs/build.md | 43 ++ dom/webgpu/tests/cts/checkout/docs/deno.md | 24 + dom/webgpu/tests/cts/checkout/docs/fp_primer.md | 516 +++++++++++++++++++++ .../tests/cts/checkout/docs/helper_index.txt | 92 ++++ dom/webgpu/tests/cts/checkout/docs/implementing.md | 97 ++++ dom/webgpu/tests/cts/checkout/docs/intro/README.md | 99 ++++ .../cts/checkout/docs/intro/convert_to_issue.png | Bin 0 -> 2061 bytes .../tests/cts/checkout/docs/intro/developing.md | 134 ++++++ .../tests/cts/checkout/docs/intro/life_of.md | 46 ++ dom/webgpu/tests/cts/checkout/docs/intro/plans.md | 82 ++++ dom/webgpu/tests/cts/checkout/docs/intro/tests.md | 25 + dom/webgpu/tests/cts/checkout/docs/organization.md | 166 +++++++ dom/webgpu/tests/cts/checkout/docs/reviews.md | 70 +++ dom/webgpu/tests/cts/checkout/docs/terms.md | 270 +++++++++++ 14 files changed, 1664 insertions(+) create mode 100644 dom/webgpu/tests/cts/checkout/docs/build.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/deno.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/fp_primer.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/helper_index.txt create mode 100644 dom/webgpu/tests/cts/checkout/docs/implementing.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/intro/README.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png create mode 100644 dom/webgpu/tests/cts/checkout/docs/intro/developing.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/intro/life_of.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/intro/plans.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/intro/tests.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/organization.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/reviews.md create mode 100644 dom/webgpu/tests/cts/checkout/docs/terms.md (limited to 'dom/webgpu/tests/cts/checkout/docs') diff --git a/dom/webgpu/tests/cts/checkout/docs/build.md b/dom/webgpu/tests/cts/checkout/docs/build.md new file mode 100644 index 0000000000..2d7b2f968c --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/build.md @@ -0,0 +1,43 @@ +# Building + +Building the project is not usually needed for local development. +However, for exports to WPT, or deployment (https://gpuweb.github.io/cts/), +files can be pre-generated. + +The project builds into two directories: + +- `out/`: Built framework and test files, needed to run standalone or command line. +- `out-wpt/`: Build directory for export into WPT. Contains: + - An adapter for running WebGPU CTS tests under WPT + - A copy of the needed files from `out/` + - A copy of any `.html` test cases from `src/` + +To build and run all pre-submit checks (including type and lint checks and +unittests), use: + +```sh +npm test +``` + +For checks only: + +```sh +npm run check +``` + +For a quicker iterative build: + +```sh +npm run standalone +``` + +## Run + +To serve the built files (rather than using the dev server), run `npx grunt serve`. + +## Export to WPT + +Run `npm run wpt`. + +Copy (or symlink) the `out-wpt/` directory as the `webgpu/` directory in your +WPT checkout or your browser's "internal" WPT test directory. diff --git a/dom/webgpu/tests/cts/checkout/docs/deno.md b/dom/webgpu/tests/cts/checkout/docs/deno.md new file mode 100644 index 0000000000..22a54c79bd --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/deno.md @@ -0,0 +1,24 @@ +# Running the CTS on Deno + +Since version 1.8, Deno experimentally implements the WebGPU API out of the box. +You can use the `./tools/deno` script to run the CTS in Deno. To do this you +will first need to install Deno: [stable](https://deno.land#installation), or +build the main branch from source +(`cargo install --git https://github.com/denoland/deno --bin deno`). + +On macOS and recent Linux, you can just run `./tools/run_deno` as is. On Windows and +older Linux releases you will need to run +`deno run --unstable --allow-read --allow-write --allow-env ./tools/deno`. + +## Usage + +``` +Usage: + tools/run_deno [OPTIONS...] QUERIES... + tools/run_deno 'unittests:*' 'webgpu:buffers,*' +Options: + --verbose Print result/log of every test as it runs. + --debug Include debug messages in logging. + --print-json Print the complete result JSON in the output. + --expectations Path to expectations file. +``` diff --git a/dom/webgpu/tests/cts/checkout/docs/fp_primer.md b/dom/webgpu/tests/cts/checkout/docs/fp_primer.md new file mode 100644 index 0000000000..234a43de40 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/fp_primer.md @@ -0,0 +1,516 @@ +# Floating Point Primer + +This document is meant to be a primer of the concepts related to floating point +numbers that are needed to be understood when working on tests in WebGPU's CTS. + +WebGPU's CTS is responsible for testing if implementations of WebGPU are +conformant to the spec, and thus interoperable with each other. + +Floating point math makes up a significant portion of the WGSL spec, and has +many subtle corner cases to get correct. + +Additionally, floating point math, unlike integer math, is broadly not exact, so +how inaccurate a calculation is allowed to be is required to be stated in the +spec and tested in the CTS, as opposed to testing for a singular correct +response. + +Thus, the WebGPU CTS has a significant amount of machinery around how to +correctly test floating point expectations in a fluent manner. + +## Floating Point Numbers + +For the context of this discussion floating point numbers, fp for short, are +single precision IEEE floating point numbers, f32 for short. + +Details of how this format works are discussed as needed below, but for a more +involved discussion, please see the references in the Resources sections. + +Additionally, in the Appendix there is a table of interesting/common values that +are often referenced in tests or this document. + +*In the future support for f16 and abstract floats will be added to the CTS, and +this document will need to be updated.* + +Floating point numbers are effectively lossy compression of the infinite number +of possible values over their range down to 32-bits of distinct points. + +This means that not all numbers in the range can be exactly represented as a f32. + +For example, the integer `1` is exactly represented as `0x3f800000`, but the next +nearest number `0x3f800001` is `1.00000011920928955`. + +So any number between `1` and `1.00000011920928955` is not exactly represented +as a f32 and instead is approximated as either `1` or `1.00000011920928955`. + +When a number X is not exactly represented by a f32 value, there are normally +two neighbouring numbers that could reasonably represent X: the nearest f32 +value above X, and the nearest f32 value below X. Which of these values gets +used is dictated by the rounding mode being used, which may be something like +always round towards 0 or go to the nearest neighbour, or something else +entirely. + +The process of converting numbers between precisions, like non-f32 to f32, is +called quantization. WGSL does not prescribe a specific rounding mode when +quantizing, so either of the neighbouring values is considered valid +when converting a non-exactly representable value to f32. This has significant +implications on the CTS that are discussed later. + +From here on, we assume you are familiar with the internal structure of a f32 +value: a sign bit, a biased exponent, and a mantissa. For reference, see +[float32 on Wikipedia](https://en.wikipedia.org/wiki/Single-precision_floating-point_format) + +In the f32 format as described above, there are two possible zero values, one +with all bits being 0, called positive zero, and one all the same except with +the sign bit being 1, called negative zero. + +For WGSL, and thus the CTS's purposes, these values are considered equivalent. +Typescript, which the CTS is written in, treats all zeros as positive zeros, +unless you explicitly escape hatch to differentiate between them, so most of the +time there being two zeros doesn't materially affect code. + +### Normals + +Normal numbers are floating point numbers whose biased exponent is not all 0s or +all 1s. For WGSL these numbers behave as you expect for floating point values +with no interesting caveats. + +### Subnormals + +Subnormal numbers are numbers whose biased exponent is all 0s, also called +denorms. + +These are the closest numbers to zero, both positive and negative, and fill in +the gap between the normal numbers with smallest magnitude, and 0. + +Some devices, for performance reasons, do not handle operations on the +subnormal numbers, and instead treat them as being zero, this is called *flush +to zero* or FTZ behaviour. + +This means in the CTS that when a subnormal number is consumed or produced by an +operation, an implementation may choose to replace it with zero. + +Like the rounding mode for quantization, this adds significant complexity to the +CTS, which will be discussed later. + +### Inf & NaNs + +Floating point numbers include positive and negative infinity to represent +values that are out of the bounds supported by the current precision. + +Implementations may assume that infinities are not present. When an evaluation +would produce an infinity, an undefined value is produced instead. + +Additionally, when a calculation would produce a finite value outside the +bounds of the current precision, the implementation may convert that value to +either an infinity with same sign, or the min/max representable value as +appropriate. + +The CTS encodes the least restrictive interpretation of the rules in the spec, +i.e. assuming someone has made a slightly adversarial implementation that always +chooses the thing with the least accuracy. + +This means that the above rules about infinities combine to say that any time an +out of bounds value is seen, any finite value is acceptable afterwards. + +This is because the out of bounds value may be converted to an infinity and then +an undefined value can be used instead of the infinity. + +This is actually a significant boon for the CTS implementation, because it short +circuits a bunch of complexity about clamping to edge values and handling +infinities. + +Signaling NaNs are treated as quiet NaNs in the WGSL spec. And quiet NaNs have +the same "may-convert-to-undefined-value" behaviour that infinities have, so for +the purpose of the CTS they are handled by the infinite/out of bounds logic +normally. + +## Notation/Terminology + +When discussing floating point values in the CTS, there are a few terms used +with precise meanings, which will be elaborated here. + +Additionally, any specific notation used will be specified here to avoid +confusion. + +### Operations + +The CTS tests for the proper execution of f32 builtins, i.e. sin, sqrt, abs, +etc, and expressions, i.e. *, /, <, etc. These collectively can be referred to +as f32 operations. + +Operations, which can be thought of as mathematical functions, are mappings from +a set of inputs to a set of outputs. + +Denoted `f(x, y) = X`, where f is a placeholder or the name of the operation, +lower case variables are the inputs to the function, and uppercase variables are +the outputs of the function. + +Operations have one or more inputs and an output. Being a f32 operation means +that the primary space for input and output values is f32, but there is some +flexibility in this definition. For example operations with values being +restricted to a subset of integers that are representable as f32 are often +referred to as being f32 based. + +Values are generally floats, integers, booleans, vector, and matrices. Consult +the WGSL spec for the exact list of types and their definitions. + +For composite outputs where there are multiple values being returned, there is a +single result value made of structured data. Whereas inputs handle this by +having multiple input parameters. + +Some examples of different types of operations: + +`multiplication(x, y) = X`, which represents the WGSL expression `x * y`, takes +in f32 values, `x` and `y`, and produces a f32 value `X`. + +`lessThen(x, y) = X`, which represents the WGSL expression `x < y`, again takes +in f32 values, but in this case returns a boolean value. + +`ldexp(x, y) = X`, which builds a f32 takes, takes in a f32 values `x` and a +restricted integer `y`. + +### Domain, Range, and Intervals + +For an operation `f(x) = X`, the interval of valid values for the input, `x`, is +called the *domain*, and the interval for valid results, `X`, is called the +*range*. + +An interval, `[a, b]`, is a set of real numbers that contains `a`, `b`, and all +the real numbers between them. + +Open-ended intervals, i.e. ones that don't include `a` and/or `b`, are avoided, +and are called out explicitly when they occur. + +The convention in this doc and the CTS code is that `a <= b`, so `a` can be +referred to as the beginning of the interval and `b` as the end of the interval. + +When talking about intervals, this doc and the code endeavours to avoid using +the term **range** to refer to the span of values that an interval covers, +instead using the term bounds to avoid confusion of terminology around output of +operations. + +## Accuracy + +As mentioned above floating point numbers are not able to represent all the +possible values over their bounds, but instead represent discrete values in that +interval, and approximate the remainder. + +Additionally, floating point numbers are not evenly distributed over the real +number line, but instead are clustered closer together near zero, and further +apart as their magnitudes grow. + +When discussing operations on floating point numbers, there is often reference +to a true value. This is the value that given no performance constraints and +infinite precision you would get, i.e `acos(1) = π`, where π has infinite +digits of precision. + +For the CTS it is often sufficient to calculate the true value using TypeScript, +since its native number format is higher precision (double-precision/f64), and +all f32 values can be represented in it. + +The true value is sometimes representable exactly as a f32 value, but often is +not. + +Additionally, many operations are implemented using approximations from +numerical analysis, where there is a tradeoff between the precision of the +result and the cost. + +Thus, the spec specifies what the accuracy constraints for specific operations +is, how close to truth an implementation is required to be, to be +considered conformant. + +There are 5 different ways that accuracy requirements are defined in the spec: + +1. *Exact* + + This is the situation where it is expected that true value for an operation + is always expected to be exactly representable. This doesn't happen for any + of the operations that return floating point values, but does occur for + logical operations that return boolean values. + + +2. *Correctly Rounded* + + For the case that the true value is exactly representable as a f32, this is + the equivalent of exactly from above. In the event that the true value is not + exact, then the acceptable answer for most numbers is either the nearest f32 + above or the nearest f32 below the true value. + + For values near the subnormal range, e.g. close to zero, this becomes more + complex, since an implementation may FTZ at any point. So if the exact + solution is subnormal or either of the neighbours of the true value are + subnormal, zero becomes a possible result, thus the acceptance interval is + wider than naively expected. + + +3. *Absolute Error* + + This type of accuracy specifies an error value, ε, and the calculated result + is expected to be within that distance from the true value, i.e. + `[ X - ε, X + ε ]`. + + The main drawback with this manner of specifying accuracy is that it doesn't + scale with the level of precision in floating point numbers themselves at a + specific value. Thus, it tends to be only used for specifying accuracy over + specific limited intervals, i.e. [-π, π]. + + +4. *Units of Least Precision (ULP)* + + The solution to the issue of not scaling with precision of floating point is + to use units of least precision. + + ULP(X) is min (b-a) over all pairs (a,b) of representable floating point + numbers such that (a <= X <= b and a =/= b). For a more formal discussion of + ULP see + [On the definition of ulp(x)](https://hal.inria.fr/inria-00070503/document). + + n * ULP or nULP means `[X - n * ULP @ X, X + n * ULP @ X]`. + + +5. *Inherited* + + When an operation's accuracy is defined in terms of other operations, then + its accuracy is said to be inherited. Handling of inherited accuracies is + one of the main driving factors in the design of testing framework, so will + need to be discussed in detail. + +## Acceptance Intervals + +The first four accuracy types; Exact, Correctly Rounded, Absolute Error, and +ULP, sometimes called simple accuracies, can be defined in isolation from each +other, and by association can be implemented using relatively independent +implementations. + +The original implementation of the floating point framework did this as it was +being built out, but ran into difficulties when defining the inherited +accuracies. + +For examples, `tan(x) inherits from sin(x)/cos(x)`, one can take the defined +rules and manually build up a bespoke solution for checking the results, but +this is tedious, error-prone, and doesn't allow for code re-use. + +Instead, it would be better if there was a single conceptual framework that one +can express all the 'simple' accuracy requirements in, and then have a mechanism +for composing them to define inherited accuracies. + +In the WebGPU CTS this is done via the concept of acceptance intervals, which is +derived from a similar concept in the Vulkan CTS, though implemented +significantly differently. + +The core of this idea is that each of different accuracy types can be integrated +into the definition of the operation, so that instead of transforming an input +from the domain to a point in the range, the operation is producing an interval +in the range, that is the acceptable values an implementation may emit. + + +The simple accuracies can be defined as follows: + +1. *Exact* + + `f(x) => [X, X]` + + +2. *Correctly Rounded* + + If `X` is precisely defined as a f32 + + `f(x) => [X, X]` + + otherwise, + + `[a, b]` where `a` is the largest representable number with `a <= X`, and `b` + is the smallest representable number with `X <= b` + + +3. *Absolute Error* + + `f(x) => [ X - ε, X + ε ]`, where ε is the absolute error value + + +4. **ULP Error** + + `f(x) = X => [X - n*ULP(X), X + n*ULP(X)]` + +As defined, these definitions handle mapping from a point in the domain into an +interval in the range. + +This is insufficient for implementing inherited accuracies, since inheritance +sometimes involve mapping domain intervals to range intervals. + +Here we use the convention for naturally extending a function on real numbers +into a function on intervals of real numbers, i.e. `f([a, b]) = [A, B]`. + +Given that floating point numbers have a finite number of precise values for any +given interval, one could implement just running the accuracy computation for +every point in the interval and then spanning together the resultant intervals. +That would be very inefficient though and make your reviewer sad to read. + +For mapping intervals to intervals the key insight is that we only need to be +concerned with the extrema of the operation in the interval, since the +acceptance interval is the bounds of the possible outputs. + +In more precise terms: +``` + f(x) => X, x = [a, b] and X = [A, B] + + X = [min(f(x)), max(f(x))] + X = [min(f([a, b])), max(f([a, b]))] + X = [f(m), f(M)] +``` +where m and M are in `[a, b]`, `m <= M`, and produce the min and max results +for `f` on the interval, respectively. + +So how do we find the minima and maxima for our operation in the domain? + +The common general solution for this requires using calculus to calculate the +derivative of `f`, `f'`, and then find the zeroes `f'` to find inflection +points of `f`. + +This solution wouldn't be sufficient for all builtins, i.e. `step` which is not +differentiable at 'edge' values. + +Thankfully we do not need a general solution for the CTS, since all the builtin +operations are defined in the spec, so `f` is from a known set of options. + +These operations can be divided into two broad categories: monotonic, and +non-monotonic, with respect to an interval. + +The monotonic operations are ones that preserve the order of inputs in their +outputs (or reverse it). Their graph only ever decreases or increases, +never changing from one or the other, though it can have flat sections. + +The non-monotonic operations are ones whose graph would have both regions of +increase and decrease. + +The monotonic operations, when mapping an interval to an interval, are simple to +handle, since the extrema are guaranteed to be the ends of the domain, `a` and `b`. + +So `f([a, b])` = `[f(a), f(b)]` or `[f(b), f(a)]`. We could figure out if `f` is +increasing or decreasing beforehand to determine if it should be `[f(a), f(b)]` +or `[f(b), f(a)]`. + +It is simpler to just use min & max to have an implementation that is agnostic +to the details of `f`. +``` + A = f(a), B = f(b) + X = [min(A, B), max(A, B)] +``` + +The non-monotonic functions that we need to handle for interval-to-interval +mappings are more complex. Thankfully are a small number of the overall +operations that need to be handled, since they are only the operations that are +used in an inherited accuracy and take in the output of another operation as +part of that inherited accuracy. + +So in the CTS we just have bespoke implementations for each of them. + +Part of the operation definition in the CTS is a function that takes in the +domain interval, and returns a sub-interval such that the subject function is +monotonic over that sub-interval, and hence the function's minima and maxima are +at the ends. + +This adjusted domain interval can then be fed through the same machinery as the +monotonic functions. + +### Inherited Accuracy + +So with all of that background out of the way, we can now define an inherited +accuracy in terms of acceptance intervals. + +The crux of this is the insight that the range of one operation can become the +domain of another operation to compose them together. + +And since we have defined how to do this interval to interval mapping above, +transforming things becomes mechanical and thus implementable in reusable code. + +When talking about inherited accuracies `f(x) => g(x)` is used to denote that +`f`'s accuracy is a defined as `g`. + +An example to illustrate inherited accuracies: + +``` + tan(x) => sin(x)/cos(x) + + sin(x) => [sin(x) - 2^-11, sin(x) + 2^-11]` + cos(x) => [cos(x) - 2^-11, cos(x) + 2-11] + + x/y => [x/y - 2.5 * ULP(x/y), x/y + 2.5 * ULP(x/y)] +``` + +`sin(x)` and `cos(x)` are non-monotonic, so calculating out a closed generic +form over an interval is a pain, since the min and max vary depending on the +value of x. Let's isolate this to a single point, so you don't have to read +literally pages of expanded intervals. + +``` + x = π/2 + + sin(π/2) => [sin(π/2) - 2-11, sin(π/2) + 2-11] + => [0 - 2-11, 0 + 2-11] + => [-0.000488.., 0.000488...] + cos(π/2) => [cos(π/2) - 2-11, cos(π/2) + 2-11] + => [-0.500488, -0.499511...] + + tan(π/2) => sin(π/2)/cos(π/2) + => [-0.000488.., 0.000488...]/[-0.500488..., -0.499511...] + => [min({-0.000488.../-0.500488..., -0.000488.../-0.499511..., ...}), + max(min({-0.000488.../-0.500488..., -0.000488.../-0.499511..., ...}) ] + => [0.000488.../-0.499511..., 0.000488.../0.499511...] + => [-0.0009775171, 0.0009775171] +``` + +For clarity this has omitted a bunch of complexity around FTZ behaviours, and +that these operations are only defined for specific domains, but the high-level +concepts hold. + +For each of the inherited operations we could implement a manually written out +closed form solution, but that would be quite error-prone and not be +re-using code between builtins. + +Instead, the CTS takes advantage of the fact in addition to testing +implementations of `tan(x)` we are going to be testing implementations of +`sin(x)`, `cos(x)` and `x/y`, so there should be functions to generate +acceptance intervals for those operations. + +The `tan(x)` acceptance interval can be constructed by generating the acceptance +intervals for `sin(x)`, `cos(x)` and `x/y` via function calls and composing the +results. + +This algorithmically looks something like this: + +``` + tan(x): + Calculate sin(x) interval + Calculate cos(x) interval + Calculate sin(x) result divided by cos(x) result + Return division result +``` + +# Appendix + +### Significant f32 Values + +| Name | Decimal (~) | Hex | Sign Bit | Exponent Bits | Significand Bits | +| ---------------------- | --------------: | ----------: | -------: | ------------: | ---------------------------: | +| Negative Infinity | -∞ | 0xff80 0000 | 1 | 1111 1111 | 0000 0000 0000 0000 0000 000 | +| Min Negative Normal | -3.40282346E38 | 0xff7f ffff | 1 | 1111 1110 | 1111 1111 1111 1111 1111 111 | +| Max Negative Normal | -1.1754943E−38 | 0x8080 0000 | 1 | 0000 0001 | 0000 0000 0000 0000 0000 000 | +| Min Negative Subnormal | -1.1754942E-38 | 0x807f ffff | 1 | 0000 0000 | 1111 1111 1111 1111 1111 111 | +| Max Negative Subnormal | -1.4012984E−45 | 0x8000 0001 | 1 | 0000 0000 | 0000 0000 0000 0000 0000 001 | +| Negative Zero | -0 | 0x8000 0000 | 1 | 0000 0000 | 0000 0000 0000 0000 0000 000 | +| Positive Zero | 0 | 0x0000 0000 | 0 | 0000 0000 | 0000 0000 0000 0000 0000 000 | +| Min Positive Subnormal | 1.4012984E−45 | 0x0000 0001 | 0 | 0000 0000 | 0000 0000 0000 0000 0000 001 | +| Max Positive Subnormal | 1.1754942E-38 | 0x007f ffff | 0 | 0000 0000 | 1111 1111 1111 1111 1111 111 | +| Min Positive Normal | 1.1754943E−38 | 0x0080 0000 | 0 | 0000 0001 | 0000 0000 0000 0000 0000 000 | +| Max Positive Normal | 3.40282346E38 | 0x7f7f ffff | 0 | 1111 1110 | 1111 1111 1111 1111 1111 111 | +| Negative Infinity | ∞ | 0x7f80 0000 | 0 | 1111 1111 | 0000 0000 0000 0000 0000 000 | + +# Resources +- [WebGPU Spec](https://www.w3.org/TR/webgpu/) +- [WGSL Spec](https://www.w3.org/TR/WGSL/) +- [float32 on Wikipedia](https://en.wikipedia.org/wiki/Single-precision_floating-point_format) +- [IEEE-754 Floating Point Converter](https://www.h-schmidt.net/FloatConverter/IEEE754.html) +- [IEEE 754 Calculator](http://weitz.de/ieee/) +- [Keisan High Precision Calculator](https://keisan.casio.com/calculator) +- [On the definition of ulp(x)](https://hal.inria.fr/inria-00070503/document) diff --git a/dom/webgpu/tests/cts/checkout/docs/helper_index.txt b/dom/webgpu/tests/cts/checkout/docs/helper_index.txt new file mode 100644 index 0000000000..1b0a503246 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/helper_index.txt @@ -0,0 +1,92 @@ + + +## Index of Test Helpers + +This index is a quick-reference of helper functions in the test suite. +Use it to determine whether you can reuse a helper, instead of writing new code, +to improve readability and reviewability. + +Whenever a new generally-useful helper is added, it should be indexed here. + +**See linked documentation for full helper listings.** + +- {@link common/framework/params_builder!CaseParamsBuilder} and {@link common/framework/params_builder!SubcaseParamsBuilder}: + Combinatorial generation of test parameters. They are iterated by the test framework at runtime. + See `examples.spec.ts` for basic examples of how this behaves. + - {@link common/framework/params_builder!CaseParamsBuilder}: + `ParamsBuilder` for adding "cases" to a test. + - {@link common/framework/params_builder!CaseParamsBuilder#beginSubcases}: + "Finalizes" the `CaseParamsBuilder`, returning a `SubcaseParamsBuilder`. + - {@link common/framework/params_builder!SubcaseParamsBuilder}: + `ParamsBuilder` for adding "subcases" to a test. + +### Fixtures + +(Uncheck the "Inherited" box to hide inherited methods from documentation pages.) + +- {@link common/framework/fixture!Fixture}: Base fixture for all tests. +- {@link webgpu/gpu_test!GPUTest}: Base fixture for WebGPU tests. +- {@link webgpu/api/validation/validation_test!ValidationTest}: Base fixture for WebGPU validation tests. +- {@link webgpu/shader/validation/shader_validation_test!ShaderValidationTest}: Base fixture for WGSL shader validation tests. +- {@link webgpu/idl/idl_test!IDLTest}: + Base fixture for testing the exposed interface is correct (without actually using WebGPU). + +### WebGPU Helpers + +- {@link webgpu/capability_info}: Structured information about texture formats, binding types, etc. +- {@link webgpu/constants}: + Constant values (needed anytime a WebGPU constant is needed outside of a test function). +- {@link webgpu/util/buffer}: Helpers for GPUBuffers. +- {@link webgpu/util/texture}: Helpers for GPUTextures. +- {@link webgpu/util/unions}: Helpers for various union typedefs in the WebGPU spec. +- {@link webgpu/util/math}: Helpers for common math operations. +- {@link webgpu/util/check_contents}: Check the contents of TypedArrays, with nice messages. + Also can be composed with {@link webgpu/gpu_test!GPUTest#expectGPUBufferValuesPassCheck}, used to implement + GPUBuffer checking helpers in GPUTest. +- {@link webgpu/util/conversion}: Numeric encoding/decoding for float/unorm/snorm values, etc. +- {@link webgpu/util/copy_to_texture}: + Helper class for copyToTexture test suites for execution copy and check results. +- {@link webgpu/util/color_space_conversion}: + Helper functions to do color space conversion. The algorithm is the same as defined in + CSS Color Module Level 4. +- {@link webgpu/util/create_elements}: + Helpers for creating web elements like HTMLCanvasElement, OffscreenCanvas, etc. +- {@link webgpu/util/shader}: Helpers for creating fragment shader based on intended output values, plainType, and componentCount. +- {@link webgpu/util/texture/base}: General texture-related helpers. +- {@link webgpu/util/texture/data_generation}: Helper for generating dummy texture data. +- {@link webgpu/util/texture/layout}: Helpers for working with linear image data + (like in copyBufferToTexture, copyTextureToBuffer, writeTexture). +- {@link webgpu/util/texture/subresource}: Helpers for working with texture subresource ranges. +- {@link webgpu/util/texture/texel_data}: Helpers encoding/decoding texel formats. +- {@link webgpu/util/texture/texel_view}: Helper class to create and view texture data through various representations. +- {@link webgpu/util/texture/texture_ok}: Helpers for checking texture contents. +- {@link webgpu/shader/types}: Helpers for WGSL data types. +- {@link webgpu/shader/execution/expression/expression}: Helpers for WGSL expression execution tests. +- {@link webgpu/web_platform/util}: Helpers for web platform features (e.g. video elements). + +### General Helpers + +- {@link common/framework/resources}: Provides the path to the `resources/` directory. +- {@link common/util/navigator_gpu}: Finds and returns the `navigator.gpu` object or equivalent. +- {@link common/util/util}: Miscellaneous utilities. + - {@link common/util/util!assert}: Assert a condition, otherwise throw an exception. + - {@link common/util/util!unreachable}: Assert unreachable code. + - {@link common/util/util!assertReject}, {@link common/util/util!resolveOnTimeout}, + {@link common/util/util!rejectOnTimeout}, + {@link common/util/util!raceWithRejectOnTimeout}, and more. +- {@link common/util/collect_garbage}: + Attempt to trigger garbage collection, for testing that garbage collection is not observable. +- {@link common/util/preprocessor}: A simple template-based, non-line-based preprocessor, + implementing if/elif/else/endif. Possibly useful for WGSL shader generation. +- {@link common/util/timeout}: Use this instead of `setTimeout`. +- {@link common/util/types}: Type metaprogramming helpers. diff --git a/dom/webgpu/tests/cts/checkout/docs/implementing.md b/dom/webgpu/tests/cts/checkout/docs/implementing.md new file mode 100644 index 0000000000..ae6848839a --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/implementing.md @@ -0,0 +1,97 @@ +# Test Implementation + +Concepts important to understand when writing tests. See existing tests for examples to copy from. + +## Test fixtures + +Most tests can use one of the several common test fixtures: + +- `Fixture`: Base fixture, provides core functions like `expect()`, `skip()`. +- `GPUTest`: Wraps every test in error scopes. Provides helpers like `expectContents()`. +- `ValidationTest`: Extends `GPUTest`, provides helpers like `expectValidationError()`, `getErrorTextureView()`. +- Or create your own. (Often not necessary - helper functions can be used instead.) + +Test fixtures or helper functions may be defined in `.spec.ts` files, but if used by multiple +test files, should be defined in separate `.ts` files (without `.spec`) alongside the files that +use them. + +### GPUDevices in tests + +`GPUDevice`s are largely stateless (except for `lost`-ness, error scope stack, and `label`). +This allows the CTS to reuse one device across multiple test cases using the `DevicePool`, +which provides `GPUDevice` objects to tests. + +Currently, there is one `GPUDevice` with the default descriptor, and +a cache of several more, for devices with additional capabilities. +Devices in the `DevicePool` are automatically removed when certain things go wrong. + +Later, there may be multiple `GPUDevice`s to allow multiple test cases to run concurrently. + +## Test parameterization + +The CTS provides helpers (`.params()` and friends) for creating large cartesian products of test parameters. +These generate "test cases" further subdivided into "test subcases". +See `basic,*` in `examples.spec.ts` for examples, and the [helper index](./helper_index.txt) +for a list of capabilities. + +Test parameterization should be applied liberally to ensure the maximum coverage +possible within reasonable time. You can skip some with `.filter()`. And remember: computers are +pretty fast - thousands of test cases can be reasonable. + +Use existing lists of parameters values (such as +[`kTextureFormats`](https://github.com/gpuweb/cts/blob/0f38b85/src/suites/cts/capability_info.ts#L61), +to parameterize tests), instead of making your own list. Use the info tables (such as +`kTextureFormatInfo`) to define and retrieve information about the parameters. + +## Asynchrony in tests + +Since there are no synchronous operations in WebGPU, almost every test is asynchronous in some +way. For example: + +- Checking the result of a readback. +- Capturing the result of a `popErrorScope()`. + +That said, test functions don't always need to be `async`; see below. + +### Checking asynchronous errors/results + +Validation is inherently asynchronous (`popErrorScope()` returns a promise). However, the error +scope stack itself is synchronous - operations immediately after a `popErrorScope()` are outside +that error scope. + +As a result, tests can assert things like validation errors/successes without having an `async` +test body. + +**Example:** + +```typescript +t.expectValidationError(() => { + device.createThing(); +}); +``` + +does: + +- `pushErrorScope('validation')` +- `popErrorScope()` and "eventually" check whether it returned an error. + +**Example:** + +```typescript +t.expectGPUBufferValuesEqual(srcBuffer, expectedData); +``` + +does: + +- copy `srcBuffer` into a new mappable buffer `dst` +- `dst.mapReadAsync()`, and "eventually" check what data it returned. + +Internally, this is accomplished via an "eventual expectation": `eventualAsyncExpectation()` +takes an async function, calls it immediately, and stores off the resulting `Promise` to +automatically await at the end before determining the pass/fail state. + +### Asynchronous parallelism + +A side effect of test asynchrony is that it's possible for multiple tests to be in flight at +once. We do not currently do this, but it will eventually be an option to run `N` tests in +"parallel", for faster local test runs. diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/README.md b/dom/webgpu/tests/cts/checkout/docs/intro/README.md new file mode 100644 index 0000000000..e5f8bcedc6 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/intro/README.md @@ -0,0 +1,99 @@ +# Introduction + +These documents contains guidelines for contributors to the WebGPU CTS (Conformance Test Suite) +on how to write effective tests, and on the testing philosophy to adopt. + +The WebGPU CTS is arguably more important than the WebGPU specification itself, because +it is what forces implementation to be interoperable by checking they conform to the specification. +However writing a CTS is hard and requires a lot of effort to reach good coverage. + +More than a collection of tests like regular end2end and unit tests for software artifacts, a CTS +needs to be exhaustive. Contrast for example the WebGL2 CTS with the ANGLE end2end tests: they +cover the same functionality (WebGL 2 / OpenGL ES 3) but are structured very differently: + +- ANGLE's test suite has one or two tests per functionality to check it works correctly, plus + regression tests and special tests to cover implementation details. +- WebGL2's CTS can have thousands of tests per API aspect to cover every combination of + parameters (and global state) used by an operation. + +Below are guidelines based on our collective experience with graphics API CTSes like WebGL's. +They are expected to evolve over time and have exceptions, but should give a general idea of what +to do. + +## Contributing + +Testing tasks are tracked in the [CTS project tracker](https://github.com/orgs/gpuweb/projects/3). +Go here if you're looking for tasks, or if you have a test idea that isn't already covered. + +If contributing conformance tests, the directory you'll work in is [`src/webgpu/`](../src/webgpu/). +This directory is organized according to the goal of the test (API validation behavior vs +actual results) and its target (API entry points and spec areas, e.g. texture sampling). + +The contents of a test file (`src/webgpu/**/*.spec.ts`) are twofold: + +- Documentation ("test plans") on what tests do, how they do it, and what cases they cover. + Some test plans are fully or partially unimplemented: + they either contain "TODO" in a description or are `.unimplemented()`. +- Actual tests. + +**Please read the following short documents before contributing.** + +### 0. [Developing](developing.md) + +- Reviewers should also read [Review Requirements](../reviews.md). + +### 1. [Life of a Test Change](life_of.md) + +### 2. [Adding or Editing Test Plans](plans.md) + +### 3. [Implementing Tests](tests.md) + +## [Additional Documentation](../) + +## Examples + +### Operation testing of vertex input id generation + +This section provides an example of the planning process for a test. +It has not been refined into a set of final test plan descriptions. +(Note: this predates the actual implementation of these tests, so doesn't match the actual tests.) + +Somewhere under the `api/operation` node are tests checking that running `GPURenderPipelines` on +the device using the `GPURenderEncoderBase.draw` family of functions works correctly. Render +pipelines are composed of several stages that are mostly independent so they can be split in +several parts such as `vertex_input`, `rasterization`, `blending`. + +Vertex input itself has several parts that are mostly separate in hardware: + +- generation of the vertex and instance indices to run for this draw +- fetching of vertex data from vertex buffers based on these indices +- conversion from the vertex attribute `GPUVertexFormat` to the datatype for the input variable + in the shader + +Each of these are tested separately and have cases for each combination of the variables that may +affect them. This means that `api/operation/render/vertex_input/id_generation` checks that the +correct operation is performed for the cartesian product of all the following dimensions: + +- for encoding in a `GPURenderPassEncoder` or a `GPURenderBundleEncoder` +- whether the draw is direct or indirect +- whether the draw is indexed or not +- for various values of the `firstInstance` argument +- for various values of the `instanceCount` argument +- if the draw is not indexed: + - for various values of the `firstVertex` argument + - for various values of the `vertexCount` argument +- if the draw is indexed: + - for each `GPUIndexFormat` + - for various values of the indices in the index buffer including the primitive restart values + - for various values for the `offset` argument to `setIndexBuffer` + - for various values of the `firstIndex` argument + - for various values of the `indexCount` argument + - for various values of the `baseVertex` argument + +"Various values" above mean several small values, including `0` and the second smallest valid +value to check for corner cases, as well as some large value. + +An instance of the test sets up a `draw*` call based on the parameters, using point rendering and +a fragment shader that outputs to a storage buffer. After the draw the test checks the content of +the storage buffer to make sure all expected vertex shader invocation, and only these ones have +been generated. diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png b/dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png new file mode 100644 index 0000000000..672324a9d9 Binary files /dev/null and b/dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png differ diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/developing.md b/dom/webgpu/tests/cts/checkout/docs/intro/developing.md new file mode 100644 index 0000000000..5b1aeed36d --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/intro/developing.md @@ -0,0 +1,134 @@ +# Developing + +The WebGPU CTS is written in TypeScript. + +## Setup + +After checking out the repository and installing node/npm, run: + +```sh +npm ci +``` + +Before uploading, you can run pre-submit checks (`npm test`) to make sure it will pass CI. +Use `npm run fix` to fix linting issues. + +`npm run` will show available npm scripts. +Some more scripts can be listed using `npx grunt`. + +## Dev Server + +To start the development server, use: + +```sh +npm start +``` + +Then, browse to the standalone test runner at the printed URL. + +The server will generate and compile code on the fly, so no build step is necessary. +Only a reload is needed to see saved changes. +(TODO: except, currently, `README.txt` and file `description` changes won't be reflected in +the standalone runner.) + +Note: The first load of a test suite may take some time as generating the test suite listing can +take a few seconds. + +## Standalone Test Runner / Test Plan Viewer + +**The standalone test runner also serves as a test plan viewer.** +(This can be done in a browser without WebGPU support.) +You can use this to preview how your test plan will appear. + +You can view different suites (webgpu, unittests, stress, etc.) or different subtrees of +the test suite. + +- `http://localhost:8080/standalone/` (defaults to `?runnow=0&worker=0&debug=0&q=webgpu:*`) +- `http://localhost:8080/standalone/?q=unittests:*` +- `http://localhost:8080/standalone/?q=unittests:basic:*` + +The following url parameters change how the harness runs: + +- `runnow=1` runs all matching tests on page load. +- `debug=1` enables verbose debug logging from tests. +- `worker=1` runs the tests on a Web Worker instead of the main thread. +- `power_preference=low-power` runs most tests passing `powerPreference: low-power` to `requestAdapter` +- `power_preference=high-performance` runs most tests passing `powerPreference: high-performance` to `requestAdapter` + +### Web Platform Tests (wpt) - Ref Tests + +You can inspect the actual and reference pages for web platform reftests in the standalone +runner by navigating to them. For example, by loading: + + - `http://localhost:8080/out/webgpu/web_platform/reftests/canvas_clear.https.html` + - `http://localhost:8080/out/webgpu/web_platform/reftests/ref/canvas_clear-ref.html` + +You can also run a minimal ref test runner. + + - open 2 terminals / command lines. + - in one, `npm start` + - in the other, `node tools/run_wpt_ref_tests [name-of-test]` + +Without `[name-of-test]` all ref tests will be run. `[name-of-test]` is just a simple check for +substring so passing in `rgba` will run every test with `rgba` in its filename. + +Examples: + +MacOS + +``` +# Chrome +node tools/run_wpt_ref_tests /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary +``` + +Windows + +``` +# Chrome +node .\tools\run_wpt_ref_tests "C:\Users\your-user-name\AppData\Local\Google\Chrome SxS\Application\chrome.exe" +``` + +## Editor + +Since this project is written in TypeScript, it integrates best with +[Visual Studio Code](https://code.visualstudio.com/). +This is optional, but highly recommended: it automatically adds `import` lines and +provides robust completions, cross-references, renames, error highlighting, +deprecation highlighting, and type/JSDoc popups. + +Open the `cts.code-workspace` workspace file to load settings convenient for this project. +You can make local configuration changes in `.vscode/`, which is untracked by Git. + +## Pull Requests + +When opening a pull request, fill out the PR checklist and attach the issue number. +If an issue hasn't been opened, find the draft issue on the +[project tracker](https://github.com/orgs/gpuweb/projects/3) and choose "Convert to issue": + +![convert to issue button screenshot](convert_to_issue.png) + +Opening a pull request will automatically notify reviewers. + +To make the review process smoother, once a reviewer has started looking at your change: + +- Avoid major additions or changes that would be best done in a follow-up PR. +- Avoid rebases (`git rebase`) and force pushes (`git push -f`). These can make + it difficult for reviewers to review incremental changes as GitHub often cannot + view a useful diff across a rebase. If it's necessary to resolve conflicts + with upstream changes, use a merge commit (`git merge`) and don't include any + consequential changes in the merge, so a reviewer can skip over merge commits + when working through the individual commits in the PR. +- When you address a review comment, mark the thread as "Resolved". + +Pull requests will (usually) be landed with the "Squash and merge" option. + +### TODOs + +The word "TODO" refers to missing test coverage. It may only appear inside file/test descriptions +and README files (enforced by linting). + +To use comments to refer to TODOs inside the description, use a backreference, e.g., in the +description, `TODO: Also test the FROBNICATE usage flag [1]`, and somewhere in the code, `[1]: +Need to add FROBNICATE to this list.`. + +Use `MAINTENANCE_TODO` for TODOs which don't impact test coverage. diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/life_of.md b/dom/webgpu/tests/cts/checkout/docs/intro/life_of.md new file mode 100644 index 0000000000..8dced4ad84 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/intro/life_of.md @@ -0,0 +1,46 @@ +# Life of a Test Change + +A "test change" could be a new test, an expansion of an existing test, a test bug fix, or a +modification to existing tests to make them match new spec changes. + +**CTS contributors should contribute to the tracker and strive to keep it up to date, especially +relating to their own changes.** + +Filing new draft issues in the CTS project tracker is very lightweight. +Anyone with access should do this eagerly, to ensure no testing ideas are forgotten. +(And if you don't have access, just file a regular issue.) + +1. Enter a [draft issue](https://github.com/orgs/gpuweb/projects/3), with the Status + set to "New (not in repo)", and any available info included in the issue description + (notes/plans to ensure full test coverage of the change). The source of this may be: + + - Anything in the spec/API that is found not to be covered by the CTS yet. + - Any test is found to be outdated or otherwise buggy. + - A spec change from the "Needs CTS Issue" column in the + [spec project tracker](https://github.com/orgs/gpuweb/projects/1). + Once information on the required test changes is entered into the CTS project tracker, + the spec issue moves to "Specification Done". + + Note: at some point, someone may make a PR to flush "New (not in repo)" issues into `TODO`s in + CTS file/test description text, changing their "Status" to "Open". + These may be done in bulk without linking back to the issue. + +1. As necessary: + + - Convert the draft issue to a full, numbered issue for linking from later PRs. + + ![convert to issue button screenshot](convert_to_issue.png) + + - Update the "Assignees" of the issue when an issue is assigned or unassigned + (you can assign yourself). + - Change the "Status" of the issue to "Started" once you start the task. + +1. Open one or more PRs, **each linking to the associated issue**. + Each PR may is reviewed and landed, and may leave further TODOs for parts it doesn't complete. + + 1. Test are "planned" in test descriptions. (For complex tests, open a separate PR with the + tests `.unimplemented()` so a reviewer can evaluate the plan before you implement tests.) + 1. Tests are implemented. + +1. When **no TODOs remain** for an issue, close it and change its status to "Complete". + (Enter a new more, specific draft issue into the tracker if you need to track related TODOs.) diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/plans.md b/dom/webgpu/tests/cts/checkout/docs/intro/plans.md new file mode 100644 index 0000000000..f8d7af3a78 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/intro/plans.md @@ -0,0 +1,82 @@ +# Adding or Editing Test Plans + +## 1. Write a test plan + +For new tests, if some notes exist already, incorporate them into your plan. + +A detailed test plan should be written and reviewed before substantial test code is written. +This allows reviewers a chance to identify additional tests and cases, opportunities for +generalizations that would improve the strength of tests, similar existing tests or test plans, +and potentially useful [helpers](../helper_index.txt). + +**A test plan must serve two functions:** + +- Describes the test, succinctly, but in enough detail that a reader can read *only* the test + plans and evaluate coverage completeness of a file/directory. +- Describes the test precisely enough that, when code is added, the reviewer can ensure that the + test really covers what the test plan says. + +There should be one test plan for each test. It should describe what it tests, how, and describe +important cases that need to be covered. Here's an example: + +```ts +g.test('x,some_detail') + .desc( + ` +Tests [some detail] about x. Tests calling x in various 'mode's { mode1, mode2 }, +with various values of 'arg', and checks correctness of the result. +Tries to trigger [some conditional path]. + +- Valid values (control case) // <- (to make sure the test function works well) +- Unaligned values (should fail) // <- (only validation tests need to intentionally hit invalid cases) +- Extreme values` + ) + .params(u => + u // + .combine('mode', ['mode1', 'mode2']) + .beginSubcases() + .combine('arg', [ + // Valid // <- Comment params as you see fit. + 4, + 8, + 100, + // Invalid + 2, + 6, + 1e30, + ]) + ) + .unimplemented(); +``` + +"Cases" each appear as individual items in the `/standalone/` runner. +"Subcases" run inside each case, like a for-loop wrapping the `.fn(`test function`)`. +Documentation on the parameter builder can be found in the [helper index](../helper_index.txt). + +It's often impossible to predict the exact case/subcase structure before implementing tests, so they +can be added during implementation, instead of planning. + +For any notes which are not specific to a single test, or for preliminary notes for tests that +haven't been planned in full detail, put them in the test file's `description` variable at +the top. Or, if they aren't associated with a test file, put them in a `README.txt` file. + +**Any notes about missing test coverage must be marked with the word `TODO` inside a +description or README.** This makes them appear on the `/standalone/` page. + +## 2. Open a pull request + +Open a PR, and work with the reviewer(s) to revise the test plan. + +Usually (probably), plans will be landed in separate PRs before test implementations. + +## Conventions used in test plans + +- `Iff`: If and only if +- `x=`: "cartesian-cross equals", like `+=` for cartesian product. + Used for combinatorial test coverage. + - Sometimes this will result in too many test cases; simplify/reduce as needed + during planning *or* implementation. +- `{x,y,z}`: list of cases to test + - e.g. `x= texture format {r8unorm, r8snorm}` +- *Control case*: a case included to make sure that the rest of the cases aren't + missing their target by testing some other error case. diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/tests.md b/dom/webgpu/tests/cts/checkout/docs/intro/tests.md new file mode 100644 index 0000000000..a67b6a20cc --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/intro/tests.md @@ -0,0 +1,25 @@ +# Implementing Tests + +Once a test plan is done, you can start writing tests. +To add new tests, imitate the pattern in neigboring tests or neighboring files. +New test files must be named ending in `.spec.ts`. + +For an example test file, see [`src/webgpu/examples.spec.ts`](../../src/webgpu/examples.spec.ts). +For a more complex, well-structured reference test file, see +[`src/webgpu/api/validation/vertex_state.spec.ts`](../../src/webgpu/api/validation/vertex_state.spec.ts). + +Implement some tests and open a pull request. You can open a PR any time you're ready for a review. +(If two tests are non-trivial but independent, consider separate pull requests.) + +Before uploading, you can run pre-submit checks (`npm test`) to make sure it will pass CI. +Use `npm run fix` to fix linting issues. + +## Test Helpers + +It's best to be familiar with helpers available in the test suite for simplifying +test implementations. + +New test helpers can be added at any time to either of those files, or to new `.ts` files anywhere +near the `.spec.ts` file where they're used. + +Documentation on existing helpers can be found in the [helper index](../helper_index.txt). diff --git a/dom/webgpu/tests/cts/checkout/docs/organization.md b/dom/webgpu/tests/cts/checkout/docs/organization.md new file mode 100644 index 0000000000..fd7020afd6 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/organization.md @@ -0,0 +1,166 @@ +# Test Organization + +## `src/webgpu/` + +Because of the glorious amount of test needed, the WebGPU CTS is organized as a tree of arbitrary +depth (a filesystem with multiple tests per file). + +Each directory may have a `README.txt` describing its contents. +Tests are grouped in large families (each of which has a `README.txt`); +the root and first few levels looks like the following (some nodes omitted for simplicity): + +- **`api`** with tests for full coverage of the Javascript API surface of WebGPU. + - **`validation`** with positive and negative tests for all the validation rules of the API. + - **`operation`** with tests that checks the result of performing valid WebGPU operations, + taking advantage of parametrization to exercise interactions between parts of the API. + - **`regression`** for one-off tests that reproduce bugs found in implementations to prevent + the bugs from appearing again. +- **`shader`** with tests for full coverage of the shaders that can be passed to WebGPU. + - **`validation`**. + - **`execution`** similar to `api/operation`. + - **`regression`**. +- **`idl`** with tests to check that the WebGPU IDL is correctly implemented, for examples that + objects exposed exactly the correct members, and that methods throw when passed incomplete + dictionaries. +- **`web-platform`** with tests for Web platform-specific interactions like `GPUSwapChain` and + ``, WebXR and `GPUQueue.copyExternalImageToTexture`. + +At the same time test hierarchies can be used to split the testing of a single sub-object into +several file for maintainability. For example `GPURenderPipeline` has a large descriptor and some +parts could be tested independently like `vertex_input` vs. `primitive_topology` vs. `blending` +but all live under the `render_pipeline` directory. + +In addition to the test tree, each test can be parameterized. For coverage it is important to +test all enums values, for example for `GPUTextureFormat`. Instead of having a loop to iterate +over all the `GPUTextureFormat`, it is better to parameterize the test over them. Each format +will have a different entry in the test list which will help WebGPU implementers debug the test, +or suppress the failure without losing test coverage while they fix the bug. + +Extra capabilities (limits and features) are often tested in the same files as the rest of the API. +For example, a compressed texture format capability would simply add a `GPUTextureFormat` to the +parametrization lists of many tests, while a capability adding significant new functionality +like ray-tracing could have a separate subtree. + +Operation tests for optional features should be skipped using `t.selectDeviceOrSkipTestCase()` or +`t.skip()`. Validation tests should be written that test the behavior with and without the +capability enabled via `t.selectDeviceOrSkipTestCase()`, to ensure the functionality is valid +only with the capability enabled. + +### Validation tests + +Validation tests check the validation rules that are (or will be) set by the +WebGPU spec. Validation tests try to carefully trigger the individual validation +rules in the spec, without simultaneously triggering other rules. + +Validation errors *generally* generate WebGPU errors, not exceptions. +But check the spec on a case-by-case basis. + +Like all `GPUTest`s, `ValidationTest`s are wrapped in both types of error scope. These +"catch-all" error scopes look for any errors during the test, and report them as test failures. +Since error scopes can be nested, validation tests can nest an error scope to expect that there +*are* errors from specific operations. + +#### Parameterization + +Test parameterization can help write many validation tests more succinctly, +while making it easier for both authors and reviewers to be confident that +an aspect of the API is tested fully. Examples: + +- [`webgpu:api,validation,render_pass,resolve:resolve_attachment:*`](https://github.com/gpuweb/cts/blob/ded3b7c8a4680a1a01621a8ac859facefadf32d0/src/webgpu/api/validation/render_pass/resolve.spec.ts#L35) +- [`webgpu:api,validation,createBindGroupLayout:bindingTypeSpecific_optional_members:*`](https://github.com/gpuweb/cts/blob/ded3b7c8a4680a1a01621a8ac859facefadf32d0/src/webgpu/api/validation/createBindGroupLayout.spec.ts#L68) + +Use your own discretion when deciding the balance between heavily parameterizing +a test and writing multiple separate tests. + +#### Guidelines + +There are many aspects that should be tested in all validation tests: + +- each individual argument to a method call (including `this`) or member of a descriptor + dictionary should be tested including: + - what happens when an error object is passed. + - what happens when an optional feature enum or method is used. + - what happens for numeric values when they are at 0, too large, too small, etc. +- each validation rule in the specification should be checked both with a control success case, + and error cases. +- each set of arguments or state that interact for validation. + +When testing numeric values, it is important to check on both sides of the boundary: if the error +happens for value N and not N - 1, both should be tested. Alignment of integer values should also +be tested but boundary testing of alignment should be between a value aligned to 2^N and a value +aligned to 2^(N-1). + +Finally, this is probably also where we would test that extensions follow the rule that: if the +browser supports a feature but it is not enabled on the device, then calling methods from that +feature throws `TypeError`. + +- Test providing unknown properties *that are definitely not part of any feature* are + valid/ignored. (Unfortunately, due to the rules of IDL, adding a member to a dictionary is + always a breaking change. So this is how we have to test this unless we can get a "strict" + dictionary type in IDL. We can't test adding members from non-enabled extensions.) + +### Operation tests + +Operation tests test the actual results of using the API. They execute +(sometimes significant) code and check that the result is within the expected +set of behaviors (which can be quite complex to compute). + +Note that operation tests need to test a lot of interactions between different +parts of the API, and so can become quite complex. Try to reduce the complexity by +utilizing combinatorics and [helpers](./helper_index.txt), and splitting/merging test files as needed. + +#### Errors + +Operation tests are usually `GPUTest`s. As a result, they automatically fail on any validation +errors that occur during the test. + +When it's easier to write an operation test with invalid cases, use +`ParamsBuilder.filter`/`.unless` to avoid invalid cases, or detect and +`expect` validation errors in some cases. + +#### Implementation + +Use helpers like `expectContents` (and more to come) to check the values of data on the GPU. +(These are "eventual expectations" - the harness will wait for them to finish at the end). + +When testing something inside a shader, it's not always necessary to output the result to a +render output. In fragment shaders, you can output to a storage buffer. In vertex shaders, you +can't - but you can render with points (simplest), send the result to the fragment shader, and +output it from there. (Someday, we may end up wanting a helper for this.) + +#### Testing Default Values + +Default value tests (for arguments and dictionary members) should usually be operation tests - +all you have to do is include `undefined` in parameterizations of other tests to make sure the +behavior with `undefined` has the same expected result that you have when the default value is +specified explicitly. + +### IDL tests + +TODO: figure out how to implement these. https://github.com/gpuweb/cts/issues/332 + +These tests test only rules that come directly from WebIDL. For example: + +- Values out of range for `[EnforceRange]` cause exceptions. +- Required function arguments and dictionary members cause exceptions if omitted. +- Arguments and dictionary members cause exceptions if passed the wrong type. + +They may also test positive cases like the following, but the behavior of these should be tested in +operation tests. + +- OK to omit optional arguments/members. +- OK to pass the correct argument/member type (or of any type in a union type). + +Every overload of every method should be tested. + +## `src/stress/`, `src/manual/` + +Stress tests and manual tests for WebGPU that are not intended to be run in an automated way. + +## `src/unittests/` + +Unit tests for the test framework (`src/common/framework/`). + +## `src/demo/` + +A demo of test hierarchies for the purpose of testing the `standalone` test runner page. diff --git a/dom/webgpu/tests/cts/checkout/docs/reviews.md b/dom/webgpu/tests/cts/checkout/docs/reviews.md new file mode 100644 index 0000000000..1a8c3f9624 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/reviews.md @@ -0,0 +1,70 @@ +# Review Requirements + +A review should have several items checked off before it is landed. +Checkboxes are pre-filled into the pull request summary when it's created. + +The uploader may pre-check-off boxes if they are not applicable +(e.g. TypeScript readability on a plan PR). + +## Readability + +A reviewer has "readability" for a topic if they have enough expertise in that topic to ensure +good practices are followed in pull requests, or know when to loop in other reviewers. +Perfection is not required! + +**It is up to reviewers' own discretion** whether they are qualified to check off a +"readability" checkbox on any given pull request. + +- WebGPU Readability: Familiarity with the API to ensure: + + - WebGPU is being used correctly; expected results seem reasonable. + - WebGPU is being tested completely; tests have control cases. + - Test code has a clear correspondence with the test description. + - [Test helpers](./helper_index.txt) are used or created appropriately + (where the reviewer is familiar with the helpers). + +- TypeScript Readability: Make sure TypeScript is utilized in a way that: + + - Ensures test code is reasonably type-safe. + Reviewers may recommend changes to make type-safety either weaker (`as`, etc.) or stronger. + - Is understandable and has appropriate verbosity and dynamicity + (e.g. type inference and `as const` are used to reduce unnecessary boilerplate). + +## Plan Reviews + +**Changes *must* have an author or reviewer with the following readability:** WebGPU + +Reviewers must carefully ensure the following: + +- The test plan name accurately describes the area being tested. +- The test plan covers the area described by the file/test name and file/test description + as fully as possible (or adds TODOs for incomplete areas). +- Validation tests have control cases (where no validation error should occur). +- Each validation rule is tested in isolation, in at least one case which does not validate any + other validation rules. + +See also: [Adding or Editing Test Plans](intro/plans.md). + +## Implementation Reviews + +**Changes *must* have an author or reviewer with the following readability:** WebGPU, TypeScript + +Reviewers must carefully ensure the following: + +- The coverage of the test implementation precisely matches the test description. +- Everything required for test plan reviews above. + +Reviewers should ensure the following: + +- New test helpers are documented in [helper index](./helper_index.txt). +- Framework and test helpers are used where they would make test code clearer. + +See also: [Implementing Tests](intro/tests.md). + +## Framework + +**Changes *must* have an author or reviewer with the following readability:** TypeScript + +Reviewers should ensure the following: + +- Changes are reasonably type-safe, and covered by unit tests where appropriate. diff --git a/dom/webgpu/tests/cts/checkout/docs/terms.md b/dom/webgpu/tests/cts/checkout/docs/terms.md new file mode 100644 index 0000000000..032639be57 --- /dev/null +++ b/dom/webgpu/tests/cts/checkout/docs/terms.md @@ -0,0 +1,270 @@ +# Terminology + +Each test suite is organized as a tree, both in the filesystem and further within each file. + +- _Suites_, e.g. `src/webgpu/`. + - _READMEs_, e.g. `src/webgpu/README.txt`. + - _Test Spec Files_, e.g. `src/webgpu/examples.spec.ts`. + Identified by their file path. + Each test spec file provides a description and a _Test Group_. + A _Test Group_ defines a test fixture, and contains multiple: + - _Tests_. + Identified by a comma-separated list of parts (e.g. `basic,async`) + which define a path through a filesystem-like tree (analogy: `basic/async.txt`). + Defines a _test function_ and contains multiple: + - _Test Cases_. + Identified by a list of _Public Parameters_ (e.g. `x` = `1`, `y` = `2`). + Each Test Case has the same test function but different Public Parameters. + +## Test Tree + +A _Test Tree_ is a tree whose leaves are individual Test Cases. + +A Test Tree can be thought of as follows: + +- Suite, which is the root of a tree with "leaves" which are: + - Test Spec Files, each of which is a tree with "leaves" which are: + - Tests, each of which is a tree with leaves which are: + - Test Cases. + +(In the implementation, this conceptual tree of trees is decomposed into one big tree +whose leaves are Test Cases.) + +**Type:** `TestTree` + +## Suite + +A suite of tests. +A single suite has a directory structure, and many _test spec files_ +(`.spec.ts` files containing tests) and _READMEs_. +Each member of a suite is identified by its path within the suite. + +**Example:** `src/webgpu/` + +### README + +**Example:** `src/webgpu/README.txt` + +Describes (in prose) the contents of a subdirectory in a suite. + +READMEs are only processed at build time, when generating the _Listing_ for a suite. + +**Type:** `TestSuiteListingEntryReadme` + +## Queries + +A _Query_ is a structured object which specifies a subset of cases in exactly one Suite. +A Query can be represented uniquely as a string. +Queries are used to: + +- Identify a subtree of a suite (by identifying the root node of that subtree). +- Identify individual cases. +- Represent the list of tests that a test runner (standalone, wpt, or cmdline) should run. +- Identify subtrees which should not be "collapsed" during WPT `cts.https.html` generation, + so that that cts.https.html "variants" can have individual test expectations + (i.e. marked as "expected to fail", "skip", etc.). + +There are four types of `TestQuery`: + +- `TestQueryMultiFile` represents any subtree of the file hierarchy: + - `suite:*` + - `suite:path,to,*` + - `suite:path,to,file,*` +- `TestQueryMultiTest` represents any subtree of the test hierarchy: + - `suite:path,to,file:*` + - `suite:path,to,file:path,to,*` + - `suite:path,to,file:path,to,test,*` +- `TestQueryMultiCase` represents any subtree of the case hierarchy: + - `suite:path,to,file:path,to,test:*` + - `suite:path,to,file:path,to,test:my=0;*` + - `suite:path,to,file:path,to,test:my=0;params="here";*` +- `TestQuerySingleCase` represents as single case: + - `suite:path,to,file:path,to,test:my=0;params="here"` + +Test Queries are a **weakly ordered set**: any query is +_Unordered_, _Equal_, _StrictSuperset_, or _StrictSubset_ relative to any other. +This property is used to construct the complete tree of test cases. +In the examples above, every example query is a StrictSubset of the previous one +(note: even `:*` is a subset of `,*`). + +In the WPT and standalone harnesses, the query is stored in the URL, e.g. +`index.html?q=q:u,e:r,y:*`. + +Queries are selectively URL-encoded for readability and compatibility with browsers +(see `encodeURIComponentSelectively`). + +**Type:** `TestQuery` + +## Listing + +A listing of the **test spec files** in a suite. + +This can be generated only in Node, which has filesystem access (see `src/tools/crawl.ts`). +As part of the build step, a _listing file_ is generated (see `src/tools/gen.ts`) so that the +Test Spec Files can be discovered by the web runner (since it does not have filesystem access). + +**Type:** `TestSuiteListing` + +### Listing File + +Each Suite has one Listing File (`suite/listing.[tj]s`), containing a list of the files +in the suite. + +In `src/suite/listing.ts`, this is computed dynamically. +In `out/suite/listing.js`, the listing has been pre-baked (by `tools/gen_listings`). + +**Type:** Once `import`ed, `ListingFile` + +**Example:** `out/webgpu/listing.js` + +## Test Spec File + +A Test Spec File has a `description` and a Test Group (under which tests and cases are defined). + +**Type:** Once `import`ed, `SpecFile` + +**Example:** `src/webgpu/**/*.spec.ts` + +## Test Group + +A subtree of tests. There is one Test Group per Test Spec File. + +The Test Fixture used for tests is defined at TestGroup creation. + +**Type:** `TestGroup` + +## Test + +One test. It has a single _test function_. + +It may represent multiple _test cases_, each of which runs the same Test Function with different +Parameters. + +A test is named using `TestGroup.test()`, which returns a `TestBuilder`. +`TestBuilder.params()`/`.paramsSimple()`/`.paramsSubcasesOnly()` +can optionally be used to parametrically generate instances (cases and subcases) of the test. +Finally, `TestBuilder.fn()` provides the Test Function +(or, a test can be marked unimplemented with `TestBuilder.unimplemented()`). + +### Test Function + +When a test subcase is run, the Test Function receives an instance of the +Test Fixture provided to the Test Group, producing test results. + +**Type:** `TestFn` + +## Test Case / Case + +A single case of a test. It is identified by a `TestCaseID`: a test name, and its parameters. + +Each case appears as an individual item (tree leaf) in `/standalone/`, +and as an individual "step" in WPT. + +If `TestBuilder.params()`/`.paramsSimple()`/`.paramsSubcasesOnly()` are not used, +there is exactly one case with one subcase, with parameters `{}`. + +**Type:** During test run time, a case is encapsulated as a `RunCase`. + +## Test Subcase / Subcase + +A single "subcase" of a test. It can also be identified by a `TestCaseID`, though +not all contexts allow subdividing cases into subcases. + +All of the subcases of a case will run _inside_ the case, essentially as a for-loop wrapping the +test function. They do _not_ appear individually in `/standalone/` or WPT. + +If `CaseParamsBuilder.beginSubcases()` is not used, there is exactly one subcase per case. + +## Test Parameters / Params + +Each Test Subcase has a (possibly empty) set of Test Parameters, +The parameters are passed to the Test Function `f(t)` via `t.params`. + +A set of Public Parameters identifies a Test Case or Test Subcase within a Test. + +There are also Private Parameters: any parameter name beginning with an underscore (`_`). +These parameters are not part of the Test Case identification, but are still passed into +the Test Function. They can be used, e.g., to manually specify expected results. + +**Type:** `TestParams` + +## Test Fixture / Fixture + +_Test Fixtures_ provide helpers for tests to use. +A new instance of the fixture is created for every run of every test case. + +There is always one fixture class for a whole test group (though this may change). + +The fixture is also how a test gets access to the _case recorder_, +which allows it to produce test results. + +They are also how tests produce results: `.skip()`, `.fail()`, etc. + +**Type:** `Fixture` + +### `UnitTest` Fixture + +Provides basic fixture utilities most useful in the `unittests` suite. + +### `GPUTest` Fixture + +Provides utilities useful in WebGPU CTS tests. + +# Test Results + +## Logger + +A logger logs the results of a whole test run. + +It saves an empty `LiveTestSpecResult` into its results map, then creates a +_test spec recorder_, which records the results for a group into the `LiveTestSpecResult`. + +**Type:** `Logger` + +### Test Case Recorder + +Refers to a `LiveTestCaseResult` created by the logger. +Records the results of running a test case (its pass-status, run time, and logs) into it. + +**Types:** `TestCaseRecorder`, `LiveTestCaseResult` + +#### Test Case Status + +The `status` of a `LiveTestCaseResult` can be one of: + +- `'running'` (only while still running) +- `'pass'` +- `'skip'` +- `'warn'` +- `'fail'` + +The "worst" result from running a case is always reported (fail > warn > skip > pass). +Note this means a test can still fail if it's "skipped", if it failed before +`.skip()` was called. + +**Type:** `Status` + +## Results Format + +The results are returned in JSON format. + +They are designed to be easily merged in JavaScript: +the `"results"` can be passed into the constructor of `Map` and merged from there. + +(TODO: Write a merge tool, if needed.) + +```js +{ + "version": "bf472c5698138cdf801006cd400f587e9b1910a5-dirty", + "results": [ + [ + "unittests:async_mutex:basic:", + { "status": "pass", "timems": 0.286, "logs": [] } + ], + [ + "unittests:async_mutex:serial:", + { "status": "pass", "timems": 0.415, "logs": [] } + ] + ] +} +``` -- cgit v1.2.3