summaryrefslogtreecommitdiffstats
path: root/dom/webgpu/tests/cts/checkout/docs
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-19 00:47:55 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-19 00:47:55 +0000
commit26a029d407be480d791972afb5975cf62c9360a6 (patch)
treef435a8308119effd964b339f76abb83a57c29483 /dom/webgpu/tests/cts/checkout/docs
parentInitial commit. (diff)
downloadfirefox-26a029d407be480d791972afb5975cf62c9360a6.tar.xz
firefox-26a029d407be480d791972afb5975cf62c9360a6.zip
Adding upstream version 124.0.1.upstream/124.0.1
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'dom/webgpu/tests/cts/checkout/docs')
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/adding_timing_metadata.md163
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/build.md43
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/deno.md24
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/fp_primer.md871
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/helper_index.txt93
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/implementing.md97
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/intro/README.md99
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.pngbin0 -> 2061 bytes
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/intro/developing.md134
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/intro/life_of.md46
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/intro/plans.md82
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/intro/tests.md25
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/organization.md166
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/reviews.md70
-rw-r--r--dom/webgpu/tests/cts/checkout/docs/terms.md270
15 files changed, 2183 insertions, 0 deletions
diff --git a/dom/webgpu/tests/cts/checkout/docs/adding_timing_metadata.md b/dom/webgpu/tests/cts/checkout/docs/adding_timing_metadata.md
new file mode 100644
index 0000000000..fe32cead20
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/adding_timing_metadata.md
@@ -0,0 +1,163 @@
+# Adding Timing Metadata
+
+## listing_meta.json files
+
+`listing_meta.json` files are SEMI AUTO-GENERATED.
+
+The raw data may be edited manually, to add entries or change timing values.
+
+The **list** of tests must stay up to date, so it can be used by external
+tools. This is verified by presubmit checks.
+
+The `subcaseMS` values are estimates. They can be set to 0 if for some reason
+you can't estimate the time (or there's an existing test with a long name and
+slow subcases that would result in query strings that are too long), but this
+will produce a non-fatal warning. Avoid creating new warnings whenever
+possible. Any existing failures should be fixed (eventually).
+
+### Performance
+
+Note this data is typically captured by developers using higher-end
+computers, so typical test machines might execute more slowly. For this
+reason, the WPT chunking should be configured to generate chunks much shorter
+than 5 seconds (a typical default time limit in WPT test executors) so they
+should still execute in under 5 seconds on lower-end computers.
+
+## Problem
+
+When adding new tests to the CTS you may occasionally see an error like this
+when running `npm test` or `npm run standalone`:
+
+```
+ERROR: Tests missing from listing_meta.json. Please add the new tests (set subcaseMS to 0 if you cannot estimate it):
+ webgpu:shader,execution,expression,binary,af_matrix_addition:matrix:*
+
+/home/runner/work/cts/cts/src/common/util/util.ts:38
+ throw new Error(msg && (typeof msg === 'string' ? msg : msg()));
+ ^
+Error:
+ at assert (/home/runner/work/cts/cts/src/common/util/util.ts:38:11)
+ at crawl (/home/runner/work/cts/cts/src/common/tools/crawl.ts:155:11)
+Warning: non-zero exit code 1
+ Use --force to continue.
+
+Aborted due to warnings.
+```
+
+What this error message is trying to tell us, is that there is no entry for
+`webgpu:shader,execution,expression,binary,af_matrix_addition:matrix:*` in
+`src/webgpu/listing_meta.json`.
+
+These entries are estimates for the amount of time that subcases take to run,
+and are used as inputs into the WPT tooling to attempt to portion out tests into
+approximately same-sized chunks.
+
+If a value has been defaulted to 0 by someone, you will see warnings like this:
+
+```
+...
+WARNING: subcaseMS≤0 found in listing_meta.json (allowed, but try to avoid):
+ webgpu:shader,execution,expression,binary,af_matrix_addition:matrix:*
+...
+```
+
+These messages should be resolved by adding appropriate entries to the JSON
+file.
+
+## Solution 1 (manual, best for simple tests)
+
+If you're developing new tests and need to update this file, it is sometimes
+easiest to do so manually. Run your tests under your usual development workflow
+and see how long they take. In the standalone web runner `npm start`, the total
+time for a test case is reported on the right-hand side when the case logs are
+expanded.
+
+Record the average time per *subcase* across all cases of the test (you may need
+to compute this) into the `listing_meta.json` file.
+
+## Solution 2 (semi-automated)
+
+There exists tooling in the CTS repo for generating appropriate estimates for
+these values, though they do require some manual intervention. The rest of this
+doc will be a walkthrough of running these tools.
+
+Timing data can be captured in bulk and "merged" into this file using
+the `merge_listing_times` tool. This is useful when a large number of tests
+change or otherwise a lot of tests need to be updated, but it also automates the
+manual steps above.
+
+The tool can also be used without any inputs to reformat `listing_meta.json`.
+Please read the help message of `merge_listing_times` for more information.
+
+### Placeholder Value
+
+If your development workflow requires a clean build, the first step is to add a
+placeholder value for entry to `src/webgpu/listing_meta.json`, since there is a
+chicken-and-egg problem for updating these values.
+
+```
+ "webgpu:shader,execution,expression,binary,af_matrix_addition:matrix:*": { "subcaseMS": 0 },
+```
+
+(It should have a value of 0, since later tooling updates the value if the newer
+value is higher.)
+
+### Websocket Logger
+
+The first tool that needs to be run is `websocket-logger`, which receives data
+on a WebSocket channel to capture timing data when CTS is run. This
+should be run in a separate process/terminal, since it needs to stay running
+throughout the following steps.
+
+In the `tools/websocket-logger/` directory:
+
+```
+npm ci
+npm start
+```
+
+The output from this command will indicate where the results are being logged,
+which will be needed later. For example:
+
+```
+...
+Writing to wslog-2023-09-12T18-57-34.txt
+...
+```
+
+### Running CTS
+
+Now we need to run the specific cases in CTS that we need to time.
+This should be possible under any development workflow (as long as its runtime environment, like Node, supports WebSockets), but the most well-tested way is using the standalone web runner.
+
+This requires serving the CTS locally. In the project root:
+
+```
+npm run standalone
+npm start
+```
+
+Once this is started you can then direct a WebGPU enabled browser to the
+specific CTS entry and run the tests, for example:
+
+```
+http://localhost:8080/standalone/?q=webgpu:shader,execution,expression,binary,af_matrix_addition:matrix:*
+```
+
+If the tests have a high variance in runtime, you can run them multiple times.
+The longest recorded time will be used.
+
+### Merging metadata
+
+The final step is to merge the new data that has been captured into the JSON
+file.
+
+This can be done using the following command:
+
+```
+tools/merge_listing_times webgpu -- tools/websocket-logger/wslog-2023-09-12T18-57-34.txt
+```
+
+where the text file is the result file from websocket-logger.
+
+Now you just need to commit the pending diff in your repo.
diff --git a/dom/webgpu/tests/cts/checkout/docs/build.md b/dom/webgpu/tests/cts/checkout/docs/build.md
new file mode 100644
index 0000000000..2d7b2f968c
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/build.md
@@ -0,0 +1,43 @@
+# Building
+
+Building the project is not usually needed for local development.
+However, for exports to WPT, or deployment (https://gpuweb.github.io/cts/),
+files can be pre-generated.
+
+The project builds into two directories:
+
+- `out/`: Built framework and test files, needed to run standalone or command line.
+- `out-wpt/`: Build directory for export into WPT. Contains:
+ - An adapter for running WebGPU CTS tests under WPT
+ - A copy of the needed files from `out/`
+ - A copy of any `.html` test cases from `src/`
+
+To build and run all pre-submit checks (including type and lint checks and
+unittests), use:
+
+```sh
+npm test
+```
+
+For checks only:
+
+```sh
+npm run check
+```
+
+For a quicker iterative build:
+
+```sh
+npm run standalone
+```
+
+## Run
+
+To serve the built files (rather than using the dev server), run `npx grunt serve`.
+
+## Export to WPT
+
+Run `npm run wpt`.
+
+Copy (or symlink) the `out-wpt/` directory as the `webgpu/` directory in your
+WPT checkout or your browser's "internal" WPT test directory.
diff --git a/dom/webgpu/tests/cts/checkout/docs/deno.md b/dom/webgpu/tests/cts/checkout/docs/deno.md
new file mode 100644
index 0000000000..22a54c79bd
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/deno.md
@@ -0,0 +1,24 @@
+# Running the CTS on Deno
+
+Since version 1.8, Deno experimentally implements the WebGPU API out of the box.
+You can use the `./tools/deno` script to run the CTS in Deno. To do this you
+will first need to install Deno: [stable](https://deno.land#installation), or
+build the main branch from source
+(`cargo install --git https://github.com/denoland/deno --bin deno`).
+
+On macOS and recent Linux, you can just run `./tools/run_deno` as is. On Windows and
+older Linux releases you will need to run
+`deno run --unstable --allow-read --allow-write --allow-env ./tools/deno`.
+
+## Usage
+
+```
+Usage:
+ tools/run_deno [OPTIONS...] QUERIES...
+ tools/run_deno 'unittests:*' 'webgpu:buffers,*'
+Options:
+ --verbose Print result/log of every test as it runs.
+ --debug Include debug messages in logging.
+ --print-json Print the complete result JSON in the output.
+ --expectations Path to expectations file.
+```
diff --git a/dom/webgpu/tests/cts/checkout/docs/fp_primer.md b/dom/webgpu/tests/cts/checkout/docs/fp_primer.md
new file mode 100644
index 0000000000..a8302fb461
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/fp_primer.md
@@ -0,0 +1,871 @@
+# Floating Point Primer
+
+This document is meant to be a primer of the concepts related to floating point
+numbers that are needed to be understood when working on tests in WebGPU's CTS.
+
+WebGPU's CTS is responsible for testing if an implementation of WebGPU
+satisfies the spec, and thus meets the expectations of programmers based on the
+contract defined by the spec.
+
+Floating point math makes up a significant portion of the WGSL spec, and has
+many subtle corner cases to get correct.
+
+Additionally, floating point math, unlike integer math, is broadly not exact, so
+how inaccurate a calculation is allowed to be is required to be stated in the
+spec and tested in the CTS, as opposed to testing for a singular correct
+response.
+
+Thus, the WebGPU CTS has a significant amount of machinery around how to
+correctly test floating point expectations in a fluent manner.
+
+## Floating Point Numbers
+
+For some of the following discussion of floating point numbers 32-bit
+floating numbers are assumed, also known as single precision IEEE floating
+point numbers or `f32`s. Most of the discussions that apply to this format apply
+to other concrete formats that are handled, i.e. 16-bit/f16/half-precision.
+There are some significant differences with respect to AbstractFloats, which
+will be discussed in its own section.
+
+Details of how these formats work are discussed as needed below, but for a more
+involved discussion, please see the references in the Resources sections.
+
+Additionally, in the Appendix there is a table of interesting/common values that
+are often referenced in tests or this document.
+
+A floating point number system defines
+- A finite set of values to stand as representatives for the infinite set of
+ real numbers, and
+- Arithmetic operations on those representatives, trying to approximate the
+ ideal operations on real numbers.
+
+The cardinality mismatch alone implies that any floating point number system necessarily loses information.
+
+This means that not all numbers in the bounds can be exactly represented as a
+floating point value.
+
+For example, the integer `1` is exactly represented as a f32 as `0x3f800000`,
+but the next nearest number `0x3f800001` is `1.00000011920928955`.
+
+So any number between `1` and `1.00000011920928955` is not exactly representable
+as a f32 and instead is approximated as either `1` or `1.00000011920928955`.
+
+When a number X is not exactly representable by a floating point value, there
+are normally two neighbouring numbers that could reasonably represent X: the
+nearest floating point value above X, and the nearest floating point value below
+X. Which of these values gets used is dictated by the rounding mode being used,
+which may be something like always round towards 0 or go to the nearest
+neighbour, or something else entirely.
+
+The process of converting numbers between different precisions is called
+quantization. WGSL does not prescribe a specific rounding mode when
+quantizing, so either of the neighbouring values is considered valid when
+converting a non-exactly representable value to a floating point value. This has
+significant implications on the CTS that are discussed later.
+
+From here on, we assume you are familiar with the internal structure of a
+floating point number (a sign bit, a biased exponent, and a mantissa). For
+reference, see
+[binary64 on Wikipedia](https://en.wikipedia.org/wiki/Double-precision_floating-point_format),
+[binary32 on Wikipedia](https://en.wikipedia.org/wiki/Single-precision_floating-point_format),
+and
+[binary16 on Wikipedia](https://en.wikipedia.org/wiki/Half-precision_floating-point_format).
+
+In the floating points formats described above, there are two possible zero
+values, one with all bits being 0, called positive zero, and one all the same
+except with the sign bit being 1, called negative zero.
+
+For WGSL, and thus the CTS's purposes, these values are considered equivalent.
+Typescript, which the CTS is written in, treats all zeros as positive zeros,
+unless you explicitly escape hatch to differentiate between them, so most of the
+time there being two zeros doesn't materially affect code.
+
+### Normal Numbers
+
+Normal numbers are floating point numbers whose biased exponent is not all 0s or
+all 1s. When working with normal numbers the mantissa starts with an implied
+leading 1. For WGSL these numbers behave as you expect for floating point values
+with no interesting caveats.
+
+### Subnormal Numbers
+
+Subnormal numbers are finite non-zero numbers whose biased exponent is all 0s,
+sometimes called denorms.
+
+These are the closest numbers to zero, both positive and negative, and fill in
+the gap between the normal numbers with the smallest magnitude, and 0.
+
+Some devices, for performance reasons, do not handle operations on the
+subnormal numbers, and instead treat them as being zero, this is called *flush
+to zero* or FTZ behaviour.
+
+This means in the CTS that when a subnormal number is consumed or produced by an
+operation, an implementation may choose to replace it with zero.
+
+Like the rounding mode for quantization, this adds significant complexity to the
+CTS, which will be discussed later.
+
+### Inf & NaNs
+
+Floating point numbers include positive and negative infinity to represent
+values that are out of the bounds supported by the current precision.
+
+Implementations may assume that infinities are not present. When an evaluation
+at runtime would produce an infinity, an indeterminate value is produced
+instead.
+
+When a value goes out of bounds for a specific precision there are special
+rounding rules that apply. If it is 'near' the edge of finite values for that
+precision, it is considered to be near-overflowing, and the implementation may
+choose to round it to the edge value or the appropriate infinity. If it is not
+near the finite values, which it is considered to be far-overflowing, then it
+must be rounded to the appropriate infinity.
+
+This of course is vague, but the spec does have a precise definition where the
+transition from near to far overflow is.
+
+Let `x` be our value.
+
+Let `exp_max` be the (unbiased) exponent of the largest finite value for the
+floating point type.
+
+If `|x|` < `2 ** (exp_max + 1)`, but not in
+the finite range, than it is considered to be near-overflowing for the
+floating point type.
+
+If the magnitude is equal to or greater than this limit, then it is
+far-overflowing for the floating point type.
+
+This concept of near-overflow vs far-overflow divides the real number line into
+5 distinct regions.
+
+| Region | Rule |
+|-----------------------------------------------|---------------------------------|
+| -∞ < `x` <= `-(2 ** (exp_max + 1))` | must round to -∞ |
+| `-(2 ** (exp_max + 1))` < `x` <= min fp value | must round to -∞ or min value |
+| min fp value < `x` < max fp value | round as discussed below |
+| max fp value <= `x` < `2 ** (exp_max + 1)` | must round to max value or ∞ |
+| `2 ** (exp_max + 1))` < `x` | implementations must round to ∞ |
+
+
+The CTS encodes the least restrictive interpretation of the rules in the spec,
+i.e. assuming someone has made a slightly adversarial implementation that always
+chooses the thing with the least accuracy.
+
+This means that the above rules about infinities and overflow combine to say
+that any time a non-finite value for the specific floating point type is seen,
+any finite value is acceptable afterward. This is because the non-finite value
+may be converted to an infinity and then an indeterminate value can be used
+instead of the infinity.
+
+(This comes with the caveat that this is only for runtime execution on a GPU,
+the rules for compile time execution will be discussed below.)
+
+Signaling NaNs are treated as quiet NaNs in the WGSL spec. And quiet NaNs have
+the same "may-convert-to-indeterminate-value" behaviour that infinities have, so
+for the purpose of the CTS they are handled by the infinite/out of bounds logic
+normally.
+
+## Notation/Terminology
+
+When discussing floating point values in the CTS, there are a few terms used
+with precise meanings, which will be elaborated here.
+
+Additionally, any specific notation used will be specified here to avoid
+confusion.
+
+### Operations
+
+The CTS tests for the proper execution of builtins, i.e. `sin`, `sqrt`, `abs`,
+etc, and expressions, i.e. `*`, `/`, `<`, etc, when provided with floating
+point inputs. These collectively can be referred to as floating point
+operations.
+
+Operations, which can be thought of as mathematical functions, are mappings from
+a set of inputs to a set of outputs.
+
+Denoted `f(x, y) = X`, where `f` is a placeholder or the name of the operation,
+lower case variables are the inputs to the function, and uppercase variables are
+the outputs of the function.
+
+Operations have one or more inputs and an output value.
+
+Values are generally defined as floats, integers, booleans, vectors, and
+matrices. Consult the [WGSL Spec](https://www.w3.org/TR/WGSL/) for the exact
+list of types and their definitions.
+
+Most operations inputs and output are the same type of value. There are some
+exceptions that accept or emit heterogeneous data types, normally a floating
+point type and a integer type or a boolean.
+
+There are a couple of builtins (`frexp` and `modf`) that return composite
+outputs where there are multiple values being returned, there is a single result
+value made of structured data. Whereas composite inputs are handle by having
+multiple input parameters.
+
+Some examples of different types of operations:
+
+`multiplication(x, y) = X`, which represents the WGSL expression `x * y`, takes
+in floating point values, `x` and `y`, and produces a floating point value `X`.
+
+`lessThan(x, y) = X`, which represents the WGSL expression `x < y`, again takes
+in floating point values, but in this case returns a boolean value.
+
+`ldexp(x, y) = X`, which builds a floating point value, takes in a floating
+point value `x` and a restricted integer `y`.
+
+### Domain, Range, and Intervals
+
+For an operation `f(x) = X`, the interval of valid values for the input, `x`, is
+called the *domain*, and the interval for valid results, `X`, is called the
+*range*.
+
+An interval, `[a, b]`, is a set of real numbers that contains `a`, `b`, and all
+the real numbers between them.
+
+Open-ended intervals, i.e. ones that don't include `a` and/or `b`, are avoided,
+and are called out explicitly when they occur.
+
+The convention in this doc and the CTS code is that `a <= b`, so `a` can be
+referred to as the beginning of the interval and `b` as the end of the interval.
+
+When talking about intervals, this doc and the code endeavours to avoid using
+the term **range** to refer to the span of values that an interval covers,
+instead using the term bounds to avoid confusion of terminology around output of
+operations.
+
+## Accuracy
+
+As mentioned above floating point numbers are not able to represent all the
+possible values over their bounds, but instead represent discrete values in that
+interval, and approximate the remainder.
+
+Additionally, floating point numbers are not evenly distributed over the real
+number line, but instead are more densely clustered around zero, with the space
+between values increasing in steps as the magnitude increases.
+
+When discussing operations on floating point numbers, there is often reference
+to a true value. This is the value that given no performance constraints and
+infinite precision you would get, i.e `acos(1) = π`, where π has infinite
+digits of precision.
+
+For the CTS it is often sufficient to calculate the true value using TypeScript,
+since its native number format is higher precision (double-precision/f64), so
+all f64, f32, and f16 values can be represented in it. Where this breaks down
+will be discussed in the section on compile time vs runtime execution.
+
+The true value is sometimes representable exactly as a floating point value, but
+often is not.
+
+Additionally, many operations are implemented using approximations from
+numerical analysis, where there is a tradeoff between the precision of the
+result and the cost.
+
+Thus, the spec specifies what the accuracy constraints for specific operations
+is, how close to truth an implementation is required to be, to be
+considered conforming.
+
+There are 5 different ways that accuracy requirements are defined in the spec:
+
+1. *Exact*
+
+ This is the situation where it is expected that true value for an operation
+ is always expected to be exactly representable. This doesn't happen for any
+ of the operations that return floating point values, but does occur for
+ logical operations that return boolean values.
+
+
+2. *Correctly Rounded*
+
+ For the case that the true value is exactly representable as a floating
+ point, this is the equivalent of exactly from above. In the event that the
+ true value is not exact, then the acceptable answer for most numbers is
+ either the nearest representable value above or below the true value.
+
+ For values near the subnormal range, e.g. close to zero, this becomes more
+ complex, since an implementation may FTZ at any point. So if the exact
+ solution is subnormal or either of the neighbours of the true value are
+ subnormal, zero becomes a possible result, thus the acceptance interval is
+ wider than naively expected.
+
+ On the edge of and beyond the bounds of a floating point type the definition
+ of correctly rounded becomes complex, which is discussed in detail in the
+ section on overflow.
+
+
+3. *Absolute Error*
+
+ This type of accuracy specifies an error value, ε, and the calculated result
+ is expected to be within that distance from the true value, i.e.
+ `[ X - ε, X + ε ]`.
+
+ The main drawback with this manner of specifying accuracy is that it doesn't
+ scale with the level of precision in floating point numbers themselves at a
+ specific value. Thus, it tends to be only used for specifying accuracy over
+ specific limited intervals, i.e. [-π, π].
+
+
+4. *Units of Least Precision (ULP)*
+
+ The solution to the issue of not scaling with precision of floating point is
+ to use units of least precision.
+
+ ULP(X) is min (b-a) over all pairs (a,b) of representable floating point
+ numbers such that (a <= X <= b and a =/= b). For a more formal discussion of
+ ULP see
+ [On the definition of ulp(x)](https://hal.inria.fr/inria-00070503/document).
+
+ n * ULP or nULP means `[X - n * ULP @ X, X + n * ULP @ X]`.
+
+
+5. *Inherited*
+
+ When an operation's accuracy is defined in terms of other operations, then
+ its accuracy is said to be inherited. Handling of inherited accuracies is
+ one of the main driving factors in the design of testing framework, so will
+ need to be discussed in detail.
+
+## Acceptance Intervals
+
+The first four accuracy types; Exact, Correctly Rounded, Absolute Error, and
+ULP, sometimes called simple accuracies, can be defined in isolation from each
+other, and by association can be implemented using relatively independent
+implementations.
+
+The original implementation of the floating point framework did this as it was
+being built out, but ran into difficulties when defining the inherited
+accuracies.
+
+For examples, `tan(x) inherits from sin(x)/cos(x)`, one can take the defined
+rules and manually build up a bespoke solution for checking the results, but
+this is tedious, error-prone, and doesn't allow for code re-use.
+
+Instead, it would be better if there was a single conceptual framework that one
+can express all the 'simple' accuracy requirements in, and then have a mechanism
+for composing them to define inherited accuracies.
+
+In the WebGPU CTS this is done via the concept of acceptance intervals, which is
+derived from a similar concept in the Vulkan CTS, though implemented
+significantly differently.
+
+The core of this idea is that each of different accuracy types can be integrated
+into the definition of the operation, so that instead of transforming an input
+from the domain to a point in the range, the operation is producing an interval
+in the range, that is the acceptable values an implementation may emit.
+
+
+The simple accuracies can be defined as follows:
+
+1. *Exact*
+
+ `f(x) => [X, X]`
+
+
+2. *Correctly Rounded*
+
+ If `X` is precisely defined as a floating point value
+
+ `f(x) => [X, X]`
+
+ otherwise,
+
+ `[a, b]` where `a` is the largest representable number with `a <= X`, and `b`
+ is the smallest representable number with `X <= b`
+
+
+3. *Absolute Error*
+
+ `f(x) => [ X - ε, X + ε ]`, where ε is the absolute error value
+
+
+4. **ULP Error**
+
+ `f(x) = X => [X - n*ULP(X), X + n*ULP(X)]`
+
+As defined, these definitions handle mapping from a point in the domain into an
+interval in the range.
+
+This is insufficient for implementing inherited accuracies, since inheritance
+sometimes involve mapping domain intervals to range intervals.
+
+Here we use the convention for naturally extending a function on real numbers
+into a function on intervals of real numbers, i.e. `f([a, b]) = [A, B]`.
+
+Given that floating point numbers have a finite number of precise values for any
+given interval, one could implement just running the accuracy computation for
+every point in the interval and then spanning together the resultant intervals.
+That would be very inefficient though and make your reviewer sad to read.
+
+For mapping intervals to intervals the key insight is that we only need to be
+concerned with the extrema of the operation in the interval, since the
+acceptance interval is the bounds of the possible outputs.
+
+In more precise terms:
+```
+ f(x) => X, x = [a, b] and X = [A, B]
+
+ X = [min(f(x)), max(f(x))]
+ X = [min(f([a, b])), max(f([a, b]))]
+ X = [f(m), f(n)]
+```
+where `m` and `n` are in `[a, b]`, `m <= n`, and produce the min and max results
+for `f` on the interval, respectively.
+
+So how do we find the minima and maxima for our operation in the domain?
+
+The common general solution for this requires using calculus to calculate the
+derivative of `f`, `f'`, and then find the zeroes `f'` to find inflection
+points of `f`.
+
+This solution wouldn't be sufficient for all builtins, i.e. `step` which is not
+differentiable at edge values.
+
+Thankfully we do not need a general solution for the CTS, since all the builtin
+operations are defined in the spec, so `f` is from a known set of options.
+
+These operations can be divided into two broad categories: monotonic, and
+non-monotonic, with respect to an interval.
+
+The monotonic operations are ones that preserve the order of inputs in their
+outputs (or reverse it). Their graph only ever decreases or increases,
+never changing from one or the other, though it can have flat sections.
+
+The non-monotonic operations are ones whose graph would have both regions of
+increase and decrease.
+
+The monotonic operations, when mapping an interval to an interval, are simple to
+handle, since the extrema are guaranteed to be the ends of the domain, `a` and
+`b`.
+
+So `f([a, b])` = `[f(a), f(b)]` or `[f(b), f(a)]`. We could figure out if `f` is
+increasing or decreasing beforehand to determine if it should be `[f(a), f(b)]`
+or `[f(b), f(a)]`.
+
+It is simpler to just use min & max to have an implementation that is agnostic
+to the details of `f`.
+```
+ A = f(a), B = f(b)
+ X = [min(A, B), max(A, B)]
+```
+
+The non-monotonic functions that we need to handle for interval-to-interval
+mappings are more complex. Thankfully are a small number of the overall
+operations that need to be handled, since they are only the operations that are
+used in an inherited accuracy and take in the output of another operation as
+part of that inherited accuracy.
+
+So in the CTS we just have bespoke implementations for each of them.
+
+Part of the operation definition in the CTS is a function that takes in the
+domain interval, and returns a sub-interval such that the subject function is
+monotonic over that sub-interval, and hence the function's minima and maxima are
+at the ends.
+
+This adjusted domain interval can then be fed through the same machinery as the
+monotonic functions.
+
+### Inherited Accuracy
+
+So with all of that background out of the way, we can now define an inherited
+accuracy in terms of acceptance intervals.
+
+The crux of this is the insight that the range of one operation can become the
+domain of another operation to compose them together.
+
+And since we have defined how to do this interval to interval mapping above,
+transforming things becomes mechanical and thus implementable in reusable code.
+
+When talking about inherited accuracies `f(x) => g(x)` is used to denote that
+`f`'s accuracy is a defined as `g`.
+
+An example to illustrate inherited accuracies, in f32:
+
+```
+ tan(x) => sin(x)/cos(x)
+
+ sin(x) => [sin(x) - 2 ** -11, sin(x) + 2 ** -11]`
+ cos(x) => [cos(x) - 2 ** -11, cos(x) + 2-11]
+
+ x/y => [x/y - 2.5 * ULP(x/y), x/y + 2.5 * ULP(x/y)]
+```
+
+`sin(x)` and `cos(x)` are non-monotonic, so calculating out a closed generic
+form over an interval is a pain, since the min and max vary depending on the
+value of x. Let's isolate this to a single point, so you don't have to read
+literally pages of expanded intervals.
+
+```
+ x = π/2
+
+ sin(π/2) => [sin(π/2) - 2 ** -11, sin(π/2) + 2 ** -11]
+ => [0 - 2 ** -11, 0 + 2 ** -11]
+ => [-0.000488…, 0.000488…]
+ cos(π/2) => [cos(π/2) - 2 ** -11, cos(π/2) + 2 ** -11]
+ => [-0.500488…, -0.499511…]
+
+ tan(π/2) => sin(π/2)/cos(π/2)
+ => [-0.000488…, 0.000488…]/[-0.500488…, -0.499511…]
+ => [min(-0.000488…/-0.500488…, -0.000488…/-0.499511…, 0.000488…/-0.500488…, 0.000488…/-0.499511…),
+ max(-0.000488…/-0.500488…, -0.000488…/-0.499511…, 0.000488…/-0.500488…, 0.000488…/-0.499511…)]
+ => [0.000488…/-0.499511…, 0.000488…/0.499511…]
+ => [-0.0009775171, 0.0009775171]
+```
+
+For clarity this has omitted a bunch of complexity around FTZ behaviours, and
+that these operations are only defined for specific domains, but the high-level
+concepts hold.
+
+For each of the inherited operations we could implement a manually written out
+closed form solution, but that would be quite error-prone and not be
+re-using code between builtins.
+
+Instead, the CTS takes advantage of the fact in addition to testing
+implementations of `tan(x)` we are going to be testing implementations of
+`sin(x)`, `cos(x)` and `x/y`, so there should be functions to generate
+acceptance intervals for those operations.
+
+The `tan(x)` acceptance interval can be constructed by generating the acceptance
+intervals for `sin(x)`, `cos(x)` and `x/y` via function calls and composing the
+results.
+
+This algorithmically looks something like this:
+
+```
+ tan(x):
+ Calculate sin(x) interval
+ Calculate cos(x) interval
+ Calculate sin(x) result divided by cos(x) result
+ Return division result
+```
+
+## Compile vs Run Time Evaluation
+
+The above discussions have been primarily agnostic to when and where a
+calculation is occurring, with an implicit bias to runtime execution on a GPU.
+
+In reality where/when a computation is occurring has a significant impact on the
+expected outcome when dealing with edge cases.
+
+### Terminology
+
+There are two related axes that will be referred to when it comes to evaluation.
+These are compile vs run time, and CPU vs GPU. Broadly speaking compile time
+execution happens on the host CPU, and run time evaluation occurs on a dedicated
+GPU.
+
+(Software graphics implementations like WARP and SwiftShader technically break this by
+being a software emulation of a GPU that runs on the CPU, but conceptually one can
+think of these implementations being a type of GPU in this context, since it has
+similar constraints when it comes to precision, etc.)
+
+Compile time evaluation is execution that occurs when setting up a shader
+module, i.e. when compiling WGSL to a platform specific shading language. It is
+part of resolving values for things like constants, and occurs once before the
+shader is run by the caller. It includes constant evaluation and override
+evaluation. All AbstractFloat operations are compile time evaluated.
+
+Runtime evaluation is execution that occurs every time the shader is run, and
+may include dynamic data that is provided between invocations. It is work that
+is sent to the GPU for execution in the shader.
+
+WGSL const-expressions and override-expressions are evaluated before runtime and
+both are considered "compile time" in this discussion. WGSL runtime-expressions
+are evaluated at runtime.
+
+### Behavioural Differences
+
+For a well-defined operation with a finite result, runtime and compile time
+evaluation should be indistinguishable.
+
+For example:
+```
+// runtime
+@group(0) @binding(0) var a : f32;
+@group(0) @binding(1) var b : f32;
+
+let c: f32 = a + b
+```
+and
+```
+// compile time
+const c: f32 = 1.0f + 2.0f
+```
+should produce the same result of `3.0` in the variable `c`, assuming `1.0` and `2.0`
+were passed in as `a` and `b`.
+
+The only difference, is when/where the execution occurs.
+
+The difference in behaviour between these two occur when the result of the
+operation is not finite for the underlying floating point type.
+
+If instead of `1.0` and `2.0`, we had `10.0` and `f32.max`, so the true result is
+`f32.max + 10.0`, the behaviours differ. Specifically the runtime
+evaluated version will still run, but the result in `c` will be an indeterminate
+value, which is any finite f32 value. For the compile time example instead,
+compiling the shader will fail validation.
+
+This applies to any operation, and isn't restricted to just addition. Anytime a
+value goes outside the finite values the shader will hit these results,
+indeterminate for runtime execution and validation failure for compile time
+execution.
+
+Unfortunately we are dealing with intervals of results and not precise results.
+So this leads to more even conceptual complexity. For runtime evaluation, this
+isn't too bad, because the rule becomes: if any part of the interval is
+non-finite then an indeterminate value can be a result, and the interval for an
+indeterminate result `[fp min, fp max]`, will include any finite portions of the
+interval.
+
+Compile time evaluation becomes significantly more complex, because difference
+isn't what interval is returned, but does this shader compile or not, which are
+mutually exclusive. This is compounded even further by having to consider
+near-overflow vs far-overflow behaviour. Thankfully this can be broken down into
+a case by case basis based on where an interval falls.
+
+Assuming `X`, is the well-defined result of an operation, i.e. not indeterminate
+due to the operation isn't defined for the inputs:
+
+| Region | | Result |
+|------------------------------|------------------------------------------------------|--------------------------------|
+| `abs(X) <= fp max` | interval falls completely in the finite bounds | validation succeeds |
+| `abs(X) >= 2 ** (exp_max+1)` | interval falls completely in the far-overflow bounds | validation fails |
+| Otherwise | interval intersects the near-overflow region | validation may succeed or fail |
+
+The final case is somewhat difficult from a CTS perspective, because now it
+isn't sufficient to know that a non-finite result has occurred, but what the
+specific result is needs to be tracked. Additionally, the expected result is
+somewhat ambiguous, since a shader may or may not compile. This could in theory
+still be tested by the CTS, via having switching logic that determines in this
+region, if the shader compiles expect these results, otherwise pass the test.
+This adds a significant amount of complexity to the testing code for thoroughly
+testing a relatively small segment of values. Other environments do not have the
+behaviour in this region as rigorously defined nor tested, so fully testing
+here would likely find lots of issues that would just need to be mitigated in
+the CTS.
+
+Currently, we choose to avoid testing validation of near-overflow scenarios.
+
+### Additional Technical Limitations
+
+The above description of compile and runtime evaluation was somewhat based in
+the theoretical world that the intervals being used for testing are infinitely
+precise, when in actuality they are implemented by the ECMAScript `number` type,
+which is implemented as a f64 value.
+
+For the vast majority of cases, even out of bounds and overflow, this is
+sufficient. There is one small slice where this breaks down. Specifically if
+the result just outside the finite range by less than 1 f64 ULP of the edge
+value. An example of this is `2 ** -11 + f32.max`. This will be between `f32.max`
+and `f32.max + ULPF64(f32.max)`. This becomes a problem, because this value
+technically falls into the out-of-bounds region, but depending on how
+quantization for f64 is handled in the test runner will be either `f32.max` or
+`f32.max + ULPF64(f32.max)`. So as a compile time evaluation either we expect an
+implementation to always handle this, or it might fail, but we cannot easily
+detect it, since this is pushing hard on the limits of precision of the testing
+environment.
+
+(A parallel version of this probably exists on the other side of the
+out-of-bounds region, but I don't have a proven example of this)
+
+The high road fix to this problem is to use an arbitrary precision floating
+point implementation. Unfortunately such a library is not on the standards
+track for ECMAScript at this time, so we would have to evaluate and pick a
+third party dependency to use. Beyond the selection process, this would also
+require a significant refactoring of the existing framework code for fixing a
+very marginal case.
+
+(This differs from Float16 support, where the prototyped version of the proposed
+API has been pulled in, and the long term plan it use the ECMAScript
+implementation's version, once all the major runtimes support it. So it can
+be viewed as a polyfill).
+
+This region currently is not tested as part of the decision to defer testing on
+the entire out-of-bounds but not overflowing region.
+
+In the future if we decide to add testing to the out-of-bounds region, to avoid
+perfect being the enemy of good here, it is likely the CTS would still avoid
+testing these regions where f64 precision breaks down. If someone is interested
+in taking on the effort needed to migrate to an arbitrary precision float
+library, or if this turns out to be a significant issue in the future, this
+decision can be revisited.
+
+## Abstract Float
+
+### Accuracy
+
+For the concrete floating point types (f32 & f16) the accuracy of operations are
+defined in terms of their own type. Specifically for f32, correctly rounded
+refers to the nearest f32 values, and ULP is in terms of the distance between
+f32 values.
+
+AbstractFloat internally is defined as a f64, and this applies for exact and
+correctly rounded accuracies. Thus, correctly rounded refers to the nearest f64
+values. However, AbstractFloat differs for ULP and absolute errors. Reading
+the spec strictly, these all have unbounded accuracies, but it is recommended
+that their accuracies be at least as good as the f32 equivalent.
+
+The difference between f32 and f64 ULP at a specific value X are significant, so
+at least as good as f32 requirement is always less strict than if it was
+calculated in terms of f64. Similarly, for absolute accuracies the interval
+`[x - epsilon, x + epsilon]` is always equal or wider if calculated as f32s
+vs f64s.
+
+If an inherited accuracy is only defined in terms of correctly rounded
+accuracies, then the interval is calculated in terms of f64s. If any of the
+defining accuracies are ULP or absolute errors, then the result falls into the
+unbounded accuracy, but recommended to be at least as good as f32 bucket.
+
+What this means from a CTS implementation is that for these "at least as good as
+f32" error intervals, if the infinitely accurate result is finite for f32, then
+the error interval for f64 is just the f32 interval. If the result is not finite
+for f32, then the accuracy interval is just the unbounded interval.
+
+How this is implemented in the CTS is by having the FPTraits for AbstractFloat
+forward to the f32 implementation for the operations that are tested to be as
+good as f32.
+
+### Implementation
+
+AbstractFloats are a compile time construct that exist in WGSL. They are
+expressible as literal values or the result of operations that return them, but
+a variable cannot be typed as an AbstractFloat. Instead, the variable needs be a
+concrete type, i.e. f32 or f16, and the AbstractFloat value will be quantized
+on assignment.
+
+Because they cannot be stored nor passed via buffers, it is tricky to test them.
+There are two approaches that have been proposed for testing the results of
+operations that return AbstractFloats.
+
+As of the writing of this doc, this second option for testing AbstractFloats
+is the one being pursued in the CTS.
+
+#### const_assert
+
+The first proposal is to lean on the `const_assert` statement that exists in
+WGSL. For each test case a snippet of code would be written out that has a form
+something like this
+
+```
+// foo(x) is the operation under test
+const_assert lower < foo(x) // Result was below the acceptance interval
+const_assert upper > foo(x) // Result was above the acceptance interval
+```
+
+where lower and upper would actually be string replaced with literals for the
+bounds of the acceptance interval when generating the shader text.
+
+This approach has a number of limitations that made it unacceptable for the CTS.
+First, how errors are reported is a pain to debug. Someone working with the CTS
+would either get a report of a failed shader compile, or a failed compile with
+the line number, but they will not get the result of `foo(x)`. Just that it is
+out of range. Additionally, if you place many of these stanzas in the same
+shader to optimize dispatch, you will not get a report that these 3 of 32 cases
+failed with these results, you will just get this batch failed. All of these
+makes for a very poor experience in attempting to understand what is failing.
+
+Beyond the lack of ergonomics, this approach also makes things like AF
+comparison and const_assert very load bearing for the CTS. It is possible that
+a bug could exist in an implementation of const_assert for example that would
+cause it to not fail shader compilation, which could lead to silent passing of
+tests. Conceptually you can think of this instead of depending on a signal to
+indicate something is working, we would be depending on a signal that it isn't
+working, and assuming if we don't receive that signal everything is good, not
+that our signal mechanism was broken.
+
+#### Extracting Bits
+
+The other proposal that was developed depends on the fact that AbstractFloat is
+spec'd to be a f64 internally. So the CTS could store the result of an operation
+as two 32-bit unsigned integers (or broken up into sign, exponent, and
+mantissa). These stored integers could be exported to the testing framework via
+a buffer, which could in turn rebuild the f64 values.
+
+This approach allows the CTS to test values directly in the testing framework,
+thus provide the same diagnostics as other tests, as well as reusing the same
+running harness.
+
+The major downsides come from actually implementing extracting the bits. Due to
+the restrictions on AbstractFloats the actual code to extract the bits is
+tricky. Specifically there is no simple bit cast to something like an
+AbstractInt that can be used. Instead, `frexp` needs to be used with additional
+operations. This leads to problems, since as is `frexp` is not defined for
+subnormal values, so it is impossible to extract a subnormal AbstractFloat,
+though 0 could be returned when one is encountered.
+
+Test that do try to extract bits to determine the result should either avoid
+cases with subnormal results or check for the nearest normal or zero number.
+
+The inability to store AbstractFloats in non-lossy fashion also has additional
+issues, since this means that user defined functions that take in or return
+them do not exist in WGSL. Thus, the snippet of code for extracting
+AbstractFloats cannot just be inserted as a function at the top of a testing
+shader, and then invoked on each test case. Instead, it needs to be inlined
+into the shader at each call-site. Actually implementing this in the CTS isn't
+difficult, but it does make the shaders significantly longer and more
+difficult to read. It also may have an impact on how many test cases can be in
+a batch, since runtime for some backends is sensitive to the length of the
+shader being run.
+
+# Appendix
+
+### Significant f64 Values
+
+| Name | Decimal (~) | Hex | Sign Bit | Exponent Bits | Significand Bits |
+|------------------------|----------------:|----------------------:|---------:|--------------:|-----------------------------------------------------------------:|
+| Negative Infinity | -∞ | 0xfff0 0000 0000 0000 | 1 | 111 1111 1111 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 |
+| Min Negative Normal | -1.79769313E308 | 0xffef ffff ffff ffff | 1 | 111 1111 1110 | 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 |
+| Max Negative Normal | -2.2250738E−308 | 0x8010 0000 0000 0000 | 1 | 000 0000 0001 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 |
+| Min Negative Subnormal | -2.2250738E−308 | 0x800f ffff ffff ffff | 1 | 000 0000 0000 | 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 |
+| Max Negative Subnormal | -4.9406564E−324 | 0x8000 0000 0000 0001 | 1 | 000 0000 0000 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 |
+| Negative Zero | -0 | 0x8000 0000 0000 0000 | 1 | 000 0000 0000 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 |
+| Positive Zero | 0 | 0x0000 0000 0000 0000 | 0 | 000 0000 0000 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 |
+| Min Positive Subnormal | 4.9406564E−324 | 0x0000 0000 0000 0001 | 0 | 000 0000 0000 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0001 |
+| Max Positive Subnormal | 2.2250738E−308 | 0x000f ffff ffff ffff | 0 | 000 0000 0000 | 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 |
+| Min Positive Normal | 2.2250738E−308 | 0x0010 0000 0000 0000 | 0 | 000 0000 0001 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 |
+| Max Positive Normal | 1.79769313E308 | 0x7fef ffff ffff ffff | 0 | 111 1111 1110 | 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 1111 |
+| Negative Infinity | ∞ | 0x7ff0 0000 0000 0000 | 0 | 111 1111 1111 | 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 |
+
+### Significant f32 Values
+
+| Name | Decimal (~) | Hex | Sign Bit | Exponent Bits | Significand Bits |
+|------------------------|---------------:|------------:|---------:|--------------:|-----------------------------:|
+| Negative Infinity | -∞ | 0xff80 0000 | 1 | 1111 1111 | 0000 0000 0000 0000 0000 000 |
+| Min Negative Normal | -3.40282346E38 | 0xff7f ffff | 1 | 1111 1110 | 1111 1111 1111 1111 1111 111 |
+| Max Negative Normal | -1.1754943E−38 | 0x8080 0000 | 1 | 0000 0001 | 0000 0000 0000 0000 0000 000 |
+| Min Negative Subnormal | -1.1754942E-38 | 0x807f ffff | 1 | 0000 0000 | 1111 1111 1111 1111 1111 111 |
+| Max Negative Subnormal | -1.4012984E−45 | 0x8000 0001 | 1 | 0000 0000 | 0000 0000 0000 0000 0000 001 |
+| Negative Zero | -0 | 0x8000 0000 | 1 | 0000 0000 | 0000 0000 0000 0000 0000 000 |
+| Positive Zero | 0 | 0x0000 0000 | 0 | 0000 0000 | 0000 0000 0000 0000 0000 000 |
+| Min Positive Subnormal | 1.4012984E−45 | 0x0000 0001 | 0 | 0000 0000 | 0000 0000 0000 0000 0000 001 |
+| Max Positive Subnormal | 1.1754942E-38 | 0x007f ffff | 0 | 0000 0000 | 1111 1111 1111 1111 1111 111 |
+| Min Positive Normal | 1.1754943E−38 | 0x0080 0000 | 0 | 0000 0001 | 0000 0000 0000 0000 0000 000 |
+| Max Positive Normal | 3.40282346E38 | 0x7f7f ffff | 0 | 1111 1110 | 1111 1111 1111 1111 1111 111 |
+| Negative Infinity | ∞ | 0x7f80 0000 | 0 | 1111 1111 | 0000 0000 0000 0000 0000 000 |
+
+### Significant f16 Values
+
+| Name | Decimal (~) | Hex | Sign Bit | Exponent Bits | Significand Bits |
+|------------------------|--------------:|-------:|---------:|--------------:|-----------------:|
+| Negative Infinity | -∞ | 0xfc00 | 1 | 111 11 | 00 0000 0000 |
+| Min Negative Normal | -65504 | 0xfbff | 1 | 111 10 | 11 1111 1111 |
+| Max Negative Normal | -6.1035156E−5 | 0x8400 | 1 | 000 01 | 00 0000 0000 |
+| Min Negative Subnormal | -6.0975552E−5 | 0x83ff | 1 | 000 00 | 11 1111 1111 |
+| Max Negative Subnormal | -5.9604645E−8 | 0x8001 | 1 | 000 00 | 00 0000 0001 |
+| Negative Zero | -0 | 0x8000 | 1 | 000 00 | 00 0000 0000 |
+| Positive Zero | 0 | 0x0000 | 0 | 000 00 | 00 0000 0000 |
+| Min Positive Subnormal | 5.9604645E−8 | 0x0001 | 0 | 000 00 | 00 0000 0001 |
+| Max Positive Subnormal | 6.0975552E−5 | 0x03ff | 0 | 000 00 | 11 1111 1111 |
+| Min Positive Normal | 6.1035156E−5 | 0x0400 | 0 | 000 01 | 00 0000 0000 |
+| Max Positive Normal | 65504 | 0x7bff | 0 | 111 10 | 11 1111 1111 |
+| Negative Infinity | ∞ | 0x7c00 | 0 | 111 11 | 00 0000 0000 |
+
+# Resources
+- [WebGPU Spec](https://www.w3.org/TR/webgpu/)
+- [WGSL Spec](https://www.w3.org/TR/WGSL/)
+- [binary64 on Wikipedia](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)
+- [binary32 on Wikipedia](https://en.wikipedia.org/wiki/Single-precision_floating-point_format)
+- [binary16 on Wikipedia](https://en.wikipedia.org/wiki/Half-precision_floating-point_format)
+- [IEEE-754 Floating Point Converter](https://www.h-schmidt.net/FloatConverter/IEEE754.html)
+- [IEEE 754 Calculator](http://weitz.de/ieee/)
+- [On the definition of ulp(x)](https://hal.inria.fr/inria-00070503/document)
+- [Float Exposed](https://float.exposed/)
diff --git a/dom/webgpu/tests/cts/checkout/docs/helper_index.txt b/dom/webgpu/tests/cts/checkout/docs/helper_index.txt
new file mode 100644
index 0000000000..3cdf868bb4
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/helper_index.txt
@@ -0,0 +1,93 @@
+<!--
+ View this file in Typedoc!
+
+ - At https://gpuweb.github.io/cts/docs/tsdoc/
+ - Or locally:
+ - npm run tsdoc
+ - npm start
+ - http://localhost:8080/docs/tsdoc/
+
+ This file is parsed as a tsdoc.
+-->
+
+## Index of Test Helpers
+
+This index is a quick-reference of helper functions in the test suite.
+Use it to determine whether you can reuse a helper, instead of writing new code,
+to improve readability and reviewability.
+
+Whenever a new generally-useful helper is added, it should be indexed here.
+
+**See linked documentation for full helper listings.**
+
+- {@link common/framework/params_builder!CaseParamsBuilder} and {@link common/framework/params_builder!SubcaseParamsBuilder}:
+ Combinatorial generation of test parameters. They are iterated by the test framework at runtime.
+ See `examples.spec.ts` for basic examples of how this behaves.
+ - {@link common/framework/params_builder!CaseParamsBuilder}:
+ `ParamsBuilder` for adding "cases" to a test.
+ - {@link common/framework/params_builder!CaseParamsBuilder#beginSubcases}:
+ "Finalizes" the `CaseParamsBuilder`, returning a `SubcaseParamsBuilder`.
+ - {@link common/framework/params_builder!SubcaseParamsBuilder}:
+ `ParamsBuilder` for adding "subcases" to a test.
+
+### Fixtures
+
+(Uncheck the "Inherited" box to hide inherited methods from documentation pages.)
+
+- {@link common/framework/fixture!Fixture}: Base fixture for all tests.
+- {@link webgpu/gpu_test!GPUTest}: Base fixture for WebGPU tests.
+- {@link webgpu/api/validation/validation_test!ValidationTest}: Base fixture for WebGPU validation tests.
+- {@link webgpu/shader/validation/shader_validation_test!ShaderValidationTest}: Base fixture for WGSL shader validation tests.
+- {@link webgpu/idl/idl_test!IDLTest}:
+ Base fixture for testing the exposed interface is correct (without actually using WebGPU).
+
+### WebGPU Helpers
+
+- {@link webgpu/capability_info}: Structured information about texture formats, binding types, etc.
+- {@link webgpu/constants}:
+ Constant values (needed anytime a WebGPU constant is needed outside of a test function).
+- {@link webgpu/util/buffer}: Helpers for GPUBuffers.
+- {@link webgpu/util/texture}: Helpers for GPUTextures.
+- {@link webgpu/util/unions}: Helpers for various union typedefs in the WebGPU spec.
+- {@link webgpu/util/math}: Helpers for common math operations.
+- {@link webgpu/util/check_contents}: Check the contents of TypedArrays, with nice messages.
+ Also can be composed with {@link webgpu/gpu_test!GPUTest#expectGPUBufferValuesPassCheck}, used to implement
+ GPUBuffer checking helpers in GPUTest.
+- {@link webgpu/util/conversion}: Numeric encoding/decoding for float/unorm/snorm values, etc.
+- {@link webgpu/util/copy_to_texture}:
+ Helper class for copyToTexture test suites for execution copy and check results.
+- {@link webgpu/util/color_space_conversion}:
+ Helper functions to do color space conversion. The algorithm is the same as defined in
+ CSS Color Module Level 4.
+- {@link webgpu/util/create_elements}:
+ Helpers for creating web elements like HTMLCanvasElement, OffscreenCanvas, etc.
+- {@link webgpu/util/shader}: Helpers for creating fragment shader based on intended output values, plainType, and componentCount.
+- {@link webgpu/util/prng}: Seed-able deterministic pseudo random number generator. Replacement for Math.random().
+- {@link webgpu/util/texture/base}: General texture-related helpers.
+- {@link webgpu/util/texture/data_generation}: Helper for generating dummy texture data.
+- {@link webgpu/util/texture/layout}: Helpers for working with linear image data
+ (like in copyBufferToTexture, copyTextureToBuffer, writeTexture).
+- {@link webgpu/util/texture/subresource}: Helpers for working with texture subresource ranges.
+- {@link webgpu/util/texture/texel_data}: Helpers encoding/decoding texel formats.
+- {@link webgpu/util/texture/texel_view}: Helper class to create and view texture data through various representations.
+- {@link webgpu/util/texture/texture_ok}: Helpers for checking texture contents.
+- {@link webgpu/shader/types}: Helpers for WGSL data types.
+- {@link webgpu/shader/execution/expression/expression}: Helpers for WGSL expression execution tests.
+- {@link webgpu/web_platform/util}: Helpers for web platform features (e.g. video elements).
+
+### General Helpers
+
+- {@link common/framework/resources}: Provides the path to the `resources/` directory.
+- {@link common/util/navigator_gpu}: Finds and returns the `navigator.gpu` object or equivalent.
+- {@link common/util/util}: Miscellaneous utilities.
+ - {@link common/util/util!assert}: Assert a condition, otherwise throw an exception.
+ - {@link common/util/util!unreachable}: Assert unreachable code.
+ - {@link common/util/util!assertReject}, {@link common/util/util!resolveOnTimeout},
+ {@link common/util/util!rejectOnTimeout},
+ {@link common/util/util!raceWithRejectOnTimeout}, and more.
+- {@link common/util/collect_garbage}:
+ Attempt to trigger garbage collection, for testing that garbage collection is not observable.
+- {@link common/util/preprocessor}: A simple template-based, non-line-based preprocessor,
+ implementing if/elif/else/endif. Possibly useful for WGSL shader generation.
+- {@link common/util/timeout}: Use this instead of `setTimeout`.
+- {@link common/util/types}: Type metaprogramming helpers.
diff --git a/dom/webgpu/tests/cts/checkout/docs/implementing.md b/dom/webgpu/tests/cts/checkout/docs/implementing.md
new file mode 100644
index 0000000000..ae6848839a
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/implementing.md
@@ -0,0 +1,97 @@
+# Test Implementation
+
+Concepts important to understand when writing tests. See existing tests for examples to copy from.
+
+## Test fixtures
+
+Most tests can use one of the several common test fixtures:
+
+- `Fixture`: Base fixture, provides core functions like `expect()`, `skip()`.
+- `GPUTest`: Wraps every test in error scopes. Provides helpers like `expectContents()`.
+- `ValidationTest`: Extends `GPUTest`, provides helpers like `expectValidationError()`, `getErrorTextureView()`.
+- Or create your own. (Often not necessary - helper functions can be used instead.)
+
+Test fixtures or helper functions may be defined in `.spec.ts` files, but if used by multiple
+test files, should be defined in separate `.ts` files (without `.spec`) alongside the files that
+use them.
+
+### GPUDevices in tests
+
+`GPUDevice`s are largely stateless (except for `lost`-ness, error scope stack, and `label`).
+This allows the CTS to reuse one device across multiple test cases using the `DevicePool`,
+which provides `GPUDevice` objects to tests.
+
+Currently, there is one `GPUDevice` with the default descriptor, and
+a cache of several more, for devices with additional capabilities.
+Devices in the `DevicePool` are automatically removed when certain things go wrong.
+
+Later, there may be multiple `GPUDevice`s to allow multiple test cases to run concurrently.
+
+## Test parameterization
+
+The CTS provides helpers (`.params()` and friends) for creating large cartesian products of test parameters.
+These generate "test cases" further subdivided into "test subcases".
+See `basic,*` in `examples.spec.ts` for examples, and the [helper index](./helper_index.txt)
+for a list of capabilities.
+
+Test parameterization should be applied liberally to ensure the maximum coverage
+possible within reasonable time. You can skip some with `.filter()`. And remember: computers are
+pretty fast - thousands of test cases can be reasonable.
+
+Use existing lists of parameters values (such as
+[`kTextureFormats`](https://github.com/gpuweb/cts/blob/0f38b85/src/suites/cts/capability_info.ts#L61),
+to parameterize tests), instead of making your own list. Use the info tables (such as
+`kTextureFormatInfo`) to define and retrieve information about the parameters.
+
+## Asynchrony in tests
+
+Since there are no synchronous operations in WebGPU, almost every test is asynchronous in some
+way. For example:
+
+- Checking the result of a readback.
+- Capturing the result of a `popErrorScope()`.
+
+That said, test functions don't always need to be `async`; see below.
+
+### Checking asynchronous errors/results
+
+Validation is inherently asynchronous (`popErrorScope()` returns a promise). However, the error
+scope stack itself is synchronous - operations immediately after a `popErrorScope()` are outside
+that error scope.
+
+As a result, tests can assert things like validation errors/successes without having an `async`
+test body.
+
+**Example:**
+
+```typescript
+t.expectValidationError(() => {
+ device.createThing();
+});
+```
+
+does:
+
+- `pushErrorScope('validation')`
+- `popErrorScope()` and "eventually" check whether it returned an error.
+
+**Example:**
+
+```typescript
+t.expectGPUBufferValuesEqual(srcBuffer, expectedData);
+```
+
+does:
+
+- copy `srcBuffer` into a new mappable buffer `dst`
+- `dst.mapReadAsync()`, and "eventually" check what data it returned.
+
+Internally, this is accomplished via an "eventual expectation": `eventualAsyncExpectation()`
+takes an async function, calls it immediately, and stores off the resulting `Promise` to
+automatically await at the end before determining the pass/fail state.
+
+### Asynchronous parallelism
+
+A side effect of test asynchrony is that it's possible for multiple tests to be in flight at
+once. We do not currently do this, but it will eventually be an option to run `N` tests in
+"parallel", for faster local test runs.
diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/README.md b/dom/webgpu/tests/cts/checkout/docs/intro/README.md
new file mode 100644
index 0000000000..e5f8bcedc6
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/intro/README.md
@@ -0,0 +1,99 @@
+# Introduction
+
+These documents contains guidelines for contributors to the WebGPU CTS (Conformance Test Suite)
+on how to write effective tests, and on the testing philosophy to adopt.
+
+The WebGPU CTS is arguably more important than the WebGPU specification itself, because
+it is what forces implementation to be interoperable by checking they conform to the specification.
+However writing a CTS is hard and requires a lot of effort to reach good coverage.
+
+More than a collection of tests like regular end2end and unit tests for software artifacts, a CTS
+needs to be exhaustive. Contrast for example the WebGL2 CTS with the ANGLE end2end tests: they
+cover the same functionality (WebGL 2 / OpenGL ES 3) but are structured very differently:
+
+- ANGLE's test suite has one or two tests per functionality to check it works correctly, plus
+ regression tests and special tests to cover implementation details.
+- WebGL2's CTS can have thousands of tests per API aspect to cover every combination of
+ parameters (and global state) used by an operation.
+
+Below are guidelines based on our collective experience with graphics API CTSes like WebGL's.
+They are expected to evolve over time and have exceptions, but should give a general idea of what
+to do.
+
+## Contributing
+
+Testing tasks are tracked in the [CTS project tracker](https://github.com/orgs/gpuweb/projects/3).
+Go here if you're looking for tasks, or if you have a test idea that isn't already covered.
+
+If contributing conformance tests, the directory you'll work in is [`src/webgpu/`](../src/webgpu/).
+This directory is organized according to the goal of the test (API validation behavior vs
+actual results) and its target (API entry points and spec areas, e.g. texture sampling).
+
+The contents of a test file (`src/webgpu/**/*.spec.ts`) are twofold:
+
+- Documentation ("test plans") on what tests do, how they do it, and what cases they cover.
+ Some test plans are fully or partially unimplemented:
+ they either contain "TODO" in a description or are `.unimplemented()`.
+- Actual tests.
+
+**Please read the following short documents before contributing.**
+
+### 0. [Developing](developing.md)
+
+- Reviewers should also read [Review Requirements](../reviews.md).
+
+### 1. [Life of a Test Change](life_of.md)
+
+### 2. [Adding or Editing Test Plans](plans.md)
+
+### 3. [Implementing Tests](tests.md)
+
+## [Additional Documentation](../)
+
+## Examples
+
+### Operation testing of vertex input id generation
+
+This section provides an example of the planning process for a test.
+It has not been refined into a set of final test plan descriptions.
+(Note: this predates the actual implementation of these tests, so doesn't match the actual tests.)
+
+Somewhere under the `api/operation` node are tests checking that running `GPURenderPipelines` on
+the device using the `GPURenderEncoderBase.draw` family of functions works correctly. Render
+pipelines are composed of several stages that are mostly independent so they can be split in
+several parts such as `vertex_input`, `rasterization`, `blending`.
+
+Vertex input itself has several parts that are mostly separate in hardware:
+
+- generation of the vertex and instance indices to run for this draw
+- fetching of vertex data from vertex buffers based on these indices
+- conversion from the vertex attribute `GPUVertexFormat` to the datatype for the input variable
+ in the shader
+
+Each of these are tested separately and have cases for each combination of the variables that may
+affect them. This means that `api/operation/render/vertex_input/id_generation` checks that the
+correct operation is performed for the cartesian product of all the following dimensions:
+
+- for encoding in a `GPURenderPassEncoder` or a `GPURenderBundleEncoder`
+- whether the draw is direct or indirect
+- whether the draw is indexed or not
+- for various values of the `firstInstance` argument
+- for various values of the `instanceCount` argument
+- if the draw is not indexed:
+ - for various values of the `firstVertex` argument
+ - for various values of the `vertexCount` argument
+- if the draw is indexed:
+ - for each `GPUIndexFormat`
+ - for various values of the indices in the index buffer including the primitive restart values
+ - for various values for the `offset` argument to `setIndexBuffer`
+ - for various values of the `firstIndex` argument
+ - for various values of the `indexCount` argument
+ - for various values of the `baseVertex` argument
+
+"Various values" above mean several small values, including `0` and the second smallest valid
+value to check for corner cases, as well as some large value.
+
+An instance of the test sets up a `draw*` call based on the parameters, using point rendering and
+a fragment shader that outputs to a storage buffer. After the draw the test checks the content of
+the storage buffer to make sure all expected vertex shader invocation, and only these ones have
+been generated.
diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png b/dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png
new file mode 100644
index 0000000000..672324a9d9
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/intro/convert_to_issue.png
Binary files differ
diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/developing.md b/dom/webgpu/tests/cts/checkout/docs/intro/developing.md
new file mode 100644
index 0000000000..5b1aeed36d
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/intro/developing.md
@@ -0,0 +1,134 @@
+# Developing
+
+The WebGPU CTS is written in TypeScript.
+
+## Setup
+
+After checking out the repository and installing node/npm, run:
+
+```sh
+npm ci
+```
+
+Before uploading, you can run pre-submit checks (`npm test`) to make sure it will pass CI.
+Use `npm run fix` to fix linting issues.
+
+`npm run` will show available npm scripts.
+Some more scripts can be listed using `npx grunt`.
+
+## Dev Server
+
+To start the development server, use:
+
+```sh
+npm start
+```
+
+Then, browse to the standalone test runner at the printed URL.
+
+The server will generate and compile code on the fly, so no build step is necessary.
+Only a reload is needed to see saved changes.
+(TODO: except, currently, `README.txt` and file `description` changes won't be reflected in
+the standalone runner.)
+
+Note: The first load of a test suite may take some time as generating the test suite listing can
+take a few seconds.
+
+## Standalone Test Runner / Test Plan Viewer
+
+**The standalone test runner also serves as a test plan viewer.**
+(This can be done in a browser without WebGPU support.)
+You can use this to preview how your test plan will appear.
+
+You can view different suites (webgpu, unittests, stress, etc.) or different subtrees of
+the test suite.
+
+- `http://localhost:8080/standalone/` (defaults to `?runnow=0&worker=0&debug=0&q=webgpu:*`)
+- `http://localhost:8080/standalone/?q=unittests:*`
+- `http://localhost:8080/standalone/?q=unittests:basic:*`
+
+The following url parameters change how the harness runs:
+
+- `runnow=1` runs all matching tests on page load.
+- `debug=1` enables verbose debug logging from tests.
+- `worker=1` runs the tests on a Web Worker instead of the main thread.
+- `power_preference=low-power` runs most tests passing `powerPreference: low-power` to `requestAdapter`
+- `power_preference=high-performance` runs most tests passing `powerPreference: high-performance` to `requestAdapter`
+
+### Web Platform Tests (wpt) - Ref Tests
+
+You can inspect the actual and reference pages for web platform reftests in the standalone
+runner by navigating to them. For example, by loading:
+
+ - `http://localhost:8080/out/webgpu/web_platform/reftests/canvas_clear.https.html`
+ - `http://localhost:8080/out/webgpu/web_platform/reftests/ref/canvas_clear-ref.html`
+
+You can also run a minimal ref test runner.
+
+ - open 2 terminals / command lines.
+ - in one, `npm start`
+ - in the other, `node tools/run_wpt_ref_tests <path-to-browser-executable> [name-of-test]`
+
+Without `[name-of-test]` all ref tests will be run. `[name-of-test]` is just a simple check for
+substring so passing in `rgba` will run every test with `rgba` in its filename.
+
+Examples:
+
+MacOS
+
+```
+# Chrome
+node tools/run_wpt_ref_tests /Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary
+```
+
+Windows
+
+```
+# Chrome
+node .\tools\run_wpt_ref_tests "C:\Users\your-user-name\AppData\Local\Google\Chrome SxS\Application\chrome.exe"
+```
+
+## Editor
+
+Since this project is written in TypeScript, it integrates best with
+[Visual Studio Code](https://code.visualstudio.com/).
+This is optional, but highly recommended: it automatically adds `import` lines and
+provides robust completions, cross-references, renames, error highlighting,
+deprecation highlighting, and type/JSDoc popups.
+
+Open the `cts.code-workspace` workspace file to load settings convenient for this project.
+You can make local configuration changes in `.vscode/`, which is untracked by Git.
+
+## Pull Requests
+
+When opening a pull request, fill out the PR checklist and attach the issue number.
+If an issue hasn't been opened, find the draft issue on the
+[project tracker](https://github.com/orgs/gpuweb/projects/3) and choose "Convert to issue":
+
+![convert to issue button screenshot](convert_to_issue.png)
+
+Opening a pull request will automatically notify reviewers.
+
+To make the review process smoother, once a reviewer has started looking at your change:
+
+- Avoid major additions or changes that would be best done in a follow-up PR.
+- Avoid rebases (`git rebase`) and force pushes (`git push -f`). These can make
+ it difficult for reviewers to review incremental changes as GitHub often cannot
+ view a useful diff across a rebase. If it's necessary to resolve conflicts
+ with upstream changes, use a merge commit (`git merge`) and don't include any
+ consequential changes in the merge, so a reviewer can skip over merge commits
+ when working through the individual commits in the PR.
+- When you address a review comment, mark the thread as "Resolved".
+
+Pull requests will (usually) be landed with the "Squash and merge" option.
+
+### TODOs
+
+The word "TODO" refers to missing test coverage. It may only appear inside file/test descriptions
+and README files (enforced by linting).
+
+To use comments to refer to TODOs inside the description, use a backreference, e.g., in the
+description, `TODO: Also test the FROBNICATE usage flag [1]`, and somewhere in the code, `[1]:
+Need to add FROBNICATE to this list.`.
+
+Use `MAINTENANCE_TODO` for TODOs which don't impact test coverage.
diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/life_of.md b/dom/webgpu/tests/cts/checkout/docs/intro/life_of.md
new file mode 100644
index 0000000000..8dced4ad84
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/intro/life_of.md
@@ -0,0 +1,46 @@
+# Life of a Test Change
+
+A "test change" could be a new test, an expansion of an existing test, a test bug fix, or a
+modification to existing tests to make them match new spec changes.
+
+**CTS contributors should contribute to the tracker and strive to keep it up to date, especially
+relating to their own changes.**
+
+Filing new draft issues in the CTS project tracker is very lightweight.
+Anyone with access should do this eagerly, to ensure no testing ideas are forgotten.
+(And if you don't have access, just file a regular issue.)
+
+1. Enter a [draft issue](https://github.com/orgs/gpuweb/projects/3), with the Status
+ set to "New (not in repo)", and any available info included in the issue description
+ (notes/plans to ensure full test coverage of the change). The source of this may be:
+
+ - Anything in the spec/API that is found not to be covered by the CTS yet.
+ - Any test is found to be outdated or otherwise buggy.
+ - A spec change from the "Needs CTS Issue" column in the
+ [spec project tracker](https://github.com/orgs/gpuweb/projects/1).
+ Once information on the required test changes is entered into the CTS project tracker,
+ the spec issue moves to "Specification Done".
+
+ Note: at some point, someone may make a PR to flush "New (not in repo)" issues into `TODO`s in
+ CTS file/test description text, changing their "Status" to "Open".
+ These may be done in bulk without linking back to the issue.
+
+1. As necessary:
+
+ - Convert the draft issue to a full, numbered issue for linking from later PRs.
+
+ ![convert to issue button screenshot](convert_to_issue.png)
+
+ - Update the "Assignees" of the issue when an issue is assigned or unassigned
+ (you can assign yourself).
+ - Change the "Status" of the issue to "Started" once you start the task.
+
+1. Open one or more PRs, **each linking to the associated issue**.
+ Each PR may is reviewed and landed, and may leave further TODOs for parts it doesn't complete.
+
+ 1. Test are "planned" in test descriptions. (For complex tests, open a separate PR with the
+ tests `.unimplemented()` so a reviewer can evaluate the plan before you implement tests.)
+ 1. Tests are implemented.
+
+1. When **no TODOs remain** for an issue, close it and change its status to "Complete".
+ (Enter a new more, specific draft issue into the tracker if you need to track related TODOs.)
diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/plans.md b/dom/webgpu/tests/cts/checkout/docs/intro/plans.md
new file mode 100644
index 0000000000..f8d7af3a78
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/intro/plans.md
@@ -0,0 +1,82 @@
+# Adding or Editing Test Plans
+
+## 1. Write a test plan
+
+For new tests, if some notes exist already, incorporate them into your plan.
+
+A detailed test plan should be written and reviewed before substantial test code is written.
+This allows reviewers a chance to identify additional tests and cases, opportunities for
+generalizations that would improve the strength of tests, similar existing tests or test plans,
+and potentially useful [helpers](../helper_index.txt).
+
+**A test plan must serve two functions:**
+
+- Describes the test, succinctly, but in enough detail that a reader can read *only* the test
+ plans and evaluate coverage completeness of a file/directory.
+- Describes the test precisely enough that, when code is added, the reviewer can ensure that the
+ test really covers what the test plan says.
+
+There should be one test plan for each test. It should describe what it tests, how, and describe
+important cases that need to be covered. Here's an example:
+
+```ts
+g.test('x,some_detail')
+ .desc(
+ `
+Tests [some detail] about x. Tests calling x in various 'mode's { mode1, mode2 },
+with various values of 'arg', and checks correctness of the result.
+Tries to trigger [some conditional path].
+
+- Valid values (control case) // <- (to make sure the test function works well)
+- Unaligned values (should fail) // <- (only validation tests need to intentionally hit invalid cases)
+- Extreme values`
+ )
+ .params(u =>
+ u //
+ .combine('mode', ['mode1', 'mode2'])
+ .beginSubcases()
+ .combine('arg', [
+ // Valid // <- Comment params as you see fit.
+ 4,
+ 8,
+ 100,
+ // Invalid
+ 2,
+ 6,
+ 1e30,
+ ])
+ )
+ .unimplemented();
+```
+
+"Cases" each appear as individual items in the `/standalone/` runner.
+"Subcases" run inside each case, like a for-loop wrapping the `.fn(`test function`)`.
+Documentation on the parameter builder can be found in the [helper index](../helper_index.txt).
+
+It's often impossible to predict the exact case/subcase structure before implementing tests, so they
+can be added during implementation, instead of planning.
+
+For any notes which are not specific to a single test, or for preliminary notes for tests that
+haven't been planned in full detail, put them in the test file's `description` variable at
+the top. Or, if they aren't associated with a test file, put them in a `README.txt` file.
+
+**Any notes about missing test coverage must be marked with the word `TODO` inside a
+description or README.** This makes them appear on the `/standalone/` page.
+
+## 2. Open a pull request
+
+Open a PR, and work with the reviewer(s) to revise the test plan.
+
+Usually (probably), plans will be landed in separate PRs before test implementations.
+
+## Conventions used in test plans
+
+- `Iff`: If and only if
+- `x=`: "cartesian-cross equals", like `+=` for cartesian product.
+ Used for combinatorial test coverage.
+ - Sometimes this will result in too many test cases; simplify/reduce as needed
+ during planning *or* implementation.
+- `{x,y,z}`: list of cases to test
+ - e.g. `x= texture format {r8unorm, r8snorm}`
+- *Control case*: a case included to make sure that the rest of the cases aren't
+ missing their target by testing some other error case.
diff --git a/dom/webgpu/tests/cts/checkout/docs/intro/tests.md b/dom/webgpu/tests/cts/checkout/docs/intro/tests.md
new file mode 100644
index 0000000000..a67b6a20cc
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/intro/tests.md
@@ -0,0 +1,25 @@
+# Implementing Tests
+
+Once a test plan is done, you can start writing tests.
+To add new tests, imitate the pattern in neigboring tests or neighboring files.
+New test files must be named ending in `.spec.ts`.
+
+For an example test file, see [`src/webgpu/examples.spec.ts`](../../src/webgpu/examples.spec.ts).
+For a more complex, well-structured reference test file, see
+[`src/webgpu/api/validation/vertex_state.spec.ts`](../../src/webgpu/api/validation/vertex_state.spec.ts).
+
+Implement some tests and open a pull request. You can open a PR any time you're ready for a review.
+(If two tests are non-trivial but independent, consider separate pull requests.)
+
+Before uploading, you can run pre-submit checks (`npm test`) to make sure it will pass CI.
+Use `npm run fix` to fix linting issues.
+
+## Test Helpers
+
+It's best to be familiar with helpers available in the test suite for simplifying
+test implementations.
+
+New test helpers can be added at any time to either of those files, or to new `.ts` files anywhere
+near the `.spec.ts` file where they're used.
+
+Documentation on existing helpers can be found in the [helper index](../helper_index.txt).
diff --git a/dom/webgpu/tests/cts/checkout/docs/organization.md b/dom/webgpu/tests/cts/checkout/docs/organization.md
new file mode 100644
index 0000000000..fd7020afd6
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/organization.md
@@ -0,0 +1,166 @@
+# Test Organization
+
+## `src/webgpu/`
+
+Because of the glorious amount of test needed, the WebGPU CTS is organized as a tree of arbitrary
+depth (a filesystem with multiple tests per file).
+
+Each directory may have a `README.txt` describing its contents.
+Tests are grouped in large families (each of which has a `README.txt`);
+the root and first few levels looks like the following (some nodes omitted for simplicity):
+
+- **`api`** with tests for full coverage of the Javascript API surface of WebGPU.
+ - **`validation`** with positive and negative tests for all the validation rules of the API.
+ - **`operation`** with tests that checks the result of performing valid WebGPU operations,
+ taking advantage of parametrization to exercise interactions between parts of the API.
+ - **`regression`** for one-off tests that reproduce bugs found in implementations to prevent
+ the bugs from appearing again.
+- **`shader`** with tests for full coverage of the shaders that can be passed to WebGPU.
+ - **`validation`**.
+ - **`execution`** similar to `api/operation`.
+ - **`regression`**.
+- **`idl`** with tests to check that the WebGPU IDL is correctly implemented, for examples that
+ objects exposed exactly the correct members, and that methods throw when passed incomplete
+ dictionaries.
+- **`web-platform`** with tests for Web platform-specific interactions like `GPUSwapChain` and
+ `<canvas>`, WebXR and `GPUQueue.copyExternalImageToTexture`.
+
+At the same time test hierarchies can be used to split the testing of a single sub-object into
+several file for maintainability. For example `GPURenderPipeline` has a large descriptor and some
+parts could be tested independently like `vertex_input` vs. `primitive_topology` vs. `blending`
+but all live under the `render_pipeline` directory.
+
+In addition to the test tree, each test can be parameterized. For coverage it is important to
+test all enums values, for example for `GPUTextureFormat`. Instead of having a loop to iterate
+over all the `GPUTextureFormat`, it is better to parameterize the test over them. Each format
+will have a different entry in the test list which will help WebGPU implementers debug the test,
+or suppress the failure without losing test coverage while they fix the bug.
+
+Extra capabilities (limits and features) are often tested in the same files as the rest of the API.
+For example, a compressed texture format capability would simply add a `GPUTextureFormat` to the
+parametrization lists of many tests, while a capability adding significant new functionality
+like ray-tracing could have a separate subtree.
+
+Operation tests for optional features should be skipped using `t.selectDeviceOrSkipTestCase()` or
+`t.skip()`. Validation tests should be written that test the behavior with and without the
+capability enabled via `t.selectDeviceOrSkipTestCase()`, to ensure the functionality is valid
+only with the capability enabled.
+
+### Validation tests
+
+Validation tests check the validation rules that are (or will be) set by the
+WebGPU spec. Validation tests try to carefully trigger the individual validation
+rules in the spec, without simultaneously triggering other rules.
+
+Validation errors *generally* generate WebGPU errors, not exceptions.
+But check the spec on a case-by-case basis.
+
+Like all `GPUTest`s, `ValidationTest`s are wrapped in both types of error scope. These
+"catch-all" error scopes look for any errors during the test, and report them as test failures.
+Since error scopes can be nested, validation tests can nest an error scope to expect that there
+*are* errors from specific operations.
+
+#### Parameterization
+
+Test parameterization can help write many validation tests more succinctly,
+while making it easier for both authors and reviewers to be confident that
+an aspect of the API is tested fully. Examples:
+
+- [`webgpu:api,validation,render_pass,resolve:resolve_attachment:*`](https://github.com/gpuweb/cts/blob/ded3b7c8a4680a1a01621a8ac859facefadf32d0/src/webgpu/api/validation/render_pass/resolve.spec.ts#L35)
+- [`webgpu:api,validation,createBindGroupLayout:bindingTypeSpecific_optional_members:*`](https://github.com/gpuweb/cts/blob/ded3b7c8a4680a1a01621a8ac859facefadf32d0/src/webgpu/api/validation/createBindGroupLayout.spec.ts#L68)
+
+Use your own discretion when deciding the balance between heavily parameterizing
+a test and writing multiple separate tests.
+
+#### Guidelines
+
+There are many aspects that should be tested in all validation tests:
+
+- each individual argument to a method call (including `this`) or member of a descriptor
+ dictionary should be tested including:
+ - what happens when an error object is passed.
+ - what happens when an optional feature enum or method is used.
+ - what happens for numeric values when they are at 0, too large, too small, etc.
+- each validation rule in the specification should be checked both with a control success case,
+ and error cases.
+- each set of arguments or state that interact for validation.
+
+When testing numeric values, it is important to check on both sides of the boundary: if the error
+happens for value N and not N - 1, both should be tested. Alignment of integer values should also
+be tested but boundary testing of alignment should be between a value aligned to 2^N and a value
+aligned to 2^(N-1).
+
+Finally, this is probably also where we would test that extensions follow the rule that: if the
+browser supports a feature but it is not enabled on the device, then calling methods from that
+feature throws `TypeError`.
+
+- Test providing unknown properties *that are definitely not part of any feature* are
+ valid/ignored. (Unfortunately, due to the rules of IDL, adding a member to a dictionary is
+ always a breaking change. So this is how we have to test this unless we can get a "strict"
+ dictionary type in IDL. We can't test adding members from non-enabled extensions.)
+
+### Operation tests
+
+Operation tests test the actual results of using the API. They execute
+(sometimes significant) code and check that the result is within the expected
+set of behaviors (which can be quite complex to compute).
+
+Note that operation tests need to test a lot of interactions between different
+parts of the API, and so can become quite complex. Try to reduce the complexity by
+utilizing combinatorics and [helpers](./helper_index.txt), and splitting/merging test files as needed.
+
+#### Errors
+
+Operation tests are usually `GPUTest`s. As a result, they automatically fail on any validation
+errors that occur during the test.
+
+When it's easier to write an operation test with invalid cases, use
+`ParamsBuilder.filter`/`.unless` to avoid invalid cases, or detect and
+`expect` validation errors in some cases.
+
+#### Implementation
+
+Use helpers like `expectContents` (and more to come) to check the values of data on the GPU.
+(These are "eventual expectations" - the harness will wait for them to finish at the end).
+
+When testing something inside a shader, it's not always necessary to output the result to a
+render output. In fragment shaders, you can output to a storage buffer. In vertex shaders, you
+can't - but you can render with points (simplest), send the result to the fragment shader, and
+output it from there. (Someday, we may end up wanting a helper for this.)
+
+#### Testing Default Values
+
+Default value tests (for arguments and dictionary members) should usually be operation tests -
+all you have to do is include `undefined` in parameterizations of other tests to make sure the
+behavior with `undefined` has the same expected result that you have when the default value is
+specified explicitly.
+
+### IDL tests
+
+TODO: figure out how to implement these. https://github.com/gpuweb/cts/issues/332
+
+These tests test only rules that come directly from WebIDL. For example:
+
+- Values out of range for `[EnforceRange]` cause exceptions.
+- Required function arguments and dictionary members cause exceptions if omitted.
+- Arguments and dictionary members cause exceptions if passed the wrong type.
+
+They may also test positive cases like the following, but the behavior of these should be tested in
+operation tests.
+
+- OK to omit optional arguments/members.
+- OK to pass the correct argument/member type (or of any type in a union type).
+
+Every overload of every method should be tested.
+
+## `src/stress/`, `src/manual/`
+
+Stress tests and manual tests for WebGPU that are not intended to be run in an automated way.
+
+## `src/unittests/`
+
+Unit tests for the test framework (`src/common/framework/`).
+
+## `src/demo/`
+
+A demo of test hierarchies for the purpose of testing the `standalone` test runner page.
diff --git a/dom/webgpu/tests/cts/checkout/docs/reviews.md b/dom/webgpu/tests/cts/checkout/docs/reviews.md
new file mode 100644
index 0000000000..1a8c3f9624
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/reviews.md
@@ -0,0 +1,70 @@
+# Review Requirements
+
+A review should have several items checked off before it is landed.
+Checkboxes are pre-filled into the pull request summary when it's created.
+
+The uploader may pre-check-off boxes if they are not applicable
+(e.g. TypeScript readability on a plan PR).
+
+## Readability
+
+A reviewer has "readability" for a topic if they have enough expertise in that topic to ensure
+good practices are followed in pull requests, or know when to loop in other reviewers.
+Perfection is not required!
+
+**It is up to reviewers' own discretion** whether they are qualified to check off a
+"readability" checkbox on any given pull request.
+
+- WebGPU Readability: Familiarity with the API to ensure:
+
+ - WebGPU is being used correctly; expected results seem reasonable.
+ - WebGPU is being tested completely; tests have control cases.
+ - Test code has a clear correspondence with the test description.
+ - [Test helpers](./helper_index.txt) are used or created appropriately
+ (where the reviewer is familiar with the helpers).
+
+- TypeScript Readability: Make sure TypeScript is utilized in a way that:
+
+ - Ensures test code is reasonably type-safe.
+ Reviewers may recommend changes to make type-safety either weaker (`as`, etc.) or stronger.
+ - Is understandable and has appropriate verbosity and dynamicity
+ (e.g. type inference and `as const` are used to reduce unnecessary boilerplate).
+
+## Plan Reviews
+
+**Changes *must* have an author or reviewer with the following readability:** WebGPU
+
+Reviewers must carefully ensure the following:
+
+- The test plan name accurately describes the area being tested.
+- The test plan covers the area described by the file/test name and file/test description
+ as fully as possible (or adds TODOs for incomplete areas).
+- Validation tests have control cases (where no validation error should occur).
+- Each validation rule is tested in isolation, in at least one case which does not validate any
+ other validation rules.
+
+See also: [Adding or Editing Test Plans](intro/plans.md).
+
+## Implementation Reviews
+
+**Changes *must* have an author or reviewer with the following readability:** WebGPU, TypeScript
+
+Reviewers must carefully ensure the following:
+
+- The coverage of the test implementation precisely matches the test description.
+- Everything required for test plan reviews above.
+
+Reviewers should ensure the following:
+
+- New test helpers are documented in [helper index](./helper_index.txt).
+- Framework and test helpers are used where they would make test code clearer.
+
+See also: [Implementing Tests](intro/tests.md).
+
+## Framework
+
+**Changes *must* have an author or reviewer with the following readability:** TypeScript
+
+Reviewers should ensure the following:
+
+- Changes are reasonably type-safe, and covered by unit tests where appropriate.
diff --git a/dom/webgpu/tests/cts/checkout/docs/terms.md b/dom/webgpu/tests/cts/checkout/docs/terms.md
new file mode 100644
index 0000000000..032639be57
--- /dev/null
+++ b/dom/webgpu/tests/cts/checkout/docs/terms.md
@@ -0,0 +1,270 @@
+# Terminology
+
+Each test suite is organized as a tree, both in the filesystem and further within each file.
+
+- _Suites_, e.g. `src/webgpu/`.
+ - _READMEs_, e.g. `src/webgpu/README.txt`.
+ - _Test Spec Files_, e.g. `src/webgpu/examples.spec.ts`.
+ Identified by their file path.
+ Each test spec file provides a description and a _Test Group_.
+ A _Test Group_ defines a test fixture, and contains multiple:
+ - _Tests_.
+ Identified by a comma-separated list of parts (e.g. `basic,async`)
+ which define a path through a filesystem-like tree (analogy: `basic/async.txt`).
+ Defines a _test function_ and contains multiple:
+ - _Test Cases_.
+ Identified by a list of _Public Parameters_ (e.g. `x` = `1`, `y` = `2`).
+ Each Test Case has the same test function but different Public Parameters.
+
+## Test Tree
+
+A _Test Tree_ is a tree whose leaves are individual Test Cases.
+
+A Test Tree can be thought of as follows:
+
+- Suite, which is the root of a tree with "leaves" which are:
+ - Test Spec Files, each of which is a tree with "leaves" which are:
+ - Tests, each of which is a tree with leaves which are:
+ - Test Cases.
+
+(In the implementation, this conceptual tree of trees is decomposed into one big tree
+whose leaves are Test Cases.)
+
+**Type:** `TestTree`
+
+## Suite
+
+A suite of tests.
+A single suite has a directory structure, and many _test spec files_
+(`.spec.ts` files containing tests) and _READMEs_.
+Each member of a suite is identified by its path within the suite.
+
+**Example:** `src/webgpu/`
+
+### README
+
+**Example:** `src/webgpu/README.txt`
+
+Describes (in prose) the contents of a subdirectory in a suite.
+
+READMEs are only processed at build time, when generating the _Listing_ for a suite.
+
+**Type:** `TestSuiteListingEntryReadme`
+
+## Queries
+
+A _Query_ is a structured object which specifies a subset of cases in exactly one Suite.
+A Query can be represented uniquely as a string.
+Queries are used to:
+
+- Identify a subtree of a suite (by identifying the root node of that subtree).
+- Identify individual cases.
+- Represent the list of tests that a test runner (standalone, wpt, or cmdline) should run.
+- Identify subtrees which should not be "collapsed" during WPT `cts.https.html` generation,
+ so that that cts.https.html "variants" can have individual test expectations
+ (i.e. marked as "expected to fail", "skip", etc.).
+
+There are four types of `TestQuery`:
+
+- `TestQueryMultiFile` represents any subtree of the file hierarchy:
+ - `suite:*`
+ - `suite:path,to,*`
+ - `suite:path,to,file,*`
+- `TestQueryMultiTest` represents any subtree of the test hierarchy:
+ - `suite:path,to,file:*`
+ - `suite:path,to,file:path,to,*`
+ - `suite:path,to,file:path,to,test,*`
+- `TestQueryMultiCase` represents any subtree of the case hierarchy:
+ - `suite:path,to,file:path,to,test:*`
+ - `suite:path,to,file:path,to,test:my=0;*`
+ - `suite:path,to,file:path,to,test:my=0;params="here";*`
+- `TestQuerySingleCase` represents as single case:
+ - `suite:path,to,file:path,to,test:my=0;params="here"`
+
+Test Queries are a **weakly ordered set**: any query is
+_Unordered_, _Equal_, _StrictSuperset_, or _StrictSubset_ relative to any other.
+This property is used to construct the complete tree of test cases.
+In the examples above, every example query is a StrictSubset of the previous one
+(note: even `:*` is a subset of `,*`).
+
+In the WPT and standalone harnesses, the query is stored in the URL, e.g.
+`index.html?q=q:u,e:r,y:*`.
+
+Queries are selectively URL-encoded for readability and compatibility with browsers
+(see `encodeURIComponentSelectively`).
+
+**Type:** `TestQuery`
+
+## Listing
+
+A listing of the **test spec files** in a suite.
+
+This can be generated only in Node, which has filesystem access (see `src/tools/crawl.ts`).
+As part of the build step, a _listing file_ is generated (see `src/tools/gen.ts`) so that the
+Test Spec Files can be discovered by the web runner (since it does not have filesystem access).
+
+**Type:** `TestSuiteListing`
+
+### Listing File
+
+Each Suite has one Listing File (`suite/listing.[tj]s`), containing a list of the files
+in the suite.
+
+In `src/suite/listing.ts`, this is computed dynamically.
+In `out/suite/listing.js`, the listing has been pre-baked (by `tools/gen_listings`).
+
+**Type:** Once `import`ed, `ListingFile`
+
+**Example:** `out/webgpu/listing.js`
+
+## Test Spec File
+
+A Test Spec File has a `description` and a Test Group (under which tests and cases are defined).
+
+**Type:** Once `import`ed, `SpecFile`
+
+**Example:** `src/webgpu/**/*.spec.ts`
+
+## Test Group
+
+A subtree of tests. There is one Test Group per Test Spec File.
+
+The Test Fixture used for tests is defined at TestGroup creation.
+
+**Type:** `TestGroup`
+
+## Test
+
+One test. It has a single _test function_.
+
+It may represent multiple _test cases_, each of which runs the same Test Function with different
+Parameters.
+
+A test is named using `TestGroup.test()`, which returns a `TestBuilder`.
+`TestBuilder.params()`/`.paramsSimple()`/`.paramsSubcasesOnly()`
+can optionally be used to parametrically generate instances (cases and subcases) of the test.
+Finally, `TestBuilder.fn()` provides the Test Function
+(or, a test can be marked unimplemented with `TestBuilder.unimplemented()`).
+
+### Test Function
+
+When a test subcase is run, the Test Function receives an instance of the
+Test Fixture provided to the Test Group, producing test results.
+
+**Type:** `TestFn`
+
+## Test Case / Case
+
+A single case of a test. It is identified by a `TestCaseID`: a test name, and its parameters.
+
+Each case appears as an individual item (tree leaf) in `/standalone/`,
+and as an individual "step" in WPT.
+
+If `TestBuilder.params()`/`.paramsSimple()`/`.paramsSubcasesOnly()` are not used,
+there is exactly one case with one subcase, with parameters `{}`.
+
+**Type:** During test run time, a case is encapsulated as a `RunCase`.
+
+## Test Subcase / Subcase
+
+A single "subcase" of a test. It can also be identified by a `TestCaseID`, though
+not all contexts allow subdividing cases into subcases.
+
+All of the subcases of a case will run _inside_ the case, essentially as a for-loop wrapping the
+test function. They do _not_ appear individually in `/standalone/` or WPT.
+
+If `CaseParamsBuilder.beginSubcases()` is not used, there is exactly one subcase per case.
+
+## Test Parameters / Params
+
+Each Test Subcase has a (possibly empty) set of Test Parameters,
+The parameters are passed to the Test Function `f(t)` via `t.params`.
+
+A set of Public Parameters identifies a Test Case or Test Subcase within a Test.
+
+There are also Private Parameters: any parameter name beginning with an underscore (`_`).
+These parameters are not part of the Test Case identification, but are still passed into
+the Test Function. They can be used, e.g., to manually specify expected results.
+
+**Type:** `TestParams`
+
+## Test Fixture / Fixture
+
+_Test Fixtures_ provide helpers for tests to use.
+A new instance of the fixture is created for every run of every test case.
+
+There is always one fixture class for a whole test group (though this may change).
+
+The fixture is also how a test gets access to the _case recorder_,
+which allows it to produce test results.
+
+They are also how tests produce results: `.skip()`, `.fail()`, etc.
+
+**Type:** `Fixture`
+
+### `UnitTest` Fixture
+
+Provides basic fixture utilities most useful in the `unittests` suite.
+
+### `GPUTest` Fixture
+
+Provides utilities useful in WebGPU CTS tests.
+
+# Test Results
+
+## Logger
+
+A logger logs the results of a whole test run.
+
+It saves an empty `LiveTestSpecResult` into its results map, then creates a
+_test spec recorder_, which records the results for a group into the `LiveTestSpecResult`.
+
+**Type:** `Logger`
+
+### Test Case Recorder
+
+Refers to a `LiveTestCaseResult` created by the logger.
+Records the results of running a test case (its pass-status, run time, and logs) into it.
+
+**Types:** `TestCaseRecorder`, `LiveTestCaseResult`
+
+#### Test Case Status
+
+The `status` of a `LiveTestCaseResult` can be one of:
+
+- `'running'` (only while still running)
+- `'pass'`
+- `'skip'`
+- `'warn'`
+- `'fail'`
+
+The "worst" result from running a case is always reported (fail > warn > skip > pass).
+Note this means a test can still fail if it's "skipped", if it failed before
+`.skip()` was called.
+
+**Type:** `Status`
+
+## Results Format
+
+The results are returned in JSON format.
+
+They are designed to be easily merged in JavaScript:
+the `"results"` can be passed into the constructor of `Map` and merged from there.
+
+(TODO: Write a merge tool, if needed.)
+
+```js
+{
+ "version": "bf472c5698138cdf801006cd400f587e9b1910a5-dirty",
+ "results": [
+ [
+ "unittests:async_mutex:basic:",
+ { "status": "pass", "timems": 0.286, "logs": [] }
+ ],
+ [
+ "unittests:async_mutex:serial:",
+ { "status": "pass", "timems": 0.415, "logs": [] }
+ ]
+ ]
+}
+```