summaryrefslogtreecommitdiffstats
path: root/third_party/rust/packed_simd/perf-guide/src/target-feature
diff options
context:
space:
mode:
Diffstat (limited to 'third_party/rust/packed_simd/perf-guide/src/target-feature')
-rw-r--r--third_party/rust/packed_simd/perf-guide/src/target-feature/attribute.md5
-rw-r--r--third_party/rust/packed_simd/perf-guide/src/target-feature/features.md13
-rw-r--r--third_party/rust/packed_simd/perf-guide/src/target-feature/inlining.md5
-rw-r--r--third_party/rust/packed_simd/perf-guide/src/target-feature/practice.md31
-rw-r--r--third_party/rust/packed_simd/perf-guide/src/target-feature/runtime.md5
-rw-r--r--third_party/rust/packed_simd/perf-guide/src/target-feature/rustflags.md77
6 files changed, 136 insertions, 0 deletions
diff --git a/third_party/rust/packed_simd/perf-guide/src/target-feature/attribute.md b/third_party/rust/packed_simd/perf-guide/src/target-feature/attribute.md
new file mode 100644
index 0000000000..ee670fea5b
--- /dev/null
+++ b/third_party/rust/packed_simd/perf-guide/src/target-feature/attribute.md
@@ -0,0 +1,5 @@
+# The `target_feature` attribute
+
+<!-- TODO:
+Explain the `#[target_feature]` attribute
+-->
diff --git a/third_party/rust/packed_simd/perf-guide/src/target-feature/features.md b/third_party/rust/packed_simd/perf-guide/src/target-feature/features.md
new file mode 100644
index 0000000000..b93030ca67
--- /dev/null
+++ b/third_party/rust/packed_simd/perf-guide/src/target-feature/features.md
@@ -0,0 +1,13 @@
+# Enabling target features
+
+Not all processors of a certain architecture will have SIMD processing units,
+and using a SIMD instruction which is not supported will trigger undefined behavior.
+
+To allow building safe, portable programs, the Rust compiler will **not**, by default,
+generate any sort of vector instructions, unless it can statically determine
+they are supported. For example, on AMD64, SSE2 support is architecturally guaranteed.
+The `x86_64-apple-darwin` target enables up to SSSE3. The get a defintive list of
+which features are enabled by default on various platforms, refer to the target
+specifications [in the compiler's source code][targets].
+
+[targets]: https://github.com/rust-lang/rust/tree/master/src/librustc_target/spec
diff --git a/third_party/rust/packed_simd/perf-guide/src/target-feature/inlining.md b/third_party/rust/packed_simd/perf-guide/src/target-feature/inlining.md
new file mode 100644
index 0000000000..86705102a7
--- /dev/null
+++ b/third_party/rust/packed_simd/perf-guide/src/target-feature/inlining.md
@@ -0,0 +1,5 @@
+# Inlining
+
+<!-- TODO:
+Explain how the `#[target_feature]` attribute interacts with inlining
+-->
diff --git a/third_party/rust/packed_simd/perf-guide/src/target-feature/practice.md b/third_party/rust/packed_simd/perf-guide/src/target-feature/practice.md
new file mode 100644
index 0000000000..5b55c61c26
--- /dev/null
+++ b/third_party/rust/packed_simd/perf-guide/src/target-feature/practice.md
@@ -0,0 +1,31 @@
+# Target features in practice
+
+Using `RUSTFLAGS` will allow the crate being compiled, as well as all its
+transitive dependencies to use certain target features.
+
+A tehnique used to avoid undefined behavior at runtime is to compile and
+ship multiple binaries, each compiled with a certain set of features.
+This might not be feasible in some cases, and can quickly get out of hand
+as more and more vector extensions are added to an architecture.
+
+Rust can be more flexible: you can build a single binary/library which automatically
+picks the best supported vector instructions depending on the host machine.
+The trick consists of monomorphizing parts of the code during building, and then
+using run-time feature detection to select the right code path when running.
+
+<!-- TODO
+Explain how to create efficient functions that dispatch to different
+implementations at run-time without issues (e.g. using `#[inline(always)]` for
+the impls, wrapping in `#[target_feature]`, and the wrapping those in a function
+that does run-time feature detection).
+-->
+
+**NOTE** (x86 specific): because the AVX (256-bit) registers extend the existing
+SSE (128-bit) registers, mixing SSE and AVX instructions in a program can cause
+performance issues.
+
+The solution is to compile all code, even the code written with 128-bit vectors,
+with the AVX target feature enabled. This will cause the compiler to prefix the
+generated instructions with the [VEX] prefix.
+
+[VEX]: https://en.wikipedia.org/wiki/VEX_prefix
diff --git a/third_party/rust/packed_simd/perf-guide/src/target-feature/runtime.md b/third_party/rust/packed_simd/perf-guide/src/target-feature/runtime.md
new file mode 100644
index 0000000000..47ddcc8660
--- /dev/null
+++ b/third_party/rust/packed_simd/perf-guide/src/target-feature/runtime.md
@@ -0,0 +1,5 @@
+# Detecting host features at runtime
+
+<!-- TODO:
+Explain cost (how it works).
+-->
diff --git a/third_party/rust/packed_simd/perf-guide/src/target-feature/rustflags.md b/third_party/rust/packed_simd/perf-guide/src/target-feature/rustflags.md
new file mode 100644
index 0000000000..f4c1d1304a
--- /dev/null
+++ b/third_party/rust/packed_simd/perf-guide/src/target-feature/rustflags.md
@@ -0,0 +1,77 @@
+# Using RUSTFLAGS
+
+One of the easiest ways to benefit from SIMD is to allow the compiler
+to generate code using certain vector instruction extensions.
+
+The environment variable `RUSTFLAGS` can be used to pass options for code
+generation to the Rust compiler. These flags will affect **all** compiled crates.
+
+There are two flags which can be used to enable specific vector extensions:
+
+## target-feature
+
+- Syntax: `-C target-feature=<features>`
+
+- Provides the compiler with a comma-separated set of instruction extensions
+ to enable.
+
+ **Example**: Use `-C target-feature=+sse3,+avx` to enable generating instructions
+ for [Streaming SIMD Extensions 3](https://en.wikipedia.org/wiki/SSE3) and
+ [Advanced Vector Extensions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions).
+
+- To list target triples for all targets supported by Rust, use:
+
+ ```sh
+ rustc --print target-list
+ ```
+
+- To list all support target features for a certain target triple, use:
+
+ ```sh
+ rustc --target=${TRIPLE} --print target-features
+ ```
+
+- Note that all CPU features are independent, and will have to be enabled individually.
+
+ **Example**: Setting `-C target-feature=+avx2` will _not_ enable `fma`, even though
+ all CPUs which support AVX2 also support FMA. To enable both, one has to use
+ `-C target-feature=+avx2,+fma`
+
+- Some features also depend on other features, which need to be enabled for the
+ target instructions to be generated.
+
+ **Example**: Unless `v7` is specified as the target CPU (see below), to enable
+ NEON on ARM it is necessary to use `-C target-feature=+v7,+neon`.
+
+## target-cpu
+
+- Syntax: `-C target-cpu=<cpu>`
+
+- Sets the identifier of a CPU family / model for which to build and optimize the code.
+
+ **Example**: `RUSTFLAGS='-C target-cpu=cortex-a75'`
+
+- To list all supported target CPUs for a certain target triple, use:
+
+ ```sh
+ rustc --target=${TRIPLE} --print target-cpus
+ ```
+
+ **Example**:
+
+ ```sh
+ rustc --target=i686-pc-windows-msvc --print target-cpus
+ ```
+
+- The compiler will translate this into a list of target features. Therefore,
+ individual feature checks (`#[cfg(target_feature = "...")]`) will still
+ work properly.
+
+- It will cause the code generator to optimize the generated code for that
+ specific CPU model.
+
+- Using `native` as the CPU model will cause Rust to generate and optimize code
+ for the CPU running the compiler. It is useful when building programs which you
+ plan to only use locally. This should never be used when the generated programs
+ are meant to be run on other computers, such as when packaging for distribution
+ or cross-compiling.