From 102b0d2daa97dae68d3eed54d8fe37a9cc38a892 Mon Sep 17 00:00:00 2001
From: Daniel Baumann <daniel.baumann@progress-linux.org>
Date: Sun, 28 Apr 2024 11:13:47 +0200
Subject: Adding upstream version 2.8.0+dfsg.

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
---
 docs/plat/nvidia-tegra.rst | 148 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 148 insertions(+)
 create mode 100644 docs/plat/nvidia-tegra.rst

(limited to 'docs/plat/nvidia-tegra.rst')

diff --git a/docs/plat/nvidia-tegra.rst b/docs/plat/nvidia-tegra.rst
new file mode 100644
index 0000000..391c7c8
--- /dev/null
+++ b/docs/plat/nvidia-tegra.rst
@@ -0,0 +1,148 @@
+NVIDIA Tegra
+============
+
+-  .. rubric:: T194
+      :name: t194
+
+T194 has eight NVIDIA Carmel CPU cores in a coherent multi-processor
+configuration. The Carmel cores support the ARM Architecture version 8.2,
+executing both 64-bit AArch64 code, and 32-bit AArch32 code. The Carmel
+processors are organized as four dual-core clusters, where each cluster has
+a dedicated 2 MiB Level-2 unified cache. A high speed coherency fabric connects
+these processor complexes and allows heterogeneous multi-processing with all
+eight cores if required.
+
+-  .. rubric:: T186
+      :name: t186
+
+The NVIDIA® Parker (T186) series system-on-chip (SoC) delivers a heterogeneous
+multi-processing (HMP) solution designed to optimize performance and
+efficiency.
+
+T186 has Dual NVIDIA Denver2 ARM® CPU cores, plus Quad ARM Cortex®-A57 cores,
+in a coherent multiprocessor configuration. The Denver 2 and Cortex-A57 cores
+support ARMv8, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
+including legacy ARMv7 applications. The Denver 2 processors each have 128 KB
+Instruction and 64 KB Data Level 1 caches; and have a 2MB shared Level 2
+unified cache. The Cortex-A57 processors each have 48 KB Instruction and 32 KB
+Data Level 1 caches; and also have a 2 MB shared Level 2 unified cache. A
+high speed coherency fabric connects these two processor complexes and allows
+heterogeneous multi-processing with all six cores if required.
+
+Denver is NVIDIA's own custom-designed, 64-bit, dual-core CPU which is
+fully Armv8-A architecture compatible. Each of the two Denver cores
+implements a 7-way superscalar microarchitecture (up to 7 concurrent
+micro-ops can be executed per clock), and includes a 128KB 4-way L1
+instruction cache, a 64KB 4-way L1 data cache, and a 2MB 16-way L2
+cache, which services both cores.
+
+Denver implements an innovative process called Dynamic Code Optimization,
+which optimizes frequently used software routines at runtime into dense,
+highly tuned microcode-equivalent routines. These are stored in a
+dedicated, 128MB main-memory-based optimization cache. After being read
+into the instruction cache, the optimized micro-ops are executed,
+re-fetched and executed from the instruction cache as long as needed and
+capacity allows.
+
+Effectively, this reduces the need to re-optimize the software routines.
+Instead of using hardware to extract the instruction-level parallelism
+(ILP) inherent in the code, Denver extracts the ILP once via software
+techniques, and then executes those routines repeatedly, thus amortizing
+the cost of ILP extraction over the many execution instances.
+
+Denver also features new low latency power-state transitions, in addition
+to extensive power-gating and dynamic voltage and clock scaling based on
+workloads.
+
+-  .. rubric:: T210
+      :name: t210
+
+T210 has Quad Arm® Cortex®-A57 cores in a switched configuration with a
+companion set of quad Arm Cortex-A53 cores. The Cortex-A57 and A53 cores
+support Armv8-A, executing both 64-bit Aarch64 code, and 32-bit Aarch32 code
+including legacy Armv7-A applications. The Cortex-A57 processors each have
+48 KB Instruction and 32 KB Data Level 1 caches; and have a 2 MB shared
+Level 2 unified cache. The Cortex-A53 processors each have 32 KB Instruction
+and 32 KB Data Level 1 caches; and have a 512 KB shared Level 2 unified cache.
+
+Directory structure
+-------------------
+
+-  plat/nvidia/tegra/common - Common code for all Tegra SoCs
+-  plat/nvidia/tegra/soc/txxx - Chip specific code
+
+Trusted OS dispatcher
+---------------------
+
+Tegra supports multiple Trusted OS'.
+
+- Trusted Little Kernel (TLK): In order to include the 'tlkd' dispatcher in
+  the image, pass 'SPD=tlkd' on the command line while preparing a bl31 image.
+- Trusty: In order to include the 'trusty' dispatcher in the image, pass
+  'SPD=trusty' on the command line while preparing a bl31 image.
+
+This allows other Trusted OS vendors to use the upstream code and include
+their dispatchers in the image without changing any makefiles.
+
+These are the supported Trusted OS' by Tegra platforms.
+
+- Tegra210: TLK and Trusty
+- Tegra186: Trusty
+- Tegra194: Trusty
+
+Scatter files
+-------------
+
+Tegra platforms currently support scatter files and ld.S scripts. The scatter
+files help support ARMLINK linker to generate BL31 binaries. For now, there
+exists a common scatter file, plat/nvidia/tegra/scat/bl31.scat, for all Tegra
+SoCs. The `LINKER` build variable needs to point to the ARMLINK binary for
+the scatter file to be used. Tegra platforms have verified BL31 image generation
+with ARMCLANG (compilation) and ARMLINK (linking) for the Tegra186 platforms.
+
+Preparing the BL31 image to run on Tegra SoCs
+---------------------------------------------
+
+.. code:: shell
+
+    CROSS_COMPILE=<path-to-aarch64-gcc>/bin/aarch64-none-elf- make PLAT=tegra \
+    TARGET_SOC=<target-soc e.g. t194|t186|t210> SPD=<dispatcher e.g. trusty|tlkd>
+    bl31
+
+Platforms wanting to use different TZDRAM\_BASE, can add ``TZDRAM_BASE=<value>``
+to the build command line.
+
+The Tegra platform code expects a pointer to the following platform specific
+structure via 'x1' register from the BL2 layer which is used by the
+bl31\_early\_platform\_setup() handler to extract the TZDRAM carveout base and
+size for loading the Trusted OS and the UART port ID to be used. The Tegra
+memory controller driver programs this base/size in order to restrict NS
+accesses.
+
+typedef struct plat\_params\_from\_bl2 {
+/\* TZ memory size */
+uint64\_t tzdram\_size;
+/* TZ memory base */
+uint64\_t tzdram\_base;
+/* UART port ID \*/
+int uart\_id;
+/* L2 ECC parity protection disable flag \*/
+int l2\_ecc\_parity\_prot\_dis;
+/* SHMEM base address for storing the boot logs \*/
+uint64\_t boot\_profiler\_shmem\_base;
+} plat\_params\_from\_bl2\_t;
+
+Power Management
+----------------
+
+The PSCI implementation expects each platform to expose the 'power state'
+parameter to be used during the 'SYSTEM SUSPEND' call. The state-id field
+is implementation defined on Tegra SoCs and is preferably defined by
+tegra\_def.h.
+
+Tegra configs
+-------------
+
+-  'tegra\_enable\_l2\_ecc\_parity\_prot': This flag enables the L2 ECC and Parity
+   Protection bit, for Arm Cortex-A57 CPUs, during CPU boot. This flag will
+   be enabled by Tegrs SoCs during 'Cluster power up' or 'System Suspend' exit.
-- 
cgit v1.2.3