diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-28 09:13:47 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-28 09:13:47 +0000 |
commit | 102b0d2daa97dae68d3eed54d8fe37a9cc38a892 (patch) | |
tree | bcf648efac40ca6139842707f0eba5a4496a6dd2 /docs/design | |
parent | Initial commit. (diff) | |
download | arm-trusted-firmware-102b0d2daa97dae68d3eed54d8fe37a9cc38a892.tar.xz arm-trusted-firmware-102b0d2daa97dae68d3eed54d8fe37a9cc38a892.zip |
Adding upstream version 2.8.0+dfsg.upstream/2.8.0+dfsgupstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/design')
-rw-r--r-- | docs/design/alt-boot-flows.rst | 84 | ||||
-rw-r--r-- | docs/design/auth-framework.rst | 980 | ||||
-rw-r--r-- | docs/design/cpu-specific-build-macros.rst | 742 | ||||
-rw-r--r-- | docs/design/firmware-design.rst | 2766 | ||||
-rw-r--r-- | docs/design/index.rst | 20 | ||||
-rw-r--r-- | docs/design/interrupt-framework-design.rst | 1021 | ||||
-rw-r--r-- | docs/design/psci-pd-tree.rst | 304 | ||||
-rw-r--r-- | docs/design/reset-design.rst | 168 | ||||
-rw-r--r-- | docs/design/trusted-board-boot-build.rst | 122 | ||||
-rw-r--r-- | docs/design/trusted-board-boot.rst | 263 |
10 files changed, 6470 insertions, 0 deletions
diff --git a/docs/design/alt-boot-flows.rst b/docs/design/alt-boot-flows.rst new file mode 100644 index 0000000..b44c061 --- /dev/null +++ b/docs/design/alt-boot-flows.rst @@ -0,0 +1,84 @@ +Alternative Boot Flows +====================== + +EL3 payloads alternative boot flow +---------------------------------- + +On a pre-production system, the ability to execute arbitrary, bare-metal code at +the highest exception level is required. It allows full, direct access to the +hardware, for example to run silicon soak tests. + +Although it is possible to implement some baremetal secure firmware from +scratch, this is a complex task on some platforms, depending on the level of +configuration required to put the system in the expected state. + +Rather than booting a baremetal application, a possible compromise is to boot +``EL3 payloads`` through TF-A instead. This is implemented as an alternative +boot flow, where a modified BL2 boots an EL3 payload, instead of loading the +other BL images and passing control to BL31. It reduces the complexity of +developing EL3 baremetal code by: + +- putting the system into a known architectural state; +- taking care of platform secure world initialization; +- loading the SCP_BL2 image if required by the platform. + +When booting an EL3 payload on Arm standard platforms, the configuration of the +TrustZone controller is simplified such that only region 0 is enabled and is +configured to permit secure access only. This gives full access to the whole +DRAM to the EL3 payload. + +The system is left in the same state as when entering BL31 in the default boot +flow. In particular: + +- Running in EL3; +- Current state is AArch64; +- Little-endian data access; +- All exceptions disabled; +- MMU disabled; +- Caches disabled. + +.. _alt_boot_flows_el3_payload: + +Booting an EL3 payload +~~~~~~~~~~~~~~~~~~~~~~ + +The EL3 payload image is a standalone image and is not part of the FIP. It is +not loaded by TF-A. Therefore, there are 2 possible scenarios: + +- The EL3 payload may reside in non-volatile memory (NVM) and execute in + place. In this case, booting it is just a matter of specifying the right + address in NVM through ``EL3_PAYLOAD_BASE`` when building TF-A. + +- The EL3 payload needs to be loaded in volatile memory (e.g. DRAM) at + run-time. + +To help in the latter scenario, the ``SPIN_ON_BL1_EXIT=1`` build option can be +used. The infinite loop that it introduces in BL1 stops execution at the right +moment for a debugger to take control of the target and load the payload (for +example, over JTAG). + +It is expected that this loading method will work in most cases, as a debugger +connection is usually available in a pre-production system. The user is free to +use any other platform-specific mechanism to load the EL3 payload, though. + + +Preloaded BL33 alternative boot flow +------------------------------------ + +Some platforms have the ability to preload BL33 into memory instead of relying +on TF-A to load it. This may simplify packaging of the normal world code and +improve performance in a development environment. When secure world cold boot +is complete, TF-A simply jumps to a BL33 base address provided at build time. + +For this option to be used, the ``PRELOADED_BL33_BASE`` build option has to be +used when compiling TF-A. For example, the following command will create a FIP +without a BL33 and prepare to jump to a BL33 image loaded at address +0x80000000: + +.. code:: shell + + make PRELOADED_BL33_BASE=0x80000000 PLAT=fvp all fip + +-------------- + +*Copyright (c) 2019, Arm Limited. All rights reserved.* diff --git a/docs/design/auth-framework.rst b/docs/design/auth-framework.rst new file mode 100644 index 0000000..6913e66 --- /dev/null +++ b/docs/design/auth-framework.rst @@ -0,0 +1,980 @@ +Authentication Framework & Chain of Trust +========================================= + +The aim of this document is to describe the authentication framework +implemented in Trusted Firmware-A (TF-A). This framework fulfills the +following requirements: + +#. It should be possible for a platform port to specify the Chain of Trust in + terms of certificate hierarchy and the mechanisms used to verify a + particular image/certificate. + +#. The framework should distinguish between: + + - The mechanism used to encode and transport information, e.g. DER encoded + X.509v3 certificates to ferry Subject Public Keys, hashes and non-volatile + counters. + + - The mechanism used to verify the transported information i.e. the + cryptographic libraries. + +The framework has been designed following a modular approach illustrated in the +next diagram: + +:: + + +---------------+---------------+------------+ + | Trusted | Trusted | Trusted | + | Firmware | Firmware | Firmware | + | Generic | IO Framework | Platform | + | Code i.e. | (IO) | Port | + | BL1/BL2 (GEN) | | (PP) | + +---------------+---------------+------------+ + ^ ^ ^ + | | | + v v v + +-----------+ +-----------+ +-----------+ + | | | | | Image | + | Crypto | | Auth | | Parser | + | Module |<->| Module |<->| Module | + | (CM) | | (AM) | | (IPM) | + | | | | | | + +-----------+ +-----------+ +-----------+ + ^ ^ + | | + v v + +----------------+ +-----------------+ + | Cryptographic | | Image Parser | + | Libraries (CL) | | Libraries (IPL) | + +----------------+ +-----------------+ + | | + | | + | | + v v + +-----------------+ + | Misc. Libs e.g. | + | ASN.1 decoder | + | | + +-----------------+ + + DIAGRAM 1. + +This document describes the inner details of the authentication framework and +the abstraction mechanisms available to specify a Chain of Trust. + +Framework design +---------------- + +This section describes some aspects of the framework design and the rationale +behind them. These aspects are key to verify a Chain of Trust. + +Chain of Trust +~~~~~~~~~~~~~~ + +A CoT is basically a sequence of authentication images which usually starts with +a root of trust and culminates in a single data image. The following diagram +illustrates how this maps to a CoT for the BL31 image described in the +`TBBR-Client specification`_. + +:: + + +------------------+ +-------------------+ + | ROTPK/ROTPK Hash |------>| Trusted Key | + +------------------+ | Certificate | + | (Auth Image) | + /+-------------------+ + / | + / | + / | + / | + L v + +------------------+ +-------------------+ + | Trusted World |------>| BL31 Key | + | Public Key | | Certificate | + +------------------+ | (Auth Image) | + +-------------------+ + / | + / | + / | + / | + / v + +------------------+ L +-------------------+ + | BL31 Content |------>| BL31 Content | + | Certificate PK | | Certificate | + +------------------+ | (Auth Image) | + +-------------------+ + / | + / | + / | + / | + / v + +------------------+ L +-------------------+ + | BL31 Hash |------>| BL31 Image | + | | | (Data Image) | + +------------------+ | | + +-------------------+ + + DIAGRAM 2. + +The root of trust is usually a public key (ROTPK) that has been burnt in the +platform and cannot be modified. + +Image types +~~~~~~~~~~~ + +Images in a CoT are categorised as authentication and data images. An +authentication image contains information to authenticate a data image or +another authentication image. A data image is usually a boot loader binary, but +it could be any other data that requires authentication. + +Component responsibilities +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For every image in a Chain of Trust, the following high level operations are +performed to verify it: + +#. Allocate memory for the image either statically or at runtime. + +#. Identify the image and load it in the allocated memory. + +#. Check the integrity of the image as per its type. + +#. Authenticate the image as per the cryptographic algorithms used. + +#. If the image is an authentication image, extract the information that will + be used to authenticate the next image in the CoT. + +In Diagram 1, each component is responsible for one or more of these operations. +The responsibilities are briefly described below. + +TF-A Generic code and IO framework (GEN/IO) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +These components are responsible for initiating the authentication process for a +particular image in BL1 or BL2. For each BL image that requires authentication, +the Generic code asks recursively the Authentication module what is the parent +image until either an authenticated image or the ROT is reached. Then the +Generic code calls the IO framework to load the image and calls the +Authentication module to authenticate it, following the CoT from ROT to Image. + +TF-A Platform Port (PP) +^^^^^^^^^^^^^^^^^^^^^^^ + +The platform is responsible for: + +#. Specifying the CoT for each image that needs to be authenticated. Details of + how a CoT can be specified by the platform are explained later. The platform + also specifies the authentication methods and the parsing method used for + each image. + +#. Statically allocating memory for each parameter in each image which is + used for verifying the CoT, e.g. memory for public keys, hashes etc. + +#. Providing the ROTPK or a hash of it. + +#. Providing additional information to the IPM to enable it to identify and + extract authentication parameters contained in an image, e.g. if the + parameters are stored as X509v3 extensions, the corresponding OID must be + provided. + +#. Fulfill any other memory requirements of the IPM and the CM (not currently + described in this document). + +#. Export functions to verify an image which uses an authentication method that + cannot be interpreted by the CM, e.g. if an image has to be verified using a + NV counter, then the value of the counter to compare with can only be + provided by the platform. + +#. Export a custom IPM if a proprietary image format is being used (described + later). + +Authentication Module (AM) +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is responsible for: + +#. Providing the necessary abstraction mechanisms to describe a CoT. Amongst + other things, the authentication and image parsing methods must be specified + by the PP in the CoT. + +#. Verifying the CoT passed by GEN by utilising functionality exported by the + PP, IPM and CM. + +#. Tracking which images have been verified. In case an image is a part of + multiple CoTs then it should be verified only once e.g. the Trusted World + Key Certificate in the TBBR-Client spec. contains information to verify + SCP_BL2, BL31, BL32 each of which have a separate CoT. (This + responsibility has not been described in this document but should be + trivial to implement). + +#. Reusing memory meant for a data image to verify authentication images e.g. + in the CoT described in Diagram 2, each certificate can be loaded and + verified in the memory reserved by the platform for the BL31 image. By the + time BL31 (the data image) is loaded, all information to authenticate it + will have been extracted from the parent image i.e. BL31 content + certificate. It is assumed that the size of an authentication image will + never exceed the size of a data image. It should be possible to verify this + at build time using asserts. + +Cryptographic Module (CM) +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The CM is responsible for providing an API to: + +#. Verify a digital signature. +#. Verify a hash. + +The CM does not include any cryptography related code, but it relies on an +external library to perform the cryptographic operations. A Crypto-Library (CL) +linking the CM and the external library must be implemented. The following +functions must be provided by the CL: + +.. code:: c + + void (*init)(void); + int (*verify_signature)(void *data_ptr, unsigned int data_len, + void *sig_ptr, unsigned int sig_len, + void *sig_alg, unsigned int sig_alg_len, + void *pk_ptr, unsigned int pk_len); + int (*verify_hash)(void *data_ptr, unsigned int data_len, + void *digest_info_ptr, unsigned int digest_info_len); + +These functions are registered in the CM using the macro: + +.. code:: c + + REGISTER_CRYPTO_LIB(_name, _init, _verify_signature, _verify_hash); + +``_name`` must be a string containing the name of the CL. This name is used for +debugging purposes. + +Image Parser Module (IPM) +^^^^^^^^^^^^^^^^^^^^^^^^^ + +The IPM is responsible for: + +#. Checking the integrity of each image loaded by the IO framework. +#. Extracting parameters used for authenticating an image based upon a + description provided by the platform in the CoT descriptor. + +Images may have different formats (for example, authentication images could be +x509v3 certificates, signed ELF files or any other platform specific format). +The IPM allows to register an Image Parser Library (IPL) for every image format +used in the CoT. This library must implement the specific methods to parse the +image. The IPM obtains the image format from the CoT and calls the right IPL to +check the image integrity and extract the authentication parameters. + +See Section "Describing the image parsing methods" for more details about the +mechanism the IPM provides to define and register IPLs. + +Authentication methods +~~~~~~~~~~~~~~~~~~~~~~ + +The AM supports the following authentication methods: + +#. Hash +#. Digital signature + +The platform may specify these methods in the CoT in case it decides to define +a custom CoT instead of reusing a predefined one. + +If a data image uses multiple methods, then all the methods must be a part of +the same CoT. The number and type of parameters are method specific. These +parameters should be obtained from the parent image using the IPM. + +#. Hash + + Parameters: + + #. A pointer to data to hash + #. Length of the data + #. A pointer to the hash + #. Length of the hash + + The hash will be represented by the DER encoding of the following ASN.1 + type: + + :: + + DigestInfo ::= SEQUENCE { + digestAlgorithm DigestAlgorithmIdentifier, + digest Digest + } + + This ASN.1 structure makes it possible to remove any assumption about the + type of hash algorithm used as this information accompanies the hash. This + should allow the Cryptography Library (CL) to support multiple hash + algorithm implementations. + +#. Digital Signature + + Parameters: + + #. A pointer to data to sign + #. Length of the data + #. Public Key Algorithm + #. Public Key value + #. Digital Signature Algorithm + #. Digital Signature value + + The Public Key parameters will be represented by the DER encoding of the + following ASN.1 type: + + :: + + SubjectPublicKeyInfo ::= SEQUENCE { + algorithm AlgorithmIdentifier{PUBLIC-KEY,{PublicKeyAlgorithms}}, + subjectPublicKey BIT STRING } + + The Digital Signature Algorithm will be represented by the DER encoding of + the following ASN.1 types. + + :: + + AlgorithmIdentifier {ALGORITHM:IOSet } ::= SEQUENCE { + algorithm ALGORITHM.&id({IOSet}), + parameters ALGORITHM.&Type({IOSet}{@algorithm}) OPTIONAL + } + + The digital signature will be represented by: + + :: + + signature ::= BIT STRING + +The authentication framework will use the image descriptor to extract all the +information related to authentication. + +Specifying a Chain of Trust +--------------------------- + +A CoT can be described as a set of image descriptors linked together in a +particular order. The order dictates the sequence in which they must be +verified. Each image has a set of properties which allow the AM to verify it. +These properties are described below. + +The PP is responsible for defining a single or multiple CoTs for a data image. +Unless otherwise specified, the data structures described in the following +sections are populated by the PP statically. + +Describing the image parsing methods +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The parsing method refers to the format of a particular image. For example, an +authentication image that represents a certificate could be in the X.509v3 +format. A data image that represents a boot loader stage could be in raw binary +or ELF format. The IPM supports three parsing methods. An image has to use one +of the three methods described below. An IPL is responsible for interpreting a +single parsing method. There has to be one IPL for every method used by the +platform. + +#. Raw format: This format is effectively a nop as an image using this method + is treated as being in raw binary format e.g. boot loader images used by + TF-A. This method should only be used by data images. + +#. X509V3 method: This method uses industry standards like X.509 to represent + PKI certificates (authentication images). It is expected that open source + libraries will be available which can be used to parse an image represented + by this method. Such libraries can be used to write the corresponding IPL + e.g. the X.509 parsing library code in mbed TLS. + +#. Platform defined method: This method caters for platform specific + proprietary standards to represent authentication or data images. For + example, The signature of a data image could be appended to the data image + raw binary. A header could be prepended to the combined blob to specify the + extents of each component. The platform will have to implement the + corresponding IPL to interpret such a format. + +The following enum can be used to define these three methods. + +.. code:: c + + typedef enum img_type_enum { + IMG_RAW, /* Binary image */ + IMG_PLAT, /* Platform specific format */ + IMG_CERT, /* X509v3 certificate */ + IMG_MAX_TYPES, + } img_type_t; + +An IPL must provide functions with the following prototypes: + +.. code:: c + + void init(void); + int check_integrity(void *img, unsigned int img_len); + int get_auth_param(const auth_param_type_desc_t *type_desc, + void *img, unsigned int img_len, + void **param, unsigned int *param_len); + +An IPL for each type must be registered using the following macro: + +.. code:: c + + REGISTER_IMG_PARSER_LIB(_type, _name, _init, _check_int, _get_param) + +- ``_type``: one of the types described above. +- ``_name``: a string containing the IPL name for debugging purposes. +- ``_init``: initialization function pointer. +- ``_check_int``: check image integrity function pointer. +- ``_get_param``: extract authentication parameter function pointer. + +The ``init()`` function will be used to initialize the IPL. + +The ``check_integrity()`` function is passed a pointer to the memory where the +image has been loaded by the IO framework and the image length. It should ensure +that the image is in the format corresponding to the parsing method and has not +been tampered with. For example, RFC-2459 describes a validation sequence for an +X.509 certificate. + +The ``get_auth_param()`` function is passed a parameter descriptor containing +information about the parameter (``type_desc`` and ``cookie``) to identify and +extract the data corresponding to that parameter from an image. This data will +be used to verify either the current or the next image in the CoT sequence. + +Each image in the CoT will specify the parsing method it uses. This information +will be used by the IPM to find the right parser descriptor for the image. + +Describing the authentication method(s) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +As part of the CoT, each image has to specify one or more authentication methods +which will be used to verify it. As described in the Section "Authentication +methods", there are three methods supported by the AM. + +.. code:: c + + typedef enum { + AUTH_METHOD_NONE, + AUTH_METHOD_HASH, + AUTH_METHOD_SIG, + AUTH_METHOD_NUM + } auth_method_type_t; + +The AM defines the type of each parameter used by an authentication method. It +uses this information to: + +#. Specify to the ``get_auth_param()`` function exported by the IPM, which + parameter should be extracted from an image. + +#. Correctly marshall the parameters while calling the verification function + exported by the CM and PP. + +#. Extract authentication parameters from a parent image in order to verify a + child image e.g. to verify the certificate image, the public key has to be + obtained from the parent image. + +.. code:: c + + typedef enum { + AUTH_PARAM_NONE, + AUTH_PARAM_RAW_DATA, /* Raw image data */ + AUTH_PARAM_SIG, /* The image signature */ + AUTH_PARAM_SIG_ALG, /* The image signature algorithm */ + AUTH_PARAM_HASH, /* A hash (including the algorithm) */ + AUTH_PARAM_PUB_KEY, /* A public key */ + } auth_param_type_t; + +The AM defines the following structure to identify an authentication parameter +required to verify an image. + +.. code:: c + + typedef struct auth_param_type_desc_s { + auth_param_type_t type; + void *cookie; + } auth_param_type_desc_t; + +``cookie`` is used by the platform to specify additional information to the IPM +which enables it to uniquely identify the parameter that should be extracted +from an image. For example, the hash of a BL3x image in its corresponding +content certificate is stored in an X509v3 custom extension field. An extension +field can only be identified using an OID. In this case, the ``cookie`` could +contain the pointer to the OID defined by the platform for the hash extension +field while the ``type`` field could be set to ``AUTH_PARAM_HASH``. A value of 0 for +the ``cookie`` field means that it is not used. + +For each method, the AM defines a structure with the parameters required to +verify the image. + +.. code:: c + + /* + * Parameters for authentication by hash matching + */ + typedef struct auth_method_param_hash_s { + auth_param_type_desc_t *data; /* Data to hash */ + auth_param_type_desc_t *hash; /* Hash to match with */ + } auth_method_param_hash_t; + + /* + * Parameters for authentication by signature + */ + typedef struct auth_method_param_sig_s { + auth_param_type_desc_t *pk; /* Public key */ + auth_param_type_desc_t *sig; /* Signature to check */ + auth_param_type_desc_t *alg; /* Signature algorithm */ + auth_param_type_desc_t *tbs; /* Data signed */ + } auth_method_param_sig_t; + +The AM defines the following structure to describe an authentication method for +verifying an image + +.. code:: c + + /* + * Authentication method descriptor + */ + typedef struct auth_method_desc_s { + auth_method_type_t type; + union { + auth_method_param_hash_t hash; + auth_method_param_sig_t sig; + } param; + } auth_method_desc_t; + +Using the method type specified in the ``type`` field, the AM finds out what field +needs to access within the ``param`` union. + +Storing Authentication parameters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A parameter described by ``auth_param_type_desc_t`` to verify an image could be +obtained from either the image itself or its parent image. The memory allocated +for loading the parent image will be reused for loading the child image. Hence +parameters which are obtained from the parent for verifying a child image need +to have memory allocated for them separately where they can be stored. This +memory must be statically allocated by the platform port. + +The AM defines the following structure to store the data corresponding to an +authentication parameter. + +.. code:: c + + typedef struct auth_param_data_desc_s { + void *auth_param_ptr; + unsigned int auth_param_len; + } auth_param_data_desc_t; + +The ``auth_param_ptr`` field is initialized by the platform. The ``auth_param_len`` +field is used to specify the length of the data in the memory. + +For parameters that can be obtained from the child image itself, the IPM is +responsible for populating the ``auth_param_ptr`` and ``auth_param_len`` fields +while executing the ``img_get_auth_param()`` function. + +The AM defines the following structure to enable an image to describe the +parameters that should be extracted from it and used to verify the next image +(child) in a CoT. + +.. code:: c + + typedef struct auth_param_desc_s { + auth_param_type_desc_t type_desc; + auth_param_data_desc_t data; + } auth_param_desc_t; + +Describing an image in a CoT +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +An image in a CoT is a consolidation of the following aspects of a CoT described +above. + +#. A unique identifier specified by the platform which allows the IO framework + to locate the image in a FIP and load it in the memory reserved for the data + image in the CoT. + +#. A parsing method which is used by the AM to find the appropriate IPM. + +#. Authentication methods and their parameters as described in the previous + section. These are used to verify the current image. + +#. Parameters which are used to verify the next image in the current CoT. These + parameters are specified only by authentication images and can be extracted + from the current image once it has been verified. + +The following data structure describes an image in a CoT. + +.. code:: c + + typedef struct auth_img_desc_s { + unsigned int img_id; + const struct auth_img_desc_s *parent; + img_type_t img_type; + const auth_method_desc_t *const img_auth_methods; + const auth_param_desc_t *const authenticated_data; + } auth_img_desc_t; + +A CoT is defined as an array of pointers to ``auth_image_desc_t`` structures +linked together by the ``parent`` field. Those nodes with no parent must be +authenticated using the ROTPK stored in the platform. + +Implementation example +---------------------- + +This section is a detailed guide explaining a trusted boot implementation using +the authentication framework. This example corresponds to the Applicative +Functional Mode (AFM) as specified in the TBBR-Client document. It is +recommended to read this guide along with the source code. + +The TBBR CoT +~~~~~~~~~~~~ + +CoT specific to BL1 and BL2 can be found in ``drivers/auth/tbbr/tbbr_cot_bl1.c`` +and ``drivers/auth/tbbr/tbbr_cot_bl2.c`` respectively. The common CoT used across +BL1 and BL2 can be found in ``drivers/auth/tbbr/tbbr_cot_common.c``. +This CoT consists of an array of pointers to image descriptors and it is +registered in the framework using the macro ``REGISTER_COT(cot_desc)``, where +``cot_desc`` must be the name of the array (passing a pointer or any other +type of indirection will cause the registration process to fail). + +The number of images participating in the boot process depends on the CoT. +There is, however, a minimum set of images that are mandatory in TF-A and thus +all CoTs must present: + +- ``BL2`` +- ``SCP_BL2`` (platform specific) +- ``BL31`` +- ``BL32`` (optional) +- ``BL33`` + +The TBBR specifies the additional certificates that must accompany these images +for a proper authentication. Details about the TBBR CoT may be found in the +:ref:`Trusted Board Boot` document. + +Following the :ref:`Porting Guide`, a platform must provide unique +identifiers for all the images and certificates that will be loaded during the +boot process. If a platform is using the TBBR as a reference for trusted boot, +these identifiers can be obtained from ``include/common/tbbr/tbbr_img_def.h``. +Arm platforms include this file in ``include/plat/arm/common/arm_def.h``. Other +platforms may also include this file or provide their own identifiers. + +**Important**: the authentication module uses these identifiers to index the +CoT array, so the descriptors location in the array must match the identifiers. + +Each image descriptor must specify: + +- ``img_id``: the corresponding image unique identifier defined by the platform. +- ``img_type``: the image parser module uses the image type to call the proper + parsing library to check the image integrity and extract the required + authentication parameters. Three types of images are currently supported: + + - ``IMG_RAW``: image is a raw binary. No parsing functions are available, + other than reading the whole image. + - ``IMG_PLAT``: image format is platform specific. The platform may use this + type for custom images not directly supported by the authentication + framework. + - ``IMG_CERT``: image is an x509v3 certificate. + +- ``parent``: pointer to the parent image descriptor. The parent will contain + the information required to authenticate the current image. If the parent + is NULL, the authentication parameters will be obtained from the platform + (i.e. the BL2 and Trusted Key certificates are signed with the ROT private + key, whose public part is stored in the platform). +- ``img_auth_methods``: this points to an array which defines the + authentication methods that must be checked to consider an image + authenticated. Each method consists of a type and a list of parameter + descriptors. A parameter descriptor consists of a type and a cookie which + will point to specific information required to extract that parameter from + the image (i.e. if the parameter is stored in an x509v3 extension, the + cookie will point to the extension OID). Depending on the method type, a + different number of parameters must be specified. This pointer should not be + NULL. + Supported methods are: + + - ``AUTH_METHOD_HASH``: the hash of the image must match the hash extracted + from the parent image. The following parameter descriptors must be + specified: + + - ``data``: data to be hashed (obtained from current image) + - ``hash``: reference hash (obtained from parent image) + + - ``AUTH_METHOD_SIG``: the image (usually a certificate) must be signed with + the private key whose public part is extracted from the parent image (or + the platform if the parent is NULL). The following parameter descriptors + must be specified: + + - ``pk``: the public key (obtained from parent image) + - ``sig``: the digital signature (obtained from current image) + - ``alg``: the signature algorithm used (obtained from current image) + - ``data``: the data to be signed (obtained from current image) + +- ``authenticated_data``: this array pointer indicates what authentication + parameters must be extracted from an image once it has been authenticated. + Each parameter consists of a parameter descriptor and the buffer + address/size to store the parameter. The CoT is responsible for allocating + the required memory to store the parameters. This pointer may be NULL. + +In the ``tbbr_cot*.c`` file, a set of buffers are allocated to store the parameters +extracted from the certificates. In the case of the TBBR CoT, these parameters +are hashes and public keys. In DER format, an RSA-4096 public key requires 550 +bytes, and a hash requires 51 bytes. Depending on the CoT and the authentication +process, some of the buffers may be reused at different stages during the boot. + +Next in that file, the parameter descriptors are defined. These descriptors will +be used to extract the parameter data from the corresponding image. + +Example: the BL31 Chain of Trust +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Four image descriptors form the BL31 Chain of Trust: + +.. code:: c + + static const auth_img_desc_t trusted_key_cert = { + .img_id = TRUSTED_KEY_CERT_ID, + .img_type = IMG_CERT, + .parent = NULL, + .img_auth_methods = (const auth_method_desc_t[AUTH_METHOD_NUM]) { + [0] = { + .type = AUTH_METHOD_SIG, + .param.sig = { + .pk = &subject_pk, + .sig = &sig, + .alg = &sig_alg, + .data = &raw_data + } + }, + [1] = { + .type = AUTH_METHOD_NV_CTR, + .param.nv_ctr = { + .cert_nv_ctr = &trusted_nv_ctr, + .plat_nv_ctr = &trusted_nv_ctr + } + } + }, + .authenticated_data = (const auth_param_desc_t[COT_MAX_VERIFIED_PARAMS]) { + [0] = { + .type_desc = &trusted_world_pk, + .data = { + .ptr = (void *)trusted_world_pk_buf, + .len = (unsigned int)PK_DER_LEN + } + }, + [1] = { + .type_desc = &non_trusted_world_pk, + .data = { + .ptr = (void *)non_trusted_world_pk_buf, + .len = (unsigned int)PK_DER_LEN + } + } + } + }; + static const auth_img_desc_t soc_fw_key_cert = { + .img_id = SOC_FW_KEY_CERT_ID, + .img_type = IMG_CERT, + .parent = &trusted_key_cert, + .img_auth_methods = (const auth_method_desc_t[AUTH_METHOD_NUM]) { + [0] = { + .type = AUTH_METHOD_SIG, + .param.sig = { + .pk = &trusted_world_pk, + .sig = &sig, + .alg = &sig_alg, + .data = &raw_data + } + }, + [1] = { + .type = AUTH_METHOD_NV_CTR, + .param.nv_ctr = { + .cert_nv_ctr = &trusted_nv_ctr, + .plat_nv_ctr = &trusted_nv_ctr + } + } + }, + .authenticated_data = (const auth_param_desc_t[COT_MAX_VERIFIED_PARAMS]) { + [0] = { + .type_desc = &soc_fw_content_pk, + .data = { + .ptr = (void *)content_pk_buf, + .len = (unsigned int)PK_DER_LEN + } + } + } + }; + static const auth_img_desc_t soc_fw_content_cert = { + .img_id = SOC_FW_CONTENT_CERT_ID, + .img_type = IMG_CERT, + .parent = &soc_fw_key_cert, + .img_auth_methods = (const auth_method_desc_t[AUTH_METHOD_NUM]) { + [0] = { + .type = AUTH_METHOD_SIG, + .param.sig = { + .pk = &soc_fw_content_pk, + .sig = &sig, + .alg = &sig_alg, + .data = &raw_data + } + }, + [1] = { + .type = AUTH_METHOD_NV_CTR, + .param.nv_ctr = { + .cert_nv_ctr = &trusted_nv_ctr, + .plat_nv_ctr = &trusted_nv_ctr + } + } + }, + .authenticated_data = (const auth_param_desc_t[COT_MAX_VERIFIED_PARAMS]) { + [0] = { + .type_desc = &soc_fw_hash, + .data = { + .ptr = (void *)soc_fw_hash_buf, + .len = (unsigned int)HASH_DER_LEN + } + }, + [1] = { + .type_desc = &soc_fw_config_hash, + .data = { + .ptr = (void *)soc_fw_config_hash_buf, + .len = (unsigned int)HASH_DER_LEN + } + } + } + }; + static const auth_img_desc_t bl31_image = { + .img_id = BL31_IMAGE_ID, + .img_type = IMG_RAW, + .parent = &soc_fw_content_cert, + .img_auth_methods = (const auth_method_desc_t[AUTH_METHOD_NUM]) { + [0] = { + .type = AUTH_METHOD_HASH, + .param.hash = { + .data = &raw_data, + .hash = &soc_fw_hash + } + } + } + }; + +The **Trusted Key certificate** is signed with the ROT private key and contains +the Trusted World public key and the Non-Trusted World public key as x509v3 +extensions. This must be specified in the image descriptor using the +``img_auth_methods`` and ``authenticated_data`` arrays, respectively. + +The Trusted Key certificate is authenticated by checking its digital signature +using the ROTPK. Four parameters are required to check a signature: the public +key, the algorithm, the signature and the data that has been signed. Therefore, +four parameter descriptors must be specified with the authentication method: + +- ``subject_pk``: parameter descriptor of type ``AUTH_PARAM_PUB_KEY``. This type + is used to extract a public key from the parent image. If the cookie is an + OID, the key is extracted from the corresponding x509v3 extension. If the + cookie is NULL, the subject public key is retrieved. In this case, because + the parent image is NULL, the public key is obtained from the platform + (this key will be the ROTPK). +- ``sig``: parameter descriptor of type ``AUTH_PARAM_SIG``. It is used to extract + the signature from the certificate. +- ``sig_alg``: parameter descriptor of type ``AUTH_PARAM_SIG``. It is used to + extract the signature algorithm from the certificate. +- ``raw_data``: parameter descriptor of type ``AUTH_PARAM_RAW_DATA``. It is used + to extract the data to be signed from the certificate. + +Once the signature has been checked and the certificate authenticated, the +Trusted World public key needs to be extracted from the certificate. A new entry +is created in the ``authenticated_data`` array for that purpose. In that entry, +the corresponding parameter descriptor must be specified along with the buffer +address to store the parameter value. In this case, the ``trusted_world_pk`` +descriptor is used to extract the public key from an x509v3 extension with OID +``TRUSTED_WORLD_PK_OID``. The BL31 key certificate will use this descriptor as +parameter in the signature authentication method. The key is stored in the +``trusted_world_pk_buf`` buffer. + +The **BL31 Key certificate** is authenticated by checking its digital signature +using the Trusted World public key obtained previously from the Trusted Key +certificate. In the image descriptor, we specify a single authentication method +by signature whose public key is the ``trusted_world_pk``. Once this certificate +has been authenticated, we have to extract the BL31 public key, stored in the +extension specified by ``soc_fw_content_pk``. This key will be copied to the +``content_pk_buf`` buffer. + +The **BL31 certificate** is authenticated by checking its digital signature +using the BL31 public key obtained previously from the BL31 Key certificate. +We specify the authentication method using ``soc_fw_content_pk`` as public key. +After authentication, we need to extract the BL31 hash, stored in the extension +specified by ``soc_fw_hash``. This hash will be copied to the +``soc_fw_hash_buf`` buffer. + +The **BL31 image** is authenticated by calculating its hash and matching it +with the hash obtained from the BL31 certificate. The image descriptor contains +a single authentication method by hash. The parameters to the hash method are +the reference hash, ``soc_fw_hash``, and the data to be hashed. In this case, +it is the whole image, so we specify ``raw_data``. + +The image parser library +~~~~~~~~~~~~~~~~~~~~~~~~ + +The image parser module relies on libraries to check the image integrity and +extract the authentication parameters. The number and type of parser libraries +depend on the images used in the CoT. Raw images do not need a library, so +only an x509v3 library is required for the TBBR CoT. + +Arm platforms will use an x509v3 library based on mbed TLS. This library may be +found in ``drivers/auth/mbedtls/mbedtls_x509_parser.c``. It exports three +functions: + +.. code:: c + + void init(void); + int check_integrity(void *img, unsigned int img_len); + int get_auth_param(const auth_param_type_desc_t *type_desc, + void *img, unsigned int img_len, + void **param, unsigned int *param_len); + +The library is registered in the framework using the macro +``REGISTER_IMG_PARSER_LIB()``. Each time the image parser module needs to access +an image of type ``IMG_CERT``, it will call the corresponding function exported +in this file. + +The build system must be updated to include the corresponding library and +mbed TLS sources. Arm platforms use the ``arm_common.mk`` file to pull the +sources. + +The cryptographic library +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The cryptographic module relies on a library to perform the required operations, +i.e. verify a hash or a digital signature. Arm platforms will use a library +based on mbed TLS, which can be found in +``drivers/auth/mbedtls/mbedtls_crypto.c``. This library is registered in the +authentication framework using the macro ``REGISTER_CRYPTO_LIB()`` and exports +four functions: + +.. code:: c + + void init(void); + int verify_signature(void *data_ptr, unsigned int data_len, + void *sig_ptr, unsigned int sig_len, + void *sig_alg, unsigned int sig_alg_len, + void *pk_ptr, unsigned int pk_len); + int verify_hash(void *data_ptr, unsigned int data_len, + void *digest_info_ptr, unsigned int digest_info_len); + int auth_decrypt(enum crypto_dec_algo dec_algo, void *data_ptr, + size_t len, const void *key, unsigned int key_len, + unsigned int key_flags, const void *iv, + unsigned int iv_len, const void *tag, + unsigned int tag_len) + +The mbedTLS library algorithm support is configured by both the +``TF_MBEDTLS_KEY_ALG`` and ``TF_MBEDTLS_KEY_SIZE`` variables. + +- ``TF_MBEDTLS_KEY_ALG`` can take in 3 values: `rsa`, `ecdsa` or `rsa+ecdsa`. + This variable allows the Makefile to include the corresponding sources in + the build for the various algorithms. Setting the variable to `rsa+ecdsa` + enables support for both rsa and ecdsa algorithms in the mbedTLS library. + +- ``TF_MBEDTLS_KEY_SIZE`` sets the supported RSA key size for TFA. Valid values + include 1024, 2048, 3072 and 4096. + +- ``TF_MBEDTLS_USE_AES_GCM`` enables the authenticated decryption support based + on AES-GCM algorithm. Valid values are 0 and 1. + +.. note:: + If code size is a concern, the build option ``MBEDTLS_SHA256_SMALLER`` can + be defined in the platform Makefile. It will make mbed TLS use an + implementation of SHA-256 with smaller memory footprint (~1.5 KB less) but + slower (~30%). + +-------------- + +*Copyright (c) 2017-2020, Arm Limited and Contributors. All rights reserved.* + +.. _TBBR-Client specification: https://developer.arm.com/docs/den0006/latest/trusted-board-boot-requirements-client-tbbr-client-armv8-a diff --git a/docs/design/cpu-specific-build-macros.rst b/docs/design/cpu-specific-build-macros.rst new file mode 100644 index 0000000..55e265c --- /dev/null +++ b/docs/design/cpu-specific-build-macros.rst @@ -0,0 +1,742 @@ +Arm CPU Specific Build Macros +============================= + +This document describes the various build options present in the CPU specific +operations framework to enable errata workarounds and to enable optimizations +for a specific CPU on a platform. + +Security Vulnerability Workarounds +---------------------------------- + +TF-A exports a series of build flags which control which security +vulnerability workarounds should be applied at runtime. + +- ``WORKAROUND_CVE_2017_5715``: Enables the security workaround for + `CVE-2017-5715`_. This flag can be set to 0 by the platform if none + of the PEs in the system need the workaround. Setting this flag to 0 provides + no performance benefit for non-affected platforms, it just helps to comply + with the recommendation in the spec regarding workaround discovery. + Defaults to 1. + +- ``WORKAROUND_CVE_2018_3639``: Enables the security workaround for + `CVE-2018-3639`_. Defaults to 1. The TF-A project recommends to keep + the default value of 1 even on platforms that are unaffected by + CVE-2018-3639, in order to comply with the recommendation in the spec + regarding workaround discovery. + +- ``DYNAMIC_WORKAROUND_CVE_2018_3639``: Enables dynamic mitigation for + `CVE-2018-3639`_. This build option should be set to 1 if the target + platform contains at least 1 CPU that requires dynamic mitigation. + Defaults to 0. + +- ``WORKAROUND_CVE_2022_23960``: Enables mitigation for `CVE-2022-23960`_. + This build option should be set to 1 if the target platform contains at + least 1 CPU that requires this mitigation. Defaults to 1. + +.. _arm_cpu_macros_errata_workarounds: + +CPU Errata Workarounds +---------------------- + +TF-A exports a series of build flags which control the errata workarounds that +are applied to each CPU by the reset handler. The errata details can be found +in the CPU specific errata documents published by Arm: + +- `Cortex-A53 MPCore Software Developers Errata Notice`_ +- `Cortex-A57 MPCore Software Developers Errata Notice`_ +- `Cortex-A72 MPCore Software Developers Errata Notice`_ + +The errata workarounds are implemented for a particular revision or a set of +processor revisions. This is checked by the reset handler at runtime. Each +errata workaround is identified by its ``ID`` as specified in the processor's +errata notice document. The format of the define used to enable/disable the +errata workaround is ``ERRATA_<Processor name>_<ID>``, where the ``Processor name`` +is for example ``A57`` for the ``Cortex_A57`` CPU. + +Refer to :ref:`firmware_design_cpu_errata_reporting` for information on how to +write errata workaround functions. + +All workarounds are disabled by default. The platform is responsible for +enabling these workarounds according to its requirement by defining the +errata workaround build flags in the platform specific makefile. In case +these workarounds are enabled for the wrong CPU revision then the errata +workaround is not applied. In the DEBUG build, this is indicated by +printing a warning to the crash console. + +In the current implementation, a platform which has more than 1 variant +with different revisions of a processor has no runtime mechanism available +for it to specify which errata workarounds should be enabled or not. + +The value of the build flags is 0 by default, that is, disabled. A value of 1 +will enable it. + +For Cortex-A9, the following errata build flags are defined : + +- ``ERRATA_A9_794073``: This applies errata 794073 workaround to Cortex-A9 + CPU. This needs to be enabled for all revisions of the CPU. + +For Cortex-A15, the following errata build flags are defined : + +- ``ERRATA_A15_816470``: This applies errata 816470 workaround to Cortex-A15 + CPU. This needs to be enabled only for revision >= r3p0 of the CPU. + +- ``ERRATA_A15_827671``: This applies errata 827671 workaround to Cortex-A15 + CPU. This needs to be enabled only for revision >= r3p0 of the CPU. + +For Cortex-A17, the following errata build flags are defined : + +- ``ERRATA_A17_852421``: This applies errata 852421 workaround to Cortex-A17 + CPU. This needs to be enabled only for revision <= r1p2 of the CPU. + +- ``ERRATA_A17_852423``: This applies errata 852423 workaround to Cortex-A17 + CPU. This needs to be enabled only for revision <= r1p2 of the CPU. + +For Cortex-A35, the following errata build flags are defined : + +- ``ERRATA_A35_855472``: This applies errata 855472 workaround to Cortex-A35 + CPUs. This needs to be enabled only for revision r0p0 of Cortex-A35. + +For Cortex-A53, the following errata build flags are defined : + +- ``ERRATA_A53_819472``: This applies errata 819472 workaround to all + CPUs. This needs to be enabled only for revision <= r0p1 of Cortex-A53. + +- ``ERRATA_A53_824069``: This applies errata 824069 workaround to all + CPUs. This needs to be enabled only for revision <= r0p2 of Cortex-A53. + +- ``ERRATA_A53_826319``: This applies errata 826319 workaround to Cortex-A53 + CPU. This needs to be enabled only for revision <= r0p2 of the CPU. + +- ``ERRATA_A53_827319``: This applies errata 827319 workaround to all + CPUs. This needs to be enabled only for revision <= r0p2 of Cortex-A53. + +- ``ERRATA_A53_835769``: This applies erratum 835769 workaround at compile and + link time to Cortex-A53 CPU. This needs to be enabled for some variants of + revision <= r0p4. This workaround can lead the linker to create ``*.stub`` + sections. + +- ``ERRATA_A53_836870``: This applies errata 836870 workaround to Cortex-A53 + CPU. This needs to be enabled only for revision <= r0p3 of the CPU. From + r0p4 and onwards, this errata is enabled by default in hardware. + +- ``ERRATA_A53_843419``: This applies erratum 843419 workaround at link time + to Cortex-A53 CPU. This needs to be enabled for some variants of revision + <= r0p4. This workaround can lead the linker to emit ``*.stub`` sections + which are 4kB aligned. + +- ``ERRATA_A53_855873``: This applies errata 855873 workaround to Cortex-A53 + CPUs. Though the erratum is present in every revision of the CPU, + this workaround is only applied to CPUs from r0p3 onwards, which feature + a chicken bit in CPUACTLR_EL1 to enable a hardware workaround. + Earlier revisions of the CPU have other errata which require the same + workaround in software, so they should be covered anyway. + +- ``ERRATA_A53_1530924``: This applies errata 1530924 workaround to all + revisions of Cortex-A53 CPU. + +For Cortex-A55, the following errata build flags are defined : + +- ``ERRATA_A55_768277``: This applies errata 768277 workaround to Cortex-A55 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A55_778703``: This applies errata 778703 workaround to Cortex-A55 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A55_798797``: This applies errata 798797 workaround to Cortex-A55 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A55_846532``: This applies errata 846532 workaround to Cortex-A55 + CPU. This needs to be enabled only for revision <= r0p1 of the CPU. + +- ``ERRATA_A55_903758``: This applies errata 903758 workaround to Cortex-A55 + CPU. This needs to be enabled only for revision <= r0p1 of the CPU. + +- ``ERRATA_A55_1221012``: This applies errata 1221012 workaround to Cortex-A55 + CPU. This needs to be enabled only for revision <= r1p0 of the CPU. + +- ``ERRATA_A55_1530923``: This applies errata 1530923 workaround to all + revisions of Cortex-A55 CPU. + +For Cortex-A57, the following errata build flags are defined : + +- ``ERRATA_A57_806969``: This applies errata 806969 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A57_813419``: This applies errata 813419 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A57_813420``: This applies errata 813420 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A57_814670``: This applies errata 814670 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A57_817169``: This applies errata 817169 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r0p1 of the CPU. + +- ``ERRATA_A57_826974``: This applies errata 826974 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r1p1 of the CPU. + +- ``ERRATA_A57_826977``: This applies errata 826977 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r1p1 of the CPU. + +- ``ERRATA_A57_828024``: This applies errata 828024 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r1p1 of the CPU. + +- ``ERRATA_A57_829520``: This applies errata 829520 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r1p2 of the CPU. + +- ``ERRATA_A57_833471``: This applies errata 833471 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r1p2 of the CPU. + +- ``ERRATA_A57_859972``: This applies errata 859972 workaround to Cortex-A57 + CPU. This needs to be enabled only for revision <= r1p3 of the CPU. + +- ``ERRATA_A57_1319537``: This applies errata 1319537 workaround to all + revisions of Cortex-A57 CPU. + +For Cortex-A72, the following errata build flags are defined : + +- ``ERRATA_A72_859971``: This applies errata 859971 workaround to Cortex-A72 + CPU. This needs to be enabled only for revision <= r0p3 of the CPU. + +- ``ERRATA_A72_1319367``: This applies errata 1319367 workaround to all + revisions of Cortex-A72 CPU. + +For Cortex-A73, the following errata build flags are defined : + +- ``ERRATA_A73_852427``: This applies errata 852427 workaround to Cortex-A73 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A73_855423``: This applies errata 855423 workaround to Cortex-A73 + CPU. This needs to be enabled only for revision <= r0p1 of the CPU. + +For Cortex-A75, the following errata build flags are defined : + +- ``ERRATA_A75_764081``: This applies errata 764081 workaround to Cortex-A75 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +- ``ERRATA_A75_790748``: This applies errata 790748 workaround to Cortex-A75 + CPU. This needs to be enabled only for revision r0p0 of the CPU. + +For Cortex-A76, the following errata build flags are defined : + +- ``ERRATA_A76_1073348``: This applies errata 1073348 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r1p0 of the CPU. + +- ``ERRATA_A76_1130799``: This applies errata 1130799 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r2p0 of the CPU. + +- ``ERRATA_A76_1220197``: This applies errata 1220197 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r2p0 of the CPU. + +- ``ERRATA_A76_1257314``: This applies errata 1257314 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_A76_1262606``: This applies errata 1262606 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_A76_1262888``: This applies errata 1262888 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_A76_1275112``: This applies errata 1275112 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_A76_1791580``: This applies errata 1791580 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r4p0 of the CPU. + +- ``ERRATA_A76_1165522``: This applies errata 1165522 workaround to all + revisions of Cortex-A76 CPU. This errata is fixed in r3p0 but due to + limitation of errata framework this errata is applied to all revisions + of Cortex-A76 CPU. + +- ``ERRATA_A76_1868343``: This applies errata 1868343 workaround to Cortex-A76 + CPU. This needs to be enabled only for revision <= r4p0 of the CPU. + +- ``ERRATA_A76_1946160``: This applies errata 1946160 workaround to Cortex-A76 + CPU. This needs to be enabled only for revisions r3p0 - r4p1 of the CPU. + +- ``ERRATA_A76_2743102``: This applies errata 2743102 workaround to Cortex-A76 + CPU. This needs to be enabled for all revisions <= r4p1 of the CPU and is + still open. + +For Cortex-A77, the following errata build flags are defined : + +- ``ERRATA_A77_1508412``: This applies errata 1508412 workaround to Cortex-A77 + CPU. This needs to be enabled only for revision <= r1p0 of the CPU. + +- ``ERRATA_A77_1925769``: This applies errata 1925769 workaround to Cortex-A77 + CPU. This needs to be enabled only for revision <= r1p1 of the CPU. + +- ``ERRATA_A77_1946167``: This applies errata 1946167 workaround to Cortex-A77 + CPU. This needs to be enabled only for revision <= r1p1 of the CPU. + +- ``ERRATA_A77_1791578``: This applies errata 1791578 workaround to Cortex-A77 + CPU. This needs to be enabled for r0p0, r1p0, and r1p1, it is still open. + +- ``ERRATA_A77_2356587``: This applies errata 2356587 workaround to Cortex-A77 + CPU. This needs to be enabled for r0p0, r1p0, and r1p1, it is still open. + + - ``ERRATA_A77_1800714``: This applies errata 1800714 workaround to Cortex-A77 + CPU. This needs to be enabled for revisions <= r1p1 of the CPU. + + - ``ERRATA_A77_2743100``: This applies errata 2743100 workaround to Cortex-A77 + CPU. This needs to be enabled for r0p0, r1p0, and r1p1, it is still open. + +For Cortex-A78, the following errata build flags are defined : + +- ``ERRATA_A78_1688305``: This applies errata 1688305 workaround to Cortex-A78 + CPU. This needs to be enabled only for revision r0p0 - r1p0 of the CPU. + +- ``ERRATA_A78_1941498``: This applies errata 1941498 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r0p0, r1p0, and r1p1 of the CPU. + +- ``ERRATA_A78_1951500``: This applies errata 1951500 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r1p0 and r1p1, r0p0 has the same + issue but there is no workaround for that revision. + +- ``ERRATA_A78_1821534``: This applies errata 1821534 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r0p0 and r1p0. + +- ``ERRATA_A78_1952683``: This applies errata 1952683 workaround to Cortex-A78 + CPU. This needs to be enabled for revision r0p0, it is fixed in r1p0. + +- ``ERRATA_A78_2132060``: This applies errata 2132060 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r0p0, r1p0, r1p1, and r1p2. It + is still open. + +- ``ERRATA_A78_2242635``: This applies errata 2242635 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r1p0, r1p1, and r1p2. The issue + is present in r0p0 but there is no workaround. It is still open. + +- ``ERRATA_A78_2376745``: This applies errata 2376745 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r0p0, r1p0, r1p1, and r1p2, and + it is still open. + +- ``ERRATA_A78_2395406``: This applies errata 2395406 workaround to Cortex-A78 + CPU. This needs to be enabled for revisions r0p0, r1p0, r1p1, and r1p2, and + it is still open. + +For Cortex-A78 AE, the following errata build flags are defined : + +- ``ERRATA_A78_AE_1941500`` : This applies errata 1941500 workaround to + Cortex-A78 AE CPU. This needs to be enabled for revisions r0p0 and r0p1. + This erratum is still open. + +- ``ERRATA_A78_AE_1951502`` : This applies errata 1951502 workaround to + Cortex-A78 AE CPU. This needs to be enabled for revisions r0p0 and r0p1. This + erratum is still open. + +- ``ERRATA_A78_AE_2376748`` : This applies errata 2376748 workaround to + Cortex-A78 AE CPU. This needs to be enabled for revisions r0p0 and r0p1. This + erratum is still open. + +- ``ERRATA_A78_AE_2395408`` : This applies errata 2395408 workaround to + Cortex-A78 AE CPU. This needs to be enabled for revisions r0p0 and r0p1. This + erratum is still open. + +For Cortex-A78C, the following errata build flags are defined : + +- ``ERRATA_A78C_2132064`` : This applies errata 2132064 workaround to + Cortex-A78C CPU. This needs to be enabled for revisions r0p1, r0p2 and + it is still open. + +- ``ERRATA_A78C_2242638`` : This applies errata 2242638 workaround to + Cortex-A78C CPU. This needs to be enabled for revisions r0p1, r0p2 and + it is still open. + +- ``ERRATA_A78C_2376749`` : This applies errata 2376749 workaround to + Cortex-A78C CPU. This needs to be enabled for revisions r0p1 and r0p2. This + erratum is still open. + +- ``ERRATA_A78C_2395411`` : This applies errata 2395411 workaround to + Cortex-A78C CPU. This needs to be enabled for revisions r0p1 and r0p2. This + erratum is still open. + +For Cortex-X1 CPU, the following errata build flags are defined: + +- ``ERRATA_X1_1821534`` : This applies errata 1821534 workaround to Cortex-X1 + CPU. This needs to be enabled only for revision <= r1p0 of the CPU. + +- ``ERRATA_X1_1688305`` : This applies errata 1688305 workaround to Cortex-X1 + CPU. This needs to be enabled only for revision <= r1p0 of the CPU. + +- ``ERRATA_X1_1827429`` : This applies errata 1827429 workaround to Cortex-X1 + CPU. This needs to be enabled only for revision <= r1p0 of the CPU. + +For Neoverse N1, the following errata build flags are defined : + +- ``ERRATA_N1_1073348``: This applies errata 1073348 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision r0p0 and r1p0 of the CPU. + +- ``ERRATA_N1_1130799``: This applies errata 1130799 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r2p0 of the CPU. + +- ``ERRATA_N1_1165347``: This applies errata 1165347 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r2p0 of the CPU. + +- ``ERRATA_N1_1207823``: This applies errata 1207823 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r2p0 of the CPU. + +- ``ERRATA_N1_1220197``: This applies errata 1220197 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r2p0 of the CPU. + +- ``ERRATA_N1_1257314``: This applies errata 1257314 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_N1_1262606``: This applies errata 1262606 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_N1_1262888``: This applies errata 1262888 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_N1_1275112``: This applies errata 1275112 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_N1_1315703``: This applies errata 1315703 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r3p0 of the CPU. + +- ``ERRATA_N1_1542419``: This applies errata 1542419 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revisions r3p0 - r4p0 of the CPU. + +- ``ERRATA_N1_1868343``: This applies errata 1868343 workaround to Neoverse-N1 + CPU. This needs to be enabled only for revision <= r4p0 of the CPU. + +- ``ERRATA_N1_1946160``: This applies errata 1946160 workaround to Neoverse-N1 + CPU. This needs to be enabled for revisions r3p0, r3p1, r4p0, and r4p1, for + revisions r0p0, r1p0, and r2p0 there is no workaround. + +- ``ERRATA_N1_2743102``: This applies errata 2743102 workaround to Neoverse-N1 + CPU. This needs to be enabled for all revisions <= r4p1 of the CPU and is + still open. + +For Neoverse V1, the following errata build flags are defined : + +- ``ERRATA_V1_1618635``: This applies errata 1618635 workaround to Neoverse-V1 + CPU. This needs to be enabled for revision r0p0 of the CPU, it is fixed in + r1p0. + +- ``ERRATA_V1_1774420``: This applies errata 1774420 workaround to Neoverse-V1 + CPU. This needs to be enabled only for revisions r0p0 and r1p0, it is fixed + in r1p1. + +- ``ERRATA_V1_1791573``: This applies errata 1791573 workaround to Neoverse-V1 + CPU. This needs to be enabled only for revisions r0p0 and r1p0, it is fixed + in r1p1. + +- ``ERRATA_V1_1852267``: This applies errata 1852267 workaround to Neoverse-V1 + CPU. This needs to be enabled only for revisions r0p0 and r1p0, it is fixed + in r1p1. + +- ``ERRATA_V1_1925756``: This applies errata 1925756 workaround to Neoverse-V1 + CPU. This needs to be enabled for r0p0, r1p0, and r1p1, it is still open. + +- ``ERRATA_V1_1940577``: This applies errata 1940577 workaround to Neoverse-V1 + CPU. This needs to be enabled only for revision r1p0 and r1p1 of the + CPU. + +- ``ERRATA_V1_1966096``: This applies errata 1966096 workaround to Neoverse-V1 + CPU. This needs to be enabled for revisions r1p0 and r1p1 of the CPU, the + issue is present in r0p0 as well but there is no workaround for that + revision. It is still open. + +- ``ERRATA_V1_2139242``: This applies errata 2139242 workaround to Neoverse-V1 + CPU. This needs to be enabled for revisions r0p0, r1p0, and r1p1 of the + CPU. It is still open. + +- ``ERRATA_V1_2108267``: This applies errata 2108267 workaround to Neoverse-V1 + CPU. This needs to be enabled for revisions r0p0, r1p0, and r1p1 of the CPU. + It is still open. + +- ``ERRATA_V1_2216392``: This applies errata 2216392 workaround to Neoverse-V1 + CPU. This needs to be enabled for revisions r1p0 and r1p1 of the CPU, the + issue is present in r0p0 as well but there is no workaround for that + revision. It is still open. + +- ``ERRATA_V1_2294912``: This applies errata 2294912 workaround to Neoverse-V1 + CPU. This needs to be enabled for revisions r0p0, r1p0, and r1p1 of the CPU. + +- ``ERRATA_V1_2372203``: This applies errata 2372203 workaround to Neoverse-V1 + CPU. This needs to be enabled for revisions r0p0, r1p0 and r1p1 of the CPU. + It is still open. + +For Cortex-A710, the following errata build flags are defined : + +- ``ERRATA_A710_1987031``: This applies errata 1987031 workaround to + Cortex-A710 CPU. This needs to be enabled only for revisions r0p0, r1p0 and + r2p0 of the CPU. It is still open. + +- ``ERRATA_A710_2081180``: This applies errata 2081180 workaround to + Cortex-A710 CPU. This needs to be enabled only for revisions r0p0, r1p0 and + r2p0 of the CPU. It is still open. + +- ``ERRATA_A710_2055002``: This applies errata 2055002 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r1p0, r2p0 of the CPU + and is still open. + +- ``ERRATA_A710_2017096``: This applies errata 2017096 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is still open. + +- ``ERRATA_A710_2083908``: This applies errata 2083908 workaround to + Cortex-A710 CPU. This needs to be enabled for revision r2p0 of the CPU and + is still open. + +- ``ERRATA_A710_2058056``: This applies errata 2058056 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is still open. + +- ``ERRATA_A710_2267065``: This applies errata 2267065 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +- ``ERRATA_A710_2136059``: This applies errata 2136059 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +- ``ERRATA_A710_2147715``: This applies errata 2147715 workaround to + Cortex-A710 CPU. This needs to be enabled for revision r2p0 of the CPU + and is fixed in r2p1. + +- ``ERRATA_A710_2216384``: This applies errata 2216384 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +- ``ERRATA_A710_2282622``: This applies errata 2282622 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +- ``ERRATA_A710_2291219``: This applies errata 2291219 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +- ``ERRATA_A710_2008768``: This applies errata 2008768 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +- ``ERRATA_A710_2371105``: This applies errata 2371105 workaround to + Cortex-A710 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +For Neoverse N2, the following errata build flags are defined : + +- ``ERRATA_N2_2002655``: This applies errata 2002655 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU, it is still open. + +- ``ERRATA_N2_2067956``: This applies errata 2067956 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2025414``: This applies errata 2025414 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2189731``: This applies errata 2189731 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2138956``: This applies errata 2138956 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2138953``: This applies errata 2138953 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2242415``: This applies errata 2242415 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2138958``: This applies errata 2138958 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2242400``: This applies errata 2242400 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2280757``: This applies errata 2280757 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU and is still open. + +- ``ERRATA_N2_2326639``: This applies errata 2326639 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU, it is fixed in + r0p1. + +- ``ERRATA_N2_2376738``: This applies errata 2376738 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU, it is fixed in + r0p1. + +- ``ERRATA_N2_2388450``: This applies errata 2388450 workaround to Neoverse-N2 + CPU. This needs to be enabled for revision r0p0 of the CPU, it is fixed in + r0p1. + +For Cortex-X2, the following errata build flags are defined : + +- ``ERRATA_X2_2002765``: This applies errata 2002765 workaround to Cortex-X2 + CPU. This needs to be enabled for revisions r0p0, r1p0, and r2p0 of the CPU, + it is still open. + +- ``ERRATA_X2_2058056``: This applies errata 2058056 workaround to Cortex-X2 + CPU. This needs to be enabled for revisions r0p0, r1p0, and r2p0 of the CPU, + it is still open. + +- ``ERRATA_X2_2083908``: This applies errata 2083908 workaround to Cortex-X2 + CPU. This needs to be enabled for revision r2p0 of the CPU, it is still open. + +- ``ERRATA_X2_2017096``: This applies errata 2017096 workaround to + Cortex-X2 CPU. This needs to be enabled only for revisions r0p0, r1p0 and + r2p0 of the CPU, it is fixed in r2p1. + +- ``ERRATA_X2_2081180``: This applies errata 2081180 workaround to + Cortex-X2 CPU. This needs to be enabled only for revisions r0p0, r1p0 and + r2p0 of the CPU, it is fixed in r2p1. + +- ``ERRATA_X2_2216384``: This applies errata 2216384 workaround to + Cortex-X2 CPU. This needs to be enabled only for revisions r0p0, r1p0 and + r2p0 of the CPU, it is fixed in r2p1. + +- ``ERRATA_X2_2147715``: This applies errata 2147715 workaround to + Cortex-X2 CPU. This needs to be enabled only for revision r2p0 of the CPU, + it is fixed in r2p1. + +- ``ERRATA_X2_2371105``: This applies errata 2371105 workaround to + Cortex-X2 CPU. This needs to be enabled for revisions r0p0, r1p0 and r2p0 + of the CPU and is fixed in r2p1. + +For Cortex-X3, the following errata build flags are defined : + +- ``ERRATA_X3_2313909``: This applies errata 2313909 workaround to + Cortex-X3 CPU. This needs to be enabled only for revisions r0p0 and r1p0 + of the CPU, it is fixed in r1p1. + +For Cortex-A510, the following errata build flags are defined : + +- ``ERRATA_A510_1922240``: This applies errata 1922240 workaround to + Cortex-A510 CPU. This needs to be enabled only for revision r0p0, it is + fixed in r0p1. + +- ``ERRATA_A510_2288014``: This applies errata 2288014 workaround to + Cortex-A510 CPU. This needs to be enabled only for revisions r0p0, r0p1, + r0p2, r0p3 and r1p0, it is fixed in r1p1. + +- ``ERRATA_A510_2042739``: This applies errata 2042739 workaround to + Cortex-A510 CPU. This needs to be enabled only for revisions r0p0, r0p1 and + r0p2, it is fixed in r0p3. + +- ``ERRATA_A510_2041909``: This applies errata 2041909 workaround to + Cortex-A510 CPU. This needs to be enabled only for revision r0p2 and is fixed + in r0p3. The issue is also present in r0p0 and r0p1 but there is no + workaround for those revisions. + +- ``ERRATA_A510_2250311``: This applies errata 2250311 workaround to + Cortex-A510 CPU. This needs to be enabled for revisions r0p0, r0p1, r0p2, + r0p3 and r1p0, it is fixed in r1p1. This workaround disables MPMM even if + ENABLE_MPMM=1. + +- ``ERRATA_A510_2218950``: This applies errata 2218950 workaround to + Cortex-A510 CPU. This needs to be enabled for revisions r0p0, r0p1, r0p2, + r0p3 and r1p0, it is fixed in r1p1. + +- ``ERRATA_A510_2172148``: This applies errata 2172148 workaround to + Cortex-A510 CPU. This needs to be enabled for revisions r0p0, r0p1, r0p2, + r0p3 and r1p0, it is fixed in r1p1. + +- ``ERRATA_A510_2347730``: This applies errata 2347730 workaround to + Cortex-A510 CPU. This needs to be enabled for revisions r0p0, r0p1, r0p2, + r0p3, r1p0 and r1p1. It is fixed in r1p2. + +- ``ERRATA_A510_2371937``: This applies errata 2371937 workaround to + Cortex-A510 CPU. This needs to applied for revisions r0p0, r0p1, r0p2, + r0p3, r1p0, r1p1, and is fixed in r1p2. + +- ``ERRATA_A510_2666669``: This applies errata 2666669 workaround to + Cortex-A510 CPU. This needs to applied for revisions r0p0, r0p1, r0p2, + r0p3, r1p0, r1p1. It is fixed in r1p2. + +DSU Errata Workarounds +---------------------- + +Similar to CPU errata, TF-A also implements workarounds for DSU (DynamIQ +Shared Unit) errata. The DSU errata details can be found in the respective Arm +documentation: + +- `Arm DSU Software Developers Errata Notice`_. + +Each erratum is identified by an ``ID``, as defined in the DSU errata notice +document. Thus, the build flags which enable/disable the errata workarounds +have the format ``ERRATA_DSU_<ID>``. The implementation and application logic +of DSU errata workarounds are similar to `CPU errata workarounds`_. + +For DSU errata, the following build flags are defined: + +- ``ERRATA_DSU_798953``: This applies errata 798953 workaround for the + affected DSU configurations. This errata applies only for those DSUs that + revision is r0p0 (on r0p1 it is fixed). However, please note that this + workaround results in increased DSU power consumption on idle. + +- ``ERRATA_DSU_936184``: This applies errata 936184 workaround for the + affected DSU configurations. This errata applies only for those DSUs that + contain the ACP interface **and** the DSU revision is older than r2p0 (on + r2p0 it is fixed). However, please note that this workaround results in + increased DSU power consumption on idle. + +- ``ERRATA_DSU_2313941``: This applies errata 2313941 workaround for the + affected DSU configurations. This errata applies for those DSUs with + revisions r0p0, r1p0, r2p0, r2p1, r3p0, r3p1 and is still open. However, + please note that this workaround results in increased DSU power consumption + on idle. + +CPU Specific optimizations +-------------------------- + +This section describes some of the optimizations allowed by the CPU micro +architecture that can be enabled by the platform as desired. + +- ``SKIP_A57_L1_FLUSH_PWR_DWN``: This flag enables an optimization in the + Cortex-A57 cluster power down sequence by not flushing the Level 1 data + cache. The L1 data cache and the L2 unified cache are inclusive. A flush + of the L2 by set/way flushes any dirty lines from the L1 as well. This + is a known safe deviation from the Cortex-A57 TRM defined power down + sequence. Each Cortex-A57 based platform must make its own decision on + whether to use the optimization. + +- ``A53_DISABLE_NON_TEMPORAL_HINT``: This flag disables the cache non-temporal + hint. The LDNP/STNP instructions as implemented on Cortex-A53 do not behave + in a way most programmers expect, and will most probably result in a + significant speed degradation to any code that employs them. The Armv8-A + architecture (see Arm DDI 0487A.h, section D3.4.3) allows cores to ignore + the non-temporal hint and treat LDNP/STNP as LDP/STP instead. Enabling this + flag enforces this behaviour. This needs to be enabled only for revisions + <= r0p3 of the CPU and is enabled by default. + +- ``A57_DISABLE_NON_TEMPORAL_HINT``: This flag has the same behaviour as + ``A53_DISABLE_NON_TEMPORAL_HINT`` but for Cortex-A57. This needs to be + enabled only for revisions <= r1p2 of the CPU and is enabled by default, + as recommended in section "4.7 Non-Temporal Loads/Stores" of the + `Cortex-A57 Software Optimization Guide`_. + +- ''A57_ENABLE_NON_CACHEABLE_LOAD_FWD'': This flag enables non-cacheable + streaming enhancement feature for Cortex-A57 CPUs. Platforms can set + this bit only if their memory system meets the requirement that cache + line fill requests from the Cortex-A57 processor are atomic. Each + Cortex-A57 based platform must make its own decision on whether to use + the optimization. This flag is disabled by default. + +- ``NEOVERSE_Nx_EXTERNAL_LLC``: This flag indicates that an external last + level cache(LLC) is present in the system, and that the DataSource field + on the master CHI interface indicates when data is returned from the LLC. + This is used to control how the LL_CACHE* PMU events count. + Default value is 0 (Disabled). + +GIC Errata Workarounds +---------------------- +- ``GIC600_ERRATA_WA_2384374``: This flag applies part 2 of errata 2384374 + workaround for the affected GIC600 and GIC600-AE implementations. It applies + to implementations of GIC600 and GIC600-AE with revisions less than or equal + to r1p6 and r0p2 respectively. If the platform sets GICV3_SUPPORT_GIC600, + then this flag is enabled; otherwise, it is 0 (Disabled). + +-------------- + +*Copyright (c) 2014-2022, Arm Limited and Contributors. All rights reserved.* + +.. _CVE-2017-5715: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-5715 +.. _CVE-2018-3639: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2018-3639 +.. _CVE-2022-23960: https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-23960 +.. _Cortex-A53 MPCore Software Developers Errata Notice: http://infocenter.arm.com/help/topic/com.arm.doc.epm048406/index.html +.. _Cortex-A57 MPCore Software Developers Errata Notice: http://infocenter.arm.com/help/topic/com.arm.doc.epm049219/index.html +.. _Cortex-A72 MPCore Software Developers Errata Notice: http://infocenter.arm.com/help/topic/com.arm.doc.epm012079/index.html +.. _Cortex-A57 Software Optimization Guide: http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf +.. _Arm DSU Software Developers Errata Notice: http://infocenter.arm.com/help/topic/com.arm.doc.epm138168/index.html diff --git a/docs/design/firmware-design.rst b/docs/design/firmware-design.rst new file mode 100644 index 0000000..84bba18 --- /dev/null +++ b/docs/design/firmware-design.rst @@ -0,0 +1,2766 @@ +Firmware Design +=============== + +Trusted Firmware-A (TF-A) implements a subset of the Trusted Board Boot +Requirements (TBBR) Platform Design Document (PDD) for Arm reference +platforms. + +The TBB sequence starts when the platform is powered on and runs up +to the stage where it hands-off control to firmware running in the normal +world in DRAM. This is the cold boot path. + +TF-A also implements the `Power State Coordination Interface PDD`_ as a +runtime service. PSCI is the interface from normal world software to firmware +implementing power management use-cases (for example, secondary CPU boot, +hotplug and idle). Normal world software can access TF-A runtime services via +the Arm SMC (Secure Monitor Call) instruction. The SMC instruction must be +used as mandated by the SMC Calling Convention (`SMCCC`_). + +TF-A implements a framework for configuring and managing interrupts generated +in either security state. The details of the interrupt management framework +and its design can be found in :ref:`Interrupt Management Framework`. + +TF-A also implements a library for setting up and managing the translation +tables. The details of this library can be found in +:ref:`Translation (XLAT) Tables Library`. + +TF-A can be built to support either AArch64 or AArch32 execution state. + +.. note:: + + The descriptions in this chapter are for the Arm TrustZone architecture. + For changes to the firmware design for the + `Arm Confidential Compute Architecture (Arm CCA)`_ please refer to the + chapter :ref:`Realm Management Extension (RME)`. + +Cold boot +--------- + +The cold boot path starts when the platform is physically turned on. If +``COLD_BOOT_SINGLE_CPU=0``, one of the CPUs released from reset is chosen as the +primary CPU, and the remaining CPUs are considered secondary CPUs. The primary +CPU is chosen through platform-specific means. The cold boot path is mainly +executed by the primary CPU, other than essential CPU initialization executed by +all CPUs. The secondary CPUs are kept in a safe platform-specific state until +the primary CPU has performed enough initialization to boot them. + +Refer to the :ref:`CPU Reset` for more information on the effect of the +``COLD_BOOT_SINGLE_CPU`` platform build option. + +The cold boot path in this implementation of TF-A depends on the execution +state. For AArch64, it is divided into five steps (in order of execution): + +- Boot Loader stage 1 (BL1) *AP Trusted ROM* +- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* +- Boot Loader stage 3-1 (BL31) *EL3 Runtime Software* +- Boot Loader stage 3-2 (BL32) *Secure-EL1 Payload* (optional) +- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* + +For AArch32, it is divided into four steps (in order of execution): + +- Boot Loader stage 1 (BL1) *AP Trusted ROM* +- Boot Loader stage 2 (BL2) *Trusted Boot Firmware* +- Boot Loader stage 3-2 (BL32) *EL3 Runtime Software* +- Boot Loader stage 3-3 (BL33) *Non-trusted Firmware* + +Arm development platforms (Fixed Virtual Platforms (FVPs) and Juno) implement a +combination of the following types of memory regions. Each bootloader stage uses +one or more of these memory regions. + +- Regions accessible from both non-secure and secure states. For example, + non-trusted SRAM, ROM and DRAM. +- Regions accessible from only the secure state. For example, trusted SRAM and + ROM. The FVPs also implement the trusted DRAM which is statically + configured. Additionally, the Base FVPs and Juno development platform + configure the TrustZone Controller (TZC) to create a region in the DRAM + which is accessible only from the secure state. + +The sections below provide the following details: + +- dynamic configuration of Boot Loader stages +- initialization and execution of the first three stages during cold boot +- specification of the EL3 Runtime Software (BL31 for AArch64 and BL32 for + AArch32) entrypoint requirements for use by alternative Trusted Boot + Firmware in place of the provided BL1 and BL2 + +Dynamic Configuration during cold boot +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Each of the Boot Loader stages may be dynamically configured if required by the +platform. The Boot Loader stage may optionally specify a firmware +configuration file and/or hardware configuration file as listed below: + +- FW_CONFIG - The firmware configuration file. Holds properties shared across + all BLx images. + An example is the "dtb-registry" node, which contains the information about + the other device tree configurations (load-address, size, image_id). +- HW_CONFIG - The hardware configuration file. Can be shared by all Boot Loader + stages and also by the Normal World Rich OS. +- TB_FW_CONFIG - Trusted Boot Firmware configuration file. Shared between BL1 + and BL2. +- SOC_FW_CONFIG - SoC Firmware configuration file. Used by BL31. +- TOS_FW_CONFIG - Trusted OS Firmware configuration file. Used by Trusted OS + (BL32). +- NT_FW_CONFIG - Non Trusted Firmware configuration file. Used by Non-trusted + firmware (BL33). + +The Arm development platforms use the Flattened Device Tree format for the +dynamic configuration files. + +Each Boot Loader stage can pass up to 4 arguments via registers to the next +stage. BL2 passes the list of the next images to execute to the *EL3 Runtime +Software* (BL31 for AArch64 and BL32 for AArch32) via `arg0`. All the other +arguments are platform defined. The Arm development platforms use the following +convention: + +- BL1 passes the address of a meminfo_t structure to BL2 via ``arg1``. This + structure contains the memory layout available to BL2. +- When dynamic configuration files are present, the firmware configuration for + the next Boot Loader stage is populated in the first available argument and + the generic hardware configuration is passed the next available argument. + For example, + + - FW_CONFIG is loaded by BL1, then its address is passed in ``arg0`` to BL2. + - TB_FW_CONFIG address is retrieved by BL2 from FW_CONFIG device tree. + - If HW_CONFIG is loaded by BL1, then its address is passed in ``arg2`` to + BL2. Note, ``arg1`` is already used for meminfo_t. + - If SOC_FW_CONFIG is loaded by BL2, then its address is passed in ``arg1`` + to BL31. Note, ``arg0`` is used to pass the list of executable images. + - Similarly, if HW_CONFIG is loaded by BL1 or BL2, then its address is + passed in ``arg2`` to BL31. + - For other BL3x images, if the firmware configuration file is loaded by + BL2, then its address is passed in ``arg0`` and if HW_CONFIG is loaded + then its address is passed in ``arg1``. + - In case of the Arm FVP platform, FW_CONFIG address passed in ``arg1`` to + BL31/SP_MIN, and the SOC_FW_CONFIG and HW_CONFIG details are retrieved + from FW_CONFIG device tree. + +BL1 +~~~ + +This stage begins execution from the platform's reset vector at EL3. The reset +address is platform dependent but it is usually located in a Trusted ROM area. +The BL1 data section is copied to trusted SRAM at runtime. + +On the Arm development platforms, BL1 code starts execution from the reset +vector defined by the constant ``BL1_RO_BASE``. The BL1 data section is copied +to the top of trusted SRAM as defined by the constant ``BL1_RW_BASE``. + +The functionality implemented by this stage is as follows. + +Determination of boot path +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Whenever a CPU is released from reset, BL1 needs to distinguish between a warm +boot and a cold boot. This is done using platform-specific mechanisms (see the +``plat_get_my_entrypoint()`` function in the :ref:`Porting Guide`). In the case +of a warm boot, a CPU is expected to continue execution from a separate +entrypoint. In the case of a cold boot, the secondary CPUs are placed in a safe +platform-specific state (see the ``plat_secondary_cold_boot_setup()`` function in +the :ref:`Porting Guide`) while the primary CPU executes the remaining cold boot +path as described in the following sections. + +This step only applies when ``PROGRAMMABLE_RESET_ADDRESS=0``. Refer to the +:ref:`CPU Reset` for more information on the effect of the +``PROGRAMMABLE_RESET_ADDRESS`` platform build option. + +Architectural initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL1 performs minimal architectural initialization as follows. + +- Exception vectors + + BL1 sets up simple exception vectors for both synchronous and asynchronous + exceptions. The default behavior upon receiving an exception is to populate + a status code in the general purpose register ``X0/R0`` and call the + ``plat_report_exception()`` function (see the :ref:`Porting Guide`). The + status code is one of: + + For AArch64: + + :: + + 0x0 : Synchronous exception from Current EL with SP_EL0 + 0x1 : IRQ exception from Current EL with SP_EL0 + 0x2 : FIQ exception from Current EL with SP_EL0 + 0x3 : System Error exception from Current EL with SP_EL0 + 0x4 : Synchronous exception from Current EL with SP_ELx + 0x5 : IRQ exception from Current EL with SP_ELx + 0x6 : FIQ exception from Current EL with SP_ELx + 0x7 : System Error exception from Current EL with SP_ELx + 0x8 : Synchronous exception from Lower EL using aarch64 + 0x9 : IRQ exception from Lower EL using aarch64 + 0xa : FIQ exception from Lower EL using aarch64 + 0xb : System Error exception from Lower EL using aarch64 + 0xc : Synchronous exception from Lower EL using aarch32 + 0xd : IRQ exception from Lower EL using aarch32 + 0xe : FIQ exception from Lower EL using aarch32 + 0xf : System Error exception from Lower EL using aarch32 + + For AArch32: + + :: + + 0x10 : User mode + 0x11 : FIQ mode + 0x12 : IRQ mode + 0x13 : SVC mode + 0x16 : Monitor mode + 0x17 : Abort mode + 0x1a : Hypervisor mode + 0x1b : Undefined mode + 0x1f : System mode + + The ``plat_report_exception()`` implementation on the Arm FVP port programs + the Versatile Express System LED register in the following format to + indicate the occurrence of an unexpected exception: + + :: + + SYS_LED[0] - Security state (Secure=0/Non-Secure=1) + SYS_LED[2:1] - Exception Level (EL3=0x3, EL2=0x2, EL1=0x1, EL0=0x0) + For AArch32 it is always 0x0 + SYS_LED[7:3] - Exception Class (Sync/Async & origin). This is the value + of the status code + + A write to the LED register reflects in the System LEDs (S6LED0..7) in the + CLCD window of the FVP. + + BL1 does not expect to receive any exceptions other than the SMC exception. + For the latter, BL1 installs a simple stub. The stub expects to receive a + limited set of SMC types (determined by their function IDs in the general + purpose register ``X0/R0``): + + - ``BL1_SMC_RUN_IMAGE``: This SMC is raised by BL2 to make BL1 pass control + to EL3 Runtime Software. + - All SMCs listed in section "BL1 SMC Interface" in the :ref:`Firmware Update (FWU)` + Design Guide are supported for AArch64 only. These SMCs are currently + not supported when BL1 is built for AArch32. + + Any other SMC leads to an assertion failure. + +- CPU initialization + + BL1 calls the ``reset_handler()`` function which in turn calls the CPU + specific reset handler function (see the section: "CPU specific operations + framework"). + +- Control register setup (for AArch64) + + - ``SCTLR_EL3``. Instruction cache is enabled by setting the ``SCTLR_EL3.I`` + bit. Alignment and stack alignment checking is enabled by setting the + ``SCTLR_EL3.A`` and ``SCTLR_EL3.SA`` bits. Exception endianness is set to + little-endian by clearing the ``SCTLR_EL3.EE`` bit. + + - ``SCR_EL3``. The register width of the next lower exception level is set + to AArch64 by setting the ``SCR.RW`` bit. The ``SCR.EA`` bit is set to trap + both External Aborts and SError Interrupts in EL3. The ``SCR.SIF`` bit is + also set to disable instruction fetches from Non-secure memory when in + secure state. + + - ``CPTR_EL3``. Accesses to the ``CPACR_EL1`` register from EL1 or EL2, or the + ``CPTR_EL2`` register from EL2 are configured to not trap to EL3 by + clearing the ``CPTR_EL3.TCPAC`` bit. Access to the trace functionality is + configured not to trap to EL3 by clearing the ``CPTR_EL3.TTA`` bit. + Instructions that access the registers associated with Floating Point + and Advanced SIMD execution are configured to not trap to EL3 by + clearing the ``CPTR_EL3.TFP`` bit. + + - ``DAIF``. The SError interrupt is enabled by clearing the SError interrupt + mask bit. + + - ``MDCR_EL3``. The trap controls, ``MDCR_EL3.TDOSA``, ``MDCR_EL3.TDA`` and + ``MDCR_EL3.TPM``, are set so that accesses to the registers they control + do not trap to EL3. AArch64 Secure self-hosted debug is disabled by + setting the ``MDCR_EL3.SDD`` bit. Also ``MDCR_EL3.SPD32`` is set to + disable AArch32 Secure self-hosted privileged debug from S-EL1. + +- Control register setup (for AArch32) + + - ``SCTLR``. Instruction cache is enabled by setting the ``SCTLR.I`` bit. + Alignment checking is enabled by setting the ``SCTLR.A`` bit. + Exception endianness is set to little-endian by clearing the + ``SCTLR.EE`` bit. + + - ``SCR``. The ``SCR.SIF`` bit is set to disable instruction fetches from + Non-secure memory when in secure state. + + - ``CPACR``. Allow execution of Advanced SIMD instructions at PL0 and PL1, + by clearing the ``CPACR.ASEDIS`` bit. Access to the trace functionality + is configured not to trap to undefined mode by clearing the + ``CPACR.TRCDIS`` bit. + + - ``NSACR``. Enable non-secure access to Advanced SIMD functionality and + system register access to implemented trace registers. + + - ``FPEXC``. Enable access to the Advanced SIMD and floating-point + functionality from all Exception levels. + + - ``CPSR.A``. The Asynchronous data abort interrupt is enabled by clearing + the Asynchronous data abort interrupt mask bit. + + - ``SDCR``. The ``SDCR.SPD`` field is set to disable AArch32 Secure + self-hosted privileged debug. + +Platform initialization +^^^^^^^^^^^^^^^^^^^^^^^ + +On Arm platforms, BL1 performs the following platform initializations: + +- Enable the Trusted Watchdog. +- Initialize the console. +- Configure the Interconnect to enable hardware coherency. +- Enable the MMU and map the memory it needs to access. +- Configure any required platform storage to load the next bootloader image + (BL2). +- If the BL1 dynamic configuration file, ``TB_FW_CONFIG``, is available, then + load it to the platform defined address and make it available to BL2 via + ``arg0``. +- Configure the system timer and program the `CNTFRQ_EL0` for use by NS-BL1U + and NS-BL2U firmware update images. + +Firmware Update detection and execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +After performing platform setup, BL1 common code calls +``bl1_plat_get_next_image_id()`` to determine if :ref:`Firmware Update (FWU)` is +required or to proceed with the normal boot process. If the platform code +returns ``BL2_IMAGE_ID`` then the normal boot sequence is executed as described +in the next section, else BL1 assumes that :ref:`Firmware Update (FWU)` is +required and execution passes to the first image in the +:ref:`Firmware Update (FWU)` process. In either case, BL1 retrieves a descriptor +of the next image by calling ``bl1_plat_get_image_desc()``. The image descriptor +contains an ``entry_point_info_t`` structure, which BL1 uses to initialize the +execution state of the next image. + +BL2 image load and execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In the normal boot flow, BL1 execution continues as follows: + +#. BL1 prints the following string from the primary CPU to indicate successful + execution of the BL1 stage: + + :: + + "Booting Trusted Firmware" + +#. BL1 loads a BL2 raw binary image from platform storage, at a + platform-specific base address. Prior to the load, BL1 invokes + ``bl1_plat_handle_pre_image_load()`` which allows the platform to update or + use the image information. If the BL2 image file is not present or if + there is not enough free trusted SRAM the following error message is + printed: + + :: + + "Failed to load BL2 firmware." + +#. BL1 invokes ``bl1_plat_handle_post_image_load()`` which again is intended + for platforms to take further action after image load. This function must + populate the necessary arguments for BL2, which may also include the memory + layout. Further description of the memory layout can be found later + in this document. + +#. BL1 passes control to the BL2 image at Secure EL1 (for AArch64) or at + Secure SVC mode (for AArch32), starting from its load address. + +BL2 +~~~ + +BL1 loads and passes control to BL2 at Secure-EL1 (for AArch64) or at Secure +SVC mode (for AArch32) . BL2 is linked against and loaded at a platform-specific +base address (more information can be found later in this document). +The functionality implemented by BL2 is as follows. + +Architectural initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For AArch64, BL2 performs the minimal architectural initialization required +for subsequent stages of TF-A and normal world software. EL1 and EL0 are given +access to Floating Point and Advanced SIMD registers by setting the +``CPACR.FPEN`` bits. + +For AArch32, the minimal architectural initialization required for subsequent +stages of TF-A and normal world software is taken care of in BL1 as both BL1 +and BL2 execute at PL1. + +Platform initialization +^^^^^^^^^^^^^^^^^^^^^^^ + +On Arm platforms, BL2 performs the following platform initializations: + +- Initialize the console. +- Configure any required platform storage to allow loading further bootloader + images. +- Enable the MMU and map the memory it needs to access. +- Perform platform security setup to allow access to controlled components. +- Reserve some memory for passing information to the next bootloader image + EL3 Runtime Software and populate it. +- Define the extents of memory available for loading each subsequent + bootloader image. +- If BL1 has passed TB_FW_CONFIG dynamic configuration file in ``arg0``, + then parse it. + +Image loading in BL2 +^^^^^^^^^^^^^^^^^^^^ + +BL2 generic code loads the images based on the list of loadable images +provided by the platform. BL2 passes the list of executable images +provided by the platform to the next handover BL image. + +The list of loadable images provided by the platform may also contain +dynamic configuration files. The files are loaded and can be parsed as +needed in the ``bl2_plat_handle_post_image_load()`` function. These +configuration files can be passed to next Boot Loader stages as arguments +by updating the corresponding entrypoint information in this function. + +SCP_BL2 (System Control Processor Firmware) image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Some systems have a separate System Control Processor (SCP) for power, clock, +reset and system control. BL2 loads the optional SCP_BL2 image from platform +storage into a platform-specific region of secure memory. The subsequent +handling of SCP_BL2 is platform specific. For example, on the Juno Arm +development platform port the image is transferred into SCP's internal memory +using the Boot Over MHU (BOM) protocol after being loaded in the trusted SRAM +memory. The SCP executes SCP_BL2 and signals to the Application Processor (AP) +for BL2 execution to continue. + +EL3 Runtime Software image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 loads the EL3 Runtime Software image from platform storage into a platform- +specific address in trusted SRAM. If there is not enough memory to load the +image or image is missing it leads to an assertion failure. + +AArch64 BL32 (Secure-EL1 Payload) image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 loads the optional BL32 image from platform storage into a platform- +specific region of secure memory. The image executes in the secure world. BL2 +relies on BL31 to pass control to the BL32 image, if present. Hence, BL2 +populates a platform-specific area of memory with the entrypoint/load-address +of the BL32 image. The value of the Saved Processor Status Register (``SPSR``) +for entry into BL32 is not determined by BL2, it is initialized by the +Secure-EL1 Payload Dispatcher (see later) within BL31, which is responsible for +managing interaction with BL32. This information is passed to BL31. + +BL33 (Non-trusted Firmware) image load +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 loads the BL33 image (e.g. UEFI or other test or boot software) from +platform storage into non-secure memory as defined by the platform. + +BL2 relies on EL3 Runtime Software to pass control to BL33 once secure state +initialization is complete. Hence, BL2 populates a platform-specific area of +memory with the entrypoint and Saved Program Status Register (``SPSR``) of the +normal world software image. The entrypoint is the load address of the BL33 +image. The ``SPSR`` is determined as specified in Section 5.13 of the +`Power State Coordination Interface PDD`_. This information is passed to the +EL3 Runtime Software. + +AArch64 BL31 (EL3 Runtime Software) execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL2 execution continues as follows: + +#. BL2 passes control back to BL1 by raising an SMC, providing BL1 with the + BL31 entrypoint. The exception is handled by the SMC exception handler + installed by BL1. + +#. BL1 turns off the MMU and flushes the caches. It clears the + ``SCTLR_EL3.M/I/C`` bits, flushes the data cache to the point of coherency + and invalidates the TLBs. + +#. BL1 passes control to BL31 at the specified entrypoint at EL3. + +Running BL2 at EL3 execution level +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some platforms have a non-TF-A Boot ROM that expects the next boot stage +to execute at EL3. On these platforms, TF-A BL1 is a waste of memory +as its only purpose is to ensure TF-A BL2 is entered at S-EL1. To avoid +this waste, a special mode enables BL2 to execute at EL3, which allows +a non-TF-A Boot ROM to load and jump directly to BL2. This mode is selected +when the build flag BL2_AT_EL3 is enabled. The main differences in this +mode are: + +#. BL2 includes the reset code and the mailbox mechanism to differentiate + cold boot and warm boot. It runs at EL3 doing the arch + initialization required for EL3. + +#. BL2 does not receive the meminfo information from BL1 anymore. This + information can be passed by the Boot ROM or be internal to the + BL2 image. + +#. Since BL2 executes at EL3, BL2 jumps directly to the next image, + instead of invoking the RUN_IMAGE SMC call. + + +We assume 3 different types of BootROM support on the platform: + +#. The Boot ROM always jumps to the same address, for both cold + and warm boot. In this case, we will need to keep a resident part + of BL2 whose memory cannot be reclaimed by any other image. The + linker script defines the symbols __TEXT_RESIDENT_START__ and + __TEXT_RESIDENT_END__ that allows the platform to configure + correctly the memory map. +#. The platform has some mechanism to indicate the jump address to the + Boot ROM. Platform code can then program the jump address with + psci_warmboot_entrypoint during cold boot. +#. The platform has some mechanism to program the reset address using + the PROGRAMMABLE_RESET_ADDRESS feature. Platform code can then + program the reset address with psci_warmboot_entrypoint during + cold boot, bypassing the boot ROM for warm boot. + +In the last 2 cases, no part of BL2 needs to remain resident at +runtime. In the first 2 cases, we expect the Boot ROM to be able to +differentiate between warm and cold boot, to avoid loading BL2 again +during warm boot. + +This functionality can be tested with FVP loading the image directly +in memory and changing the address where the system jumps at reset. +For example: + + -C cluster0.cpu0.RVBAR=0x4022000 + --data cluster0.cpu0=bl2.bin@0x4022000 + +With this configuration, FVP is like a platform of the first case, +where the Boot ROM jumps always to the same address. For simplification, +BL32 is loaded in DRAM in this case, to avoid other images reclaiming +BL2 memory. + + +AArch64 BL31 +~~~~~~~~~~~~ + +The image for this stage is loaded by BL2 and BL1 passes control to BL31 at +EL3. BL31 executes solely in trusted SRAM. BL31 is linked against and +loaded at a platform-specific base address (more information can be found later +in this document). The functionality implemented by BL31 is as follows. + +Architectural initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Currently, BL31 performs a similar architectural initialization to BL1 as +far as system register settings are concerned. Since BL1 code resides in ROM, +architectural initialization in BL31 allows override of any previous +initialization done by BL1. + +BL31 initializes the per-CPU data framework, which provides a cache of +frequently accessed per-CPU data optimised for fast, concurrent manipulation +on different CPUs. This buffer includes pointers to per-CPU contexts, crash +buffer, CPU reset and power down operations, PSCI data, platform data and so on. + +It then replaces the exception vectors populated by BL1 with its own. BL31 +exception vectors implement more elaborate support for handling SMCs since this +is the only mechanism to access the runtime services implemented by BL31 (PSCI +for example). BL31 checks each SMC for validity as specified by the +`SMC Calling Convention`_ before passing control to the required SMC +handler routine. + +BL31 programs the ``CNTFRQ_EL0`` register with the clock frequency of the system +counter, which is provided by the platform. + +Platform initialization +^^^^^^^^^^^^^^^^^^^^^^^ + +BL31 performs detailed platform initialization, which enables normal world +software to function correctly. + +On Arm platforms, this consists of the following: + +- Initialize the console. +- Configure the Interconnect to enable hardware coherency. +- Enable the MMU and map the memory it needs to access. +- Initialize the generic interrupt controller. +- Initialize the power controller device. +- Detect the system topology. + +Runtime services initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +BL31 is responsible for initializing the runtime services. One of them is PSCI. + +As part of the PSCI initializations, BL31 detects the system topology. It also +initializes the data structures that implement the state machine used to track +the state of power domain nodes. The state can be one of ``OFF``, ``RUN`` or +``RETENTION``. All secondary CPUs are initially in the ``OFF`` state. The cluster +that the primary CPU belongs to is ``ON``; any other cluster is ``OFF``. It also +initializes the locks that protect them. BL31 accesses the state of a CPU or +cluster immediately after reset and before the data cache is enabled in the +warm boot path. It is not currently possible to use 'exclusive' based spinlocks, +therefore BL31 uses locks based on Lamport's Bakery algorithm instead. + +The runtime service framework and its initialization is described in more +detail in the "EL3 runtime services framework" section below. + +Details about the status of the PSCI implementation are provided in the +"Power State Coordination Interface" section below. + +AArch64 BL32 (Secure-EL1 Payload) image initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If a BL32 image is present then there must be a matching Secure-EL1 Payload +Dispatcher (SPD) service (see later for details). During initialization +that service must register a function to carry out initialization of BL32 +once the runtime services are fully initialized. BL31 invokes such a +registered function to initialize BL32 before running BL33. This initialization +is not necessary for AArch32 SPs. + +Details on BL32 initialization and the SPD's role are described in the +:ref:`firmware_design_sel1_spd` section below. + +BL33 (Non-trusted Firmware) execution +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +EL3 Runtime Software initializes the EL2 or EL1 processor context for normal- +world cold boot, ensuring that no secure state information finds its way into +the non-secure execution state. EL3 Runtime Software uses the entrypoint +information provided by BL2 to jump to the Non-trusted firmware image (BL33) +at the highest available Exception Level (EL2 if available, otherwise EL1). + +Using alternative Trusted Boot Firmware in place of BL1 & BL2 (AArch64 only) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Some platforms have existing implementations of Trusted Boot Firmware that +would like to use TF-A BL31 for the EL3 Runtime Software. To enable this +firmware architecture it is important to provide a fully documented and stable +interface between the Trusted Boot Firmware and BL31. + +Future changes to the BL31 interface will be done in a backwards compatible +way, and this enables these firmware components to be independently enhanced/ +updated to develop and exploit new functionality. + +Required CPU state when calling ``bl31_entrypoint()`` during cold boot +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This function must only be called by the primary CPU. + +On entry to this function the calling primary CPU must be executing in AArch64 +EL3, little-endian data access, and all interrupt sources masked: + +:: + + PSTATE.EL = 3 + PSTATE.RW = 1 + PSTATE.DAIF = 0xf + SCTLR_EL3.EE = 0 + +X0 and X1 can be used to pass information from the Trusted Boot Firmware to the +platform code in BL31: + +:: + + X0 : Reserved for common TF-A information + X1 : Platform specific information + +BL31 zero-init sections (e.g. ``.bss``) should not contain valid data on entry, +these will be zero filled prior to invoking platform setup code. + +Use of the X0 and X1 parameters +''''''''''''''''''''''''''''''' + +The parameters are platform specific and passed from ``bl31_entrypoint()`` to +``bl31_early_platform_setup()``. The value of these parameters is never directly +used by the common BL31 code. + +The convention is that ``X0`` conveys information regarding the BL31, BL32 and +BL33 images from the Trusted Boot firmware and ``X1`` can be used for other +platform specific purpose. This convention allows platforms which use TF-A's +BL1 and BL2 images to transfer additional platform specific information from +Secure Boot without conflicting with future evolution of TF-A using ``X0`` to +pass a ``bl31_params`` structure. + +BL31 common and SPD initialization code depends on image and entrypoint +information about BL33 and BL32, which is provided via BL31 platform APIs. +This information is required until the start of execution of BL33. This +information can be provided in a platform defined manner, e.g. compiled into +the platform code in BL31, or provided in a platform defined memory location +by the Trusted Boot firmware, or passed from the Trusted Boot Firmware via the +Cold boot Initialization parameters. This data may need to be cleaned out of +the CPU caches if it is provided by an earlier boot stage and then accessed by +BL31 platform code before the caches are enabled. + +TF-A's BL2 implementation passes a ``bl31_params`` structure in +``X0`` and the Arm development platforms interpret this in the BL31 platform +code. + +MMU, Data caches & Coherency +'''''''''''''''''''''''''''' + +BL31 does not depend on the enabled state of the MMU, data caches or +interconnect coherency on entry to ``bl31_entrypoint()``. If these are disabled +on entry, these should be enabled during ``bl31_plat_arch_setup()``. + +Data structures used in the BL31 cold boot interface +'''''''''''''''''''''''''''''''''''''''''''''''''''' + +These structures are designed to support compatibility and independent +evolution of the structures and the firmware images. For example, a version of +BL31 that can interpret the BL3x image information from different versions of +BL2, a platform that uses an extended entry_point_info structure to convey +additional register information to BL31, or a ELF image loader that can convey +more details about the firmware images. + +To support these scenarios the structures are versioned and sized, which enables +BL31 to detect which information is present and respond appropriately. The +``param_header`` is defined to capture this information: + +.. code:: c + + typedef struct param_header { + uint8_t type; /* type of the structure */ + uint8_t version; /* version of this structure */ + uint16_t size; /* size of this structure in bytes */ + uint32_t attr; /* attributes: unused bits SBZ */ + } param_header_t; + +The structures using this format are ``entry_point_info``, ``image_info`` and +``bl31_params``. The code that allocates and populates these structures must set +the header fields appropriately, and the ``SET_PARAM_HEAD()`` a macro is defined +to simplify this action. + +Required CPU state for BL31 Warm boot initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When requesting a CPU power-on, or suspending a running CPU, TF-A provides +the platform power management code with a Warm boot initialization +entry-point, to be invoked by the CPU immediately after the reset handler. +On entry to the Warm boot initialization function the calling CPU must be in +AArch64 EL3, little-endian data access and all interrupt sources masked: + +:: + + PSTATE.EL = 3 + PSTATE.RW = 1 + PSTATE.DAIF = 0xf + SCTLR_EL3.EE = 0 + +The PSCI implementation will initialize the processor state and ensure that the +platform power management code is then invoked as required to initialize all +necessary system, cluster and CPU resources. + +AArch32 EL3 Runtime Software entrypoint interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To enable this firmware architecture it is important to provide a fully +documented and stable interface between the Trusted Boot Firmware and the +AArch32 EL3 Runtime Software. + +Future changes to the entrypoint interface will be done in a backwards +compatible way, and this enables these firmware components to be independently +enhanced/updated to develop and exploit new functionality. + +Required CPU state when entering during cold boot +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This function must only be called by the primary CPU. + +On entry to this function the calling primary CPU must be executing in AArch32 +EL3, little-endian data access, and all interrupt sources masked: + +:: + + PSTATE.AIF = 0x7 + SCTLR.EE = 0 + +R0 and R1 are used to pass information from the Trusted Boot Firmware to the +platform code in AArch32 EL3 Runtime Software: + +:: + + R0 : Reserved for common TF-A information + R1 : Platform specific information + +Use of the R0 and R1 parameters +''''''''''''''''''''''''''''''' + +The parameters are platform specific and the convention is that ``R0`` conveys +information regarding the BL3x images from the Trusted Boot firmware and ``R1`` +can be used for other platform specific purpose. This convention allows +platforms which use TF-A's BL1 and BL2 images to transfer additional platform +specific information from Secure Boot without conflicting with future +evolution of TF-A using ``R0`` to pass a ``bl_params`` structure. + +The AArch32 EL3 Runtime Software is responsible for entry into BL33. This +information can be obtained in a platform defined manner, e.g. compiled into +the AArch32 EL3 Runtime Software, or provided in a platform defined memory +location by the Trusted Boot firmware, or passed from the Trusted Boot Firmware +via the Cold boot Initialization parameters. This data may need to be cleaned +out of the CPU caches if it is provided by an earlier boot stage and then +accessed by AArch32 EL3 Runtime Software before the caches are enabled. + +When using AArch32 EL3 Runtime Software, the Arm development platforms pass a +``bl_params`` structure in ``R0`` from BL2 to be interpreted by AArch32 EL3 Runtime +Software platform code. + +MMU, Data caches & Coherency +'''''''''''''''''''''''''''' + +AArch32 EL3 Runtime Software must not depend on the enabled state of the MMU, +data caches or interconnect coherency in its entrypoint. They must be explicitly +enabled if required. + +Data structures used in cold boot interface +''''''''''''''''''''''''''''''''''''''''''' + +The AArch32 EL3 Runtime Software cold boot interface uses ``bl_params`` instead +of ``bl31_params``. The ``bl_params`` structure is based on the convention +described in AArch64 BL31 cold boot interface section. + +Required CPU state for warm boot initialization +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When requesting a CPU power-on, or suspending a running CPU, AArch32 EL3 +Runtime Software must ensure execution of a warm boot initialization entrypoint. +If TF-A BL1 is used and the PROGRAMMABLE_RESET_ADDRESS build flag is false, +then AArch32 EL3 Runtime Software must ensure that BL1 branches to the warm +boot entrypoint by arranging for the BL1 platform function, +plat_get_my_entrypoint(), to return a non-zero value. + +In this case, the warm boot entrypoint must be in AArch32 EL3, little-endian +data access and all interrupt sources masked: + +:: + + PSTATE.AIF = 0x7 + SCTLR.EE = 0 + +The warm boot entrypoint may be implemented by using TF-A +``psci_warmboot_entrypoint()`` function. In that case, the platform must fulfil +the pre-requisites mentioned in the +:ref:`PSCI Library Integration guide for Armv8-A AArch32 systems`. + +EL3 runtime services framework +------------------------------ + +Software executing in the non-secure state and in the secure state at exception +levels lower than EL3 will request runtime services using the Secure Monitor +Call (SMC) instruction. These requests will follow the convention described in +the SMC Calling Convention PDD (`SMCCC`_). The `SMCCC`_ assigns function +identifiers to each SMC request and describes how arguments are passed and +returned. + +The EL3 runtime services framework enables the development of services by +different providers that can be easily integrated into final product firmware. +The following sections describe the framework which facilitates the +registration, initialization and use of runtime services in EL3 Runtime +Software (BL31). + +The design of the runtime services depends heavily on the concepts and +definitions described in the `SMCCC`_, in particular SMC Function IDs, Owning +Entity Numbers (OEN), Fast and Yielding calls, and the SMC32 and SMC64 calling +conventions. Please refer to that document for more detailed explanation of +these terms. + +The following runtime services are expected to be implemented first. They have +not all been instantiated in the current implementation. + +#. Standard service calls + + This service is for management of the entire system. The Power State + Coordination Interface (`PSCI`_) is the first set of standard service calls + defined by Arm (see PSCI section later). + +#. Secure-EL1 Payload Dispatcher service + + If a system runs a Trusted OS or other Secure-EL1 Payload (SP) then + it also requires a *Secure Monitor* at EL3 to switch the EL1 processor + context between the normal world (EL1/EL2) and trusted world (Secure-EL1). + The Secure Monitor will make these world switches in response to SMCs. The + `SMCCC`_ provides for such SMCs with the Trusted OS Call and Trusted + Application Call OEN ranges. + + The interface between the EL3 Runtime Software and the Secure-EL1 Payload is + not defined by the `SMCCC`_ or any other standard. As a result, each + Secure-EL1 Payload requires a specific Secure Monitor that runs as a runtime + service - within TF-A this service is referred to as the Secure-EL1 Payload + Dispatcher (SPD). + + TF-A provides a Test Secure-EL1 Payload (TSP) and its associated Dispatcher + (TSPD). Details of SPD design and TSP/TSPD operation are described in the + :ref:`firmware_design_sel1_spd` section below. + +#. CPU implementation service + + This service will provide an interface to CPU implementation specific + services for a given platform e.g. access to processor errata workarounds. + This service is currently unimplemented. + +Additional services for Arm Architecture, SiP and OEM calls can be implemented. +Each implemented service handles a range of SMC function identifiers as +described in the `SMCCC`_. + +Registration +~~~~~~~~~~~~ + +A runtime service is registered using the ``DECLARE_RT_SVC()`` macro, specifying +the name of the service, the range of OENs covered, the type of service and +initialization and call handler functions. This macro instantiates a ``const struct rt_svc_desc`` for the service with these details (see ``runtime_svc.h``). +This structure is allocated in a special ELF section ``rt_svc_descs``, enabling +the framework to find all service descriptors included into BL31. + +The specific service for a SMC Function is selected based on the OEN and call +type of the Function ID, and the framework uses that information in the service +descriptor to identify the handler for the SMC Call. + +The service descriptors do not include information to identify the precise set +of SMC function identifiers supported by this service implementation, the +security state from which such calls are valid nor the capability to support +64-bit and/or 32-bit callers (using SMC32 or SMC64). Responding appropriately +to these aspects of a SMC call is the responsibility of the service +implementation, the framework is focused on integration of services from +different providers and minimizing the time taken by the framework before the +service handler is invoked. + +Details of the parameters, requirements and behavior of the initialization and +call handling functions are provided in the following sections. + +Initialization +~~~~~~~~~~~~~~ + +``runtime_svc_init()`` in ``runtime_svc.c`` initializes the runtime services +framework running on the primary CPU during cold boot as part of the BL31 +initialization. This happens prior to initializing a Trusted OS and running +Normal world boot firmware that might in turn use these services. +Initialization involves validating each of the declared runtime service +descriptors, calling the service initialization function and populating the +index used for runtime lookup of the service. + +The BL31 linker script collects all of the declared service descriptors into a +single array and defines symbols that allow the framework to locate and traverse +the array, and determine its size. + +The framework does basic validation of each descriptor to halt firmware +initialization if service declaration errors are detected. The framework does +not check descriptors for the following error conditions, and may behave in an +unpredictable manner under such scenarios: + +#. Overlapping OEN ranges +#. Multiple descriptors for the same range of OENs and ``call_type`` +#. Incorrect range of owning entity numbers for a given ``call_type`` + +Once validated, the service ``init()`` callback is invoked. This function carries +out any essential EL3 initialization before servicing requests. The ``init()`` +function is only invoked on the primary CPU during cold boot. If the service +uses per-CPU data this must either be initialized for all CPUs during this call, +or be done lazily when a CPU first issues an SMC call to that service. If +``init()`` returns anything other than ``0``, this is treated as an initialization +error and the service is ignored: this does not cause the firmware to halt. + +The OEN and call type fields present in the SMC Function ID cover a total of +128 distinct services, but in practice a single descriptor can cover a range of +OENs, e.g. SMCs to call a Trusted OS function. To optimize the lookup of a +service handler, the framework uses an array of 128 indices that map every +distinct OEN/call-type combination either to one of the declared services or to +indicate the service is not handled. This ``rt_svc_descs_indices[]`` array is +populated for all of the OENs covered by a service after the service ``init()`` +function has reported success. So a service that fails to initialize will never +have it's ``handle()`` function invoked. + +The following figure shows how the ``rt_svc_descs_indices[]`` index maps the SMC +Function ID call type and OEN onto a specific service handler in the +``rt_svc_descs[]`` array. + +|Image 1| + +.. _handling-an-smc: + +Handling an SMC +~~~~~~~~~~~~~~~ + +When the EL3 runtime services framework receives a Secure Monitor Call, the SMC +Function ID is passed in W0 from the lower exception level (as per the +`SMCCC`_). If the calling register width is AArch32, it is invalid to invoke an +SMC Function which indicates the SMC64 calling convention: such calls are +ignored and return the Unknown SMC Function Identifier result code ``0xFFFFFFFF`` +in R0/X0. + +Bit[31] (fast/yielding call) and bits[29:24] (owning entity number) of the SMC +Function ID are combined to index into the ``rt_svc_descs_indices[]`` array. The +resulting value might indicate a service that has no handler, in this case the +framework will also report an Unknown SMC Function ID. Otherwise, the value is +used as a further index into the ``rt_svc_descs[]`` array to locate the required +service and handler. + +The service's ``handle()`` callback is provided with five of the SMC parameters +directly, the others are saved into memory for retrieval (if needed) by the +handler. The handler is also provided with an opaque ``handle`` for use with the +supporting library for parameter retrieval, setting return values and context +manipulation. The ``flags`` parameter indicates the security state of the caller +and the state of the SVE hint bit per the SMCCCv1.3. The framework finally sets +up the execution stack for the handler, and invokes the services ``handle()`` +function. + +On return from the handler the result registers are populated in X0-X7 as needed +before restoring the stack and CPU state and returning from the original SMC. + +Exception Handling Framework +---------------------------- + +Please refer to the :ref:`Exception Handling Framework` document. + +Power State Coordination Interface +---------------------------------- + +TODO: Provide design walkthrough of PSCI implementation. + +The PSCI v1.1 specification categorizes APIs as optional and mandatory. All the +mandatory APIs in PSCI v1.1, PSCI v1.0 and in PSCI v0.2 draft specification +`Power State Coordination Interface PDD`_ are implemented. The table lists +the PSCI v1.1 APIs and their support in generic code. + +An API implementation might have a dependency on platform code e.g. CPU_SUSPEND +requires the platform to export a part of the implementation. Hence the level +of support of the mandatory APIs depends upon the support exported by the +platform port as well. The Juno and FVP (all variants) platforms export all the +required support. + ++-----------------------------+-------------+-------------------------------+ +| PSCI v1.1 API | Supported | Comments | ++=============================+=============+===============================+ +| ``PSCI_VERSION`` | Yes | The version returned is 1.1 | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_SUSPEND`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_OFF`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_ON`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``AFFINITY_INFO`` | Yes | | ++-----------------------------+-------------+-------------------------------+ +| ``MIGRATE`` | Yes\*\* | | ++-----------------------------+-------------+-------------------------------+ +| ``MIGRATE_INFO_TYPE`` | Yes\*\* | | ++-----------------------------+-------------+-------------------------------+ +| ``MIGRATE_INFO_CPU`` | Yes\*\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_OFF`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_RESET`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_FEATURES`` | Yes | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_FREEZE`` | No | | ++-----------------------------+-------------+-------------------------------+ +| ``CPU_DEFAULT_SUSPEND`` | No | | ++-----------------------------+-------------+-------------------------------+ +| ``NODE_HW_STATE`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_SUSPEND`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_SET_SUSPEND_MODE`` | No | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_STAT_RESIDENCY`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``PSCI_STAT_COUNT`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``SYSTEM_RESET2`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``MEM_PROTECT`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ +| ``MEM_PROTECT_CHECK_RANGE`` | Yes\* | | ++-----------------------------+-------------+-------------------------------+ + +\*Note : These PSCI APIs require platform power management hooks to be +registered with the generic PSCI code to be supported. + +\*\*Note : These PSCI APIs require appropriate Secure Payload Dispatcher +hooks to be registered with the generic PSCI code to be supported. + +The PSCI implementation in TF-A is a library which can be integrated with +AArch64 or AArch32 EL3 Runtime Software for Armv8-A systems. A guide to +integrating PSCI library with AArch32 EL3 Runtime Software can be found +at :ref:`PSCI Library Integration guide for Armv8-A AArch32 systems`. + +.. _firmware_design_sel1_spd: + +Secure-EL1 Payloads and Dispatchers +----------------------------------- + +On a production system that includes a Trusted OS running in Secure-EL1/EL0, +the Trusted OS is coupled with a companion runtime service in the BL31 +firmware. This service is responsible for the initialisation of the Trusted +OS and all communications with it. The Trusted OS is the BL32 stage of the +boot flow in TF-A. The firmware will attempt to locate, load and execute a +BL32 image. + +TF-A uses a more general term for the BL32 software that runs at Secure-EL1 - +the *Secure-EL1 Payload* - as it is not always a Trusted OS. + +TF-A provides a Test Secure-EL1 Payload (TSP) and a Test Secure-EL1 Payload +Dispatcher (TSPD) service as an example of how a Trusted OS is supported on a +production system using the Runtime Services Framework. On such a system, the +Test BL32 image and service are replaced by the Trusted OS and its dispatcher +service. The TF-A build system expects that the dispatcher will define the +build flag ``NEED_BL32`` to enable it to include the BL32 in the build either +as a binary or to compile from source depending on whether the ``BL32`` build +option is specified or not. + +The TSP runs in Secure-EL1. It is designed to demonstrate synchronous +communication with the normal-world software running in EL1/EL2. Communication +is initiated by the normal-world software + +- either directly through a Fast SMC (as defined in the `SMCCC`_) + +- or indirectly through a `PSCI`_ SMC. The `PSCI`_ implementation in turn + informs the TSPD about the requested power management operation. This allows + the TSP to prepare for or respond to the power state change + +The TSPD service is responsible for. + +- Initializing the TSP + +- Routing requests and responses between the secure and the non-secure + states during the two types of communications just described + +Initializing a BL32 Image +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Secure-EL1 Payload Dispatcher (SPD) service is responsible for initializing +the BL32 image. It needs access to the information passed by BL2 to BL31 to do +so. This is provided by: + +.. code:: c + + entry_point_info_t *bl31_plat_get_next_image_ep_info(uint32_t); + +which returns a reference to the ``entry_point_info`` structure corresponding to +the image which will be run in the specified security state. The SPD uses this +API to get entry point information for the SECURE image, BL32. + +In the absence of a BL32 image, BL31 passes control to the normal world +bootloader image (BL33). When the BL32 image is present, it is typical +that the SPD wants control to be passed to BL32 first and then later to BL33. + +To do this the SPD has to register a BL32 initialization function during +initialization of the SPD service. The BL32 initialization function has this +prototype: + +.. code:: c + + int32_t init(void); + +and is registered using the ``bl31_register_bl32_init()`` function. + +TF-A supports two approaches for the SPD to pass control to BL32 before +returning through EL3 and running the non-trusted firmware (BL33): + +#. In the BL32 setup function, use ``bl31_set_next_image_type()`` to + request that the exit from ``bl31_main()`` is to the BL32 entrypoint in + Secure-EL1. BL31 will exit to BL32 using the asynchronous method by + calling ``bl31_prepare_next_image_entry()`` and ``el3_exit()``. + + When the BL32 has completed initialization at Secure-EL1, it returns to + BL31 by issuing an SMC, using a Function ID allocated to the SPD. On + receipt of this SMC, the SPD service handler should switch the CPU context + from trusted to normal world and use the ``bl31_set_next_image_type()`` and + ``bl31_prepare_next_image_entry()`` functions to set up the initial return to + the normal world firmware BL33. On return from the handler the framework + will exit to EL2 and run BL33. + +#. The BL32 setup function registers an initialization function using + ``bl31_register_bl32_init()`` which provides a SPD-defined mechanism to + invoke a 'world-switch synchronous call' to Secure-EL1 to run the BL32 + entrypoint. + + .. note:: + The Test SPD service included with TF-A provides one implementation + of such a mechanism. + + On completion BL32 returns control to BL31 via a SMC, and on receipt the + SPD service handler invokes the synchronous call return mechanism to return + to the BL32 initialization function. On return from this function, + ``bl31_main()`` will set up the return to the normal world firmware BL33 and + continue the boot process in the normal world. + +Crash Reporting in BL31 +----------------------- + +BL31 implements a scheme for reporting the processor state when an unhandled +exception is encountered. The reporting mechanism attempts to preserve all the +register contents and report it via a dedicated UART (PL011 console). BL31 +reports the general purpose, EL3, Secure EL1 and some EL2 state registers. + +A dedicated per-CPU crash stack is maintained by BL31 and this is retrieved via +the per-CPU pointer cache. The implementation attempts to minimise the memory +required for this feature. The file ``crash_reporting.S`` contains the +implementation for crash reporting. + +The sample crash output is shown below. + +:: + + x0 = 0x000000002a4a0000 + x1 = 0x0000000000000001 + x2 = 0x0000000000000002 + x3 = 0x0000000000000003 + x4 = 0x0000000000000004 + x5 = 0x0000000000000005 + x6 = 0x0000000000000006 + x7 = 0x0000000000000007 + x8 = 0x0000000000000008 + x9 = 0x0000000000000009 + x10 = 0x0000000000000010 + x11 = 0x0000000000000011 + x12 = 0x0000000000000012 + x13 = 0x0000000000000013 + x14 = 0x0000000000000014 + x15 = 0x0000000000000015 + x16 = 0x0000000000000016 + x17 = 0x0000000000000017 + x18 = 0x0000000000000018 + x19 = 0x0000000000000019 + x20 = 0x0000000000000020 + x21 = 0x0000000000000021 + x22 = 0x0000000000000022 + x23 = 0x0000000000000023 + x24 = 0x0000000000000024 + x25 = 0x0000000000000025 + x26 = 0x0000000000000026 + x27 = 0x0000000000000027 + x28 = 0x0000000000000028 + x29 = 0x0000000000000029 + x30 = 0x0000000088000b78 + scr_el3 = 0x000000000003073d + sctlr_el3 = 0x00000000b0cd183f + cptr_el3 = 0x0000000000000000 + tcr_el3 = 0x000000008080351c + daif = 0x00000000000002c0 + mair_el3 = 0x00000000004404ff + spsr_el3 = 0x0000000060000349 + elr_el3 = 0x0000000088000114 + ttbr0_el3 = 0x0000000004018201 + esr_el3 = 0x00000000be000000 + far_el3 = 0x0000000000000000 + spsr_el1 = 0x0000000000000000 + elr_el1 = 0x0000000000000000 + spsr_abt = 0x0000000000000000 + spsr_und = 0x0000000000000000 + spsr_irq = 0x0000000000000000 + spsr_fiq = 0x0000000000000000 + sctlr_el1 = 0x0000000030d00800 + actlr_el1 = 0x0000000000000000 + cpacr_el1 = 0x0000000000000000 + csselr_el1 = 0x0000000000000000 + sp_el1 = 0x0000000000000000 + esr_el1 = 0x0000000000000000 + ttbr0_el1 = 0x0000000000000000 + ttbr1_el1 = 0x0000000000000000 + mair_el1 = 0x0000000000000000 + amair_el1 = 0x0000000000000000 + tcr_el1 = 0x0000000000000000 + tpidr_el1 = 0x0000000000000000 + tpidr_el0 = 0x0000000000000000 + tpidrro_el0 = 0x0000000000000000 + par_el1 = 0x0000000000000000 + mpidr_el1 = 0x0000000080000000 + afsr0_el1 = 0x0000000000000000 + afsr1_el1 = 0x0000000000000000 + contextidr_el1 = 0x0000000000000000 + vbar_el1 = 0x0000000000000000 + cntp_ctl_el0 = 0x0000000000000000 + cntp_cval_el0 = 0x0000000000000000 + cntv_ctl_el0 = 0x0000000000000000 + cntv_cval_el0 = 0x0000000000000000 + cntkctl_el1 = 0x0000000000000000 + sp_el0 = 0x0000000004014940 + isr_el1 = 0x0000000000000000 + dacr32_el2 = 0x0000000000000000 + ifsr32_el2 = 0x0000000000000000 + icc_hppir0_el1 = 0x00000000000003ff + icc_hppir1_el1 = 0x00000000000003ff + icc_ctlr_el3 = 0x0000000000080400 + gicd_ispendr regs (Offsets 0x200-0x278) + Offset Value + 0x200: 0x0000000000000000 + 0x208: 0x0000000000000000 + 0x210: 0x0000000000000000 + 0x218: 0x0000000000000000 + 0x220: 0x0000000000000000 + 0x228: 0x0000000000000000 + 0x230: 0x0000000000000000 + 0x238: 0x0000000000000000 + 0x240: 0x0000000000000000 + 0x248: 0x0000000000000000 + 0x250: 0x0000000000000000 + 0x258: 0x0000000000000000 + 0x260: 0x0000000000000000 + 0x268: 0x0000000000000000 + 0x270: 0x0000000000000000 + 0x278: 0x0000000000000000 + +Guidelines for Reset Handlers +----------------------------- + +TF-A implements a framework that allows CPU and platform ports to perform +actions very early after a CPU is released from reset in both the cold and warm +boot paths. This is done by calling the ``reset_handler()`` function in both +the BL1 and BL31 images. It in turn calls the platform and CPU specific reset +handling functions. + +Details for implementing a CPU specific reset handler can be found in +Section 8. Details for implementing a platform specific reset handler can be +found in the :ref:`Porting Guide` (see the ``plat_reset_handler()`` function). + +When adding functionality to a reset handler, keep in mind that if a different +reset handling behavior is required between the first and the subsequent +invocations of the reset handling code, this should be detected at runtime. +In other words, the reset handler should be able to detect whether an action has +already been performed and act as appropriate. Possible courses of actions are, +e.g. skip the action the second time, or undo/redo it. + +.. _configuring-secure-interrupts: + +Configuring secure interrupts +----------------------------- + +The GIC driver is responsible for performing initial configuration of secure +interrupts on the platform. To this end, the platform is expected to provide the +GIC driver (either GICv2 or GICv3, as selected by the platform) with the +interrupt configuration during the driver initialisation. + +Secure interrupt configuration are specified in an array of secure interrupt +properties. In this scheme, in both GICv2 and GICv3 driver data structures, the +``interrupt_props`` member points to an array of interrupt properties. Each +element of the array specifies the interrupt number and its attributes +(priority, group, configuration). Each element of the array shall be populated +by the macro ``INTR_PROP_DESC()``. The macro takes the following arguments: + +- 10-bit interrupt number, + +- 8-bit interrupt priority, + +- Interrupt type (one of ``INTR_TYPE_EL3``, ``INTR_TYPE_S_EL1``, + ``INTR_TYPE_NS``), + +- Interrupt configuration (either ``GIC_INTR_CFG_LEVEL`` or + ``GIC_INTR_CFG_EDGE``). + +.. _firmware_design_cpu_ops_fwk: + +CPU specific operations framework +--------------------------------- + +Certain aspects of the Armv8-A architecture are implementation defined, +that is, certain behaviours are not architecturally defined, but must be +defined and documented by individual processor implementations. TF-A +implements a framework which categorises the common implementation defined +behaviours and allows a processor to export its implementation of that +behaviour. The categories are: + +#. Processor specific reset sequence. + +#. Processor specific power down sequences. + +#. Processor specific register dumping as a part of crash reporting. + +#. Errata status reporting. + +Each of the above categories fulfils a different requirement. + +#. allows any processor specific initialization before the caches and MMU + are turned on, like implementation of errata workarounds, entry into + the intra-cluster coherency domain etc. + +#. allows each processor to implement the power down sequence mandated in + its Technical Reference Manual (TRM). + +#. allows a processor to provide additional information to the developer + in the event of a crash, for example Cortex-A53 has registers which + can expose the data cache contents. + +#. allows a processor to define a function that inspects and reports the status + of all errata workarounds on that processor. + +Please note that only 2. is mandated by the TRM. + +The CPU specific operations framework scales to accommodate a large number of +different CPUs during power down and reset handling. The platform can specify +any CPU optimization it wants to enable for each CPU. It can also specify +the CPU errata workarounds to be applied for each CPU type during reset +handling by defining CPU errata compile time macros. Details on these macros +can be found in the :ref:`Arm CPU Specific Build Macros` document. + +The CPU specific operations framework depends on the ``cpu_ops`` structure which +needs to be exported for each type of CPU in the platform. It is defined in +``include/lib/cpus/aarch64/cpu_macros.S`` and has the following fields : ``midr``, +``reset_func()``, ``cpu_pwr_down_ops`` (array of power down functions) and +``cpu_reg_dump()``. + +The CPU specific files in ``lib/cpus`` export a ``cpu_ops`` data structure with +suitable handlers for that CPU. For example, ``lib/cpus/aarch64/cortex_a53.S`` +exports the ``cpu_ops`` for Cortex-A53 CPU. According to the platform +configuration, these CPU specific files must be included in the build by +the platform makefile. The generic CPU specific operations framework code exists +in ``lib/cpus/aarch64/cpu_helpers.S``. + +CPU specific Reset Handling +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +After a reset, the state of the CPU when it calls generic reset handler is: +MMU turned off, both instruction and data caches turned off and not part +of any coherency domain. + +The BL entrypoint code first invokes the ``plat_reset_handler()`` to allow +the platform to perform any system initialization required and any system +errata workarounds that needs to be applied. The ``get_cpu_ops_ptr()`` reads +the current CPU midr, finds the matching ``cpu_ops`` entry in the ``cpu_ops`` +array and returns it. Note that only the part number and implementer fields +in midr are used to find the matching ``cpu_ops`` entry. The ``reset_func()`` in +the returned ``cpu_ops`` is then invoked which executes the required reset +handling for that CPU and also any errata workarounds enabled by the platform. +This function must preserve the values of general purpose registers x20 to x29. + +Refer to Section "Guidelines for Reset Handlers" for general guidelines +regarding placement of code in a reset handler. + +CPU specific power down sequence +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +During the BL31 initialization sequence, the pointer to the matching ``cpu_ops`` +entry is stored in per-CPU data by ``init_cpu_ops()`` so that it can be quickly +retrieved during power down sequences. + +Various CPU drivers register handlers to perform power down at certain power +levels for that specific CPU. The PSCI service, upon receiving a power down +request, determines the highest power level at which to execute power down +sequence for a particular CPU. It uses the ``prepare_cpu_pwr_dwn()`` function to +pick the right power down handler for the requested level. The function +retrieves ``cpu_ops`` pointer member of per-CPU data, and from that, further +retrieves ``cpu_pwr_down_ops`` array, and indexes into the required level. If the +requested power level is higher than what a CPU driver supports, the handler +registered for highest level is invoked. + +At runtime the platform hooks for power down are invoked by the PSCI service to +perform platform specific operations during a power down sequence, for example +turning off CCI coherency during a cluster power down. + +CPU specific register reporting during crash +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If the crash reporting is enabled in BL31, when a crash occurs, the crash +reporting framework calls ``do_cpu_reg_dump`` which retrieves the matching +``cpu_ops`` using ``get_cpu_ops_ptr()`` function. The ``cpu_reg_dump()`` in +``cpu_ops`` is invoked, which then returns the CPU specific register values to +be reported and a pointer to the ASCII list of register names in a format +expected by the crash reporting framework. + +.. _firmware_design_cpu_errata_reporting: + +CPU errata status reporting +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Errata workarounds for CPUs supported in TF-A are applied during both cold and +warm boots, shortly after reset. Individual Errata workarounds are enabled as +build options. Some errata workarounds have potential run-time implications; +therefore some are enabled by default, others not. Platform ports shall +override build options to enable or disable errata as appropriate. The CPU +drivers take care of applying errata workarounds that are enabled and applicable +to a given CPU. Refer to :ref:`arm_cpu_macros_errata_workarounds` for more +information. + +Functions in CPU drivers that apply errata workaround must follow the +conventions listed below. + +The errata workaround must be authored as two separate functions: + +- One that checks for errata. This function must determine whether that errata + applies to the current CPU. Typically this involves matching the current + CPUs revision and variant against a value that's known to be affected by the + errata. If the function determines that the errata applies to this CPU, it + must return ``ERRATA_APPLIES``; otherwise, it must return + ``ERRATA_NOT_APPLIES``. The utility functions ``cpu_get_rev_var`` and + ``cpu_rev_var_ls`` functions may come in handy for this purpose. + +For an errata identified as ``E``, the check function must be named +``check_errata_E``. + +This function will be invoked at different times, both from assembly and from +C run time. Therefore it must follow AAPCS, and must not use stack. + +- Another one that applies the errata workaround. This function would call the + check function described above, and applies errata workaround if required. + +CPU drivers that apply errata workaround can optionally implement an assembly +function that report the status of errata workarounds pertaining to that CPU. +For a driver that registers the CPU, for example, ``cpux`` via ``declare_cpu_ops`` +macro, the errata reporting function, if it exists, must be named +``cpux_errata_report``. This function will always be called with MMU enabled; it +must follow AAPCS and may use stack. + +In a debug build of TF-A, on a CPU that comes out of reset, both BL1 and the +runtime firmware (BL31 in AArch64, and BL32 in AArch32) will invoke errata +status reporting function, if one exists, for that type of CPU. + +To report the status of each errata workaround, the function shall use the +assembler macro ``report_errata``, passing it: + +- The build option that enables the errata; + +- The name of the CPU: this must be the same identifier that CPU driver + registered itself with, using ``declare_cpu_ops``; + +- And the errata identifier: the identifier must match what's used in the + errata's check function described above. + +The errata status reporting function will be called once per CPU type/errata +combination during the software's active life time. + +It's expected that whenever an errata workaround is submitted to TF-A, the +errata reporting function is appropriately extended to report its status as +well. + +Reporting the status of errata workaround is for informational purpose only; it +has no functional significance. + +Memory layout of BL images +-------------------------- + +Each bootloader image can be divided in 2 parts: + +- the static contents of the image. These are data actually stored in the + binary on the disk. In the ELF terminology, they are called ``PROGBITS`` + sections; + +- the run-time contents of the image. These are data that don't occupy any + space in the binary on the disk. The ELF binary just contains some + metadata indicating where these data will be stored at run-time and the + corresponding sections need to be allocated and initialized at run-time. + In the ELF terminology, they are called ``NOBITS`` sections. + +All PROGBITS sections are grouped together at the beginning of the image, +followed by all NOBITS sections. This is true for all TF-A images and it is +governed by the linker scripts. This ensures that the raw binary images are +as small as possible. If a NOBITS section was inserted in between PROGBITS +sections then the resulting binary file would contain zero bytes in place of +this NOBITS section, making the image unnecessarily bigger. Smaller images +allow faster loading from the FIP to the main memory. + +For BL31, a platform can specify an alternate location for NOBITS sections +(other than immediately following PROGBITS sections) by setting +``SEPARATE_NOBITS_REGION`` to 1 and defining ``BL31_NOBITS_BASE`` and +``BL31_NOBITS_LIMIT``. + +Linker scripts and symbols +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Each bootloader stage image layout is described by its own linker script. The +linker scripts export some symbols into the program symbol table. Their values +correspond to particular addresses. TF-A code can refer to these symbols to +figure out the image memory layout. + +Linker symbols follow the following naming convention in TF-A. + +- ``__<SECTION>_START__`` + + Start address of a given section named ``<SECTION>``. + +- ``__<SECTION>_END__`` + + End address of a given section named ``<SECTION>``. If there is an alignment + constraint on the section's end address then ``__<SECTION>_END__`` corresponds + to the end address of the section's actual contents, rounded up to the right + boundary. Refer to the value of ``__<SECTION>_UNALIGNED_END__`` to know the + actual end address of the section's contents. + +- ``__<SECTION>_UNALIGNED_END__`` + + End address of a given section named ``<SECTION>`` without any padding or + rounding up due to some alignment constraint. + +- ``__<SECTION>_SIZE__`` + + Size (in bytes) of a given section named ``<SECTION>``. If there is an + alignment constraint on the section's end address then ``__<SECTION>_SIZE__`` + corresponds to the size of the section's actual contents, rounded up to the + right boundary. In other words, ``__<SECTION>_SIZE__ = __<SECTION>_END__ - _<SECTION>_START__``. Refer to the value of ``__<SECTION>_UNALIGNED_SIZE__`` + to know the actual size of the section's contents. + +- ``__<SECTION>_UNALIGNED_SIZE__`` + + Size (in bytes) of a given section named ``<SECTION>`` without any padding or + rounding up due to some alignment constraint. In other words, + ``__<SECTION>_UNALIGNED_SIZE__ = __<SECTION>_UNALIGNED_END__ - __<SECTION>_START__``. + +Some of the linker symbols are mandatory as TF-A code relies on them to be +defined. They are listed in the following subsections. Some of them must be +provided for each bootloader stage and some are specific to a given bootloader +stage. + +The linker scripts define some extra, optional symbols. They are not actually +used by any code but they help in understanding the bootloader images' memory +layout as they are easy to spot in the link map files. + +Common linker symbols +^^^^^^^^^^^^^^^^^^^^^ + +All BL images share the following requirements: + +- The BSS section must be zero-initialised before executing any C code. +- The coherent memory section (if enabled) must be zero-initialised as well. +- The MMU setup code needs to know the extents of the coherent and read-only + memory regions to set the right memory attributes. When + ``SEPARATE_CODE_AND_RODATA=1``, it needs to know more specifically how the + read-only memory region is divided between code and data. + +The following linker symbols are defined for this purpose: + +- ``__BSS_START__`` +- ``__BSS_SIZE__`` +- ``__COHERENT_RAM_START__`` Must be aligned on a page-size boundary. +- ``__COHERENT_RAM_END__`` Must be aligned on a page-size boundary. +- ``__COHERENT_RAM_UNALIGNED_SIZE__`` +- ``__RO_START__`` +- ``__RO_END__`` +- ``__TEXT_START__`` +- ``__TEXT_END__`` +- ``__RODATA_START__`` +- ``__RODATA_END__`` + +BL1's linker symbols +^^^^^^^^^^^^^^^^^^^^ + +BL1 being the ROM image, it has additional requirements. BL1 resides in ROM and +it is entirely executed in place but it needs some read-write memory for its +mutable data. Its ``.data`` section (i.e. its allocated read-write data) must be +relocated from ROM to RAM before executing any C code. + +The following additional linker symbols are defined for BL1: + +- ``__BL1_ROM_END__`` End address of BL1's ROM contents, covering its code + and ``.data`` section in ROM. +- ``__DATA_ROM_START__`` Start address of the ``.data`` section in ROM. Must be + aligned on a 16-byte boundary. +- ``__DATA_RAM_START__`` Address in RAM where the ``.data`` section should be + copied over. Must be aligned on a 16-byte boundary. +- ``__DATA_SIZE__`` Size of the ``.data`` section (in ROM or RAM). +- ``__BL1_RAM_START__`` Start address of BL1 read-write data. +- ``__BL1_RAM_END__`` End address of BL1 read-write data. + +How to choose the right base addresses for each bootloader stage image +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There is currently no support for dynamic image loading in TF-A. This means +that all bootloader images need to be linked against their ultimate runtime +locations and the base addresses of each image must be chosen carefully such +that images don't overlap each other in an undesired way. As the code grows, +the base addresses might need adjustments to cope with the new memory layout. + +The memory layout is completely specific to the platform and so there is no +general recipe for choosing the right base addresses for each bootloader image. +However, there are tools to aid in understanding the memory layout. These are +the link map files: ``build/<platform>/<build-type>/bl<x>/bl<x>.map``, with ``<x>`` +being the stage bootloader. They provide a detailed view of the memory usage of +each image. Among other useful information, they provide the end address of +each image. + +- ``bl1.map`` link map file provides ``__BL1_RAM_END__`` address. +- ``bl2.map`` link map file provides ``__BL2_END__`` address. +- ``bl31.map`` link map file provides ``__BL31_END__`` address. +- ``bl32.map`` link map file provides ``__BL32_END__`` address. + +For each bootloader image, the platform code must provide its start address +as well as a limit address that it must not overstep. The latter is used in the +linker scripts to check that the image doesn't grow past that address. If that +happens, the linker will issue a message similar to the following: + +:: + + aarch64-none-elf-ld: BLx has exceeded its limit. + +Additionally, if the platform memory layout implies some image overlaying like +on FVP, BL31 and TSP need to know the limit address that their PROGBITS +sections must not overstep. The platform code must provide those. + +TF-A does not provide any mechanism to verify at boot time that the memory +to load a new image is free to prevent overwriting a previously loaded image. +The platform must specify the memory available in the system for all the +relevant BL images to be loaded. + +For example, in the case of BL1 loading BL2, ``bl1_plat_sec_mem_layout()`` will +return the region defined by the platform where BL1 intends to load BL2. The +``load_image()`` function performs bounds check for the image size based on the +base and maximum image size provided by the platforms. Platforms must take +this behaviour into account when defining the base/size for each of the images. + +Memory layout on Arm development platforms +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following list describes the memory layout on the Arm development platforms: + +- A 4KB page of shared memory is used for communication between Trusted + Firmware and the platform's power controller. This is located at the base of + Trusted SRAM. The amount of Trusted SRAM available to load the bootloader + images is reduced by the size of the shared memory. + + The shared memory is used to store the CPUs' entrypoint mailbox. On Juno, + this is also used for the MHU payload when passing messages to and from the + SCP. + +- Another 4 KB page is reserved for passing memory layout between BL1 and BL2 + and also the dynamic firmware configurations. + +- On FVP, BL1 is originally sitting in the Trusted ROM at address ``0x0``. On + Juno, BL1 resides in flash memory at address ``0x0BEC0000``. BL1 read-write + data are relocated to the top of Trusted SRAM at runtime. + +- BL2 is loaded below BL1 RW + +- EL3 Runtime Software, BL31 for AArch64 and BL32 for AArch32 (e.g. SP_MIN), + is loaded at the top of the Trusted SRAM, such that its NOBITS sections will + overwrite BL1 R/W data and BL2. This implies that BL1 global variables + remain valid only until execution reaches the EL3 Runtime Software entry + point during a cold boot. + +- On Juno, SCP_BL2 is loaded temporarily into the EL3 Runtime Software memory + region and transferred to the SCP before being overwritten by EL3 Runtime + Software. + +- BL32 (for AArch64) can be loaded in one of the following locations: + + - Trusted SRAM + - Trusted DRAM (FVP only) + - Secure region of DRAM (top 16MB of DRAM configured by the TrustZone + controller) + + When BL32 (for AArch64) is loaded into Trusted SRAM, it is loaded below + BL31. + +The location of the BL32 image will result in different memory maps. This is +illustrated for both FVP and Juno in the following diagrams, using the TSP as +an example. + +.. note:: + Loading the BL32 image in TZC secured DRAM doesn't change the memory + layout of the other images in Trusted SRAM. + +CONFIG section in memory layouts shown below contains: + +:: + + +--------------------+ + |bl2_mem_params_descs| + |--------------------| + | fw_configs | + +--------------------+ + +``bl2_mem_params_descs`` contains parameters passed from BL2 to next the +BL image during boot. + +``fw_configs`` includes soc_fw_config, tos_fw_config, tb_fw_config and fw_config. + +**FVP with TSP in Trusted SRAM with firmware configs :** +(These diagrams only cover the AArch64 case) + +:: + + DRAM + 0xffffffff +----------+ + : : + 0x82100000 |----------| + |HW_CONFIG | + 0x82000000 |----------| (non-secure) + | | + 0x80000000 +----------+ + + Trusted DRAM + 0x08000000 +----------+ + |HW_CONFIG | + 0x07f00000 |----------| + : : + | | + 0x06000000 +----------+ + + Trusted SRAM + 0x04040000 +----------+ loaded by BL2 +----------------+ + | BL1 (rw) | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< | BL31 NOBITS | + | BL2 | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL31 PROGBITS | + | | <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL32 | + 0x04003000 +----------+ +----------------+ + | CONFIG | + 0x04001000 +----------+ + | Shared | + 0x04000000 +----------+ + + Trusted ROM + 0x04000000 +----------+ + | BL1 (ro) | + 0x00000000 +----------+ + +**FVP with TSP in Trusted DRAM with firmware configs (default option):** + +:: + + DRAM + 0xffffffff +--------------+ + : : + 0x82100000 |--------------| + | HW_CONFIG | + 0x82000000 |--------------| (non-secure) + | | + 0x80000000 +--------------+ + + Trusted DRAM + 0x08000000 +--------------+ + | HW_CONFIG | + 0x07f00000 |--------------| + : : + | BL32 | + 0x06000000 +--------------+ + + Trusted SRAM + 0x04040000 +--------------+ loaded by BL2 +----------------+ + | BL1 (rw) | <<<<<<<<<<<<< | | + |--------------| <<<<<<<<<<<<< | BL31 NOBITS | + | BL2 | <<<<<<<<<<<<< | | + |--------------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL31 PROGBITS | + | | +----------------+ + 0x04003000 +--------------+ + | CONFIG | + 0x04001000 +--------------+ + | Shared | + 0x04000000 +--------------+ + + Trusted ROM + 0x04000000 +--------------+ + | BL1 (ro) | + 0x00000000 +--------------+ + +**FVP with TSP in TZC-Secured DRAM with firmware configs :** + +:: + + DRAM + 0xffffffff +----------+ + | BL32 | (secure) + 0xff000000 +----------+ + | | + 0x82100000 |----------| + |HW_CONFIG | + 0x82000000 |----------| (non-secure) + | | + 0x80000000 +----------+ + + Trusted DRAM + 0x08000000 +----------+ + |HW_CONFIG | + 0x7f000000 |----------| + : : + | | + 0x06000000 +----------+ + + Trusted SRAM + 0x04040000 +----------+ loaded by BL2 +----------------+ + | BL1 (rw) | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< | BL31 NOBITS | + | BL2 | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL31 PROGBITS | + | | +----------------+ + 0x04003000 +----------+ + | CONFIG | + 0x04001000 +----------+ + | Shared | + 0x04000000 +----------+ + + Trusted ROM + 0x04000000 +----------+ + | BL1 (ro) | + 0x00000000 +----------+ + +**Juno with BL32 in Trusted SRAM :** + +:: + + Flash0 + 0x0C000000 +----------+ + : : + 0x0BED0000 |----------| + | BL1 (ro) | + 0x0BEC0000 |----------| + : : + 0x08000000 +----------+ BL31 is loaded + after SCP_BL2 has + Trusted SRAM been sent to SCP + 0x04040000 +----------+ loaded by BL2 +----------------+ + | BL1 (rw) | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< | BL31 NOBITS | + | BL2 | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< |----------------| + | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | + | | <<<<<<<<<<<<< |----------------| + | | <<<<<<<<<<<<< | BL32 | + | | +----------------+ + | | + 0x04001000 +----------+ + | MHU | + 0x04000000 +----------+ + +**Juno with BL32 in TZC-secured DRAM :** + +:: + + DRAM + 0xFFE00000 +----------+ + | BL32 | (secure) + 0xFF000000 |----------| + | | + : : (non-secure) + | | + 0x80000000 +----------+ + + Flash0 + 0x0C000000 +----------+ + : : + 0x0BED0000 |----------| + | BL1 (ro) | + 0x0BEC0000 |----------| + : : + 0x08000000 +----------+ BL31 is loaded + after SCP_BL2 has + Trusted SRAM been sent to SCP + 0x04040000 +----------+ loaded by BL2 +----------------+ + | BL1 (rw) | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< | BL31 NOBITS | + | BL2 | <<<<<<<<<<<<< | | + |----------| <<<<<<<<<<<<< |----------------| + | SCP_BL2 | <<<<<<<<<<<<< | BL31 PROGBITS | + | | +----------------+ + 0x04001000 +----------+ + | MHU | + 0x04000000 +----------+ + +.. _firmware_design_fip: + +Firmware Image Package (FIP) +---------------------------- + +Using a Firmware Image Package (FIP) allows for packing bootloader images (and +potentially other payloads) into a single archive that can be loaded by TF-A +from non-volatile platform storage. A driver to load images from a FIP has +been added to the storage layer and allows a package to be read from supported +platform storage. A tool to create Firmware Image Packages is also provided +and described below. + +Firmware Image Package layout +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The FIP layout consists of a table of contents (ToC) followed by payload data. +The ToC itself has a header followed by one or more table entries. The ToC is +terminated by an end marker entry, and since the size of the ToC is 0 bytes, +the offset equals the total size of the FIP file. All ToC entries describe some +payload data that has been appended to the end of the binary package. With the +information provided in the ToC entry the corresponding payload data can be +retrieved. + +:: + + ------------------ + | ToC Header | + |----------------| + | ToC Entry 0 | + |----------------| + | ToC Entry 1 | + |----------------| + | ToC End Marker | + |----------------| + | | + | Data 0 | + | | + |----------------| + | | + | Data 1 | + | | + ------------------ + +The ToC header and entry formats are described in the header file +``include/tools_share/firmware_image_package.h``. This file is used by both the +tool and TF-A. + +The ToC header has the following fields: + +:: + + `name`: The name of the ToC. This is currently used to validate the header. + `serial_number`: A non-zero number provided by the creation tool + `flags`: Flags associated with this data. + Bits 0-31: Reserved + Bits 32-47: Platform defined + Bits 48-63: Reserved + +A ToC entry has the following fields: + +:: + + `uuid`: All files are referred to by a pre-defined Universally Unique + IDentifier [UUID] . The UUIDs are defined in + `include/tools_share/firmware_image_package.h`. The platform translates + the requested image name into the corresponding UUID when accessing the + package. + `offset_address`: The offset address at which the corresponding payload data + can be found. The offset is calculated from the ToC base address. + `size`: The size of the corresponding payload data in bytes. + `flags`: Flags associated with this entry. None are yet defined. + +Firmware Image Package creation tool +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The FIP creation tool can be used to pack specified images into a binary +package that can be loaded by TF-A from platform storage. The tool currently +only supports packing bootloader images. Additional image definitions can be +added to the tool as required. + +The tool can be found in ``tools/fiptool``. + +Loading from a Firmware Image Package (FIP) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Firmware Image Package (FIP) driver can load images from a binary package on +non-volatile platform storage. For the Arm development platforms, this is +currently NOR FLASH. + +Bootloader images are loaded according to the platform policy as specified by +the function ``plat_get_image_source()``. For the Arm development platforms, this +means the platform will attempt to load images from a Firmware Image Package +located at the start of NOR FLASH0. + +The Arm development platforms' policy is to only allow loading of a known set of +images. The platform policy can be modified to allow additional images. + +Use of coherent memory in TF-A +------------------------------ + +There might be loss of coherency when physical memory with mismatched +shareability, cacheability and memory attributes is accessed by multiple CPUs +(refer to section B2.9 of `Arm ARM`_ for more details). This possibility occurs +in TF-A during power up/down sequences when coherency, MMU and caches are +turned on/off incrementally. + +TF-A defines coherent memory as a region of memory with Device nGnRE attributes +in the translation tables. The translation granule size in TF-A is 4KB. This +is the smallest possible size of the coherent memory region. + +By default, all data structures which are susceptible to accesses with +mismatched attributes from various CPUs are allocated in a coherent memory +region (refer to section 2.1 of :ref:`Porting Guide`). The coherent memory +region accesses are Outer Shareable, non-cacheable and they can be accessed with +the Device nGnRE attributes when the MMU is turned on. Hence, at the expense of +at least an extra page of memory, TF-A is able to work around coherency issues +due to mismatched memory attributes. + +The alternative to the above approach is to allocate the susceptible data +structures in Normal WriteBack WriteAllocate Inner shareable memory. This +approach requires the data structures to be designed so that it is possible to +work around the issue of mismatched memory attributes by performing software +cache maintenance on them. + +Disabling the use of coherent memory in TF-A +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +It might be desirable to avoid the cost of allocating coherent memory on +platforms which are memory constrained. TF-A enables inclusion of coherent +memory in firmware images through the build flag ``USE_COHERENT_MEM``. +This flag is enabled by default. It can be disabled to choose the second +approach described above. + +The below sections analyze the data structures allocated in the coherent memory +region and the changes required to allocate them in normal memory. + +Coherent memory usage in PSCI implementation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``psci_non_cpu_pd_nodes`` data structure stores the platform's power domain +tree information for state management of power domains. By default, this data +structure is allocated in the coherent memory region in TF-A because it can be +accessed by multiple CPUs, either with caches enabled or disabled. + +.. code:: c + + typedef struct non_cpu_pwr_domain_node { + /* + * Index of the first CPU power domain node level 0 which has this node + * as its parent. + */ + unsigned int cpu_start_idx; + + /* + * Number of CPU power domains which are siblings of the domain indexed + * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx + * -> cpu_start_idx + ncpus' have this node as their parent. + */ + unsigned int ncpus; + + /* + * Index of the parent power domain node. + */ + unsigned int parent_node; + + plat_local_state_t local_state; + + unsigned char level; + + /* For indexing the psci_lock array*/ + unsigned char lock_index; + } non_cpu_pd_node_t; + +In order to move this data structure to normal memory, the use of each of its +fields must be analyzed. Fields like ``cpu_start_idx``, ``ncpus``, ``parent_node`` +``level`` and ``lock_index`` are only written once during cold boot. Hence removing +them from coherent memory involves only doing a clean and invalidate of the +cache lines after these fields are written. + +The field ``local_state`` can be concurrently accessed by multiple CPUs in +different cache states. A Lamport's Bakery lock ``psci_locks`` is used to ensure +mutual exclusion to this field and a clean and invalidate is needed after it +is written. + +Bakery lock data +~~~~~~~~~~~~~~~~ + +The bakery lock data structure ``bakery_lock_t`` is allocated in coherent memory +and is accessed by multiple CPUs with mismatched attributes. ``bakery_lock_t`` is +defined as follows: + +.. code:: c + + typedef struct bakery_lock { + /* + * The lock_data is a bit-field of 2 members: + * Bit[0] : choosing. This field is set when the CPU is + * choosing its bakery number. + * Bits[1 - 15] : number. This is the bakery number allocated. + */ + volatile uint16_t lock_data[BAKERY_LOCK_MAX_CPUS]; + } bakery_lock_t; + +It is a characteristic of Lamport's Bakery algorithm that the volatile per-CPU +fields can be read by all CPUs but only written to by the owning CPU. + +Depending upon the data cache line size, the per-CPU fields of the +``bakery_lock_t`` structure for multiple CPUs may exist on a single cache line. +These per-CPU fields can be read and written during lock contention by multiple +CPUs with mismatched memory attributes. Since these fields are a part of the +lock implementation, they do not have access to any other locking primitive to +safeguard against the resulting coherency issues. As a result, simple software +cache maintenance is not enough to allocate them in coherent memory. Consider +the following example. + +CPU0 updates its per-CPU field with data cache enabled. This write updates a +local cache line which contains a copy of the fields for other CPUs as well. Now +CPU1 updates its per-CPU field of the ``bakery_lock_t`` structure with data cache +disabled. CPU1 then issues a DCIVAC operation to invalidate any stale copies of +its field in any other cache line in the system. This operation will invalidate +the update made by CPU0 as well. + +To use bakery locks when ``USE_COHERENT_MEM`` is disabled, the lock data structure +has been redesigned. The changes utilise the characteristic of Lamport's Bakery +algorithm mentioned earlier. The bakery_lock structure only allocates the memory +for a single CPU. The macro ``DEFINE_BAKERY_LOCK`` allocates all the bakery locks +needed for a CPU into a section ``bakery_lock``. The linker allocates the memory +for other cores by using the total size allocated for the bakery_lock section +and multiplying it with (PLATFORM_CORE_COUNT - 1). This enables software to +perform software cache maintenance on the lock data structure without running +into coherency issues associated with mismatched attributes. + +The bakery lock data structure ``bakery_info_t`` is defined for use when +``USE_COHERENT_MEM`` is disabled as follows: + +.. code:: c + + typedef struct bakery_info { + /* + * The lock_data is a bit-field of 2 members: + * Bit[0] : choosing. This field is set when the CPU is + * choosing its bakery number. + * Bits[1 - 15] : number. This is the bakery number allocated. + */ + volatile uint16_t lock_data; + } bakery_info_t; + +The ``bakery_info_t`` represents a single per-CPU field of one lock and +the combination of corresponding ``bakery_info_t`` structures for all CPUs in the +system represents the complete bakery lock. The view in memory for a system +with n bakery locks are: + +:: + + bakery_lock section start + |----------------| + | `bakery_info_t`| <-- Lock_0 per-CPU field + | Lock_0 | for CPU0 + |----------------| + | `bakery_info_t`| <-- Lock_1 per-CPU field + | Lock_1 | for CPU0 + |----------------| + | .... | + |----------------| + | `bakery_info_t`| <-- Lock_N per-CPU field + | Lock_N | for CPU0 + ------------------ + | XXXXX | + | Padding to | + | next Cache WB | <--- Calculate PERCPU_BAKERY_LOCK_SIZE, allocate + | Granule | continuous memory for remaining CPUs. + ------------------ + | `bakery_info_t`| <-- Lock_0 per-CPU field + | Lock_0 | for CPU1 + |----------------| + | `bakery_info_t`| <-- Lock_1 per-CPU field + | Lock_1 | for CPU1 + |----------------| + | .... | + |----------------| + | `bakery_info_t`| <-- Lock_N per-CPU field + | Lock_N | for CPU1 + ------------------ + | XXXXX | + | Padding to | + | next Cache WB | + | Granule | + ------------------ + +Consider a system of 2 CPUs with 'N' bakery locks as shown above. For an +operation on Lock_N, the corresponding ``bakery_info_t`` in both CPU0 and CPU1 +``bakery_lock`` section need to be fetched and appropriate cache operations need +to be performed for each access. + +On Arm Platforms, bakery locks are used in psci (``psci_locks``) and power controller +driver (``arm_lock``). + +Non Functional Impact of removing coherent memory +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Removal of the coherent memory region leads to the additional software overhead +of performing cache maintenance for the affected data structures. However, since +the memory where the data structures are allocated is cacheable, the overhead is +mostly mitigated by an increase in performance. + +There is however a performance impact for bakery locks, due to: + +- Additional cache maintenance operations, and +- Multiple cache line reads for each lock operation, since the bakery locks + for each CPU are distributed across different cache lines. + +The implementation has been optimized to minimize this additional overhead. +Measurements indicate that when bakery locks are allocated in Normal memory, the +minimum latency of acquiring a lock is on an average 3-4 micro seconds whereas +in Device memory the same is 2 micro seconds. The measurements were done on the +Juno Arm development platform. + +As mentioned earlier, almost a page of memory can be saved by disabling +``USE_COHERENT_MEM``. Each platform needs to consider these trade-offs to decide +whether coherent memory should be used. If a platform disables +``USE_COHERENT_MEM`` and needs to use bakery locks in the porting layer, it can +optionally define macro ``PLAT_PERCPU_BAKERY_LOCK_SIZE`` (see the +:ref:`Porting Guide`). Refer to the reference platform code for examples. + +Isolating code and read-only data on separate memory pages +---------------------------------------------------------- + +In the Armv8-A VMSA, translation table entries include fields that define the +properties of the target memory region, such as its access permissions. The +smallest unit of memory that can be addressed by a translation table entry is +a memory page. Therefore, if software needs to set different permissions on two +memory regions then it needs to map them using different memory pages. + +The default memory layout for each BL image is as follows: + +:: + + | ... | + +-------------------+ + | Read-write data | + +-------------------+ Page boundary + | <Padding> | + +-------------------+ + | Exception vectors | + +-------------------+ 2 KB boundary + | <Padding> | + +-------------------+ + | Read-only data | + +-------------------+ + | Code | + +-------------------+ BLx_BASE + +.. note:: + The 2KB alignment for the exception vectors is an architectural + requirement. + +The read-write data start on a new memory page so that they can be mapped with +read-write permissions, whereas the code and read-only data below are configured +as read-only. + +However, the read-only data are not aligned on a page boundary. They are +contiguous to the code. Therefore, the end of the code section and the beginning +of the read-only data one might share a memory page. This forces both to be +mapped with the same memory attributes. As the code needs to be executable, this +means that the read-only data stored on the same memory page as the code are +executable as well. This could potentially be exploited as part of a security +attack. + +TF provides the build flag ``SEPARATE_CODE_AND_RODATA`` to isolate the code and +read-only data on separate memory pages. This in turn allows independent control +of the access permissions for the code and read-only data. In this case, +platform code gets a finer-grained view of the image layout and can +appropriately map the code region as executable and the read-only data as +execute-never. + +This has an impact on memory footprint, as padding bytes need to be introduced +between the code and read-only data to ensure the segregation of the two. To +limit the memory cost, this flag also changes the memory layout such that the +code and exception vectors are now contiguous, like so: + +:: + + | ... | + +-------------------+ + | Read-write data | + +-------------------+ Page boundary + | <Padding> | + +-------------------+ + | Read-only data | + +-------------------+ Page boundary + | <Padding> | + +-------------------+ + | Exception vectors | + +-------------------+ 2 KB boundary + | <Padding> | + +-------------------+ + | Code | + +-------------------+ BLx_BASE + +With this more condensed memory layout, the separation of read-only data will +add zero or one page to the memory footprint of each BL image. Each platform +should consider the trade-off between memory footprint and security. + +This build flag is disabled by default, minimising memory footprint. On Arm +platforms, it is enabled. + +Publish and Subscribe Framework +------------------------------- + +The Publish and Subscribe Framework allows EL3 components to define and publish +events, to which other EL3 components can subscribe. + +The following macros are provided by the framework: + +- ``REGISTER_PUBSUB_EVENT(event)``: Defines an event, and takes one argument, + the event name, which must be a valid C identifier. All calls to + ``REGISTER_PUBSUB_EVENT`` macro must be placed in the file + ``pubsub_events.h``. + +- ``PUBLISH_EVENT_ARG(event, arg)``: Publishes a defined event, by iterating + subscribed handlers and calling them in turn. The handlers will be passed the + parameter ``arg``. The expected use-case is to broadcast an event. + +- ``PUBLISH_EVENT(event)``: Like ``PUBLISH_EVENT_ARG``, except that the value + ``NULL`` is passed to subscribed handlers. + +- ``SUBSCRIBE_TO_EVENT(event, handler)``: Registers the ``handler`` to + subscribe to ``event``. The handler will be executed whenever the ``event`` + is published. + +- ``for_each_subscriber(event, subscriber)``: Iterates through all handlers + subscribed for ``event``. ``subscriber`` must be a local variable of type + ``pubsub_cb_t *``, and will point to each subscribed handler in turn during + iteration. This macro can be used for those patterns that none of the + ``PUBLISH_EVENT_*()`` macros cover. + +Publishing an event that wasn't defined using ``REGISTER_PUBSUB_EVENT`` will +result in build error. Subscribing to an undefined event however won't. + +Subscribed handlers must be of type ``pubsub_cb_t``, with following function +signature: + +.. code:: c + + typedef void* (*pubsub_cb_t)(const void *arg); + +There may be arbitrary number of handlers registered to the same event. The +order in which subscribed handlers are notified when that event is published is +not defined. Subscribed handlers may be executed in any order; handlers should +not assume any relative ordering amongst them. + +Publishing an event on a PE will result in subscribed handlers executing on that +PE only; it won't cause handlers to execute on a different PE. + +Note that publishing an event on a PE blocks until all the subscribed handlers +finish executing on the PE. + +TF-A generic code publishes and subscribes to some events within. Platform +ports are discouraged from subscribing to them. These events may be withdrawn, +renamed, or have their semantics altered in the future. Platforms may however +register, publish, and subscribe to platform-specific events. + +Publish and Subscribe Example +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A publisher that wants to publish event ``foo`` would: + +- Define the event ``foo`` in the ``pubsub_events.h``. + + .. code:: c + + REGISTER_PUBSUB_EVENT(foo); + +- Depending on the nature of event, use one of ``PUBLISH_EVENT_*()`` macros to + publish the event at the appropriate path and time of execution. + +A subscriber that wants to subscribe to event ``foo`` published above would +implement: + +.. code:: c + + void *foo_handler(const void *arg) + { + void *result; + + /* Do handling ... */ + + return result; + } + + SUBSCRIBE_TO_EVENT(foo, foo_handler); + + +Reclaiming the BL31 initialization code +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A significant amount of the code used for the initialization of BL31 is never +needed again after boot time. In order to reduce the runtime memory +footprint, the memory used for this code can be reclaimed after initialization +has finished and be used for runtime data. + +The build option ``RECLAIM_INIT_CODE`` can be set to mark this boot time code +with a ``.text.init.*`` attribute which can be filtered and placed suitably +within the BL image for later reclamation by the platform. The platform can +specify the filter and the memory region for this init section in BL31 via the +plat.ld.S linker script. For example, on the FVP, this section is placed +overlapping the secondary CPU stacks so that after the cold boot is done, this +memory can be reclaimed for the stacks. The init memory section is initially +mapped with ``RO``, ``EXECUTE`` attributes. After BL31 initialization has +completed, the FVP changes the attributes of this section to ``RW``, +``EXECUTE_NEVER`` allowing it to be used for runtime data. The memory attributes +are changed within the ``bl31_plat_runtime_setup`` platform hook. The init +section section can be reclaimed for any data which is accessed after cold +boot initialization and it is upto the platform to make the decision. + +.. _firmware_design_pmf: + +Performance Measurement Framework +--------------------------------- + +The Performance Measurement Framework (PMF) facilitates collection of +timestamps by registered services and provides interfaces to retrieve them +from within TF-A. A platform can choose to expose appropriate SMCs to +retrieve these collected timestamps. + +By default, the global physical counter is used for the timestamp +value and is read via ``CNTPCT_EL0``. The framework allows to retrieve +timestamps captured by other CPUs. + +Timestamp identifier format +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A PMF timestamp is uniquely identified across the system via the +timestamp ID or ``tid``. The ``tid`` is composed as follows: + +:: + + Bits 0-7: The local timestamp identifier. + Bits 8-9: Reserved. + Bits 10-15: The service identifier. + Bits 16-31: Reserved. + +#. The service identifier. Each PMF service is identified by a + service name and a service identifier. Both the service name and + identifier are unique within the system as a whole. + +#. The local timestamp identifier. This identifier is unique within a given + service. + +Registering a PMF service +~~~~~~~~~~~~~~~~~~~~~~~~~ + +To register a PMF service, the ``PMF_REGISTER_SERVICE()`` macro from ``pmf.h`` +is used. The arguments required are the service name, the service ID, +the total number of local timestamps to be captured and a set of flags. + +The ``flags`` field can be specified as a bitwise-OR of the following values: + +:: + + PMF_STORE_ENABLE: The timestamp is stored in memory for later retrieval. + PMF_DUMP_ENABLE: The timestamp is dumped on the serial console. + +The ``PMF_REGISTER_SERVICE()`` reserves memory to store captured +timestamps in a PMF specific linker section at build time. +Additionally, it defines necessary functions to capture and +retrieve a particular timestamp for the given service at runtime. + +The macro ``PMF_REGISTER_SERVICE()`` only enables capturing PMF timestamps +from within TF-A. In order to retrieve timestamps from outside of TF-A, the +``PMF_REGISTER_SERVICE_SMC()`` macro must be used instead. This macro +accepts the same set of arguments as the ``PMF_REGISTER_SERVICE()`` +macro but additionally supports retrieving timestamps using SMCs. + +Capturing a timestamp +~~~~~~~~~~~~~~~~~~~~~ + +PMF timestamps are stored in a per-service timestamp region. On a +system with multiple CPUs, each timestamp is captured and stored +in a per-CPU cache line aligned memory region. + +Having registered the service, the ``PMF_CAPTURE_TIMESTAMP()`` macro can be +used to capture a timestamp at the location where it is used. The macro +takes the service name, a local timestamp identifier and a flag as arguments. + +The ``flags`` field argument can be zero, or ``PMF_CACHE_MAINT`` which +instructs PMF to do cache maintenance following the capture. Cache +maintenance is required if any of the service's timestamps are captured +with data cache disabled. + +To capture a timestamp in assembly code, the caller should use +``pmf_calc_timestamp_addr`` macro (defined in ``pmf_asm_macros.S``) to +calculate the address of where the timestamp would be stored. The +caller should then read ``CNTPCT_EL0`` register to obtain the timestamp +and store it at the determined address for later retrieval. + +Retrieving a timestamp +~~~~~~~~~~~~~~~~~~~~~~ + +From within TF-A, timestamps for individual CPUs can be retrieved using either +``PMF_GET_TIMESTAMP_BY_MPIDR()`` or ``PMF_GET_TIMESTAMP_BY_INDEX()`` macros. +These macros accept the CPU's MPIDR value, or its ordinal position +respectively. + +From outside TF-A, timestamps for individual CPUs can be retrieved by calling +into ``pmf_smc_handler()``. + +:: + + Interface : pmf_smc_handler() + Argument : unsigned int smc_fid, u_register_t x1, + u_register_t x2, u_register_t x3, + u_register_t x4, void *cookie, + void *handle, u_register_t flags + Return : uintptr_t + + smc_fid: Holds the SMC identifier which is either `PMF_SMC_GET_TIMESTAMP_32` + when the caller of the SMC is running in AArch32 mode + or `PMF_SMC_GET_TIMESTAMP_64` when the caller is running in AArch64 mode. + x1: Timestamp identifier. + x2: The `mpidr` of the CPU for which the timestamp has to be retrieved. + This can be the `mpidr` of a different core to the one initiating + the SMC. In that case, service specific cache maintenance may be + required to ensure the updated copy of the timestamp is returned. + x3: A flags value that is either 0 or `PMF_CACHE_MAINT`. If + `PMF_CACHE_MAINT` is passed, then the PMF code will perform a + cache invalidate before reading the timestamp. This ensures + an updated copy is returned. + +The remaining arguments, ``x4``, ``cookie``, ``handle`` and ``flags`` are unused +in this implementation. + +PMF code structure +~~~~~~~~~~~~~~~~~~ + +#. ``pmf_main.c`` consists of core functions that implement service registration, + initialization, storing, dumping and retrieving timestamps. + +#. ``pmf_smc.c`` contains the SMC handling for registered PMF services. + +#. ``pmf.h`` contains the public interface to Performance Measurement Framework. + +#. ``pmf_asm_macros.S`` consists of macros to facilitate capturing timestamps in + assembly code. + +#. ``pmf_helpers.h`` is an internal header used by ``pmf.h``. + +Armv8-A Architecture Extensions +------------------------------- + +TF-A makes use of Armv8-A Architecture Extensions where applicable. This +section lists the usage of Architecture Extensions, and build flags +controlling them. + +In general, and unless individually mentioned, the build options +``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` select the Architecture Extension to +target when building TF-A. Subsequent Arm Architecture Extensions are backward +compatible with previous versions. + +The build system only requires that ``ARM_ARCH_MAJOR`` and ``ARM_ARCH_MINOR`` have a +valid numeric value. These build options only control whether or not +Architecture Extension-specific code is included in the build. Otherwise, TF-A +targets the base Armv8.0-A architecture; i.e. as if ``ARM_ARCH_MAJOR`` == 8 +and ``ARM_ARCH_MINOR`` == 0, which are also their respective default values. + +.. seealso:: :ref:`Build Options` + +For details on the Architecture Extension and available features, please refer +to the respective Architecture Extension Supplement. + +Armv8.1-A +~~~~~~~~~ + +This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` >= 8, or when +``ARM_ARCH_MAJOR`` == 8 and ``ARM_ARCH_MINOR`` >= 1. + +- By default, a load-/store-exclusive instruction pair is used to implement + spinlocks. The ``USE_SPINLOCK_CAS`` build option when set to 1 selects the + spinlock implementation using the ARMv8.1-LSE Compare and Swap instruction. + Notice this instruction is only available in AArch64 execution state, so + the option is only available to AArch64 builds. + +Armv8.2-A +~~~~~~~~~ + +- The presence of ARMv8.2-TTCNP is detected at runtime. When it is present, the + Common not Private (TTBRn_ELx.CnP) bit is enabled to indicate that multiple + Processing Elements in the same Inner Shareable domain use the same + translation table entries for a given stage of translation for a particular + translation regime. + +Armv8.3-A +~~~~~~~~~ + +- Pointer authentication features of Armv8.3-A are unconditionally enabled in + the Non-secure world so that lower ELs are allowed to use them without + causing a trap to EL3. + + In order to enable the Secure world to use it, ``CTX_INCLUDE_PAUTH_REGS`` + must be set to 1. This will add all pointer authentication system registers + to the context that is saved when doing a world switch. + + The TF-A itself has support for pointer authentication at runtime + that can be enabled by setting ``BRANCH_PROTECTION`` option to non-zero and + ``CTX_INCLUDE_PAUTH_REGS`` to 1. This enables pointer authentication in BL1, + BL2, BL31, and the TSP if it is used. + + Note that Pointer Authentication is enabled for Non-secure world irrespective + of the value of these build flags if the CPU supports it. + + If ``ARM_ARCH_MAJOR == 8`` and ``ARM_ARCH_MINOR >= 3`` the code footprint of + enabling PAuth is lower because the compiler will use the optimized + PAuth instructions rather than the backwards-compatible ones. + +Armv8.5-A +~~~~~~~~~ + +- Branch Target Identification feature is selected by ``BRANCH_PROTECTION`` + option set to 1. This option defaults to 0. + +- Memory Tagging Extension feature is unconditionally enabled for both worlds + (at EL0 and S-EL0) if it is only supported at EL0. If instead it is + implemented at all ELs, it is unconditionally enabled for only the normal + world. To enable it for the secure world as well, the build option + ``CTX_INCLUDE_MTE_REGS`` is required. If the hardware does not implement + MTE support at all, it is always disabled, no matter what build options + are used. + +Armv7-A +~~~~~~~ + +This Architecture Extension is targeted when ``ARM_ARCH_MAJOR`` == 7. + +There are several Armv7-A extensions available. Obviously the TrustZone +extension is mandatory to support the TF-A bootloader and runtime services. + +Platform implementing an Armv7-A system can to define from its target +Cortex-A architecture through ``ARM_CORTEX_A<X> = yes`` in their +``platform.mk`` script. For example ``ARM_CORTEX_A15=yes`` for a +Cortex-A15 target. + +Platform can also set ``ARM_WITH_NEON=yes`` to enable neon support. +Note that using neon at runtime has constraints on non secure world context. +TF-A does not yet provide VFP context management. + +Directive ``ARM_CORTEX_A<x>`` and ``ARM_WITH_NEON`` are used to set +the toolchain target architecture directive. + +Platform may choose to not define straight the toolchain target architecture +directive by defining ``MARCH32_DIRECTIVE``. +I.e: + +.. code:: make + + MARCH32_DIRECTIVE := -mach=armv7-a + +Code Structure +-------------- + +TF-A code is logically divided between the three boot loader stages mentioned +in the previous sections. The code is also divided into the following +categories (present as directories in the source code): + +- **Platform specific.** Choice of architecture specific code depends upon + the platform. +- **Common code.** This is platform and architecture agnostic code. +- **Library code.** This code comprises of functionality commonly used by all + other code. The PSCI implementation and other EL3 runtime frameworks reside + as Library components. +- **Stage specific.** Code specific to a boot stage. +- **Drivers.** +- **Services.** EL3 runtime services (eg: SPD). Specific SPD services + reside in the ``services/spd`` directory (e.g. ``services/spd/tspd``). + +Each boot loader stage uses code from one or more of the above mentioned +categories. Based upon the above, the code layout looks like this: + +:: + + Directory Used by BL1? Used by BL2? Used by BL31? + bl1 Yes No No + bl2 No Yes No + bl31 No No Yes + plat Yes Yes Yes + drivers Yes No Yes + common Yes Yes Yes + lib Yes Yes Yes + services No No Yes + +The build system provides a non configurable build option IMAGE_BLx for each +boot loader stage (where x = BL stage). e.g. for BL1 , IMAGE_BL1 will be +defined by the build system. This enables TF-A to compile certain code only +for specific boot loader stages + +All assembler files have the ``.S`` extension. The linker source files for each +boot stage have the extension ``.ld.S``. These are processed by GCC to create the +linker scripts which have the extension ``.ld``. + +FDTs provide a description of the hardware platform and are used by the Linux +kernel at boot time. These can be found in the ``fdts`` directory. + +.. rubric:: References + +- `Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D)`_ + +- `Power State Coordination Interface PDD`_ + +- `SMC Calling Convention`_ + +- :ref:`Interrupt Management Framework` + +-------------- + +*Copyright (c) 2013-2022, Arm Limited and Contributors. All rights reserved.* + +.. _Power State Coordination Interface PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf +.. _SMCCC: https://developer.arm.com/docs/den0028/latest +.. _PSCI: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf +.. _Power State Coordination Interface PDD: http://infocenter.arm.com/help/topic/com.arm.doc.den0022d/Power_State_Coordination_Interface_PDD_v1_1_DEN0022D.pdf +.. _Arm ARM: https://developer.arm.com/docs/ddi0487/latest +.. _SMC Calling Convention: https://developer.arm.com/docs/den0028/latest +.. _Trusted Board Boot Requirements CLIENT (TBBR-CLIENT) Armv8-A (ARM DEN0006D): https://developer.arm.com/docs/den0006/latest/trusted-board-boot-requirements-client-tbbr-client-armv8-a +.. _Arm Confidential Compute Architecture (Arm CCA): https://www.arm.com/why-arm/architecture/security-features/arm-confidential-compute-architecture + +.. |Image 1| image:: ../resources/diagrams/rt-svc-descs-layout.png diff --git a/docs/design/index.rst b/docs/design/index.rst new file mode 100644 index 0000000..17ef756 --- /dev/null +++ b/docs/design/index.rst @@ -0,0 +1,20 @@ +System Design +============= + +.. toctree:: + :maxdepth: 1 + :caption: Contents + + alt-boot-flows + auth-framework + cpu-specific-build-macros + firmware-design + interrupt-framework-design + psci-pd-tree + reset-design + trusted-board-boot + trusted-board-boot-build + +-------------- + +*Copyright (c) 2019, Arm Limited. All rights reserved.* diff --git a/docs/design/interrupt-framework-design.rst b/docs/design/interrupt-framework-design.rst new file mode 100644 index 0000000..dfb2eac --- /dev/null +++ b/docs/design/interrupt-framework-design.rst @@ -0,0 +1,1021 @@ +Interrupt Management Framework +============================== + +This framework is responsible for managing interrupts routed to EL3. It also +allows EL3 software to configure the interrupt routing behavior. Its main +objective is to implement the following two requirements. + +#. It should be possible to route interrupts meant to be handled by secure + software (Secure interrupts) to EL3, when execution is in non-secure state + (normal world). The framework should then take care of handing control of + the interrupt to either software in EL3 or Secure-EL1 depending upon the + software configuration and the GIC implementation. This requirement ensures + that secure interrupts are under the control of the secure software with + respect to their delivery and handling without the possibility of + intervention from non-secure software. + +#. It should be possible to route interrupts meant to be handled by + non-secure software (Non-secure interrupts) to the last executed exception + level in the normal world when the execution is in secure world at + exception levels lower than EL3. This could be done with or without the + knowledge of software executing in Secure-EL1/Secure-EL0. The choice of + approach should be governed by the secure software. This requirement + ensures that non-secure software is able to execute in tandem with the + secure software without overriding it. + +Concepts +-------- + +Interrupt types +~~~~~~~~~~~~~~~ + +The framework categorises an interrupt to be one of the following depending upon +the exception level(s) it is handled in. + +#. Secure EL1 interrupt. This type of interrupt can be routed to EL3 or + Secure-EL1 depending upon the security state of the current execution + context. It is always handled in Secure-EL1. + +#. Non-secure interrupt. This type of interrupt can be routed to EL3, + Secure-EL1, Non-secure EL1 or EL2 depending upon the security state of the + current execution context. It is always handled in either Non-secure EL1 + or EL2. + +#. EL3 interrupt. This type of interrupt can be routed to EL3 or Secure-EL1 + depending upon the security state of the current execution context. It is + always handled in EL3. + +The following constants define the various interrupt types in the framework +implementation. + +.. code:: c + + #define INTR_TYPE_S_EL1 0 + #define INTR_TYPE_EL3 1 + #define INTR_TYPE_NS 2 + +Routing model +~~~~~~~~~~~~~ + +A type of interrupt can be either generated as an FIQ or an IRQ. The target +exception level of an interrupt type is configured through the FIQ and IRQ bits +in the Secure Configuration Register at EL3 (``SCR_EL3.FIQ`` and ``SCR_EL3.IRQ`` +bits). When ``SCR_EL3.FIQ``\ =1, FIQs are routed to EL3. Otherwise they are routed +to the First Exception Level (FEL) capable of handling interrupts. When +``SCR_EL3.IRQ``\ =1, IRQs are routed to EL3. Otherwise they are routed to the +FEL. This register is configured independently by EL3 software for each security +state prior to entry into a lower exception level in that security state. + +A routing model for a type of interrupt (generated as FIQ or IRQ) is defined as +its target exception level for each security state. It is represented by a +single bit for each security state. A value of ``0`` means that the interrupt +should be routed to the FEL. A value of ``1`` means that the interrupt should be +routed to EL3. A routing model is applicable only when execution is not in EL3. + +The default routing model for an interrupt type is to route it to the FEL in +either security state. + +Valid routing models +~~~~~~~~~~~~~~~~~~~~ + +The framework considers certain routing models for each type of interrupt to be +incorrect as they conflict with the requirements mentioned in Section 1. The +following sub-sections describe all the possible routing models and specify +which ones are valid or invalid. EL3 interrupts are currently supported only +for GIC version 3.0 (Arm GICv3) and only the Secure-EL1 and Non-secure interrupt +types are supported for GIC version 2.0 (Arm GICv2) (see `Assumptions in +Interrupt Management Framework`_). The terminology used in the following +sub-sections is explained below. + +#. **CSS**. Current Security State. ``0`` when secure and ``1`` when non-secure + +#. **TEL3**. Target Exception Level 3. ``0`` when targeted to the FEL. ``1`` when + targeted to EL3. + +Secure-EL1 interrupts +^^^^^^^^^^^^^^^^^^^^^ + +#. **CSS=0, TEL3=0**. Interrupt is routed to the FEL when execution is in + secure state. This is a valid routing model as secure software is in + control of handling secure interrupts. + +#. **CSS=0, TEL3=1**. Interrupt is routed to EL3 when execution is in secure + state. This is a valid routing model as secure software in EL3 can + handover the interrupt to Secure-EL1 for handling. + +#. **CSS=1, TEL3=0**. Interrupt is routed to the FEL when execution is in + non-secure state. This is an invalid routing model as a secure interrupt + is not visible to the secure software which violates the motivation behind + the Arm Security Extensions. + +#. **CSS=1, TEL3=1**. Interrupt is routed to EL3 when execution is in + non-secure state. This is a valid routing model as secure software in EL3 + can handover the interrupt to Secure-EL1 for handling. + +Non-secure interrupts +^^^^^^^^^^^^^^^^^^^^^ + +#. **CSS=0, TEL3=0**. Interrupt is routed to the FEL when execution is in + secure state. This allows the secure software to trap non-secure + interrupts, perform its book-keeping and hand the interrupt to the + non-secure software through EL3. This is a valid routing model as secure + software is in control of how its execution is preempted by non-secure + interrupts. + +#. **CSS=0, TEL3=1**. Interrupt is routed to EL3 when execution is in secure + state. This is a valid routing model as secure software in EL3 can save + the state of software in Secure-EL1/Secure-EL0 before handing the + interrupt to non-secure software. This model requires additional + coordination between Secure-EL1 and EL3 software to ensure that the + former's state is correctly saved by the latter. + +#. **CSS=1, TEL3=0**. Interrupt is routed to FEL when execution is in + non-secure state. This is a valid routing model as a non-secure interrupt + is handled by non-secure software. + +#. **CSS=1, TEL3=1**. Interrupt is routed to EL3 when execution is in + non-secure state. This is an invalid routing model as there is no valid + reason to route the interrupt to EL3 software and then hand it back to + non-secure software for handling. + +.. _EL3 interrupts: + +EL3 interrupts +^^^^^^^^^^^^^^ + +#. **CSS=0, TEL3=0**. Interrupt is routed to the FEL when execution is in + Secure-EL1/Secure-EL0. This is a valid routing model as secure software + in Secure-EL1/Secure-EL0 is in control of how its execution is preempted + by EL3 interrupt and can handover the interrupt to EL3 for handling. + + However, when ``EL3_EXCEPTION_HANDLING`` is ``1``, this routing model is + invalid as EL3 interrupts are unconditionally routed to EL3, and EL3 + interrupts will always preempt Secure EL1/EL0 execution. See :ref:`exception + handling<interrupt-handling>` documentation. + +#. **CSS=0, TEL3=1**. Interrupt is routed to EL3 when execution is in + Secure-EL1/Secure-EL0. This is a valid routing model as secure software + in EL3 can handle the interrupt. + +#. **CSS=1, TEL3=0**. Interrupt is routed to the FEL when execution is in + non-secure state. This is an invalid routing model as a secure interrupt + is not visible to the secure software which violates the motivation behind + the Arm Security Extensions. + +#. **CSS=1, TEL3=1**. Interrupt is routed to EL3 when execution is in + non-secure state. This is a valid routing model as secure software in EL3 + can handle the interrupt. + +Mapping of interrupt type to signal +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The framework is meant to work with any interrupt controller implemented by a +platform. A interrupt controller could generate a type of interrupt as either an +FIQ or IRQ signal to the CPU depending upon the current security state. The +mapping between the type and signal is known only to the platform. The framework +uses this information to determine whether the IRQ or the FIQ bit should be +programmed in ``SCR_EL3`` while applying the routing model for a type of +interrupt. The platform provides this information through the +``plat_interrupt_type_to_line()`` API (described in the +:ref:`Porting Guide`). For example, on the FVP port when the platform uses an +Arm GICv2 interrupt controller, Secure-EL1 interrupts are signaled through the +FIQ signal while Non-secure interrupts are signaled through the IRQ signal. +This applies when execution is in either security state. + +Effect of mapping of several interrupt types to one signal +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It should be noted that if more than one interrupt type maps to a single +interrupt signal, and if any one of the interrupt type sets **TEL3=1** for a +particular security state, then interrupt signal will be routed to EL3 when in +that security state. This means that all the other interrupt types using the +same interrupt signal will be forced to the same routing model. This should be +borne in mind when choosing the routing model for an interrupt type. + +For example, in Arm GICv3, when the execution context is Secure-EL1/ +Secure-EL0, both the EL3 and the non secure interrupt types map to the FIQ +signal. So if either one of the interrupt type sets the routing model so +that **TEL3=1** when **CSS=0**, the FIQ bit in ``SCR_EL3`` will be programmed to +route the FIQ signal to EL3 when executing in Secure-EL1/Secure-EL0, thereby +effectively routing the other interrupt type also to EL3. + +Assumptions in Interrupt Management Framework +--------------------------------------------- + +The framework makes the following assumptions to simplify its implementation. + +#. Although the framework has support for 2 types of secure interrupts (EL3 + and Secure-EL1 interrupt), only interrupt controller architectures + like Arm GICv3 has architectural support for EL3 interrupts in the form of + Group 0 interrupts. In Arm GICv2, all secure interrupts are assumed to be + handled in Secure-EL1. They can be delivered to Secure-EL1 via EL3 but they + cannot be handled in EL3. + +#. Interrupt exceptions (``PSTATE.I`` and ``F`` bits) are masked during execution + in EL3. + +#. Interrupt management: the following sections describe how interrupts are + managed by the interrupt handling framework. This entails: + + #. Providing an interface to allow registration of a handler and + specification of the routing model for a type of interrupt. + + #. Implementing support to hand control of an interrupt type to its + registered handler when the interrupt is generated. + +Both aspects of interrupt management involve various components in the secure +software stack spanning from EL3 to Secure-EL1. These components are described +in the section `Software components`_. The framework stores information +associated with each type of interrupt in the following data structure. + +.. code:: c + + typedef struct intr_type_desc { + interrupt_type_handler_t handler; + uint32_t flags; + uint32_t scr_el3[2]; + } intr_type_desc_t; + +The ``flags`` field stores the routing model for the interrupt type in +bits[1:0]. Bit[0] stores the routing model when execution is in the secure +state. Bit[1] stores the routing model when execution is in the non-secure +state. As mentioned in Section `Routing model`_, a value of ``0`` implies that +the interrupt should be targeted to the FEL. A value of ``1`` implies that it +should be targeted to EL3. The remaining bits are reserved and SBZ. The helper +macro ``set_interrupt_rm_flag()`` should be used to set the bits in the +``flags`` parameter. + +The ``scr_el3[2]`` field also stores the routing model but as a mapping of the +model in the ``flags`` field to the corresponding bit in the ``SCR_EL3`` for each +security state. + +The framework also depends upon the platform port to configure the interrupt +controller to distinguish between secure and non-secure interrupts. The platform +is expected to be aware of the secure devices present in the system and their +associated interrupt numbers. It should configure the interrupt controller to +enable the secure interrupts, ensure that their priority is always higher than +the non-secure interrupts and target them to the primary CPU. It should also +export the interface described in the :ref:`Porting Guide` to enable +handling of interrupts. + +In the remainder of this document, for the sake of simplicity a Arm GICv2 system +is considered and it is assumed that the FIQ signal is used to generate Secure-EL1 +interrupts and the IRQ signal is used to generate non-secure interrupts in either +security state. EL3 interrupts are not considered. + +Software components +------------------- + +Roles and responsibilities for interrupt management are sub-divided between the +following components of software running in EL3 and Secure-EL1. Each component is +briefly described below. + +#. EL3 Runtime Firmware. This component is common to all ports of TF-A. + +#. Secure Payload Dispatcher (SPD) service. This service interfaces with the + Secure Payload (SP) software which runs in Secure-EL1/Secure-EL0 and is + responsible for switching execution between secure and non-secure states. + A switch is triggered by a Secure Monitor Call and it uses the APIs + exported by the Context management library to implement this functionality. + Switching execution between the two security states is a requirement for + interrupt management as well. This results in a significant dependency on + the SPD service. TF-A implements an example Test Secure Payload Dispatcher + (TSPD) service. + + An SPD service plugs into the EL3 runtime firmware and could be common to + some ports of TF-A. + +#. Secure Payload (SP). On a production system, the Secure Payload corresponds + to a Secure OS which runs in Secure-EL1/Secure-EL0. It interfaces with the + SPD service to manage communication with non-secure software. TF-A + implements an example secure payload called Test Secure Payload (TSP) + which runs only in Secure-EL1. + + A Secure payload implementation could be common to some ports of TF-A, + just like the SPD service. + +Interrupt registration +---------------------- + +This section describes in detail the role of each software component (see +`Software components`_) during the registration of a handler for an interrupt +type. + +.. _el3-runtime-firmware: + +EL3 runtime firmware +~~~~~~~~~~~~~~~~~~~~ + +This component declares the following prototype for a handler of an interrupt type. + +.. code:: c + + typedef uint64_t (*interrupt_type_handler_t)(uint32_t id, + uint32_t flags, + void *handle, + void *cookie); + +The ``id`` is parameter is reserved and could be used in the future for passing +the interrupt id of the highest pending interrupt only if there is a foolproof +way of determining the id. Currently it contains ``INTR_ID_UNAVAILABLE``. + +The ``flags`` parameter contains miscellaneous information as follows. + +#. Security state, bit[0]. This bit indicates the security state of the lower + exception level when the interrupt was generated. A value of ``1`` means + that it was in the non-secure state. A value of ``0`` indicates that it was + in the secure state. This bit can be used by the handler to ensure that + interrupt was generated and routed as per the routing model specified + during registration. + +#. Reserved, bits[31:1]. The remaining bits are reserved for future use. + +The ``handle`` parameter points to the ``cpu_context`` structure of the current CPU +for the security state specified in the ``flags`` parameter. + +Once the handler routine completes, execution will return to either the secure +or non-secure state. The handler routine must return a pointer to +``cpu_context`` structure of the current CPU for the target security state. On +AArch64, this return value is currently ignored by the caller as the +appropriate ``cpu_context`` to be used is expected to be set by the handler +via the context management library APIs. +A portable interrupt handler implementation must set the target context both in +the structure pointed to by the returned pointer and via the context management +library APIs. The handler should treat all error conditions as critical errors +and take appropriate action within its implementation e.g. use assertion +failures. + +The runtime firmware provides the following API for registering a handler for a +particular type of interrupt. A Secure Payload Dispatcher service should use +this API to register a handler for Secure-EL1 and optionally for non-secure +interrupts. This API also requires the caller to specify the routing model for +the type of interrupt. + +.. code:: c + + int32_t register_interrupt_type_handler(uint32_t type, + interrupt_type_handler handler, + uint64_t flags); + +The ``type`` parameter can be one of the three interrupt types listed above i.e. +``INTR_TYPE_S_EL1``, ``INTR_TYPE_NS`` & ``INTR_TYPE_EL3``. The ``flags`` parameter +is as described in Section 2. + +The function will return ``0`` upon a successful registration. It will return +``-EALREADY`` in case a handler for the interrupt type has already been +registered. If the ``type`` is unrecognised or the ``flags`` or the ``handler`` are +invalid it will return ``-EINVAL``. + +Interrupt routing is governed by the configuration of the ``SCR_EL3.FIQ/IRQ`` bits +prior to entry into a lower exception level in either security state. The +context management library maintains a copy of the ``SCR_EL3`` system register for +each security state in the ``cpu_context`` structure of each CPU. It exports the +following APIs to let EL3 Runtime Firmware program and retrieve the routing +model for each security state for the current CPU. The value of ``SCR_EL3`` stored +in the ``cpu_context`` is used by the ``el3_exit()`` function to program the +``SCR_EL3`` register prior to returning from the EL3 exception level. + +.. code:: c + + uint32_t cm_get_scr_el3(uint32_t security_state); + void cm_write_scr_el3_bit(uint32_t security_state, + uint32_t bit_pos, + uint32_t value); + +``cm_get_scr_el3()`` returns the value of the ``SCR_EL3`` register for the specified +security state of the current CPU. ``cm_write_scr_el3_bit()`` writes a ``0`` or ``1`` +to the bit specified by ``bit_pos``. ``register_interrupt_type_handler()`` invokes +``set_routing_model()`` API which programs the ``SCR_EL3`` according to the routing +model using the ``cm_get_scr_el3()`` and ``cm_write_scr_el3_bit()`` APIs. + +It is worth noting that in the current implementation of the framework, the EL3 +runtime firmware is responsible for programming the routing model. The SPD is +responsible for ensuring that the routing model has been adhered to upon +receiving an interrupt. + +.. _spd-int-registration: + +Secure payload dispatcher +~~~~~~~~~~~~~~~~~~~~~~~~~ + +A SPD service is responsible for determining and maintaining the interrupt +routing model supported by itself and the Secure Payload. It is also responsible +for ferrying interrupts between secure and non-secure software depending upon +the routing model. It could determine the routing model at build time or at +runtime. It must use this information to register a handler for each interrupt +type using the ``register_interrupt_type_handler()`` API in EL3 runtime firmware. + +If the routing model is not known to the SPD service at build time, then it must +be provided by the SP as the result of its initialisation. The SPD should +program the routing model only after SP initialisation has completed e.g. in the +SPD initialisation function pointed to by the ``bl32_init`` variable. + +The SPD should determine the mechanism to pass control to the Secure Payload +after receiving an interrupt from the EL3 runtime firmware. This information +could either be provided to the SPD service at build time or by the SP at +runtime. + +Test secure payload dispatcher behavior +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. note:: + Where this document discusses ``TSP_NS_INTR_ASYNC_PREEMPT`` as being + ``1``, the same results also apply when ``EL3_EXCEPTION_HANDLING`` is ``1``. + +The TSPD only handles Secure-EL1 interrupts and is provided with the following +routing model at build time. + +- Secure-EL1 interrupts are routed to EL3 when execution is in non-secure + state and are routed to the FEL when execution is in the secure state + i.e **CSS=0, TEL3=0** & **CSS=1, TEL3=1** for Secure-EL1 interrupts + +- When the build flag ``TSP_NS_INTR_ASYNC_PREEMPT`` is zero, the default routing + model is used for non-secure interrupts. They are routed to the FEL in + either security state i.e **CSS=0, TEL3=0** & **CSS=1, TEL3=0** for + Non-secure interrupts. + +- When the build flag ``TSP_NS_INTR_ASYNC_PREEMPT`` is defined to 1, then the + non secure interrupts are routed to EL3 when execution is in secure state + i.e **CSS=0, TEL3=1** for non-secure interrupts. This effectively preempts + Secure-EL1. The default routing model is used for non secure interrupts in + non-secure state. i.e **CSS=1, TEL3=0**. + +It performs the following actions in the ``tspd_init()`` function to fulfill the +requirements mentioned earlier. + +#. It passes control to the Test Secure Payload to perform its + initialisation. The TSP provides the address of the vector table + ``tsp_vectors`` in the SP which also includes the handler for Secure-EL1 + interrupts in the ``sel1_intr_entry`` field. The TSPD passes control to the TSP at + this address when it receives a Secure-EL1 interrupt. + + The handover agreement between the TSP and the TSPD requires that the TSPD + masks all interrupts (``PSTATE.DAIF`` bits) when it calls + ``tsp_sel1_intr_entry()``. The TSP has to preserve the callee saved general + purpose, SP_EL1/Secure-EL0, LR, VFP and system registers. It can use + ``x0-x18`` to enable its C runtime. + +#. The TSPD implements a handler function for Secure-EL1 interrupts. This + function is registered with the EL3 runtime firmware using the + ``register_interrupt_type_handler()`` API as follows + + .. code:: c + + /* Forward declaration */ + interrupt_type_handler tspd_secure_el1_interrupt_handler; + int32_t rc, flags = 0; + set_interrupt_rm_flag(flags, NON_SECURE); + rc = register_interrupt_type_handler(INTR_TYPE_S_EL1, + tspd_secure_el1_interrupt_handler, + flags); + if (rc) + panic(); + +#. When the build flag ``TSP_NS_INTR_ASYNC_PREEMPT`` is defined to 1, the TSPD + implements a handler function for non-secure interrupts. This function is + registered with the EL3 runtime firmware using the + ``register_interrupt_type_handler()`` API as follows + + .. code:: c + + /* Forward declaration */ + interrupt_type_handler tspd_ns_interrupt_handler; + int32_t rc, flags = 0; + set_interrupt_rm_flag(flags, SECURE); + rc = register_interrupt_type_handler(INTR_TYPE_NS, + tspd_ns_interrupt_handler, + flags); + if (rc) + panic(); + +.. _sp-int-registration: + +Secure payload +~~~~~~~~~~~~~~ + +A Secure Payload must implement an interrupt handling framework at Secure-EL1 +(Secure-EL1 IHF) to support its chosen interrupt routing model. Secure payload +execution will alternate between the below cases. + +#. In the code where IRQ, FIQ or both interrupts are enabled, if an interrupt + type is targeted to the FEL, then it will be routed to the Secure-EL1 + exception vector table. This is defined as the **asynchronous mode** of + handling interrupts. This mode applies to both Secure-EL1 and non-secure + interrupts. + +#. In the code where both interrupts are disabled, if an interrupt type is + targeted to the FEL, then execution will eventually migrate to the + non-secure state. Any non-secure interrupts will be handled as described + in the routing model where **CSS=1 and TEL3=0**. Secure-EL1 interrupts + will be routed to EL3 (as per the routing model where **CSS=1 and + TEL3=1**) where the SPD service will hand them to the SP. This is defined + as the **synchronous mode** of handling interrupts. + +The interrupt handling framework implemented by the SP should support one or +both these interrupt handling models depending upon the chosen routing model. + +The following list briefly describes how the choice of a valid routing model +(see `Valid routing models`_) effects the implementation of the Secure-EL1 +IHF. If the choice of the interrupt routing model is not known to the SPD +service at compile time, then the SP should pass this information to the SPD +service at runtime during its initialisation phase. + +As mentioned earlier, an Arm GICv2 system is considered and it is assumed that +the FIQ signal is used to generate Secure-EL1 interrupts and the IRQ signal +is used to generate non-secure interrupts in either security state. + +Secure payload IHF design w.r.t secure-EL1 interrupts +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +#. **CSS=0, TEL3=0**. If ``PSTATE.F=0``, Secure-EL1 interrupts will be + triggered at one of the Secure-EL1 FIQ exception vectors. The Secure-EL1 + IHF should implement support for handling FIQ interrupts asynchronously. + + If ``PSTATE.F=1`` then Secure-EL1 interrupts will be handled as per the + synchronous interrupt handling model. The SP could implement this scenario + by exporting a separate entrypoint for Secure-EL1 interrupts to the SPD + service during the registration phase. The SPD service would also need to + know the state of the system, general purpose and the ``PSTATE`` registers + in which it should arrange to return execution to the SP. The SP should + provide this information in an implementation defined way during the + registration phase if it is not known to the SPD service at build time. + +#. **CSS=1, TEL3=1**. Interrupts are routed to EL3 when execution is in + non-secure state. They should be handled through the synchronous interrupt + handling model as described in 1. above. + +#. **CSS=0, TEL3=1**. Secure-EL1 interrupts are routed to EL3 when execution + is in secure state. They will not be visible to the SP. The ``PSTATE.F`` bit + in Secure-EL1/Secure-EL0 will not mask FIQs. The EL3 runtime firmware will + call the handler registered by the SPD service for Secure-EL1 interrupts. + Secure-EL1 IHF should then handle all Secure-EL1 interrupt through the + synchronous interrupt handling model described in 1. above. + +Secure payload IHF design w.r.t non-secure interrupts +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +#. **CSS=0, TEL3=0**. If ``PSTATE.I=0``, non-secure interrupts will be + triggered at one of the Secure-EL1 IRQ exception vectors . The Secure-EL1 + IHF should co-ordinate with the SPD service to transfer execution to the + non-secure state where the interrupt should be handled e.g the SP could + allocate a function identifier to issue a SMC64 or SMC32 to the SPD + service which indicates that the SP execution has been preempted by a + non-secure interrupt. If this function identifier is not known to the SPD + service at compile time then the SP could provide it during the + registration phase. + + If ``PSTATE.I=1`` then the non-secure interrupt will pend until execution + resumes in the non-secure state. + +#. **CSS=0, TEL3=1**. Non-secure interrupts are routed to EL3. They will not + be visible to the SP. The ``PSTATE.I`` bit in Secure-EL1/Secure-EL0 will + have not effect. The SPD service should register a non-secure interrupt + handler which should save the SP state correctly and resume execution in + the non-secure state where the interrupt will be handled. The Secure-EL1 + IHF does not need to take any action. + +#. **CSS=1, TEL3=0**. Non-secure interrupts are handled in the FEL in + non-secure state (EL1/EL2) and are not visible to the SP. This routing + model does not affect the SP behavior. + +A Secure Payload must also ensure that all Secure-EL1 interrupts are correctly +configured at the interrupt controller by the platform port of the EL3 runtime +firmware. It should configure any additional Secure-EL1 interrupts which the EL3 +runtime firmware is not aware of through its platform port. + +Test secure payload behavior +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The routing model for Secure-EL1 and non-secure interrupts chosen by the TSP is +described in Section `Secure Payload Dispatcher`__. It is known to the TSPD +service at build time. + +.. __: #spd-int-registration + +The TSP implements an entrypoint (``tsp_sel1_intr_entry()``) for handling Secure-EL1 +interrupts taken in non-secure state and routed through the TSPD service +(synchronous handling model). It passes the reference to this entrypoint via +``tsp_vectors`` to the TSPD service. + +The TSP also replaces the default exception vector table referenced through the +``early_exceptions`` variable, with a vector table capable of handling FIQ and IRQ +exceptions taken at the same (Secure-EL1) exception level. This table is +referenced through the ``tsp_exceptions`` variable and programmed into the +VBAR_EL1. It caters for the asynchronous handling model. + +The TSP also programs the Secure Physical Timer in the Arm Generic Timer block +to raise a periodic interrupt (every half a second) for the purpose of testing +interrupt management across all the software components listed in `Software +components`_. + +Interrupt handling +------------------ + +This section describes in detail the role of each software component (see +Section `Software components`_) in handling an interrupt of a particular type. + +EL3 runtime firmware +~~~~~~~~~~~~~~~~~~~~ + +The EL3 runtime firmware populates the IRQ and FIQ exception vectors referenced +by the ``runtime_exceptions`` variable as follows. + +#. IRQ and FIQ exceptions taken from the current exception level with + ``SP_EL0`` or ``SP_EL3`` are reported as irrecoverable error conditions. As + mentioned earlier, EL3 runtime firmware always executes with the + ``PSTATE.I`` and ``PSTATE.F`` bits set. + +#. The following text describes how the IRQ and FIQ exceptions taken from a + lower exception level using AArch64 or AArch32 are handled. + +When an interrupt is generated, the vector for each interrupt type is +responsible for: + +#. Saving the entire general purpose register context (x0-x30) immediately + upon exception entry. The registers are saved in the per-cpu ``cpu_context`` + data structure referenced by the ``SP_EL3``\ register. + +#. Saving the ``ELR_EL3``, ``SP_EL0`` and ``SPSR_EL3`` system registers in the + per-cpu ``cpu_context`` data structure referenced by the ``SP_EL3`` register. + +#. Switching to the C runtime stack by restoring the ``CTX_RUNTIME_SP`` value + from the per-cpu ``cpu_context`` data structure in ``SP_EL0`` and + executing the ``msr spsel, #0`` instruction. + +#. Determining the type of interrupt. Secure-EL1 interrupts will be signaled + at the FIQ vector. Non-secure interrupts will be signaled at the IRQ + vector. The platform should implement the following API to determine the + type of the pending interrupt. + + .. code:: c + + uint32_t plat_ic_get_interrupt_type(void); + + It should return either ``INTR_TYPE_S_EL1`` or ``INTR_TYPE_NS``. + +#. Determining the handler for the type of interrupt that has been generated. + The following API has been added for this purpose. + + .. code:: c + + interrupt_type_handler get_interrupt_type_handler(uint32_t interrupt_type); + + It returns the reference to the registered handler for this interrupt + type. The ``handler`` is retrieved from the ``intr_type_desc_t`` structure as + described in Section 2. ``NULL`` is returned if no handler has been + registered for this type of interrupt. This scenario is reported as an + irrecoverable error condition. + +#. Calling the registered handler function for the interrupt type generated. + The ``id`` parameter is set to ``INTR_ID_UNAVAILABLE`` currently. The id along + with the current security state and a reference to the ``cpu_context_t`` + structure for the current security state are passed to the handler function + as its arguments. + + The handler function returns a reference to the per-cpu ``cpu_context_t`` + structure for the target security state. + +#. Calling ``el3_exit()`` to return from EL3 into a lower exception level in + the security state determined by the handler routine. The ``el3_exit()`` + function is responsible for restoring the register context from the + ``cpu_context_t`` data structure for the target security state. + +Secure payload dispatcher +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Interrupt entry +^^^^^^^^^^^^^^^ + +The SPD service begins handling an interrupt when the EL3 runtime firmware calls +the handler function for that type of interrupt. The SPD service is responsible +for the following: + +#. Validating the interrupt. This involves ensuring that the interrupt was + generated according to the interrupt routing model specified by the SPD + service during registration. It should use the security state of the + exception level (passed in the ``flags`` parameter of the handler) where + the interrupt was taken from to determine this. If the interrupt is not + recognised then the handler should treat it as an irrecoverable error + condition. + + An SPD service can register a handler for Secure-EL1 and/or Non-secure + interrupts. A non-secure interrupt should never be routed to EL3 from + from non-secure state. Also if a routing model is chosen where Secure-EL1 + interrupts are routed to S-EL1 when execution is in Secure state, then a + S-EL1 interrupt should never be routed to EL3 from secure state. The handler + could use the security state flag to check this. + +#. Determining whether a context switch is required. This depends upon the + routing model and interrupt type. For non secure and S-EL1 interrupt, + if the security state of the execution context where the interrupt was + generated is not the same as the security state required for handling + the interrupt, a context switch is required. The following 2 cases + require a context switch from secure to non-secure or vice-versa: + + #. A Secure-EL1 interrupt taken from the non-secure state should be + routed to the Secure Payload. + + #. A non-secure interrupt taken from the secure state should be routed + to the last known non-secure exception level. + + The SPD service must save the system register context of the current + security state. It must then restore the system register context of the + target security state. It should use the ``cm_set_next_eret_context()`` API + to ensure that the next ``cpu_context`` to be restored is of the target + security state. + + If the target state is secure then execution should be handed to the SP as + per the synchronous interrupt handling model it implements. A Secure-EL1 + interrupt can be routed to EL3 while execution is in the SP. This implies + that SP execution can be preempted while handling an interrupt by a + another higher priority Secure-EL1 interrupt or a EL3 interrupt. The SPD + service should be able to handle this preemption or manage secure interrupt + priorities before handing control to the SP. + +#. Setting the return value of the handler to the per-cpu ``cpu_context`` if + the interrupt has been successfully validated and ready to be handled at a + lower exception level. + +The routing model allows non-secure interrupts to interrupt Secure-EL1 when in +secure state if it has been configured to do so. The SPD service and the SP +should implement a mechanism for routing these interrupts to the last known +exception level in the non-secure state. The former should save the SP context, +restore the non-secure context and arrange for entry into the non-secure state +so that the interrupt can be handled. + +Interrupt exit +^^^^^^^^^^^^^^ + +When the Secure Payload has finished handling a Secure-EL1 interrupt, it could +return control back to the SPD service through a SMC32 or SMC64. The SPD service +should handle this secure monitor call so that execution resumes in the +exception level and the security state from where the Secure-EL1 interrupt was +originally taken. + +Test secure payload dispatcher Secure-EL1 interrupt handling +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The example TSPD service registers a handler for Secure-EL1 interrupts taken +from the non-secure state. During execution in S-EL1, the TSPD expects that the +Secure-EL1 interrupts are handled in S-EL1 by TSP. Its handler +``tspd_secure_el1_interrupt_handler()`` expects only to be invoked for Secure-EL1 +originating from the non-secure state. It takes the following actions upon being +invoked. + +#. It uses the security state provided in the ``flags`` parameter to ensure + that the secure interrupt originated from the non-secure state. It asserts + if this is not the case. + +#. It saves the system register context for the non-secure state by calling + ``cm_el1_sysregs_context_save(NON_SECURE);``. + +#. It sets the ``ELR_EL3`` system register to ``tsp_sel1_intr_entry`` and sets the + ``SPSR_EL3.DAIF`` bits in the secure CPU context. It sets ``x0`` to + ``TSP_HANDLE_SEL1_INTR_AND_RETURN``. If the TSP was preempted earlier by a non + secure interrupt during ``yielding`` SMC processing, save the registers that + will be trashed, which is the ``ELR_EL3`` and ``SPSR_EL3``, in order to be able + to re-enter TSP for Secure-EL1 interrupt processing. It does not need to + save any other secure context since the TSP is expected to preserve it + (see section `Test secure payload dispatcher behavior`_). + +#. It restores the system register context for the secure state by calling + ``cm_el1_sysregs_context_restore(SECURE);``. + +#. It ensures that the secure CPU context is used to program the next + exception return from EL3 by calling ``cm_set_next_eret_context(SECURE);``. + +#. It returns the per-cpu ``cpu_context`` to indicate that the interrupt can + now be handled by the SP. ``x1`` is written with the value of ``elr_el3`` + register for the non-secure state. This information is used by the SP for + debugging purposes. + +The figure below describes how the interrupt handling is implemented by the TSPD +when a Secure-EL1 interrupt is generated when execution is in the non-secure +state. + +|Image 1| + +The TSP issues an SMC with ``TSP_HANDLED_S_EL1_INTR`` as the function identifier to +signal completion of interrupt handling. + +The TSPD service takes the following actions in ``tspd_smc_handler()`` function +upon receiving an SMC with ``TSP_HANDLED_S_EL1_INTR`` as the function identifier: + +#. It ensures that the call originated from the secure state otherwise + execution returns to the non-secure state with ``SMC_UNK`` in ``x0``. + +#. It restores the saved ``ELR_EL3`` and ``SPSR_EL3`` system registers back to + the secure CPU context (see step 3 above) in case the TSP had been preempted + by a non secure interrupt earlier. + +#. It restores the system register context for the non-secure state by + calling ``cm_el1_sysregs_context_restore(NON_SECURE)``. + +#. It ensures that the non-secure CPU context is used to program the next + exception return from EL3 by calling ``cm_set_next_eret_context(NON_SECURE)``. + +#. ``tspd_smc_handler()`` returns a reference to the non-secure ``cpu_context`` + as the return value. + +Test secure payload dispatcher non-secure interrupt handling +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The TSP in Secure-EL1 can be preempted by a non-secure interrupt during +``yielding`` SMC processing or by a higher priority EL3 interrupt during +Secure-EL1 interrupt processing. When ``EL3_EXCEPTION_HANDLING`` is ``0``, only +non-secure interrupts can cause preemption of TSP since there are no EL3 +interrupts in the system. With ``EL3_EXCEPTION_HANDLING=1`` however, any EL3 +interrupt may preempt Secure execution. + +It should be noted that while TSP is preempted, the TSPD only allows entry into +the TSP either for Secure-EL1 interrupt handling or for resuming the preempted +``yielding`` SMC in response to the ``TSP_FID_RESUME`` SMC from the normal world. +(See Section `Implication of preempted SMC on Non-Secure Software`_). + +The non-secure interrupt triggered in Secure-EL1 during ``yielding`` SMC +processing can be routed to either EL3 or Secure-EL1 and is controlled by build +option ``TSP_NS_INTR_ASYNC_PREEMPT`` (see Section `Test secure payload +dispatcher behavior`_). If the build option is set, the TSPD will set the +routing model for the non-secure interrupt to be routed to EL3 from secure state +i.e. **TEL3=1, CSS=0** and registers ``tspd_ns_interrupt_handler()`` as the +non-secure interrupt handler. The ``tspd_ns_interrupt_handler()`` on being +invoked ensures that the interrupt originated from the secure state and disables +routing of non-secure interrupts from secure state to EL3. This is to prevent +further preemption (by a non-secure interrupt) when TSP is reentered for +handling Secure-EL1 interrupts that triggered while execution was in the normal +world. The ``tspd_ns_interrupt_handler()`` then invokes +``tspd_handle_sp_preemption()`` for further handling. + +If the ``TSP_NS_INTR_ASYNC_PREEMPT`` build option is zero (default), the default +routing model for non-secure interrupt in secure state is in effect +i.e. **TEL3=0, CSS=0**. During ``yielding`` SMC processing, the IRQ +exceptions are unmasked i.e. ``PSTATE.I=0``, and a non-secure interrupt will +trigger at Secure-EL1 IRQ exception vector. The TSP saves the general purpose +register context and issues an SMC with ``TSP_PREEMPTED`` as the function +identifier to signal preemption of TSP. The TSPD SMC handler, +``tspd_smc_handler()``, ensures that the SMC call originated from the +secure state otherwise execution returns to the non-secure state with +``SMC_UNK`` in ``x0``. It then invokes ``tspd_handle_sp_preemption()`` for +further handling. + +The ``tspd_handle_sp_preemption()`` takes the following actions upon being +invoked: + +#. It saves the system register context for the secure state by calling + ``cm_el1_sysregs_context_save(SECURE)``. + +#. It restores the system register context for the non-secure state by + calling ``cm_el1_sysregs_context_restore(NON_SECURE)``. + +#. It ensures that the non-secure CPU context is used to program the next + exception return from EL3 by calling ``cm_set_next_eret_context(NON_SECURE)``. + +#. ``SMC_PREEMPTED`` is set in x0 and return to non secure state after + restoring non secure context. + +The Normal World is expected to resume the TSP after the ``yielding`` SMC +preemption by issuing an SMC with ``TSP_FID_RESUME`` as the function identifier +(see section `Implication of preempted SMC on Non-Secure Software`_). The TSPD +service takes the following actions in ``tspd_smc_handler()`` function upon +receiving this SMC: + +#. It ensures that the call originated from the non secure state. An + assertion is raised otherwise. + +#. Checks whether the TSP needs a resume i.e check if it was preempted. It + then saves the system register context for the non-secure state by calling + ``cm_el1_sysregs_context_save(NON_SECURE)``. + +#. Restores the secure context by calling + ``cm_el1_sysregs_context_restore(SECURE)`` + +#. It ensures that the secure CPU context is used to program the next + exception return from EL3 by calling ``cm_set_next_eret_context(SECURE)``. + +#. ``tspd_smc_handler()`` returns a reference to the secure ``cpu_context`` as the + return value. + +The figure below describes how the TSP/TSPD handle a non-secure interrupt when +it is generated during execution in the TSP with ``PSTATE.I`` = 0 when the +``TSP_NS_INTR_ASYNC_PREEMPT`` build flag is 0. + +|Image 2| + +.. _sp-synchronous-int: + +Secure payload interrupt handling +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The SP should implement one or both of the synchronous and asynchronous +interrupt handling models depending upon the interrupt routing model it has +chosen (as described in section :ref:`Secure Payload <sp-int-registration>`). + +In the synchronous model, it should begin handling a Secure-EL1 interrupt after +receiving control from the SPD service at an entrypoint agreed upon during build +time or during the registration phase. Before handling the interrupt, the SP +should save any Secure-EL1 system register context which is needed for resuming +normal execution in the SP later e.g. ``SPSR_EL1``, ``ELR_EL1``. After handling +the interrupt, the SP could return control back to the exception level and +security state where the interrupt was originally taken from. The SP should use +an SMC32 or SMC64 to ask the SPD service to do this. + +In the asynchronous model, the Secure Payload is responsible for handling +non-secure and Secure-EL1 interrupts at the IRQ and FIQ vectors in its exception +vector table when ``PSTATE.I`` and ``PSTATE.F`` bits are 0. As described earlier, +when a non-secure interrupt is generated, the SP should coordinate with the SPD +service to pass control back to the non-secure state in the last known exception +level. This will allow the non-secure interrupt to be handled in the non-secure +state. + +Test secure payload behavior +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The TSPD hands control of a Secure-EL1 interrupt to the TSP at the +``tsp_sel1_intr_entry()``. The TSP handles the interrupt while ensuring that the +handover agreement described in Section `Test secure payload dispatcher +behavior`_ is maintained. It updates some statistics by calling +``tsp_update_sync_sel1_intr_stats()``. It then calls +``tsp_common_int_handler()`` which. + +#. Checks whether the interrupt is the secure physical timer interrupt. It + uses the platform API ``plat_ic_get_pending_interrupt_id()`` to get the + interrupt number. If it is not the secure physical timer interrupt, then + that means that a higher priority interrupt has preempted it. Invoke + ``tsp_handle_preemption()`` to handover control back to EL3 by issuing + an SMC with ``TSP_PREEMPTED`` as the function identifier. + +#. Handles the secure timer interrupt interrupt by acknowledging it using the + ``plat_ic_acknowledge_interrupt()`` platform API, calling + ``tsp_generic_timer_handler()`` to reprogram the secure physical generic + timer and calling the ``plat_ic_end_of_interrupt()`` platform API to signal + end of interrupt processing. + +The TSP passes control back to the TSPD by issuing an SMC64 with +``TSP_HANDLED_S_EL1_INTR`` as the function identifier. + +The TSP handles interrupts under the asynchronous model as follows. + +#. Secure-EL1 interrupts are handled by calling the ``tsp_common_int_handler()`` + function. The function has been described above. + +#. Non-secure interrupts are handled by calling the ``tsp_common_int_handler()`` + function which ends up invoking ``tsp_handle_preemption()`` and issuing an + SMC64 with ``TSP_PREEMPTED`` as the function identifier. Execution resumes at + the instruction that follows this SMC instruction when the TSPD hands control + to the TSP in response to an SMC with ``TSP_FID_RESUME`` as the function + identifier from the non-secure state (see section `Test secure payload + dispatcher non-secure interrupt handling`_). + +Other considerations +-------------------- + +Implication of preempted SMC on Non-Secure Software +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A ``yielding`` SMC call to Secure payload can be preempted by a non-secure +interrupt and the execution can return to the non-secure world for handling +the interrupt (For details on ``yielding`` SMC refer `SMC calling convention`_). +In this case, the SMC call has not completed its execution and the execution +must return back to the secure payload to resume the preempted SMC call. +This can be achieved by issuing an SMC call which instructs to resume the +preempted SMC. + +A ``fast`` SMC cannot be preempted and hence this case will not happen for +a fast SMC call. + +In the Test Secure Payload implementation, ``TSP_FID_RESUME`` is designated +as the resume SMC FID. It is important to note that ``TSP_FID_RESUME`` is a +``yielding`` SMC which means it too can be be preempted. The typical non +secure software sequence for issuing a ``yielding`` SMC would look like this, +assuming ``P.STATE.I=0`` in the non secure state : + +.. code:: c + + int rc; + rc = smc(TSP_YIELD_SMC_FID, ...); /* Issue a Yielding SMC call */ + /* The pending non-secure interrupt is handled by the interrupt handler + and returns back here. */ + while (rc == SMC_PREEMPTED) { /* Check if the SMC call is preempted */ + rc = smc(TSP_FID_RESUME); /* Issue resume SMC call */ + } + +The ``TSP_YIELD_SMC_FID`` is any ``yielding`` SMC function identifier and the smc() +function invokes a SMC call with the required arguments. The pending non-secure +interrupt causes an IRQ exception and the IRQ handler registered at the +exception vector handles the non-secure interrupt and returns. The return value +from the SMC call is tested for ``SMC_PREEMPTED`` to check whether it is +preempted. If it is, then the resume SMC call ``TSP_FID_RESUME`` is issued. The +return value of the SMC call is tested again to check if it is preempted. +This is done in a loop till the SMC call succeeds or fails. If a ``yielding`` +SMC is preempted, until it is resumed using ``TSP_FID_RESUME`` SMC and +completed, the current TSPD prevents any other SMC call from re-entering +TSP by returning ``SMC_UNK`` error. + +-------------- + +*Copyright (c) 2014-2020, Arm Limited and Contributors. All rights reserved.* + +.. _SMC calling convention: https://developer.arm.com/docs/den0028/latest + +.. |Image 1| image:: ../resources/diagrams/sec-int-handling.png +.. |Image 2| image:: ../resources/diagrams/non-sec-int-handling.png diff --git a/docs/design/psci-pd-tree.rst b/docs/design/psci-pd-tree.rst new file mode 100644 index 0000000..56a6d6f --- /dev/null +++ b/docs/design/psci-pd-tree.rst @@ -0,0 +1,304 @@ +PSCI Power Domain Tree Structure +================================ + +Requirements +------------ + +#. A platform must export the ``plat_get_aff_count()`` and + ``plat_get_aff_state()`` APIs to enable the generic PSCI code to + populate a tree that describes the hierarchy of power domains in the + system. This approach is inflexible because a change to the topology + requires a change in the code. + + It would be much simpler for the platform to describe its power domain tree + in a data structure. + +#. The generic PSCI code generates MPIDRs in order to populate the power domain + tree. It also uses an MPIDR to find a node in the tree. The assumption that + a platform will use exactly the same MPIDRs as generated by the generic PSCI + code is not scalable. The use of an MPIDR also restricts the number of + levels in the power domain tree to four. + + Therefore, there is a need to decouple allocation of MPIDRs from the + mechanism used to populate the power domain topology tree. + +#. The current arrangement of the power domain tree requires a binary search + over the sibling nodes at a particular level to find a specified power + domain node. During a power management operation, the tree is traversed from + a 'start' to an 'end' power level. The binary search is required to find the + node at each level. The natural way to perform this traversal is to + start from a leaf node and follow the parent node pointer to reach the end + level. + + Therefore, there is a need to define data structures that implement the tree in + a way which facilitates such a traversal. + +#. The attributes of a core power domain differ from the attributes of power + domains at higher levels. For example, only a core power domain can be identified + using an MPIDR. There is no requirement to perform state coordination while + performing a power management operation on the core power domain. + + Therefore, there is a need to implement the tree in a way which facilitates this + distinction between a leaf and non-leaf node and any associated + optimizations. + +-------------- + +Design +------ + +Describing a power domain tree +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To fulfill requirement 1., the existing platform APIs +``plat_get_aff_count()`` and ``plat_get_aff_state()`` have been +removed. A platform must define an array of unsigned chars such that: + +#. The first entry in the array specifies the number of power domains at the + highest power level implemented in the platform. This caters for platforms + where the power domain tree does not have a single root node, for example, + the FVP has two cluster power domains at the highest level (1). + +#. Each subsequent entry corresponds to a power domain and contains the number + of power domains that are its direct children. + +#. The size of the array minus the first entry will be equal to the number of + non-leaf power domains. + +#. The value in each entry in the array is used to find the number of entries + to consider at the next level. The sum of the values (number of children) of + all the entries at a level specifies the number of entries in the array for + the next level. + +The following example power domain topology tree will be used to describe the +above text further. The leaf and non-leaf nodes in this tree have been numbered +separately. + +:: + + +-+ + |0| + +-+ + / \ + / \ + / \ + / \ + / \ + / \ + / \ + / \ + / \ + / \ + +-+ +-+ + |1| |2| + +-+ +-+ + / \ / \ + / \ / \ + / \ / \ + / \ / \ + +-+ +-+ +-+ +-+ + |3| |4| |5| |6| + +-+ +-+ +-+ +-+ + +---+-----+ +----+----| +----+----+ +----+-----+-----+ + | | | | | | | | | | | | | + | | | | | | | | | | | | | + v v v v v v v v v v v v v + +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +--+ +--+ +--+ + |0| |1| |2| |3| |4| |5| |6| |7| |8| |9| |10| |11| |12| + +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +-+ +--+ +--+ +--+ + +This tree is defined by the platform as the array described above as follows: + +.. code:: c + + #define PLAT_NUM_POWER_DOMAINS 20 + #define PLATFORM_CORE_COUNT 13 + #define PSCI_NUM_NON_CPU_PWR_DOMAINS \ + (PLAT_NUM_POWER_DOMAINS - PLATFORM_CORE_COUNT) + + unsigned char plat_power_domain_tree_desc[] = { 1, 2, 2, 2, 3, 3, 3, 4}; + +Removing assumptions about MPIDRs used in a platform +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To fulfill requirement 2., it is assumed that the platform assigns a +unique number (core index) between ``0`` and ``PLAT_CORE_COUNT - 1`` to each core +power domain. MPIDRs could be allocated in any manner and will not be used to +populate the tree. + +``plat_core_pos_by_mpidr(mpidr)`` will return the core index for the core +corresponding to the MPIDR. It will return an error (-1) if an MPIDR is passed +which is not allocated or corresponds to an absent core. The semantics of this +platform API have changed since it is required to validate the passed MPIDR. It +has been made a mandatory API as a result. + +Another mandatory API, ``plat_my_core_pos()`` has been added to return the core +index for the calling core. This API provides a more lightweight mechanism to get +the index since there is no need to validate the MPIDR of the calling core. + +The platform should assign the core indices (as illustrated in the diagram above) +such that, if the core nodes are numbered from left to right, then the index +for a core domain will be the same as the index returned by +``plat_core_pos_by_mpidr()`` or ``plat_my_core_pos()`` for that core. This +relationship allows the core nodes to be allocated in a separate array +(requirement 4.) during ``psci_setup()`` in such an order that the index of the +core in the array is the same as the return value from these APIs. + +Dealing with holes in MPIDR allocation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +For platforms where the number of allocated MPIDRs is equal to the number of +core power domains, for example, Juno and FVPs, the logic to convert an MPIDR to +a core index should remain unchanged. Both Juno and FVP use a simple collision +proof hash function to do this. + +It is possible that on some platforms, the allocation of MPIDRs is not +contiguous or certain cores have been disabled. This essentially means that the +MPIDRs have been sparsely allocated, that is, the size of the range of MPIDRs +used by the platform is not equal to the number of core power domains. + +The platform could adopt one of the following approaches to deal with this +scenario: + +#. Implement more complex logic to convert a valid MPIDR to a core index while + maintaining the relationship described earlier. This means that the power + domain tree descriptor will not describe any core power domains which are + disabled or absent. Entries will not be allocated in the tree for these + domains. + +#. Treat unallocated MPIDRs and disabled cores as absent but still describe them + in the power domain descriptor, that is, the number of core nodes described + is equal to the size of the range of MPIDRs allocated. This approach will + lead to memory wastage since entries will be allocated in the tree but will + allow use of a simpler logic to convert an MPIDR to a core index. + +Traversing through and distinguishing between core and non-core power domains +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To fulfill requirement 3 and 4, separate data structures have been defined +to represent leaf and non-leaf power domain nodes in the tree. + +.. code:: c + + /******************************************************************************* + * The following two data structures implement the power domain tree. The tree + * is used to track the state of all the nodes i.e. power domain instances + * described by the platform. The tree consists of nodes that describe CPU power + * domains i.e. leaf nodes and all other power domains which are parents of a + * CPU power domain i.e. non-leaf nodes. + ******************************************************************************/ + typedef struct non_cpu_pwr_domain_node { + /* + * Index of the first CPU power domain node level 0 which has this node + * as its parent. + */ + unsigned int cpu_start_idx; + + /* + * Number of CPU power domains which are siblings of the domain indexed + * by 'cpu_start_idx' i.e. all the domains in the range 'cpu_start_idx + * -> cpu_start_idx + ncpus' have this node as their parent. + */ + unsigned int ncpus; + + /* Index of the parent power domain node */ + unsigned int parent_node; + + ----- + } non_cpu_pd_node_t; + + typedef struct cpu_pwr_domain_node { + u_register_t mpidr; + + /* Index of the parent power domain node */ + unsigned int parent_node; + + ----- + } cpu_pd_node_t; + +The power domain tree is implemented as a combination of the following data +structures. + +.. code:: c + + non_cpu_pd_node_t psci_non_cpu_pd_nodes[PSCI_NUM_NON_CPU_PWR_DOMAINS]; + cpu_pd_node_t psci_cpu_pd_nodes[PLATFORM_CORE_COUNT]; + +Populating the power domain tree +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``populate_power_domain_tree()`` function in ``psci_setup.c`` implements the +algorithm to parse the power domain descriptor exported by the platform to +populate the two arrays. It is essentially a breadth-first-search. The nodes for +each level starting from the root are laid out one after another in the +``psci_non_cpu_pd_nodes`` and ``psci_cpu_pd_nodes`` arrays as follows: + +:: + + psci_non_cpu_pd_nodes -> [[Level 3 nodes][Level 2 nodes][Level 1 nodes]] + psci_cpu_pd_nodes -> [Level 0 nodes] + +For the example power domain tree illustrated above, the ``psci_cpu_pd_nodes`` +will be populated as follows. The value in each entry is the index of the parent +node. Other fields have been ignored for simplicity. + +:: + + +-------------+ ^ + CPU0 | 3 | | + +-------------+ | + CPU1 | 3 | | + +-------------+ | + CPU2 | 3 | | + +-------------+ | + CPU3 | 4 | | + +-------------+ | + CPU4 | 4 | | + +-------------+ | + CPU5 | 4 | | PLATFORM_CORE_COUNT + +-------------+ | + CPU6 | 5 | | + +-------------+ | + CPU7 | 5 | | + +-------------+ | + CPU8 | 5 | | + +-------------+ | + CPU9 | 6 | | + +-------------+ | + CPU10 | 6 | | + +-------------+ | + CPU11 | 6 | | + +-------------+ | + CPU12 | 6 | v + +-------------+ + +The ``psci_non_cpu_pd_nodes`` array will be populated as follows. The value in +each entry is the index of the parent node. + +:: + + +-------------+ ^ + PD0 | -1 | | + +-------------+ | + PD1 | 0 | | + +-------------+ | + PD2 | 0 | | + +-------------+ | + PD3 | 1 | | PLAT_NUM_POWER_DOMAINS - + +-------------+ | PLATFORM_CORE_COUNT + PD4 | 1 | | + +-------------+ | + PD5 | 2 | | + +-------------+ | + PD6 | 2 | | + +-------------+ v + +Each core can find its node in the ``psci_cpu_pd_nodes`` array using the +``plat_my_core_pos()`` function. When a core is turned on, the normal world +provides an MPIDR. The ``plat_core_pos_by_mpidr()`` function is used to validate +the MPIDR before using it to find the corresponding core node. The non-core power +domain nodes do not need to be identified. + +-------------- + +*Copyright (c) 2017-2018, Arm Limited and Contributors. All rights reserved.* diff --git a/docs/design/reset-design.rst b/docs/design/reset-design.rst new file mode 100644 index 0000000..666ee4f --- /dev/null +++ b/docs/design/reset-design.rst @@ -0,0 +1,168 @@ +CPU Reset +========= + +This document describes the high-level design of the framework to handle CPU +resets in Trusted Firmware-A (TF-A). It also describes how the platform +integrator can tailor this code to the system configuration to some extent, +resulting in a simplified and more optimised boot flow. + +This document should be used in conjunction with the :ref:`Firmware Design` +document which provides greater implementation details around the reset code, +specifically for the cold boot path. + +General reset code flow +----------------------- + +The TF-A reset code is implemented in BL1 by default. The following high-level +diagram illustrates this: + +|Default reset code flow| + +This diagram shows the default, unoptimised reset flow. Depending on the system +configuration, some of these steps might be unnecessary. The following sections +guide the platform integrator by indicating which build options exclude which +steps, depending on the capability of the platform. + +.. note:: + If BL31 is used as the TF-A entry point instead of BL1, the diagram + above is still relevant, as all these operations will occur in BL31 in + this case. Please refer to section 6 "Using BL31 entrypoint as the reset + address" for more information. + +Programmable CPU reset address +------------------------------ + +By default, TF-A assumes that the CPU reset address is not programmable. +Therefore, all CPUs start at the same address (typically address 0) whenever +they reset. Further logic is then required to identify whether it is a cold or +warm boot to direct CPUs to the right execution path. + +If the reset vector address (reflected in the reset vector base address register +``RVBAR_EL3``) is programmable then it is possible to make each CPU start directly +at the right address, both on a cold and warm reset. Therefore, the boot type +detection can be skipped, resulting in the following boot flow: + +|Reset code flow with programmable reset address| + +To enable this boot flow, compile TF-A with ``PROGRAMMABLE_RESET_ADDRESS=1``. +This option only affects the TF-A reset image, which is BL1 by default or BL31 if +``RESET_TO_BL31=1``. + +On both the FVP and Juno platforms, the reset vector address is not programmable +so both ports use ``PROGRAMMABLE_RESET_ADDRESS=0``. + +Cold boot on a single CPU +------------------------- + +By default, TF-A assumes that several CPUs may be released out of reset. +Therefore, the cold boot code has to arbitrate access to hardware resources +shared amongst CPUs. This is done by nominating one of the CPUs as the primary, +which is responsible for initialising shared hardware and coordinating the boot +flow with the other CPUs. + +If the platform guarantees that only a single CPU will ever be brought up then +no arbitration is required. The notion of primary/secondary CPU itself no longer +applies. This results in the following boot flow: + +|Reset code flow with single CPU released out of reset| + +To enable this boot flow, compile TF-A with ``COLD_BOOT_SINGLE_CPU=1``. This +option only affects the TF-A reset image, which is BL1 by default or BL31 if +``RESET_TO_BL31=1``. + +On both the FVP and Juno platforms, although only one core is powered up by +default, there are platform-specific ways to release any number of cores out of +reset. Therefore, both platform ports use ``COLD_BOOT_SINGLE_CPU=0``. + +Programmable CPU reset address, Cold boot on a single CPU +--------------------------------------------------------- + +It is obviously possible to combine both optimisations on platforms that have +a programmable CPU reset address and which release a single CPU out of reset. +This results in the following boot flow: + + +|Reset code flow with programmable reset address and single CPU released out of reset| + +To enable this boot flow, compile TF-A with both ``COLD_BOOT_SINGLE_CPU=1`` +and ``PROGRAMMABLE_RESET_ADDRESS=1``. These options only affect the TF-A reset +image, which is BL1 by default or BL31 if ``RESET_TO_BL31=1``. + +Using BL31 entrypoint as the reset address +------------------------------------------ + +On some platforms the runtime firmware (BL3x images) for the application +processors are loaded by some firmware running on a secure system processor +on the SoC, rather than by BL1 and BL2 running on the primary application +processor. For this type of SoC it is desirable for the application processor +to always reset to BL31 which eliminates the need for BL1 and BL2. + +TF-A provides a build-time option ``RESET_TO_BL31`` that includes some additional +logic in the BL31 entry point to support this use case. + +In this configuration, the platform's Trusted Boot Firmware must ensure that +BL31 is loaded to its runtime address, which must match the CPU's ``RVBAR_EL3`` +reset vector base address, before the application processor is powered on. +Additionally, platform software is responsible for loading the other BL3x images +required and providing entry point information for them to BL31. Loading these +images might be done by the Trusted Boot Firmware or by platform code in BL31. + +Although the Arm FVP platform does not support programming the reset base +address dynamically at run-time, it is possible to set the initial value of the +``RVBAR_EL3`` register at start-up. This feature is provided on the Base FVP +only. + +It allows the Arm FVP port to support the ``RESET_TO_BL31`` configuration, in +which case the ``bl31.bin`` image must be loaded to its run address in Trusted +SRAM and all CPU reset vectors be changed from the default ``0x0`` to this run +address. See the :ref:`Arm Fixed Virtual Platforms (FVP)` for details of running +the FVP models in this way. + +Although technically it would be possible to program the reset base address with +the right support in the SCP firmware, this is currently not implemented so the +Juno port doesn't support the ``RESET_TO_BL31`` configuration. + +The ``RESET_TO_BL31`` configuration requires some additions and changes in the +BL31 functionality: + +Determination of boot path +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In this configuration, BL31 uses the same reset framework and code as the one +described for BL1 above. Therefore, it is affected by the +``PROGRAMMABLE_RESET_ADDRESS`` and ``COLD_BOOT_SINGLE_CPU`` build options in the +same way. + +In the default, unoptimised BL31 reset flow, on a warm boot a CPU is directed +to the PSCI implementation via a platform defined mechanism. On a cold boot, +the platform must place any secondary CPUs into a safe state while the primary +CPU executes a modified BL31 initialization, as described below. + +Platform initialization +~~~~~~~~~~~~~~~~~~~~~~~ + +In this configuration, when the CPU resets to BL31 there should be no parameters +that can be passed in registers by previous boot stages. Instead, the platform +code in BL31 needs to know, or be able to determine, the location of the BL32 +(if required) and BL33 images and provide this information in response to the +``bl31_plat_get_next_image_ep_info()`` function. + +.. note:: + Some platforms that configure ``RESET_TO_BL31`` might still be able to + receive parameters in registers depending on their actual boot sequence. On + those occasions, and in addition to ``RESET_TO_BL31``, these platforms should + set ``RESET_TO_BL31_WITH_PARAMS`` to avoid the input registers from being + zeroed before entering BL31. + +Additionally, platform software is responsible for carrying out any security +initialisation, for example programming a TrustZone address space controller. +This might be done by the Trusted Boot Firmware or by platform code in BL31. + +-------------- + +*Copyright (c) 2015-2022, Arm Limited and Contributors. All rights reserved.* + +.. |Default reset code flow| image:: ../resources/diagrams/default_reset_code.png +.. |Reset code flow with programmable reset address| image:: ../resources/diagrams/reset_code_no_boot_type_check.png +.. |Reset code flow with single CPU released out of reset| image:: ../resources/diagrams/reset_code_no_cpu_check.png +.. |Reset code flow with programmable reset address and single CPU released out of reset| image:: ../resources/diagrams/reset_code_no_checks.png diff --git a/docs/design/trusted-board-boot-build.rst b/docs/design/trusted-board-boot-build.rst new file mode 100644 index 0000000..c3f3a2f --- /dev/null +++ b/docs/design/trusted-board-boot-build.rst @@ -0,0 +1,122 @@ +Building FIP images with support for Trusted Board Boot +======================================================= + +Trusted Board Boot primarily consists of the following two features: + +- Image Authentication, described in :ref:`Trusted Board Boot`, and +- Firmware Update, described in :ref:`Firmware Update (FWU)` + +The following steps should be followed to build FIP and (optionally) FWU_FIP +images with support for these features: + +#. Fulfill the dependencies of the ``mbedtls`` cryptographic and image parser + modules by checking out a recent version of the `mbed TLS Repository`_. It + is important to use a version that is compatible with TF-A and fixes any + known security vulnerabilities. See `mbed TLS Security Center`_ for more + information. See the :ref:`Prerequisites` document for the appropriate + version of mbed TLS to use. + + The ``drivers/auth/mbedtls/mbedtls_*.mk`` files contain the list of mbed TLS + source files the modules depend upon. + ``include/drivers/auth/mbedtls/mbedtls_config.h`` contains the configuration + options required to build the mbed TLS sources. + + Note that the mbed TLS library is licensed under the Apache version 2.0 + license. Using mbed TLS source code will affect the licensing of TF-A + binaries that are built using this library. + +#. To build the FIP image, ensure the following command line variables are set + while invoking ``make`` to build TF-A: + + - ``MBEDTLS_DIR=<path of the directory containing mbed TLS sources>`` + - ``TRUSTED_BOARD_BOOT=1`` + - ``GENERATE_COT=1`` + + By default, this will use the Chain of Trust described in the TBBR-client + document. To select a different one, use the ``COT`` build option. + + If using a custom build of OpenSSL, set the ``OPENSSL_DIR`` variable + accordingly so it points at the OpenSSL installation path, as explained in + :ref:`Build Options`. In addition, set the ``LD_LIBRARY_PATH`` variable + when running to point at the custom OpenSSL path, so the OpenSSL libraries + are loaded from that path instead of the default OS path. Export this + variable if necessary. + + In the case of Arm platforms, the location of the ROTPK hash must also be + specified at build time. The following locations are currently supported (see + ``ARM_ROTPK_LOCATION`` build option): + + - ``ARM_ROTPK_LOCATION=regs``: the ROTPK hash is obtained from the Trusted + root-key storage registers present in the platform. On Juno, these + registers are read-only. On FVP Base and Cortex models, the registers + are also read-only, but the value can be specified using the command line + option ``bp.trusted_key_storage.public_key`` when launching the model. + On Juno board, the default value corresponds to an ECDSA-SECP256R1 public + key hash, whose private part is not currently available. + + - ``ARM_ROTPK_LOCATION=devel_rsa``: use the default hash located in + ``plat/arm/board/common/rotpk/arm_rotpk_rsa_sha256.bin``. Enforce + generation of the new hash if ``ROT_KEY`` is specified. + + - ``ARM_ROTPK_LOCATION=devel_ecdsa``: use the default hash located in + ``plat/arm/board/common/rotpk/arm_rotpk_ecdsa_sha256.bin``. Enforce + generation of the new hash if ``ROT_KEY`` is specified. + + Example of command line using RSA development keys: + + .. code:: shell + + MBEDTLS_DIR=<path of the directory containing mbed TLS sources> \ + make PLAT=<platform> TRUSTED_BOARD_BOOT=1 GENERATE_COT=1 \ + ARM_ROTPK_LOCATION=devel_rsa \ + ROT_KEY=plat/arm/board/common/rotpk/arm_rotprivk_rsa.pem \ + BL33=<path-to>/<bl33_image> OPENSSL_DIR=<path-to>/<openssl> \ + all fip + + The result of this build will be the bl1.bin and the fip.bin binaries. This + FIP will include the certificates corresponding to the selected Chain of + Trust. These certificates can also be found in the output build directory. + +#. The optional FWU_FIP contains any additional images to be loaded from + Non-Volatile storage during the :ref:`Firmware Update (FWU)` process. To build the + FWU_FIP, any FWU images required by the platform must be specified on the + command line. On Arm development platforms like Juno, these are: + + - NS_BL2U. The AP non-secure Firmware Updater image. + - SCP_BL2U. The SCP Firmware Update Configuration image. + + Example of Juno command line for generating both ``fwu`` and ``fwu_fip`` + targets using RSA development: + + :: + + MBEDTLS_DIR=<path of the directory containing mbed TLS sources> \ + make PLAT=juno TRUSTED_BOARD_BOOT=1 GENERATE_COT=1 \ + ARM_ROTPK_LOCATION=devel_rsa \ + ROT_KEY=plat/arm/board/common/rotpk/arm_rotprivk_rsa.pem \ + BL33=<path-to>/<bl33_image> OPENSSL_DIR=<path-to>/<openssl> \ + SCP_BL2=<path-to>/<scp_bl2_image> \ + SCP_BL2U=<path-to>/<scp_bl2u_image> \ + NS_BL2U=<path-to>/<ns_bl2u_image> \ + all fip fwu_fip + + .. note:: + The BL2U image will be built by default and added to the FWU_FIP. + The user may override this by adding ``BL2U=<path-to>/<bl2u_image>`` + to the command line above. + + .. note:: + Building and installing the non-secure and SCP FWU images (NS_BL1U, + NS_BL2U and SCP_BL2U) is outside the scope of this document. + + The result of this build will be bl1.bin, fip.bin and fwu_fip.bin binaries. + Both the FIP and FWU_FIP will include the certificates corresponding to the + selected Chain of Trust. These certificates can also be found in the output + build directory. + +-------------- + +*Copyright (c) 2019-2022, Arm Limited. All rights reserved.* + +.. _mbed TLS Repository: https://github.com/ARMmbed/mbedtls.git +.. _mbed TLS Security Center: https://tls.mbed.org/security diff --git a/docs/design/trusted-board-boot.rst b/docs/design/trusted-board-boot.rst new file mode 100644 index 0000000..46177d7 --- /dev/null +++ b/docs/design/trusted-board-boot.rst @@ -0,0 +1,263 @@ +Trusted Board Boot +================== + +The Trusted Board Boot (TBB) feature prevents malicious firmware from running on +the platform by authenticating all firmware images up to and including the +normal world bootloader. It does this by establishing a Chain of Trust using +Public-Key-Cryptography Standards (PKCS). + +This document describes the design of Trusted Firmware-A (TF-A) TBB, which is an +implementation of the `Trusted Board Boot Requirements (TBBR)`_ specification, +Arm DEN0006D. It should be used in conjunction with the +:ref:`Firmware Update (FWU)` design document, which implements a specific aspect +of the TBBR. + +Chain of Trust +-------------- + +A Chain of Trust (CoT) starts with a set of implicitly trusted components. On +the Arm development platforms, these components are: + +- A SHA-256 hash of the Root of Trust Public Key (ROTPK). It is stored in the + trusted root-key storage registers. Alternatively, a development ROTPK might + be used and its hash embedded into the BL1 and BL2 images (only for + development purposes). + +- The BL1 image, on the assumption that it resides in ROM so cannot be + tampered with. + +The remaining components in the CoT are either certificates or boot loader +images. The certificates follow the `X.509 v3`_ standard. This standard +enables adding custom extensions to the certificates, which are used to store +essential information to establish the CoT. + +In the TBB CoT all certificates are self-signed. There is no need for a +Certificate Authority (CA) because the CoT is not established by verifying the +validity of a certificate's issuer but by the content of the certificate +extensions. To sign the certificates, different signature schemes are available, +please refer to the :ref:`Build Options` for more details. + +The certificates are categorised as "Key" and "Content" certificates. Key +certificates are used to verify public keys which have been used to sign content +certificates. Content certificates are used to store the hash of a boot loader +image. An image can be authenticated by calculating its hash and matching it +with the hash extracted from the content certificate. Various hash algorithms +are supported to calculate all hashes, please refer to the :ref:`Build Options` +for more details.. The public keys and hashes are included as non-standard +extension fields in the `X.509 v3`_ certificates. + +The keys used to establish the CoT are: + +- **Root of trust key** + + The private part of this key is used to sign the BL2 content certificate and + the trusted key certificate. The public part is the ROTPK. + +- **Trusted world key** + + The private part is used to sign the key certificates corresponding to the + secure world images (SCP_BL2, BL31 and BL32). The public part is stored in + one of the extension fields in the trusted world certificate. + +- **Non-trusted world key** + + The private part is used to sign the key certificate corresponding to the + non secure world image (BL33). The public part is stored in one of the + extension fields in the trusted world certificate. + +- **BL3X keys** + + For each of SCP_BL2, BL31, BL32 and BL33, the private part is used to + sign the content certificate for the BL3X image. The public part is stored + in one of the extension fields in the corresponding key certificate. + +The following images are included in the CoT: + +- BL1 +- BL2 +- SCP_BL2 (optional) +- BL31 +- BL33 +- BL32 (optional) + +The following certificates are used to authenticate the images. + +- **BL2 content certificate** + + It is self-signed with the private part of the ROT key. It contains a hash + of the BL2 image. + +- **Trusted key certificate** + + It is self-signed with the private part of the ROT key. It contains the + public part of the trusted world key and the public part of the non-trusted + world key. + +- **SCP_BL2 key certificate** + + It is self-signed with the trusted world key. It contains the public part of + the SCP_BL2 key. + +- **SCP_BL2 content certificate** + + It is self-signed with the SCP_BL2 key. It contains a hash of the SCP_BL2 + image. + +- **BL31 key certificate** + + It is self-signed with the trusted world key. It contains the public part of + the BL31 key. + +- **BL31 content certificate** + + It is self-signed with the BL31 key. It contains a hash of the BL31 image. + +- **BL32 key certificate** + + It is self-signed with the trusted world key. It contains the public part of + the BL32 key. + +- **BL32 content certificate** + + It is self-signed with the BL32 key. It contains a hash of the BL32 image. + +- **BL33 key certificate** + + It is self-signed with the non-trusted world key. It contains the public + part of the BL33 key. + +- **BL33 content certificate** + + It is self-signed with the BL33 key. It contains a hash of the BL33 image. + +The SCP_BL2 and BL32 certificates are optional, but they must be present if the +corresponding SCP_BL2 or BL32 images are present. + +Trusted Board Boot Sequence +--------------------------- + +The CoT is verified through the following sequence of steps. The system panics +if any of the steps fail. + +- BL1 loads and verifies the BL2 content certificate. The issuer public key is + read from the verified certificate. A hash of that key is calculated and + compared with the hash of the ROTPK read from the trusted root-key storage + registers. If they match, the BL2 hash is read from the certificate. + + .. note:: + The matching operation is platform specific and is currently + unimplemented on the Arm development platforms. + +- BL1 loads the BL2 image. Its hash is calculated and compared with the hash + read from the certificate. Control is transferred to the BL2 image if all + the comparisons succeed. + +- BL2 loads and verifies the trusted key certificate. The issuer public key is + read from the verified certificate. A hash of that key is calculated and + compared with the hash of the ROTPK read from the trusted root-key storage + registers. If the comparison succeeds, BL2 reads and saves the trusted and + non-trusted world public keys from the verified certificate. + +The next two steps are executed for each of the SCP_BL2, BL31 & BL32 images. +The steps for the optional SCP_BL2 and BL32 images are skipped if these images +are not present. + +- BL2 loads and verifies the BL3x key certificate. The certificate signature + is verified using the trusted world public key. If the signature + verification succeeds, BL2 reads and saves the BL3x public key from the + certificate. + +- BL2 loads and verifies the BL3x content certificate. The signature is + verified using the BL3x public key. If the signature verification succeeds, + BL2 reads and saves the BL3x image hash from the certificate. + +The next two steps are executed only for the BL33 image. + +- BL2 loads and verifies the BL33 key certificate. If the signature + verification succeeds, BL2 reads and saves the BL33 public key from the + certificate. + +- BL2 loads and verifies the BL33 content certificate. If the signature + verification succeeds, BL2 reads and saves the BL33 image hash from the + certificate. + +The next step is executed for all the boot loader images. + +- BL2 calculates the hash of each image. It compares it with the hash obtained + from the corresponding content certificate. The image authentication succeeds + if the hashes match. + +The Trusted Board Boot implementation spans both generic and platform-specific +BL1 and BL2 code, and in tool code on the host build machine. The feature is +enabled through use of specific build flags as described in +:ref:`Build Options`. + +On the host machine, a tool generates the certificates, which are included in +the FIP along with the boot loader images. These certificates are loaded in +Trusted SRAM using the IO storage framework. They are then verified by an +Authentication module included in TF-A. + +The mechanism used for generating the FIP and the Authentication module are +described in the following sections. + +Authentication Framework +------------------------ + +The authentication framework included in TF-A provides support to implement +the desired trusted boot sequence. Arm platforms use this framework to +implement the boot requirements specified in the +`Trusted Board Boot Requirements (TBBR)`_ document. + +More information about the authentication framework can be found in the +:ref:`Authentication Framework & Chain of Trust` document. + +Certificate Generation Tool +--------------------------- + +The ``cert_create`` tool is built and runs on the host machine as part of the +TF-A build process when ``GENERATE_COT=1``. It takes the boot loader images +and keys as inputs (keys must be in PEM format) and generates the +certificates (in DER format) required to establish the CoT. New keys can be +generated by the tool in case they are not provided. The certificates are then +passed as inputs to the ``fiptool`` utility for creating the FIP. + +The certificates are also stored individually in the output build directory. + +The tool resides in the ``tools/cert_create`` directory. It uses the OpenSSL SSL +library version to generate the X.509 certificates. The specific version of the +library that is required is given in the :ref:`Prerequisites` document. + +Instructions for building and using the tool can be found at +:ref:`tools_build_cert_create`. + +Authenticated Encryption Framework +---------------------------------- + +The authenticated encryption framework included in TF-A provides support to +implement the optional firmware encryption feature. This feature can be +optionally enabled on platforms to implement the optional requirement: +R060_TBBR_FUNCTION as specified in the `Trusted Board Boot Requirements (TBBR)`_ +document. + +Firmware Encryption Tool +------------------------ + +The ``encrypt_fw`` tool is built and runs on the host machine as part of the +TF-A build process when ``DECRYPTION_SUPPORT != none``. It takes the plain +firmware image as input and generates the encrypted firmware image which can +then be passed as input to the ``fiptool`` utility for creating the FIP. + +The encrypted firmwares are also stored individually in the output build +directory. + +The tool resides in the ``tools/encrypt_fw`` directory. It uses OpenSSL SSL +library version 1.0.1 or later to do authenticated encryption operation. +Instructions for building and using the tool can be found in the +:ref:`tools_build_enctool`. + +-------------- + +*Copyright (c) 2015-2020, Arm Limited and Contributors. All rights reserved.* + +.. _X.509 v3: https://tools.ietf.org/rfc/rfc5280.txt +.. _Trusted Board Boot Requirements (TBBR): https://developer.arm.com/docs/den0006/latest/trusted-board-boot-requirements-client-tbbr-client-armv8-a |