diff options
Diffstat (limited to 'third_party/heimdal/lib/asn1/README-X681.md')
-rw-r--r-- | third_party/heimdal/lib/asn1/README-X681.md | 1124 |
1 files changed, 1124 insertions, 0 deletions
diff --git a/third_party/heimdal/lib/asn1/README-X681.md b/third_party/heimdal/lib/asn1/README-X681.md new file mode 100644 index 0000000..e0d270b --- /dev/null +++ b/third_party/heimdal/lib/asn1/README-X681.md @@ -0,0 +1,1124 @@ +# Automatic Open Type Handling via X.68x Support in Heimdal's ASN.1 Compiler + +## Table of Contents + + 1. [Introduction](#Introduction) + 2. [Typed Holes / Open Types](#typed-holes--open-types) + 3. [ASN.1 IOS, Constraint, and Parameterization](#asn1-ios-constraint-and-parameterization) + - [IOS Crash Course](#ios-crash-course) + 4. [Usage](#Usage) + 5. [Limitations](#Limitations) + 6. [Implementation Design](#implementation-design) + 7. [Moving From C](#moving-from-c) + +## Introduction + +ASN.1 is a set of specifications for "syntax" for defining data schemas, and +"encoding rules" for encoding values of data of types defined in those schemas. +There are many encoding rules, but one syntax. + +The base of ASN.1 _syntax_ is specified by X.680, an ITU-T standard. The +encoding rules are specified by the X.69x series (X.690 through X.697). + +This README is concerned primarily with the X.68x series. + +While X.680 is essential for implementing many Internet (and other) protocols, +and sufficient for implementing all of those, there are extensions in the +remainder of the X.68x series that can make life a lot easier for developers +who have to use ASN.1 for interoperability reasons. + +Various syntax extensions are specified in X.68x series documents: + + - X.681: Information Object specification + - X.682: Constraint specification + - X.683: Parameterization of ASN.1 specifications + +The intent of X.681, X.682, and X.683 is to add ways to formally express +constraints that would otherwise require natural language to express. Give a +compiler more formally-expressed constraints and it can do more labor-saving +than it could otherwise. + +A subset of these three extensions, X.681, X.682, and X.683, can enable some +rather magical features. These magical features are generally not the focus of +those ITU-T specifications nor of many RFCs that make use of them, but +nonetheless they are of interest to us. + +This README covers some ideas for what this magic is, and implementation of it. + +RFC 6025 does an excellent job of elucidating X.681, which otherwise most +readers unfamiliar with it will no doubt find inscrutable. Hopefully this +README improves that further. + +The magic that we're after is simply the *automatic and recursive handling of +open types by an ASN.1 compiler*. + +Combined with eventual support for the ASN.1 JSON Encoding Rules (JER) [X.697], +this feature could give us unprecendented visibility into really complex data +structures, such as Endorsement Key Certificates (EKcerts) for Trusted Platform +Module (TPM) applications. + +Support for JER and automatic handling of open types should allow us to +trivially implement a command-line tool that can parse any DER or JER (JSON) +encoding of any value whose type is known and compiled, and which could +transcode to the other encoding rules. I.e., dump DER to JSON, and parse JSON +to output DER. + +Indeed, Heimdal's `asn1_print` program currently supports transcoding of DER to +JSON, though it's not quite X.697-compliant JSON! Heimdal does not currently +support parsing JSON-encoded values of ASN.1 types. + +Combined with transcoders for JSON/CBOR and other binary-JSON formats, we could +support those encodings too. + +We could really see how much space OER/JER/CBOR save over DER for Kerberos +tickets, PKIX certificates, and much else. + +We especially want this for PKIX, and more than anything for certificates, as +the TBSCertificate type is full of deeply nested open types: DNs and +subjectDirectory attributes, otherName SAN types, and certificate extensions. + +Besides a magical ASN.1 DER/JER dumper/transcoder utility, we want to replace +DN attribute and subject alternative name (SAN) `otherName` tables and much +hand-coded handling of certificate extensions in `lib/hx509/`. + +The reader should already be familiar with ASN.1, which anyways is a set of two +things: + + - an abstract syntax for specifying schemas for data interchange + + - a set of encoding rules + +A very common thing to see in projects that use ASN.1, as well as projects that +use alternatives to ASN.1, is a pattern known as the "typed hole" or "open +type". + +The ASN.1 Information Object System (IOS) [X.681] is all about automating the +otherwise very annoying task of dealing with "typed holes" / "open types". + +The ASN.1 IOS is not sufficient to implement the magic we're after. Also +needed is constraint specification and parameterization of types. + +ITU-T references: + +https://www.itu.int/rec/T-REC-X.680-201508-I/en +https://www.itu.int/rec/T-REC-X.681-201508-I/en +https://www.itu.int/rec/T-REC-X.682-201508-I/en +https://www.itu.int/rec/T-REC-X.683-201508-I/en + + +## Typed Holes / Open Types + +A typed hole or open type is a pattern of data structure that generally looks +like: + +``` + { type_id, bytes_encoding_a_value_of_a_type_identified_by_type_id } +``` + +I.e., an opaque datum and an identifier of what kind of datum that is. This +happens because the structure with the typed hole is used in contexts where it +can't know all possible things that can go in it. In many cases we do know +what all possible things are that can go in a typed hole, but many years ago +didn't, say, or anyways, had a reason to use a typed hole. + +These are used not only in protocols that use ASN.1, but in many protocols that +use syntaxes and encodings unrelated to ASN.1. I.e., these concepts are *not* +ASN.1-specific. + +Many Internet protocols use typed holes, and many use typed holes in ASN.1 +types. For example, PKIX, Kerberos, LDAP, and others, use ASN.1 and typed +holes. + +For examples of an Internet protocol that does not use ASN.1 but which still +has typed holes, see IP, MIME, SSHv2, IKEv2, and others. Most quintessentilly, +IP itself, since IP packet payloads are for some upper layer protocol +identified in the IP packet header. + +In ASN.1 these generally look like: + +```ASN.1 + TypedHole ::= SEQUENCE { + typeId INTEGER, + opaque OCTET STRING + } +``` + +or + +```ASN.1 + -- Old ASN.1 style + TypedHole ::= SEQUENCE { + typeId OBJECT IDENTIFIER, + opaque ANY DEFINED BY typeID + } +``` + +or + +```ASN.1 + -- Old ASN.1 style + TypedHole ::= SEQUENCE { + typeId OBJECT IDENTIFIER, + opaque ANY -- DEFINED BY typeID + } +``` + +or any number of variations. + + Note: the `ANY` variations are no longer conformant to X.680 (the base + ASN.1 specification). + +The pattern is `{ id, hole }` where the `hole` is ultimately an opaque sequence +of bytes whose content's schema is identified by the `id` in the same data +structure. The pattern does not require just two fields, and it does not +require any particular type for the hole, nor for the type ID. Sometimes the +"hole" is an `OCTET STRING`, sometimes it's a `BIT STRING`, sometimes it's an +`ANY` or `ANY DEFINED BY`. Sometimes the hole is even an array of (`SET OF` or +`SEQUENCE OF`, in ASN.1) values of the type identified by the id field. + +An example from PKIX: + +```ASN.1 +Extension ::= SEQUENCE { + extnID OBJECT IDENTIFIER, -- <- type ID + critical BOOLEAN OPTIONAL, + extnValue OCTET STRING, -- <- hole +} +``` + +which shows that typed holes don't always have just three fields, and the type +identifier isn't always an integer. + +Now, Heimdal's ASN.1 compiler generates the obvious C data structure for PKIX's +`Extension` type: + +```C + typedef struct Extension { + heim_oid extnID; + int *critical; + heim_octet_string extnValue; + } Extension; +``` + +and applications using this compiler have to inspect the `extnID` field, +comparing it to any number of OIDs, to determine the type of `extnValue`, then +must call `decode_ThatType()` to decode whatever that octet string has. + +This is very inconvenient. + +Compare this to the handling of discriminated unions (what ASN.1 calls a +`CHOICE`): + +```C + /* + * ASN.1 definition: + * + * DistributionPointName ::= CHOICE { + * fullName [0] IMPLICIT SEQUENCE OF GeneralName, + * nameRelativeToCRLIssuer [1] RelativeDistinguishedName, + * } + */ + + /* C equivalent */ + typedef struct DistributionPointName { + enum DistributionPointName_enum { + choice_DistributionPointName_fullName = 1, + choice_DistributionPointName_nameRelativeToCRLIssuer + } element; + union { + struct DistributionPointName_fullName { + unsigned int len; + GeneralName *val; + } fullName; + RelativeDistinguishedName nameRelativeToCRLIssuer; + } u; + } DistributionPointName; +``` + +The ASN.1 encoding on the wire of a `CHOICE` value, almost no matter the +encoding rules, looks... remarkably like the encoding of a typed hole. Though +generally the alternatives of a discriminated union have to all be encoded with +the same encoding rules, whereas with typed holes the encoded data could be +encoded in radically different encoding rules than the structure containing it +in a typed hole. + +In fact, extensible `CHOICE`s are handled by our compiler as a discriminated +union one of whose alternatives is a typed hole when the `CHOICE` is +extensible: + +```C + typedef struct DigestRepInner { + enum DigestRepInner_enum { + choice_DigestRepInner_asn1_ellipsis = 0, /* <--- unknown CHOICE arm */ + choice_DigestRepInner_error, + choice_DigestRepInner_initReply, + choice_DigestRepInner_response, + choice_DigestRepInner_ntlmInitReply, + choice_DigestRepInner_ntlmResponse, + choice_DigestRepInner_supportedMechs + /* ... */ + } element; + union { + DigestError error; + DigestInitReply initReply; + DigestResponse response; + NTLMInitReply ntlmInitReply; + NTLMResponse ntlmResponse; + DigestTypes supportedMechs; + heim_octet_string asn1_ellipsis; /* <--- unknown CHOICE arm */ + } u; + } DigestRepInner; +``` + +The critical thing to understand is that our compiler automatically decodes +(and encodes) `CHOICE`s' alternatives, but it used to NOT do that for typed +holes because it knows nothing about them. Now, however, our compiler can +do this for typed holes provided the module specifies what the alternatives +are. + +It would be nice if we could treat *all* typed holes like `CHOICE`s whenever +the compiler knows the alternatives! + +And that's exactly what the ASN.1 IOS system makes possible. With ASN.1 IOS +support, our compiler can automatically decode all the `Certificate` +extensions, and all the distinguished name extensions it knows about. + +There is a fair bit of code in `lib/hx509/` that deals with encoding and +decoding things in typed holes where the compiler could just handle that +automatically for us, allowing us to delete a lot of code. + +Even more importantly, if we ever add support for visual encoding rules of +ASN.1, such as JSON Encoding Rules (JER) [X.697] or Generic String Encoding +Rules (GSER) [RFC2641], we could have a utility program to automatically +display or compile DER (and other encodings) of certifcates and many other +interesting data structures. + +Indeed, we do now have such a utility (`asn1_print`), able to transcode DER to +JSON. + +## ASN.1 IOS, Constraint, and Parameterization + +The ASN.1 IOS is additional syntax that allows ASN.1 module authors to express +all the details about typed holes that ASN.1 compilers need to make developers' +lives much easier. + +RFC5912 has lots of examples, such as this `CLASS` corresponding to the +`Extension` type from PKIX: + +```ASN.1 + -- A class that provides some of the details of the PKIX Extension typed + -- hole: + EXTENSION ::= CLASS { + -- The following are fields of a class (as opposed to "members" of + -- SEQUENCE or SET types): + &id OBJECT IDENTIFIER UNIQUE, -- This is a fixed-type value field. + -- UNIQUE -> There can be only one + -- object with this OID + -- in any object set of + -- this class. + -- I.e., this is like a + -- PRIMARY KEY in a SQL + -- TABLE spec. + &ExtnType, -- This is a type field (the hole). + &Critical BOOLEAN DEFAULT {TRUE | FALSE } -- fixed-type value set field. + } WITH SYNTAX { + -- This is a specification of easy to use (but hard-to-parse) syntax for + -- specifying instances of this CLASS: + SYNTAX &ExtnType IDENTIFIED BY &id + [CRITICALITY &Critical] + } + + -- Here's a parameterized Extension type. The formal parameter is an as-yet + -- unspecified set of valid things this hole can carry for some particular + -- instance of this type. The actual parameter will be specified later (see + -- below). + Extension{EXTENSION:ExtensionSet} ::= SEQUENCE { + -- The type ID has to be the &id field of the EXTENSION CLASS of the + -- ExtensionSet object set parameter. + extnID EXTENSION.&id({ExtensionSet}), + -- This is the critical field, whose DEFAULT value should be that of + -- the &Critical field of the EXTENSION CLASS of the ExtensionSet object + -- set parameter. + critical BOOLEAN + -- (EXTENSION.&Critical({ExtensionSet}{@extnID})) + DEFAULT FALSE, + -- Finally, the hole is an OCTET STRING constrained to hold the encoding + -- of the type named by the &ExtnType field of the EXTENSION CLASS of the + -- ExtensionSet object set parameter. + -- + -- Note that for all members of this SEQUENCE, the fields of the object + -- referenced must be of the same object in the ExtensionSet object set + -- parameter. That's how we get to say that some OID implies some type + -- for the hole. + extnValue OCTET STRING (CONTAINING + EXTENSION.&ExtnType({ExtensionSet}{@extnID})) + -- contains the DER encoding of the ASN.1 value + -- corresponding to the extension type identified + -- by extnID + } + + -- This is just a SEQUENCE of Extensions, the parameterized version. + Extensions{EXTENSION:ExtensionSet} ::= + SEQUENCE SIZE (1..MAX) OF Extension{{ExtensionSet}} +``` + +and these uses of it in RFC5280 (PKIX base) where the actual parameter is +given: + +```ASN.1 + -- Here we have an individual "object" specifying that the OID + -- id-ce-authorityKeyIdentifier implies AuthorityKeyIdentifier as the hole + -- type: + ext-AuthorityKeyIdentifier EXTENSION ::= { SYNTAX + AuthorityKeyIdentifier IDENTIFIED BY + id-ce-authorityKeyIdentifier } + + -- And here's the OID, for completeness: + id-ce-authorityKeyIdentifier OBJECT IDENTIFIER ::= { id-ce 35 } + ... + + -- And Here's an object set for the EXTENSION CLASS collecting a bunch of + -- related extensions (here they are the extensions that certificates can + -- carry in their extensions member): + CertExtensions EXTENSION ::= { + ext-AuthorityKeyIdentifier | ext-SubjectKeyIdentifier | + ext-KeyUsage | ext-PrivateKeyUsagePeriod | + ext-CertificatePolicies | ext-PolicyMappings | + ext-SubjectAltName | ext-IssuerAltName | + ext-SubjectDirectoryAttributes | + ext-BasicConstraints | ext-NameConstraints | + ext-PolicyConstraints | ext-ExtKeyUsage | + ext-CRLDistributionPoints | ext-InhibitAnyPolicy | + ext-FreshestCRL | ext-AuthorityInfoAccess | + ext-SubjectInfoAccessSyntax, ... } + ... + + -- Lastly, we have a Certificate, and the place where the Extensions type's + -- actual parameter is specified. + -- + -- This is where the rubber meets the road: + + Certificate ::= SIGNED{TBSCertificate} + + TBSCertificate ::= SEQUENCE { + version [0] Version DEFAULT v1, + serialNumber CertificateSerialNumber, + signature AlgorithmIdentifier{SIGNATURE-ALGORITHM, + {SignatureAlgorithms}}, + issuer Name, + validity Validity, + subject Name, + subjectPublicKeyInfo SubjectPublicKeyInfo, + ... , + [[2: -- If present, version MUST be v2 + issuerUniqueID [1] IMPLICIT UniqueIdentifier OPTIONAL, + subjectUniqueID [2] IMPLICIT UniqueIdentifier OPTIONAL + ]], + [[3: -- If present, version MUST be v3 -- + extensions [3] Extensions{{CertExtensions}} OPTIONAL + -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + -- The rubber meets the road *here*. + -- + -- This says that the set of *known* certificate + -- extensions are those for which there are "objects" + -- in the "object set" named CertExtensions. + ]], ... } +``` + +Notice that the `extensions` field of `TBSCertificate` is of type `Extensions` +parametrized by the `CertExtensions` "information object set". + +This allows the compiler to know that if any of the OIDs listed in the +`CertExtensions` object set appear as the actual value of the `extnID` member +of an `Extension` value, then the `extnValue` member of the same `Extension` +value must be an instance of the type associated with that OID. For example, +an `Extension` with `extnID` value of `id-ce-authorityKeyIdentifier` must have +an `extnValue` of type `AuthorityKeyIdentifier`. + + +### IOS Crash Course + +The ASN.1 IOS may be... a bit difficult to understand -- the syntax isn't +pretty. And X.681 has a lot of strange terminology, like "variable type value +set field". + +An IOS "class" has fields, and those fields are of kind +`[Fixed]Type[Value[Set]]` or `Object[Set]`. Then there's "objects" and "object +sets". Hopefully this section will make all of that comprehensible. + +_Classes_ have fields of various kinds. More on this below. + +_Classes_ can also have zero, one, or more _object sets_ associated with them, +and each object set has zero, one, or more _objects_ that are also themselves +associated with classes. Each object has a setting for each required field of +a class, and possibly also for optional/defaulted fields as well. + +As X.681 explains, IOS object sets really are akin to relational database +tables, while objects are akin to rows of the same, with columns specified by +classes. + +Or one can think of _classes_ as relational tables with one predefined column +naming object sets, and rows being objects grouped into object sets by that +column. IOS supports complex path expressions across these objects (but we +won't need to support that yet). + +These relational entities are immutable in that they are defined in ASN.1 +modules that are compiled and there is no way to change them at run-time, only +query them (although perhaps object sets marked as extensible are intended to +be extensible at run-time?). To mutate them one must edit the ASN.1 module +that defines them and recompile it. IOS entities also have no on-the-wire +representation. + +So far, the IOS seems just so useless to us: we have some, but non-urgent need +to specify immutable relational data. For example, cryptosystem parameters, +which PKIX does define using IOS, but again: not urgent. + +The magic for us lies in being able to document and constrain actual datatypes +using the IOS [X.681], constraint specification [X.682], and type +parameterization [X.683]. We can express the following things: + + - that some _member_ of a `SET` or `SEQUENCE` is of open type + + - that some _member_ of a `SET` or `SEQUENCE` identifies a type encoded into + an open type member of the same (or related) `SET` or `SEQUENCE` + + - what pairs of `{type ID value, type}` are allowed for some `SET`'s or + `SEQUENCE`'s open type members + +With this our ASN.1 compiler has the metadata it needs in order to +auto-generate decoding and encoding of values of open types. + +A termnology point: `CHOICE`, `SET`, and `SEQUENCE` types have "members", but +_classes_ and _objects_ have "fields", and _object sets_ have "elements". + +Objects must have "_settings_" for all the required fields of the object's +class and none, some, or all of the `OPTIONAL` or `DEFAULT` fields of the +class. This is very similar to `SET`/`SEQUENCE` members, which can be +`OPTIONAL` or `DEFAULT`ed. + +The _members_ (we call them fields in C, instance variables in C++, Java, ...) +of a `SET` or `SEQUENCE` type are typed, just as in C, C++, Java, etc. for +struct or object types. + +There are several kinds of fields of classes. These can be confusing, so it is +useful that we explain them by reference to how they relate to the members of +`SEQUENCE` types constrained by object sets: + + - A `type field` of a class is one that specifies a `SET` or `SEQUENCE` member + of unknown (i.e., open) type. + + The type of that `SET` or `SEQUENCE` member will not be not truly unknown, + but determined by some other member of the SET or SEQUENCE, and that will be + specified in a "value field" (or "value set" field) an "object" in an + "object set" of that class. + + This is essentially a "type variable", akin to those seen in high-level + languages like Haskell. + + - A `fixed type value field` of a class is one that specifies a SET or + SEQUENCE member of fixed type. Being of fixed-type, this is not a type + variable, naturally. + + - A `fixed type value set field` of a class is like a `fixed type value + field`, but where object sets should provide a set of values with which to + constrain `SET`/`SEQUENCE` members corresponding to the field. + + - A `variable type value [set] field` is one where the type of the `SET` or + `SEQUENCE` member corresponding to the field will vary according to some + specified `type field` of the same class. + + - An `object field` will be a field that names another class (possibly the + same class), which can be used to provide rich hierarchical type semantics + that... we mostly don't need for now. + + These define relations between classes, much like `FOREIGN KEY`s in SQL. + + These are also known as `link fields`. + + - Similarly for `object set field`s. + +As usual for ASN.1, the case of the first letter of a field name is meaningful: + + - value and object field names start with a lower case letter; + - type, value set, and object set fields start with an upper-case letter. + +The form of a `fixed type value` field and a `fixed type value set` field is +the same, differing only the case of the first letter of the field name. +Similarly for `variable type value` and `variable type value set` fields. +Similarly, again, for `object` and `object set` fields. + +Here's a simple example from PKIX: + +```ASN.1 + -- An IOS class used to impose constraints on the PKIX Extension type: + EXTENSION ::= CLASS { + &id OBJECT IDENTIFIER UNIQUE, + &ExtnType, + &Critical BOOLEAN DEFAULT {TRUE | FALSE } + } WITH SYNTAX { + SYNTAX &ExtnType IDENTIFIED BY &id + [CRITICALITY &Critical] + } +``` + + - The `&id` field of `EXTENSION` is a fixed-type value field. It's not a + fixed-type value _set_ field because its identifier (`id`) starts with a + lower-case letter. + + The `&id` field is intended to make the `extnId` member of the `Extension` + `SEQUENCE` type name identify the actual type of the `extnValue` member of + the same `SEQUENCE` type. + + Note that `UNIQUE` keyword tells us there can be only one object with any + given value of this field in any object set of this class. (There is no way + to specify the equivalent of a multi-column `PRIMARY KEY` from SQL, only + single-column primary/unique keys. Note that the `&id` field is not marked + `OPTIONAL` or `DEFAULT`, which is like saying it's `NOT NULL` in SQL.) + + - The `&ExtnType` field is a type field. We can tell because no type is named + in its declaration! + + - The `&Critical` field is a fixed-type value set field. We can tell because + it specifies a type (`BOOLEAN`) and starts with an upper-case letter. + + In-tree we could avoid having to implement fixed-type value set fields by + renaming this one to `&critical` and eliding its `DEFAULT <ValueSet>` given + that we know there are only two possible values for a `BOOLEAN` field. + + - Ignore the `WITH SYNTAX` clause for now. All it does is specify a + user-friendly but implementor-hostile syntax for specifying objects. + +Note that none of the `Extension` extensions in PKIX actually specify +`CRITICALITY`/`&Critical`, so... we just don't need fixed-type value set +fields. We could elide the `&Critical` field of the `EXTENSION` class +altogether. + +Here's another, much more complex example from PKIX: + +```ASN.1 + ATTRIBUTE ::= CLASS { + &id OBJECT IDENTIFIER UNIQUE, + &Type OPTIONAL, + &equality-match MATCHING-RULE OPTIONAL, + &minCount INTEGER DEFAULT 1, + &maxCount INTEGER OPTIONAL + } + MATCHING-RULE ::= CLASS { + &ParentMatchingRules MATCHING-RULE OPTIONAL, + &AssertionType OPTIONAL, + &uniqueMatchIndicator ATTRIBUTE OPTIONAL, + &id OBJECT IDENTIFIER UNIQUE + } +``` + + - For `ATTRIBUTE` the fields are: + - The `&id` field is a fixed-type value field (intended to name the type of + members linked to the `&Type` field). + - The `&Type` field is a type field (open type). + - The `&equality-match` is an object field linking to object sets of the + `MATCHING-RULE` class. + - The `minCount` and `maxCount` fields are fixed-type value fields. + - For `MATCHING-RULE` the fields are: + - The `&ParentMatchingRules` is an object set field linking to more + `MATCHING-RULE`s. + - The `&AssertionType` field is a type field (open type). + - The `&uniqueMatchIndicator` field is an object field linking back to some + object of the `ATTRIBUTE` class that indicates whether the match is + unique (presumably). + - The `&id` field is a fixed-type value field (intended to name the type of + members linked to the `&AssertionType` field). + +No `Attribute`s in PKIX (at least RFC 5912) specify matching rules, so we +really don't need support for object nor object set fields. + +Because + - no objects in object sets of `EXTENSION` in PKIX specify "criticality", + - and no objects in object sets of `ATTRIBUTE` in PKIX specify matching rules, + - and no matching rules are specified in PKIX (or maybe just one), +we can drop `MATCHING-RULE` and simplify `ATTRIBUTE` and `EXTENSION` as: + +```ASN.1 + EXTENSION ::= CLASS { + &id OBJECT IDENTIFIER UNIQUE, + &ExtnType + } + ATTRIBUTE ::= CLASS { + &id OBJECT IDENTIFIER UNIQUE, + &Type OPTIONAL, + &minCount INTEGER DEFAULT 1, + &maxCount INTEGER OPTIONAL + } +``` + +X.681 has an example in appendix D.2 that has at least one field of every kind. + +Again, the rubber that are IOS classes and object sets meet the road when +defining types: + +```ASN.1 + -- Define the Extension type but link it to the EXTENSION class so that + -- an object set for that class can constrain it: + Extension{EXTENSION:ExtensionSet} ::= SEQUENCE { + extnID EXTENSION.&id({ExtensionSet}), + critical BOOLEAN + (EXTENSION.&Critical({ExtensionSet}{@extnID})) + DEFAULT FALSE, + extnValue OCTET STRING (CONTAINING + EXTENSION.&ExtnType({ExtensionSet}{@extnID})) + } + -- Most members of TBSCertificate elided for brevity: + TBSCertificate ::= SEQUENCE { + ..., + extensions [3] Extensions{{CertExtensions}} OPTIONAL + -- ^^^^^^^^^^^^^^^^ + -- the rubber meets the road here!! + ... + } + + OTHER-NAME ::= TYPE-IDENTIFIER + -- Most members of GeneralName elided for brevity: + GeneralName ::= CHOICE { + otherName [0] INSTANCE OF OTHER-NAME({KnownOtherNames}), + -- ^^^^^^^^^^^^^^^^^ + -- rubber & road meet! + ... + } +``` + +(The `CertExtensions` and `KnownOtherNames` object sets are not shown here for +brevity. PKIX doesn't even define an `KnownOtherNames` object set, though it +well could.) + +The above demonstrates two ways to create `SEQUENCE` types that are constrained +by IOS classes. One is by defining the types of the members of a `SEQUENCE` +type by reference to class fields. The other is by using `INSTANCE OF` to say +that the class defines the type directly. The first lets us do things like +have a mix members of a `SEQUENCE` type where some are defined by relation to a +class and others are not, or where multiple classes are used. + +In the case of `INSTANCE OF`, what shall the names of the members of the +derived type be? Well, such types can _only_ be instances of `TYPE-IDENTIFIER` +or classes copied from and isomorphic to it (as `OTHER-NAME` is in the above +exammle), and so the names of their two members are just baked in by X.681 +annex C.1 as: + +```ASN.1 + SEQUENCE { + type-id <DefinedObjectClass>.&id, + value[0] <DefinedObjectClass>.&Type + } + -- where <DefinedObjectClass> is the name of the class, which has to be + -- `TYPE-IDENTIFIER` or exactly like it. +``` + +(This means we can't use `INSTANCE OF` with `EXTENSION`, though we can for +`OTHER-NAME`.) + +PKIX has much more complex classes for relating and constraining cryptographic +algorithms and their parameters: + + - `DIGEST-ALGORITHM`, + - `SIGNATURE-ALGORITHM`, + - `PUBLIC-KEY`, + - `KEY-TRANSPORT`, + - `KEY-AGREE`, + - `KEY-WRAP`, + - `KEY-DERIVATION`, + - `MAC-ALGORITHM`, + - `CONTENT-ENCRYPTION`, + - `ALGORITHM`, + - `SMIME-CAPS`, + - and `CURVE`. + +These show the value of just the relational data aspect of IOS. They can not +only be used by the codecs at run-time to perform validation of, e.g., +cryptographic algorithm parameters, but also to provide those rules to other +code in the application so that the programmer doesn't have to manually write +the same in C, C++, Java, etc, and can refer to them when applying those +cryptographic algorithms. And, of course, the object sets for the above +classes can be and are specified in standards documents, making it very easy to +import them into projects that have an IOS-capable ASN.1 compiler. + +Still, for Heimdal we won't bother with the full power of X.681/X.682/X.683 for +now. + +## Usage + +To use this feature you must use the `--template` and `--one-code-file` +arguments to `asn1_compile`. C types are generated from ASN.1 types as +described above. + +Note that failure to decode open type values does not cause decoding to fail +altogether. It is important that applications check for undecoded open types. +Open type decoding failures manifest as `NULL` values for the `u` field of the +decoded open type structures (see above). + +For examples of X.681/X.682/X.683 usage, look at `lib/asn1/rfc2459.asn1`. + +## Limitations + + - `AtNotation` supported is very limited. + + - Object set extensibility is not supported. + + - Only one formal (and actual) type parameter is supported at this time. + + - `TYPE-IDENTIFIER` is not built-in at this time. (But users can define it as + specified.) + + - `CLASS` "copying" is not supported at this time. + + - Link fields are not supported. + + - `Information from objects` constructs are not supported. + + - `IMPORTS` of IOS entities are not supported at this time. + + - ... + +## Implementation Design + +NOTE: This has already be implemented in the `master` branch of Heimdal. + + - The required specifications, X.681, X.682, and X.683, are fairly large and + non-trivial. We can implement just the subset of those three that we need + to implement PKIX, just as we already implement just the subset of X.680 + that we need to implement PKIX and Kerberos. + + For dealing with PKIX, the bare minimum of IOS classes we want are: + + - `ATTRIBUTE` (used for `DN` attributes in RFC5280, specifically for the + `SingleAttribute` and `AttributeSet` types, RDNs, and the + `subjectDirectoryAttributes` extension) + - `EXTENSION` (used for `Extension`, i.e., certificate extensions in + RFC5280) + - `TYPE-IDENTIFIER` (used for `OtherName` and for CMS' `Content-Type`) + + The minimal subset of X.681, X.682, and X.683 needed to implement those + three is all we need. + + _Eventually_ we may want to increase that subset so as to implement other + IOS classes from PKIX, such as `DIGEST-ALGORITHM`, and to provide object + sets and query functionality for them to applications so that we can use + standard modules to encode information about cryptosystems. But not right + now. + + Note that there's no object set specified for OTHER-NAME instances, but we + can and have creates our own. We want magic open type decoding to recurse + all the way down and handle DN attributes, extensions, SANs, policy + qualifiers, the works. + + - We'll really want to do this mainly for the template compiler and begin + abandoning the original compiler. The codegen backend generates the same C + types, but no code for automatic, recursive handling of open types. + + Maintaining two compiler backends is difficult enough; adding complex + features beyond X.680 to both is too much work. The template compiler is + simply superior just on account of its output size scaling as `O(N)` instead + of `O(M * N)` where `M` is the number of encoding rules supported and `N` is + the size of an ASN.1 module (or all modules). + + - Also, to make the transition to using IOS in-tree, we'll want to keep + existing fields of C structures as generated by the compiler today, only + adding new ones, that way code that hasn't been updated to use the automatic + encoding/decoding can still work and we can then update Heimdal in-tree + slowly to take advantage of the new magic. + + Below are the C types for the ASN.1 PKIX types we care about, as generated + by the current prototype. + + `Extension` compiles to: + +```C +typedef struct Extension { + heim_oid extnID; + int critical; + heim_octet_string extnValue; + /* NEW: */ + struct { + enum { + choice_Extension_iosnumunknown = 0, + choice_Extension_iosnum_id_x509_ce_authorityKeyIdentifier, + choice_Extension_iosnum_id_x509_ce_subjectKeyIdentifier, + choice_Extension_iosnum_id_x509_ce_keyUsage, + choice_Extension_iosnum_id_x509_ce_privateKeyUsagePeriod, + choice_Extension_iosnum_id_x509_ce_certificatePolicies, + choice_Extension_iosnum_id_x509_ce_policyMappings, + choice_Extension_iosnum_id_x509_ce_subjectAltName, + choice_Extension_iosnum_id_x509_ce_issuerAltName, + choice_Extension_iosnum_id_x509_ce_basicConstraints, + choice_Extension_iosnum_id_x509_ce_nameConstraints, + choice_Extension_iosnum_id_x509_ce_policyConstraints, + choice_Extension_iosnum_id_x509_ce_extKeyUsage, + choice_Extension_iosnum_id_x509_ce_cRLDistributionPoints, + choice_Extension_iosnum_id_x509_ce_inhibitAnyPolicy, + choice_Extension_iosnum_id_x509_ce_freshestCRL, + choice_Extension_iosnum_id_pkix_pe_authorityInfoAccess, + choice_Extension_iosnum_id_pkix_pe_subjectInfoAccess, + } element; + union { + void *_any; + AuthorityKeyIdentifier* ext_AuthorityKeyIdentifier; + SubjectKeyIdentifier* ext_SubjectKeyIdentifier; + KeyUsage* ext_KeyUsage; + PrivateKeyUsagePeriod* ext_PrivateKeyUsagePeriod; + CertificatePolicies* ext_CertificatePolicies; + PolicyMappings* ext_PolicyMappings; + GeneralNames* ext_SubjectAltName; + GeneralNames* ext_IssuerAltName; + BasicConstraints* ext_BasicConstraints; + NameConstraints* ext_NameConstraints; + PolicyConstraints* ext_PolicyConstraints; + ExtKeyUsage* ext_ExtKeyUsage; + CRLDistributionPoints* ext_CRLDistributionPoints; + SkipCerts* ext_InhibitAnyPolicy; + CRLDistributionPoints* ext_FreshestCRL; + AuthorityInfoAccessSyntax* ext_AuthorityInfoAccess; + SubjectInfoAccessSyntax* ext_SubjectInfoAccessSyntax; + } u; + } _ioschoice_extnValue; +} Extension; +``` + + The `SingleAttribute` and `AttributeSet` types compile to: + +```C +typedef struct SingleAttribute { + heim_oid type; + HEIM_ANY value; + struct { + enum { + choice_SingleAttribute_iosnumunknown = 0, + choice_SingleAttribute_iosnum_id_at_name, + choice_SingleAttribute_iosnum_id_at_surname, + choice_SingleAttribute_iosnum_id_at_givenName, + choice_SingleAttribute_iosnum_id_at_initials, + choice_SingleAttribute_iosnum_id_at_generationQualifier, + choice_SingleAttribute_iosnum_id_at_commonName, + choice_SingleAttribute_iosnum_id_at_localityName, + choice_SingleAttribute_iosnum_id_at_stateOrProvinceName, + choice_SingleAttribute_iosnum_id_at_organizationName, + choice_SingleAttribute_iosnum_id_at_organizationalUnitName, + choice_SingleAttribute_iosnum_id_at_title, + choice_SingleAttribute_iosnum_id_at_dnQualifier, + choice_SingleAttribute_iosnum_id_at_countryName, + choice_SingleAttribute_iosnum_id_at_serialNumber, + choice_SingleAttribute_iosnum_id_at_pseudonym, + choice_SingleAttribute_iosnum_id_domainComponent, + choice_SingleAttribute_iosnum_id_at_emailAddress, + } element; + union { + void *_any; + X520name* at_name; + X520name* at_surname; + X520name* at_givenName; + X520name* at_initials; + X520name* at_generationQualifier; + X520CommonName* at_x520CommonName; + X520LocalityName* at_x520LocalityName; + DirectoryString* at_x520StateOrProvinceName; + DirectoryString* at_x520OrganizationName; + DirectoryString* at_x520OrganizationalUnitName; + DirectoryString* at_x520Title; + heim_printable_string* at_x520dnQualifier; + heim_printable_string* at_x520countryName; + heim_printable_string* at_x520SerialNumber; + DirectoryString* at_x520Pseudonym; + heim_ia5_string* at_domainComponent; + heim_ia5_string* at_emailAddress; + } u; + } _ioschoice_value; +} SingleAttribute; +``` + + and + +```C +typedef struct AttributeSet { + heim_oid type; + struct AttributeSet_values + { + unsigned int len; + HEIM_ANY* val; + } values; + struct { + enum { + choice_AttributeSet_iosnumunknown = 0, + choice_AttributeSet_iosnum_id_at_name, + choice_AttributeSet_iosnum_id_at_surname, + choice_AttributeSet_iosnum_id_at_givenName, + choice_AttributeSet_iosnum_id_at_initials, + choice_AttributeSet_iosnum_id_at_generationQualifier, + choice_AttributeSet_iosnum_id_at_commonName, + choice_AttributeSet_iosnum_id_at_localityName, + choice_AttributeSet_iosnum_id_at_stateOrProvinceName, + choice_AttributeSet_iosnum_id_at_organizationName, + choice_AttributeSet_iosnum_id_at_organizationalUnitName, + choice_AttributeSet_iosnum_id_at_title, + choice_AttributeSet_iosnum_id_at_dnQualifier, + choice_AttributeSet_iosnum_id_at_countryName, + choice_AttributeSet_iosnum_id_at_serialNumber, + choice_AttributeSet_iosnum_id_at_pseudonym, + choice_AttributeSet_iosnum_id_domainComponent, + choice_AttributeSet_iosnum_id_at_emailAddress, + } element; + unsigned int len; + union { + void *_any; + X520name* at_name; + X520name* at_surname; + X520name* at_givenName; + X520name* at_initials; + X520name* at_generationQualifier; + X520CommonName* at_x520CommonName; + X520LocalityName* at_x520LocalityName; + DirectoryString* at_x520StateOrProvinceName; + DirectoryString* at_x520OrganizationName; + DirectoryString* at_x520OrganizationalUnitName; + DirectoryString* at_x520Title; + heim_printable_string* at_x520dnQualifier; + heim_printable_string* at_x520countryName; + heim_printable_string* at_x520SerialNumber; + DirectoryString* at_x520Pseudonym; + heim_ia5_string* at_domainComponent; + heim_ia5_string* at_emailAddress; + } *val; + } _ioschoice_values; +} AttributeSet; +``` + + The `OtherName` type compiles to: + +```C +typedef struct OtherName { + heim_oid type_id; + HEIM_ANY value; + struct { + enum { + choice_OtherName_iosnumunknown = 0, + choice_OtherName_iosnum_id_pkix_on_xmppAddr, + choice_OtherName_iosnum_id_pkix_on_dnsSRV, + choice_OtherName_iosnum_id_pkix_on_hardwareModuleName, + choice_OtherName_iosnum_id_pkix_on_permanentIdentifier, + choice_OtherName_iosnum_id_pkix_on_pkinit_san, + choice_OtherName_iosnum_id_pkix_on_pkinit_ms_san, + } element; + union { + void *_any; + heim_utf8_string* on_xmppAddr; + heim_ia5_string* on_dnsSRV; + HardwareModuleName* on_hardwareModuleName; + PermanentIdentifier* on_permanentIdentifier; + KRB5PrincipalName* on_krb5PrincipalName; + heim_utf8_string* on_pkinit_ms_san; + } u; + } _ioschoice_value; +} OtherName; +``` + + If a caller to `encode_Certificate()` passes a certificate object with + extensions with `_ioselement == choice_Extension_iosnumunknown` (or + whatever, for each open type), then the encoder will use the `extnID` and + `extnValue` fields, otherwise it will use the new `_ioschoice_extnValue` + field and leave `extnID` and `extnValue` cleared. If both are set, the + `extnID` and `extnValue` fields, and also the new `_ioschoice_extnValue` + field, then the encoder will ignore the latter. + + In both cases, the `critical` field gets used as-is. The rule is be that we + support *two* special C struct fields for open types: a hole type ID enum + field, and a decoded hole value union. All other fields will map to either + normal (possibly constrained) members of the SET/SEQUENCE. + + - Type ID values get mapped to discrete enum values. Object sets get sorted + by object type IDs so that for decoding they can be and are binary-searched. + For encoding and other cases (destructors and copy constructors) we directly + index the object set by the mapped type ID enum. + + - The C header generator remains shared between the two backends. + + - SET and SEQUENCE types containing an open type are represented as follows in + their templates. + +```C + extern const struct asn1_template asn1_CertExtensions[]; + /*...*/ + const struct asn1_template asn1_Extension_tag__22[] = { + /* 0 */ { 0, sizeof(struct Extension), ((void*)5) }, + /* 1 */ { A1_TAG_T(ASN1_C_UNIV, PRIM, UT_OID), + offsetof(struct Extension, extnID), + asn1_AttributeType_tag__1 }, + /* 2 */ { A1_OP_DEFVAL | A1_DV_BOOLEAN, ~0, (void*)0 }, + /* 3 */ { A1_TAG_T(ASN1_C_UNIV, PRIM, UT_Boolean) | A1_FLAG_DEFAULT, + offsetof(struct Extension, critical), + asn1_Extension_tag_critical_24 }, + /* 4 */ { A1_TAG_T(ASN1_C_UNIV, PRIM, UT_OctetString), + offsetof(struct Extension, extnValue), + asn1_Extension_tag_extnValue_25 }, + /* NEW: vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv */ + /* 5 */ { A1_OP_OPENTYPE_OBJSET | 0 | (2 << 10) | 0, + offsetof(Extension, _ioschoice_extnValue), + asn1_CertExtensions } + }; + const struct asn1_template asn1_Extension[] = { + /* 0 */ { 0, sizeof(Extension), ((void*)1) }, + /* 1 */ { A1_TAG_T(ASN1_C_UNIV, CONS, UT_Sequence), + 0, asn1_Extension_tag__22 } + }; + + /* NEW: */ + const struct asn1_template asn1_CertExtensions[] = { + /* + * Header template entry bearing the count of objects in + * this object set: + */ + /* 0 */ { 0, 0, ((void*)18) }, + + /* + * Value of object #0 in this set: two entries, one naming + * a type ID field value, and the other naming the type + * that corresponds to that value. + * + * In this case, the first object is for the + * AuthorityKeyIdentifier type as a certificate extension. + */ + /* 1 */ { A1_OP_OPENTYPE_ID, 0, + (const void*)&asn1_oid_id_x509_ce_authorityKeyIdentifier }, + /* 2 */ { A1_OP_OPENTYPE, sizeof(AuthorityKeyIdentifier), + (const void*)&asn1_AuthorityKeyIdentifier }, + + /* Value of object #1 (SubjectKeyIdentifier): */ + + /* 3 */ { A1_OP_OPENTYPE_ID, 0, + (const void*)&asn1_oid_id_x509_ce_subjectKeyIdentifier }, + /* 4 */ { A1_OP_OPENTYPE, sizeof(SubjectKeyIdentifier), + (const void*)&asn1_SubjectKeyIdentifier }, + /* 5 */ + + /* And so on...*/ + + /* Value of object #17 */ + /* 35 */ { A1_OP_OPENTYPE_ID, 0, + (const void*)&asn1_oid_id_pkix_pe_subjectInfoAccess }, + /* 36 */ { A1_OP_OPENTYPE, sizeof(SubjectInfoAccessSyntax), + (const void*)&asn1_SubjectInfoAccessSyntax } + }; +``` + + After the template entries for all the normal fields of a struct there will + be an object set reference entry identifying the type ID and open type + fields's entries' indices in the same template. The object set has a header + entry followed by pairs of entries each representing a single object and all + of them representing the object set. + + This allows the encoder and decoder to both find the object set quickly, + especially since the objects are sorted by type ID value. + +## Moving From C + + - Generate and output a JSON representation of the compiled ASN.1 module. + + - Code codegen/templategen backends in jq or Haskell or whatever. + + - Code template interpreters in <host> language. + + - Eventually rewrite the compiler itself in Rust or whatever. |