# Automatic Open Type Handling via X.68x Support in Heimdal's ASN.1 Compiler

## Table of Contents

 1. [Introduction](#Introduction)
 2. [Typed Holes / Open Types](#typed-holes--open-types)
 3. [ASN.1 IOS, Constraint, and Parameterization](#asn1-ios-constraint-and-parameterization)
    - [IOS Crash Course](#ios-crash-course)
 4. [Usage](#Usage)
 5. [Limitations](#Limitations)
 6. [Implementation Design](#implementation-design)
 7. [Moving From C](#moving-from-c)

## Introduction

ASN.1 is a set of specifications for "syntax" for defining data schemas, and
"encoding rules" for encoding values of data of types defined in those schemas.
There are many encoding rules, but one syntax.

The base of ASN.1 _syntax_ is specified by X.680, an ITU-T standard.  The
encoding rules are specified by the X.69x series (X.690 through X.697).

This README is concerned primarily with the X.68x series.

While X.680 is essential for implementing many Internet (and other) protocols,
and sufficient for implementing all of those, there are extensions in the
remainder of the X.68x series that can make life a lot easier for developers
who have to use ASN.1 for interoperability reasons.

Various syntax extensions are specified in X.68x series documents:

 - X.681: Information Object specification
 - X.682: Constraint specification
 - X.683: Parameterization of ASN.1 specifications

The intent of X.681, X.682, and X.683 is to add ways to formally express
constraints that would otherwise require natural language to express.  Give a
compiler more formally-expressed constraints and it can do more labor-saving
than it could otherwise.

A subset of these three extensions, X.681, X.682, and X.683, can enable some
rather magical features.  These magical features are generally not the focus of
those ITU-T specifications nor of many RFCs that make use of them, but
nonetheless they are of interest to us.

This README covers some ideas for what this magic is, and implementation of it.

RFC 6025 does an excellent job of elucidating X.681, which otherwise most
readers unfamiliar with it will no doubt find inscrutable.  Hopefully this
README improves that further.

The magic that we're after is simply the *automatic and recursive handling of
open types by an ASN.1 compiler*.

Combined with eventual support for the ASN.1 JSON Encoding Rules (JER) [X.697],
this feature could give us unprecendented visibility into really complex data
structures, such as Endorsement Key Certificates (EKcerts) for Trusted Platform
Module (TPM) applications.

Support for JER and automatic handling of open types should allow us to
trivially implement a command-line tool that can parse any DER or JER (JSON)
encoding of any value whose type is known and compiled, and which could
transcode to the other encoding rules.  I.e., dump DER to JSON, and parse JSON
to output DER.

Indeed, Heimdal's `asn1_print` program currently supports transcoding of DER to
JSON, though it's not quite X.697-compliant JSON!  Heimdal does not currently
support parsing JSON-encoded values of ASN.1 types.

Combined with transcoders for JSON/CBOR and other binary-JSON formats, we could
support those encodings too.

We could really see how much space OER/JER/CBOR save over DER for Kerberos
tickets, PKIX certificates, and much else.

We especially want this for PKIX, and more than anything for certificates, as
the TBSCertificate type is full of deeply nested open types: DNs and
subjectDirectory attributes, otherName SAN types, and certificate extensions.

Besides a magical ASN.1 DER/JER dumper/transcoder utility, we want to replace
DN attribute and subject alternative name (SAN) `otherName` tables and much
hand-coded handling of certificate extensions in `lib/hx509/`.

The reader should already be familiar with ASN.1, which anyways is a set of two
things:

 - an abstract syntax for specifying schemas for data interchange

 - a set of encoding rules

A very common thing to see in projects that use ASN.1, as well as projects that
use alternatives to ASN.1, is a pattern known as the "typed hole" or "open
type".

The ASN.1 Information Object System (IOS) [X.681] is all about automating the
otherwise very annoying task of dealing with "typed holes" / "open types".

The ASN.1 IOS is not sufficient to implement the magic we're after.  Also
needed is constraint specification and parameterization of types.

ITU-T references:

https://www.itu.int/rec/T-REC-X.680-201508-I/en
https://www.itu.int/rec/T-REC-X.681-201508-I/en
https://www.itu.int/rec/T-REC-X.682-201508-I/en
https://www.itu.int/rec/T-REC-X.683-201508-I/en


## Typed Holes / Open Types

A typed hole or open type is a pattern of data structure that generally looks
like:

```
    { type_id, bytes_encoding_a_value_of_a_type_identified_by_type_id }
```

I.e., an opaque datum and an identifier of what kind of datum that is.  This
happens because the structure with the typed hole is used in contexts where it
can't know all possible things that can go in it.  In many cases we do know
what all possible things are that can go in a typed hole, but many years ago
didn't, say, or anyways, had a reason to use a typed hole.

These are used not only in protocols that use ASN.1, but in many protocols that
use syntaxes and encodings unrelated to ASN.1.  I.e., these concepts are *not*
ASN.1-specific.

Many Internet protocols use typed holes, and many use typed holes in ASN.1
types.  For example, PKIX, Kerberos, LDAP, and others, use ASN.1 and typed
holes.

For examples of an Internet protocol that does not use ASN.1 but which still
has typed holes, see IP, MIME, SSHv2, IKEv2, and others.  Most quintessentilly,
IP itself, since IP packet payloads are for some upper layer protocol
identified in the IP packet header.

In ASN.1 these generally look like:

```ASN.1
    TypedHole ::= SEQUENCE {
        typeId INTEGER,
        opaque OCTET STRING
    }
```

or

```ASN.1
    -- Old ASN.1 style
    TypedHole ::= SEQUENCE {
        typeId OBJECT IDENTIFIER,
        opaque ANY DEFINED BY typeID
    }
```

or

```ASN.1
    -- Old ASN.1 style
    TypedHole ::= SEQUENCE {
        typeId OBJECT IDENTIFIER,
        opaque ANY -- DEFINED BY typeID
    }
```

or any number of variations.

    Note: the `ANY` variations are no longer conformant to X.680 (the base
    ASN.1 specification).

The pattern is `{ id, hole }` where the `hole` is ultimately an opaque sequence
of bytes whose content's schema is identified by the `id` in the same data
structure.  The pattern does not require just two fields, and it does not
require any particular type for the hole, nor for the type ID.  Sometimes the
"hole" is an `OCTET STRING`, sometimes it's a `BIT STRING`, sometimes it's an
`ANY` or `ANY DEFINED BY`.  Sometimes the hole is even an array of (`SET OF` or
`SEQUENCE OF`, in ASN.1) values of the type identified by the id field.

An example from PKIX:

```ASN.1
Extension ::= SEQUENCE {
  extnID          OBJECT IDENTIFIER, -- <- type ID
  critical        BOOLEAN OPTIONAL,
  extnValue       OCTET STRING,      -- <- hole
}
```

which shows that typed holes don't always have just three fields, and the type
identifier isn't always an integer.

Now, Heimdal's ASN.1 compiler generates the obvious C data structure for PKIX's
`Extension` type:

```C
    typedef struct Extension {
      heim_oid extnID;
      int *critical;
      heim_octet_string extnValue;
    } Extension;
```

and applications using this compiler have to inspect the `extnID` field,
comparing it to any number of OIDs, to determine the type of `extnValue`, then
must call `decode_ThatType()` to decode whatever that octet string has.

This is very inconvenient.

Compare this to the handling of discriminated unions (what ASN.1 calls a
`CHOICE`):

```C
    /*
     * ASN.1 definition:
     *
     *  DistributionPointName ::= CHOICE {
     *    fullName                  [0] IMPLICIT SEQUENCE OF GeneralName,
     *    nameRelativeToCRLIssuer   [1] RelativeDistinguishedName,
     *  }
    */

    /* C equivalent */
    typedef struct DistributionPointName {
      enum DistributionPointName_enum {
        choice_DistributionPointName_fullName = 1,
        choice_DistributionPointName_nameRelativeToCRLIssuer
      } element;
      union {
        struct DistributionPointName_fullName {
          unsigned int len;
          GeneralName *val;
        } fullName;
        RelativeDistinguishedName nameRelativeToCRLIssuer;
      } u;
    } DistributionPointName;
```

The ASN.1 encoding on the wire of a `CHOICE` value, almost no matter the
encoding rules, looks... remarkably like the encoding of a typed hole.  Though
generally the alternatives of a discriminated union have to all be encoded with
the same encoding rules, whereas with typed holes the encoded data could be
encoded in radically different encoding rules than the structure containing it
in a typed hole.

In fact, extensible `CHOICE`s are handled by our compiler as a discriminated
union one of whose alternatives is a typed hole when the `CHOICE` is
extensible:

```C
    typedef struct DigestRepInner {
      enum DigestRepInner_enum {
        choice_DigestRepInner_asn1_ellipsis = 0, /* <--- unknown CHOICE arm */
        choice_DigestRepInner_error,
        choice_DigestRepInner_initReply,
        choice_DigestRepInner_response,
        choice_DigestRepInner_ntlmInitReply,
        choice_DigestRepInner_ntlmResponse,
        choice_DigestRepInner_supportedMechs
        /* ... */
      } element;
      union {
        DigestError error;
        DigestInitReply initReply;
        DigestResponse response;
        NTLMInitReply ntlmInitReply;
        NTLMResponse ntlmResponse;
        DigestTypes supportedMechs;
        heim_octet_string asn1_ellipsis; /* <--- unknown CHOICE arm */
      } u;
    } DigestRepInner;
```

The critical thing to understand is that our compiler automatically decodes
(and encodes) `CHOICE`s' alternatives, but it used to NOT do that for typed
holes because it knows nothing about them.  Now, however, our compiler can
do this for typed holes provided the module specifies what the alternatives
are.

It would be nice if we could treat *all* typed holes like `CHOICE`s whenever
the compiler knows the alternatives!

And that's exactly what the ASN.1 IOS system makes possible.  With ASN.1 IOS
support, our compiler can automatically decode all the `Certificate`
extensions, and all the distinguished name extensions it knows about.

There is a fair bit of code in `lib/hx509/` that deals with encoding and
decoding things in typed holes where the compiler could just handle that
automatically for us, allowing us to delete a lot of code.

Even more importantly, if we ever add support for visual encoding rules of
ASN.1, such as JSON Encoding Rules (JER) [X.697] or Generic String Encoding
Rules (GSER) [RFC2641], we could have a utility program to automatically
display or compile DER (and other encodings) of certifcates and many other
interesting data structures.

Indeed, we do now have such a utility (`asn1_print`), able to transcode DER to
JSON.

## ASN.1 IOS, Constraint, and Parameterization

The ASN.1 IOS is additional syntax that allows ASN.1 module authors to express
all the details about typed holes that ASN.1 compilers need to make developers'
lives much easier.

RFC5912 has lots of examples, such as this `CLASS` corresponding to the
`Extension` type from PKIX:

```ASN.1
  -- A class that provides some of the details of the PKIX Extension typed
  -- hole:
  EXTENSION ::= CLASS {
      -- The following are fields of a class (as opposed to "members" of
      -- SEQUENCE or SET types):
      &id  OBJECT IDENTIFIER UNIQUE,    -- This is a fixed-type value field.
                                        -- UNIQUE -> There can be only one
                                        --           object with this OID
                                        --           in any object set of
                                        --           this class.
                                        --           I.e., this is like a
                                        --           PRIMARY KEY in a SQL
                                        --           TABLE spec.
      &ExtnType,                        -- This is a type field (the hole).
      &Critical    BOOLEAN DEFAULT {TRUE | FALSE } -- fixed-type value set field.
  } WITH SYNTAX {
      -- This is a specification of easy to use (but hard-to-parse) syntax for
      -- specifying instances of this CLASS:
      SYNTAX &ExtnType IDENTIFIED BY &id
      [CRITICALITY &Critical]
  }

  -- Here's a parameterized Extension type.  The formal parameter is an as-yet
  -- unspecified set of valid things this hole can carry for some particular
  -- instance of this type.  The actual parameter will be specified later (see
  -- below).
  Extension{EXTENSION:ExtensionSet} ::= SEQUENCE {
      -- The type ID has to be the &id field of the EXTENSION CLASS of the
      -- ExtensionSet object set parameter.
      extnID      EXTENSION.&id({ExtensionSet}),
      -- This is the critical field, whose DEFAULT value should be that of
      -- the &Critical field of the EXTENSION CLASS of the ExtensionSet object
      -- set parameter.
      critical    BOOLEAN
  --                     (EXTENSION.&Critical({ExtensionSet}{@extnID}))
                       DEFAULT FALSE,
      -- Finally, the hole is an OCTET STRING constrained to hold the encoding
      -- of the type named by the &ExtnType field of the EXTENSION CLASS of the
      -- ExtensionSet object set parameter.
      --
      -- Note that for all members of this SEQUENCE, the fields of the object
      -- referenced must be of the same object in the ExtensionSet object set
      -- parameter.  That's how we get to say that some OID implies some type
      -- for the hole.
      extnValue   OCTET STRING (CONTAINING
                  EXTENSION.&ExtnType({ExtensionSet}{@extnID}))
                  --  contains the DER encoding of the ASN.1 value
                  --  corresponding to the extension type identified
                  --  by extnID
  }

  -- This is just a SEQUENCE of Extensions, the parameterized version.
  Extensions{EXTENSION:ExtensionSet} ::=
      SEQUENCE SIZE (1..MAX) OF Extension{{ExtensionSet}}
```

and these uses of it in RFC5280 (PKIX base) where the actual parameter is
given:

```ASN.1
   -- Here we have an individual "object" specifying that the OID
   -- id-ce-authorityKeyIdentifier implies AuthorityKeyIdentifier as the hole
   -- type:
   ext-AuthorityKeyIdentifier EXTENSION ::= { SYNTAX
       AuthorityKeyIdentifier IDENTIFIED BY
       id-ce-authorityKeyIdentifier }

   -- And here's the OID, for completeness:
   id-ce-authorityKeyIdentifier OBJECT IDENTIFIER ::=  { id-ce 35 }
   ...

   -- And Here's an object set for the EXTENSION CLASS collecting a bunch of
   -- related extensions (here they are the extensions that certificates can
   -- carry in their extensions member):
   CertExtensions EXTENSION ::= {
           ext-AuthorityKeyIdentifier | ext-SubjectKeyIdentifier |
           ext-KeyUsage | ext-PrivateKeyUsagePeriod |
           ext-CertificatePolicies | ext-PolicyMappings |
           ext-SubjectAltName | ext-IssuerAltName |
           ext-SubjectDirectoryAttributes |
           ext-BasicConstraints | ext-NameConstraints |
           ext-PolicyConstraints | ext-ExtKeyUsage |
           ext-CRLDistributionPoints | ext-InhibitAnyPolicy |
           ext-FreshestCRL | ext-AuthorityInfoAccess |
           ext-SubjectInfoAccessSyntax, ... }
   ...

   -- Lastly, we have a Certificate, and the place where the Extensions type's
   -- actual parameter is specified.
   --
   -- This is where the rubber meets the road:

   Certificate  ::=  SIGNED{TBSCertificate}

   TBSCertificate  ::=  SEQUENCE  {
       version         [0]  Version DEFAULT v1,
       serialNumber         CertificateSerialNumber,
       signature            AlgorithmIdentifier{SIGNATURE-ALGORITHM,
                                 {SignatureAlgorithms}},
       issuer               Name,
       validity             Validity,
       subject              Name,
       subjectPublicKeyInfo SubjectPublicKeyInfo,
       ... ,
       [[2:               -- If present, version MUST be v2
       issuerUniqueID  [1]  IMPLICIT UniqueIdentifier OPTIONAL,
       subjectUniqueID [2]  IMPLICIT UniqueIdentifier OPTIONAL
       ]],
       [[3:               -- If present, version MUST be v3 --
       extensions      [3]  Extensions{{CertExtensions}} OPTIONAL
                         -- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                         -- The rubber meets the road *here*.
                         --
                         -- This says that the set of *known* certificate
                         -- extensions are those for which there are "objects"
                         -- in the "object set" named CertExtensions.
       ]], ... }
```

Notice that the `extensions` field of `TBSCertificate` is of type `Extensions`
parametrized by the `CertExtensions` "information object set".

This allows the compiler to know that if any of the OIDs listed in the
`CertExtensions` object set appear as the actual value of the `extnID` member
of an `Extension` value, then the `extnValue` member of the same `Extension`
value must be an instance of the type associated with that OID.  For example,
an `Extension` with `extnID` value of `id-ce-authorityKeyIdentifier` must have
an `extnValue` of type `AuthorityKeyIdentifier`.


### IOS Crash Course

The ASN.1 IOS may be... a bit difficult to understand -- the syntax isn't
pretty.  And X.681 has a lot of strange terminology, like "variable type value
set field".

An IOS "class" has fields, and those fields are of kind
`[Fixed]Type[Value[Set]]` or `Object[Set]`.  Then there's "objects" and "object
sets".  Hopefully this section will make all of that comprehensible.

_Classes_ have fields of various kinds.  More on this below.

_Classes_ can also have zero, one, or more _object sets_ associated with them,
and each object set has zero, one, or more _objects_ that are also themselves
associated with classes.  Each object has a setting for each required field of
a class, and possibly also for optional/defaulted fields as well.

As X.681 explains, IOS object sets really are akin to relational database
tables, while objects are akin to rows of the same, with columns specified by
classes.

Or one can think of _classes_ as relational tables with one predefined column
naming object sets, and rows being objects grouped into object sets by that
column.  IOS supports complex path expressions across these objects (but we
won't need to support that yet).

These relational entities are immutable in that they are defined in ASN.1
modules that are compiled and there is no way to change them at run-time, only
query them (although perhaps object sets marked as extensible are intended to
be extensible at run-time?).  To mutate them one must edit the ASN.1 module
that defines them and recompile it.  IOS entities also have no on-the-wire
representation.

So far, the IOS seems just so useless to us: we have some, but non-urgent need
to specify immutable relational data.  For example, cryptosystem parameters,
which PKIX does define using IOS, but again: not urgent.

The magic for us lies in being able to document and constrain actual datatypes
using the IOS [X.681], constraint specification [X.682], and type
parameterization [X.683].  We can express the following things:

 - that some _member_ of a `SET` or `SEQUENCE` is of open type

 - that some  _member_ of a `SET` or `SEQUENCE` identifies a type encoded into
   an open type member of the same (or related) `SET` or `SEQUENCE`

 - what pairs of `{type ID value, type}` are allowed for some `SET`'s or
   `SEQUENCE`'s open type members

With this our ASN.1 compiler has the metadata it needs in order to
auto-generate decoding and encoding of values of open types.

A termnology point: `CHOICE`, `SET`, and `SEQUENCE` types have "members", but
_classes_ and _objects_ have "fields", and _object sets_ have "elements".

Objects must have "_settings_" for all the required fields of the object's
class and none, some, or all of the `OPTIONAL` or `DEFAULT` fields of the
class.  This is very similar to `SET`/`SEQUENCE` members, which can be
`OPTIONAL` or `DEFAULT`ed.

The _members_ (we call them fields in C, instance variables in C++, Java, ...)
of a `SET` or `SEQUENCE` type are typed, just as in C, C++, Java, etc. for
struct or object types.

There are several kinds of fields of classes.  These can be confusing, so it is
useful that we explain them by reference to how they relate to the members of
`SEQUENCE` types constrained by object sets:

 - A `type field` of a class is one that specifies a `SET` or `SEQUENCE` member
   of unknown (i.e., open) type.

   The type of that `SET` or `SEQUENCE` member will not be not truly unknown,
   but determined by some other member of the SET or SEQUENCE, and that will be
   specified in a "value field" (or "value set" field) an "object" in an
   "object set" of that class.

   This is essentially a "type variable", akin to those seen in high-level
   languages like Haskell.

 - A `fixed type value field` of a class is one that specifies a SET or
   SEQUENCE member of fixed type.  Being of fixed-type, this is not a type
   variable, naturally.

 - A `fixed type value set field` of a class is like a `fixed type value
   field`, but where object sets should provide a set of values with which to
   constrain `SET`/`SEQUENCE` members corresponding to the field.

 - A `variable type value [set] field` is one where the type of the `SET` or
   `SEQUENCE` member corresponding to the field will vary according to some
   specified `type field` of the same class.

 - An `object field` will be a field that names another class (possibly the
   same class), which can be used to provide rich hierarchical type semantics
   that... we mostly don't need for now.

   These define relations between classes, much like `FOREIGN KEY`s in SQL.

   These are also known as `link fields`.

 - Similarly for `object set field`s.

As usual for ASN.1, the case of the first letter of a field name is meaningful:

 - value and object field names start with a lower case letter;
 - type, value set, and object set fields start with an upper-case letter.

The form of a `fixed type value` field and a `fixed type value set` field is
the same, differing only the case of the first letter of the field name.
Similarly for `variable type value` and `variable type value set` fields.
Similarly, again, for `object` and `object set` fields.

Here's a simple example from PKIX:

```ASN.1
  -- An IOS class used to impose constraints on the PKIX Extension type:
  EXTENSION ::= CLASS {
      &id  OBJECT IDENTIFIER UNIQUE,
      &ExtnType,
      &Critical    BOOLEAN DEFAULT {TRUE | FALSE }
  } WITH SYNTAX {
      SYNTAX &ExtnType IDENTIFIED BY &id
      [CRITICALITY &Critical]
  }
```

 - The `&id` field of `EXTENSION` is a fixed-type value field.  It's not a
   fixed-type value _set_ field because its identifier (`id`) starts with a
   lower-case letter.

   The `&id` field is intended to make the `extnId` member of the `Extension`
   `SEQUENCE` type name identify the actual type of the `extnValue` member of
   the same `SEQUENCE` type.

   Note that `UNIQUE` keyword tells us there can be only one object with any
   given value of this field in any object set of this class.  (There is no way
   to specify the equivalent of a multi-column `PRIMARY KEY` from SQL, only
   single-column primary/unique keys.  Note that the `&id` field is not marked
   `OPTIONAL` or `DEFAULT`, which is like saying it's `NOT NULL` in SQL.)

 - The `&ExtnType` field is a type field.  We can tell because no type is named
   in its declaration!

 - The `&Critical` field is a fixed-type value set field.  We can tell because
   it specifies a type (`BOOLEAN`) and starts with an upper-case letter.

   In-tree we could avoid having to implement fixed-type value set fields by
   renaming this one to `&critical` and eliding its `DEFAULT <ValueSet>` given
   that we know there are only two possible values for a `BOOLEAN` field.

 - Ignore the `WITH SYNTAX` clause for now.  All it does is specify a
   user-friendly but implementor-hostile syntax for specifying objects.

Note that none of the `Extension` extensions in PKIX actually specify
`CRITICALITY`/`&Critical`, so... we just don't need fixed-type value set
fields.  We could elide the `&Critical` field of the `EXTENSION` class
altogether.

Here's another, much more complex example from PKIX:

```ASN.1
  ATTRIBUTE ::= CLASS {
      &id             OBJECT IDENTIFIER UNIQUE,
      &Type           OPTIONAL,
      &equality-match MATCHING-RULE OPTIONAL,
      &minCount       INTEGER DEFAULT 1,
      &maxCount       INTEGER OPTIONAL
  }
  MATCHING-RULE ::= CLASS {
      &ParentMatchingRules   MATCHING-RULE OPTIONAL,
      &AssertionType         OPTIONAL,
      &uniqueMatchIndicator  ATTRIBUTE OPTIONAL,
      &id                    OBJECT IDENTIFIER UNIQUE
  }
```

 - For `ATTRIBUTE` the fields are:
    - The `&id` field is a fixed-type value field (intended to name the type of
      members linked to the `&Type` field).
    - The `&Type` field is a type field (open type).
    - The `&equality-match` is an object field linking to object sets of the
      `MATCHING-RULE` class.
    - The `minCount` and `maxCount` fields are fixed-type value fields.
 - For `MATCHING-RULE` the fields are:
    - The `&ParentMatchingRules` is an object set field linking to more
      `MATCHING-RULE`s.
    - The `&AssertionType` field is a type field (open type).
    - The `&uniqueMatchIndicator` field is an object field linking back to some
      object of the `ATTRIBUTE` class that indicates whether the match is
      unique (presumably).
    - The `&id` field is a fixed-type value field (intended to name the type of
      members linked to the `&AssertionType` field).

No `Attribute`s in PKIX (at least RFC 5912) specify matching rules, so we
really don't need support for object nor object set fields.

Because
 - no objects in object sets of `EXTENSION` in PKIX specify "criticality",
 - and no objects in object sets of `ATTRIBUTE` in PKIX specify matching rules,
 - and no matching rules are specified in PKIX (or maybe just one),
we can drop `MATCHING-RULE` and simplify `ATTRIBUTE` and `EXTENSION` as:

```ASN.1
  EXTENSION ::= CLASS {
      &id  OBJECT IDENTIFIER UNIQUE,
      &ExtnType
  }
  ATTRIBUTE ::= CLASS {
      &id             OBJECT IDENTIFIER UNIQUE,
      &Type           OPTIONAL,
      &minCount       INTEGER DEFAULT 1,
      &maxCount       INTEGER OPTIONAL
  }
```

X.681 has an example in appendix D.2 that has at least one field of every kind.

Again, the rubber that are IOS classes and object sets meet the road when
defining types:

```ASN.1
  -- Define the Extension type but link it to the EXTENSION class so that
  -- an object set for that class can constrain it:
  Extension{EXTENSION:ExtensionSet} ::= SEQUENCE {
      extnID      EXTENSION.&id({ExtensionSet}),
      critical    BOOLEAN
                  (EXTENSION.&Critical({ExtensionSet}{@extnID}))
                  DEFAULT FALSE,
      extnValue   OCTET STRING (CONTAINING
                  EXTENSION.&ExtnType({ExtensionSet}{@extnID}))
  }
  -- Most members of TBSCertificate elided for brevity:
  TBSCertificate  ::=  SEQUENCE  {
      ...,
      extensions      [3]  Extensions{{CertExtensions}} OPTIONAL
                                   -- ^^^^^^^^^^^^^^^^
                                   -- the rubber meets the road here!!
      ...
  }

  OTHER-NAME ::= TYPE-IDENTIFIER
  -- Most members of GeneralName elided for brevity:
  GeneralName ::= CHOICE {
      otherName       [0]  INSTANCE OF OTHER-NAME({KnownOtherNames}),
                                               -- ^^^^^^^^^^^^^^^^^
                                               -- rubber & road meet!
      ...
  }
```

(The `CertExtensions` and `KnownOtherNames` object sets are not shown here for
brevity.  PKIX doesn't even define an `KnownOtherNames` object set, though it
well could.)

The above demonstrates two ways to create `SEQUENCE` types that are constrained
by IOS classes.  One is by defining the types of the members of a `SEQUENCE`
type by reference to class fields.  The other is by using `INSTANCE OF` to say
that the class defines the type directly.  The first lets us do things like
have a mix members of a `SEQUENCE` type where some are defined by relation to a
class and others are not, or where multiple classes are used.

In the case of `INSTANCE OF`, what shall the names of the members of the
derived type be?  Well, such types can _only_ be instances of `TYPE-IDENTIFIER`
or classes copied from and isomorphic to it (as `OTHER-NAME` is in the above
exammle), and so the names of their two members are just baked in by X.681
annex C.1 as:

```ASN.1
    SEQUENCE {
        type-id     <DefinedObjectClass>.&id,
        value[0]    <DefinedObjectClass>.&Type
    }
    -- where <DefinedObjectClass> is the name of the class, which has to be
    -- `TYPE-IDENTIFIER` or exactly like it.
```

(This means we can't use `INSTANCE OF` with `EXTENSION`, though we can for
`OTHER-NAME`.)

PKIX has much more complex classes for relating and constraining cryptographic
algorithms and their parameters:

 - `DIGEST-ALGORITHM`,
 - `SIGNATURE-ALGORITHM`,
 - `PUBLIC-KEY`,
 - `KEY-TRANSPORT`,
 - `KEY-AGREE`,
 - `KEY-WRAP`,
 - `KEY-DERIVATION`,
 - `MAC-ALGORITHM`,
 - `CONTENT-ENCRYPTION`,
 - `ALGORITHM`,
 - `SMIME-CAPS`,
 - and `CURVE`.

These show the value of just the relational data aspect of IOS.  They can not
only be used by the codecs at run-time to perform validation of, e.g.,
cryptographic algorithm parameters, but also to provide those rules to other
code in the application so that the programmer doesn't have to manually write
the same in C, C++, Java, etc, and can refer to them when applying those
cryptographic algorithms.  And, of course, the object sets for the above
classes can be and are specified in standards documents, making it very easy to
import them into projects that have an IOS-capable ASN.1 compiler.

Still, for Heimdal we won't bother with the full power of X.681/X.682/X.683 for
now.

## Usage

To use this feature you must use the `--template` and `--one-code-file`
arguments to `asn1_compile`.  C types are generated from ASN.1 types as
described above.

Note that failure to decode open type values does not cause decoding to fail
altogether.  It is important that applications check for undecoded open types.
Open type decoding failures manifest as `NULL` values for the `u` field of the
decoded open type structures (see above).

For examples of X.681/X.682/X.683 usage, look at `lib/asn1/rfc2459.asn1`.

## Limitations

 - `AtNotation` supported is very limited.

 - Object set extensibility is not supported.

 - Only one formal (and actual) type parameter is supported at this time.

 - `TYPE-IDENTIFIER` is not built-in at this time.  (But users can define it as
   specified.)

 - `CLASS` "copying" is not supported at this time.

 - Link fields are not supported.

 - `Information from objects` constructs are not supported.

 - `IMPORTS` of IOS entities are not supported at this time.

 - ...

## Implementation Design

NOTE: This has already be implemented in the `master` branch of Heimdal.

 - The required specifications, X.681, X.682, and X.683, are fairly large and
   non-trivial.  We can implement just the subset of those three that we need
   to implement PKIX, just as we already implement just the subset of X.680
   that we need to implement PKIX and Kerberos.

   For dealing with PKIX, the bare minimum of IOS classes we want are:

    - `ATTRIBUTE` (used for `DN` attributes in RFC5280, specifically for the
      `SingleAttribute` and `AttributeSet` types, RDNs, and the
      `subjectDirectoryAttributes` extension)
    - `EXTENSION` (used for `Extension`, i.e., certificate extensions in
      RFC5280)
    - `TYPE-IDENTIFIER` (used for `OtherName` and for CMS' `Content-Type`)

   The minimal subset of X.681, X.682, and X.683 needed to implement those
   three is all we need.

   _Eventually_ we may want to increase that subset so as to implement other
   IOS classes from PKIX, such as `DIGEST-ALGORITHM`, and to provide object
   sets and query functionality for them to applications so that we can use
   standard modules to encode information about cryptosystems.  But not right
   now.

   Note that there's no object set specified for OTHER-NAME instances, but we
   can and have creates our own.  We want magic open type decoding to recurse
   all the way down and handle DN attributes, extensions, SANs, policy
   qualifiers, the works.

 - We'll really want to do this mainly for the template compiler and begin
   abandoning the original compiler.  The codegen backend generates the same C
   types, but no code for automatic, recursive handling of open types.

   Maintaining two compiler backends is difficult enough; adding complex
   features beyond X.680 to both is too much work.  The template compiler is
   simply superior just on account of its output size scaling as `O(N)` instead
   of `O(M * N)` where `M` is the number of encoding rules supported and `N` is
   the size of an ASN.1 module (or all modules).

 - Also, to make the transition to using IOS in-tree, we'll want to keep
   existing fields of C structures as generated by the compiler today, only
   adding new ones, that way code that hasn't been updated to use the automatic
   encoding/decoding can still work and we can then update Heimdal in-tree
   slowly to take advantage of the new magic.

   Below are the C types for the ASN.1 PKIX types we care about, as generated
   by the current prototype.

   `Extension` compiles to:

```C
typedef struct Extension {
    heim_oid extnID;
    int critical;
    heim_octet_string extnValue;
    /* NEW: */
    struct {
        enum {
            choice_Extension_iosnumunknown = 0,
            choice_Extension_iosnum_id_x509_ce_authorityKeyIdentifier,
            choice_Extension_iosnum_id_x509_ce_subjectKeyIdentifier,
            choice_Extension_iosnum_id_x509_ce_keyUsage,
            choice_Extension_iosnum_id_x509_ce_privateKeyUsagePeriod,
            choice_Extension_iosnum_id_x509_ce_certificatePolicies,
            choice_Extension_iosnum_id_x509_ce_policyMappings,
            choice_Extension_iosnum_id_x509_ce_subjectAltName,
            choice_Extension_iosnum_id_x509_ce_issuerAltName,
            choice_Extension_iosnum_id_x509_ce_basicConstraints,
            choice_Extension_iosnum_id_x509_ce_nameConstraints,
            choice_Extension_iosnum_id_x509_ce_policyConstraints,
            choice_Extension_iosnum_id_x509_ce_extKeyUsage,
            choice_Extension_iosnum_id_x509_ce_cRLDistributionPoints,
            choice_Extension_iosnum_id_x509_ce_inhibitAnyPolicy,
            choice_Extension_iosnum_id_x509_ce_freshestCRL,
            choice_Extension_iosnum_id_pkix_pe_authorityInfoAccess,
            choice_Extension_iosnum_id_pkix_pe_subjectInfoAccess,
        } element;
        union {
            void *_any;
            AuthorityKeyIdentifier* ext_AuthorityKeyIdentifier;
            SubjectKeyIdentifier* ext_SubjectKeyIdentifier;
            KeyUsage* ext_KeyUsage;
            PrivateKeyUsagePeriod* ext_PrivateKeyUsagePeriod;
            CertificatePolicies* ext_CertificatePolicies;
            PolicyMappings* ext_PolicyMappings;
            GeneralNames* ext_SubjectAltName;
            GeneralNames* ext_IssuerAltName;
            BasicConstraints* ext_BasicConstraints;
            NameConstraints* ext_NameConstraints;
            PolicyConstraints* ext_PolicyConstraints;
            ExtKeyUsage* ext_ExtKeyUsage;
            CRLDistributionPoints* ext_CRLDistributionPoints;
            SkipCerts* ext_InhibitAnyPolicy;
            CRLDistributionPoints* ext_FreshestCRL;
            AuthorityInfoAccessSyntax* ext_AuthorityInfoAccess;
            SubjectInfoAccessSyntax* ext_SubjectInfoAccessSyntax;
        } u;
    } _ioschoice_extnValue;
} Extension;
```

   The `SingleAttribute` and `AttributeSet` types compile to:

```C
typedef struct SingleAttribute {
    heim_oid type;
    HEIM_ANY value;
    struct {
        enum {
            choice_SingleAttribute_iosnumunknown = 0,
            choice_SingleAttribute_iosnum_id_at_name,
            choice_SingleAttribute_iosnum_id_at_surname,
            choice_SingleAttribute_iosnum_id_at_givenName,
            choice_SingleAttribute_iosnum_id_at_initials,
            choice_SingleAttribute_iosnum_id_at_generationQualifier,
            choice_SingleAttribute_iosnum_id_at_commonName,
            choice_SingleAttribute_iosnum_id_at_localityName,
            choice_SingleAttribute_iosnum_id_at_stateOrProvinceName,
            choice_SingleAttribute_iosnum_id_at_organizationName,
            choice_SingleAttribute_iosnum_id_at_organizationalUnitName,
            choice_SingleAttribute_iosnum_id_at_title,
            choice_SingleAttribute_iosnum_id_at_dnQualifier,
            choice_SingleAttribute_iosnum_id_at_countryName,
            choice_SingleAttribute_iosnum_id_at_serialNumber,
            choice_SingleAttribute_iosnum_id_at_pseudonym,
            choice_SingleAttribute_iosnum_id_domainComponent,
            choice_SingleAttribute_iosnum_id_at_emailAddress,
        } element;
        union {
            void *_any;
            X520name* at_name;
            X520name* at_surname;
            X520name* at_givenName;
            X520name* at_initials;
            X520name* at_generationQualifier;
            X520CommonName* at_x520CommonName;
            X520LocalityName* at_x520LocalityName;
            DirectoryString* at_x520StateOrProvinceName;
            DirectoryString* at_x520OrganizationName;
            DirectoryString* at_x520OrganizationalUnitName;
            DirectoryString* at_x520Title;
            heim_printable_string* at_x520dnQualifier;
            heim_printable_string* at_x520countryName;
            heim_printable_string* at_x520SerialNumber;
            DirectoryString* at_x520Pseudonym;
            heim_ia5_string* at_domainComponent;
            heim_ia5_string* at_emailAddress;
        } u;
    } _ioschoice_value;
} SingleAttribute;
```

   and

```C
typedef struct AttributeSet {
    heim_oid type;
    struct AttributeSet_values
    {
        unsigned int len;
        HEIM_ANY* val;
    } values;
    struct {
        enum {
            choice_AttributeSet_iosnumunknown = 0,
            choice_AttributeSet_iosnum_id_at_name,
            choice_AttributeSet_iosnum_id_at_surname,
            choice_AttributeSet_iosnum_id_at_givenName,
            choice_AttributeSet_iosnum_id_at_initials,
            choice_AttributeSet_iosnum_id_at_generationQualifier,
            choice_AttributeSet_iosnum_id_at_commonName,
            choice_AttributeSet_iosnum_id_at_localityName,
            choice_AttributeSet_iosnum_id_at_stateOrProvinceName,
            choice_AttributeSet_iosnum_id_at_organizationName,
            choice_AttributeSet_iosnum_id_at_organizationalUnitName,
            choice_AttributeSet_iosnum_id_at_title,
            choice_AttributeSet_iosnum_id_at_dnQualifier,
            choice_AttributeSet_iosnum_id_at_countryName,
            choice_AttributeSet_iosnum_id_at_serialNumber,
            choice_AttributeSet_iosnum_id_at_pseudonym,
            choice_AttributeSet_iosnum_id_domainComponent,
            choice_AttributeSet_iosnum_id_at_emailAddress,
        } element;
        unsigned int len;
        union {
            void *_any;
            X520name* at_name;
            X520name* at_surname;
            X520name* at_givenName;
            X520name* at_initials;
            X520name* at_generationQualifier;
            X520CommonName* at_x520CommonName;
            X520LocalityName* at_x520LocalityName;
            DirectoryString* at_x520StateOrProvinceName;
            DirectoryString* at_x520OrganizationName;
            DirectoryString* at_x520OrganizationalUnitName;
            DirectoryString* at_x520Title;
            heim_printable_string* at_x520dnQualifier;
            heim_printable_string* at_x520countryName;
            heim_printable_string* at_x520SerialNumber;
            DirectoryString* at_x520Pseudonym;
            heim_ia5_string* at_domainComponent;
            heim_ia5_string* at_emailAddress;
        } *val;
    } _ioschoice_values;
} AttributeSet;
```

   The `OtherName` type compiles to:

```C
typedef struct OtherName {
    heim_oid type_id;
    HEIM_ANY value;
    struct {
        enum {
            choice_OtherName_iosnumunknown = 0,
            choice_OtherName_iosnum_id_pkix_on_xmppAddr,
            choice_OtherName_iosnum_id_pkix_on_dnsSRV,
            choice_OtherName_iosnum_id_pkix_on_hardwareModuleName,
            choice_OtherName_iosnum_id_pkix_on_permanentIdentifier,
            choice_OtherName_iosnum_id_pkix_on_pkinit_san,
            choice_OtherName_iosnum_id_pkix_on_pkinit_ms_san,
        } element;
        union {
            void *_any;
            heim_utf8_string* on_xmppAddr;
            heim_ia5_string* on_dnsSRV;
            HardwareModuleName* on_hardwareModuleName;
            PermanentIdentifier* on_permanentIdentifier;
            KRB5PrincipalName* on_krb5PrincipalName;
            heim_utf8_string* on_pkinit_ms_san;
        } u;
    } _ioschoice_value;
} OtherName;
```

   If a caller to `encode_Certificate()` passes a certificate object with
   extensions with `_ioselement == choice_Extension_iosnumunknown` (or
   whatever, for each open type), then the encoder will use the `extnID` and
   `extnValue` fields, otherwise it will use the new `_ioschoice_extnValue`
   field and leave `extnID` and `extnValue` cleared.  If both are set, the
   `extnID` and `extnValue` fields, and also the new `_ioschoice_extnValue`
   field, then the encoder will ignore the latter.

   In both cases, the `critical` field gets used as-is.  The rule is be that we
   support *two* special C struct fields for open types: a hole type ID enum
   field, and a decoded hole value union.  All other fields will map to either
   normal (possibly constrained) members of the SET/SEQUENCE.

 - Type ID values get mapped to discrete enum values.  Object sets get sorted
   by object type IDs so that for decoding they can be and are binary-searched.
   For encoding and other cases (destructors and copy constructors) we directly
   index the object set by the mapped type ID enum.

 - The C header generator remains shared between the two backends.

 - SET and SEQUENCE types containing an open type are represented as follows in
   their templates.

```C
    extern const struct asn1_template asn1_CertExtensions[];
    /*...*/
    const struct asn1_template asn1_Extension_tag__22[] = {
        /* 0 */ { 0, sizeof(struct Extension), ((void*)5) },
        /* 1 */ { A1_TAG_T(ASN1_C_UNIV, PRIM, UT_OID),
                  offsetof(struct Extension, extnID),
                  asn1_AttributeType_tag__1 },
        /* 2 */ { A1_OP_DEFVAL | A1_DV_BOOLEAN, ~0, (void*)0 },
        /* 3 */ { A1_TAG_T(ASN1_C_UNIV, PRIM, UT_Boolean) | A1_FLAG_DEFAULT,
                  offsetof(struct Extension, critical),
                  asn1_Extension_tag_critical_24 },
        /* 4 */ { A1_TAG_T(ASN1_C_UNIV, PRIM, UT_OctetString),
                  offsetof(struct Extension, extnValue),
                  asn1_Extension_tag_extnValue_25 },
        /* NEW: vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv */
        /* 5 */ { A1_OP_OPENTYPE_OBJSET | 0 | (2 << 10) | 0,
                  offsetof(Extension, _ioschoice_extnValue),
                  asn1_CertExtensions }
    };
    const struct asn1_template asn1_Extension[] = {
        /* 0 */ { 0, sizeof(Extension), ((void*)1) },
        /* 1 */ { A1_TAG_T(ASN1_C_UNIV, CONS, UT_Sequence),
                  0, asn1_Extension_tag__22 }
    };

    /* NEW: */
    const struct asn1_template asn1_CertExtensions[] = {
        /*
         * Header template entry bearing the count of objects in
         * this object set:
         */
        /* 0 */ { 0, 0, ((void*)18) },

        /*
         * Value of object #0 in this set: two entries, one naming
         * a type ID field value, and the other naming the type
         * that corresponds to that value.
         *
         * In this case, the first object is for the
         * AuthorityKeyIdentifier type as a certificate extension.
         */
        /* 1 */ { A1_OP_OPENTYPE_ID, 0,
                  (const void*)&asn1_oid_id_x509_ce_authorityKeyIdentifier },
        /* 2 */ { A1_OP_OPENTYPE, sizeof(AuthorityKeyIdentifier),
                  (const void*)&asn1_AuthorityKeyIdentifier },

        /* Value of object #1 (SubjectKeyIdentifier): */

        /* 3 */ { A1_OP_OPENTYPE_ID, 0,
                  (const void*)&asn1_oid_id_x509_ce_subjectKeyIdentifier },
        /* 4 */ { A1_OP_OPENTYPE, sizeof(SubjectKeyIdentifier),
                  (const void*)&asn1_SubjectKeyIdentifier },
        /* 5 */

        /* And so on...*/

        /* Value of object #17 */
        /* 35 */ { A1_OP_OPENTYPE_ID, 0,
                   (const void*)&asn1_oid_id_pkix_pe_subjectInfoAccess },
        /* 36 */ { A1_OP_OPENTYPE, sizeof(SubjectInfoAccessSyntax),
                   (const void*)&asn1_SubjectInfoAccessSyntax }
    };
```

   After the template entries for all the normal fields of a struct there will
   be an object set reference entry identifying the type ID and open type
   fields's entries' indices in the same template.  The object set has a header
   entry followed by pairs of entries each representing a single object and all
   of them representing the object set.

   This allows the encoder and decoder to both find the object set quickly,
   especially since the objects are sorted by type ID value.

## Moving From C

 - Generate and output a JSON representation of the compiled ASN.1 module.

 - Code codegen/templategen backends in jq or Haskell or whatever.

 - Code template interpreters in &lt;host&gt; language.

 - Eventually rewrite the compiler itself in Rust or whatever.