75.2. System Catalog Initial Data

75.2. System Catalog Initial Data
Prev	Up	Chapter 75. System Catalog Declarations and Initial Contents	Home	Next

75.2. System Catalog Initial Data #

75.2.1. Data File Format
75.2.2. OID Assignment
75.2.3. OID Reference Lookup
75.2.4. Automatic Creation of Array Types
75.2.5. Recipes for Editing Data Files

+ Each catalog that has any manually-created initial data (some do not) + has a corresponding .dat file that contains its + initial data in an editable format. +

75.2.1. Data File Format #

+ Each .dat file contains Perl data structure literals + that are simply eval'd to produce an in-memory data structure consisting + of an array of hash references, one per catalog row. + A slightly modified excerpt from pg_database.dat + will demonstrate the key features: +

+[
+
+# A comment could appear here.
+{ oid => '1', oid_symbol => 'Template1DbOid',
+  descr => 'database\'s default template',
+  datname => 'template1', encoding => 'ENCODING',
+  datlocprovider => 'LOCALE_PROVIDER', datistemplate => 't',
+  datallowconn => 't', datconnlimit => '-1', datfrozenxid => '0',
+  datminmxid => '1', dattablespace => 'pg_default', datcollate => 'LC_COLLATE',
+  datctype => 'LC_CTYPE', daticulocale => 'ICU_LOCALE', datacl => '_null_' },
+
+]
+

+ Points to note: +

75.2.2. OID Assignment #

+ A catalog row appearing in the initial data can be given a + manually-assigned OID by writing an oid + => nnnn metadata field. + Furthermore, if an OID is assigned, a C macro for that OID can be + created by writing an oid_symbol + => name metadata field. +

+ Pre-loaded catalog rows must have preassigned OIDs if there are OID + references to them in other pre-loaded rows. A preassigned OID is + also needed if the row's OID must be referenced from C code. + If neither case applies, the oid metadata field can + be omitted, in which case the bootstrap code assigns an OID + automatically. + In practice we usually preassign OIDs for all or none of the pre-loaded + rows in a given catalog, even if only some of them are actually + cross-referenced. +

+ Writing the actual numeric value of any OID in C code is considered + very bad form; always use a macro, instead. Direct references + to pg_proc OIDs are common enough that there's + a special mechanism to create the necessary macros automatically; + see src/backend/utils/Gen_fmgrtab.pl. Similarly + — but, for historical reasons, not done the same way — + there's an automatic method for creating macros + for pg_type + OIDs. oid_symbol entries are therefore not + necessary in those two catalogs. Likewise, macros for + the pg_class OIDs of system catalogs and + indexes are set up automatically. For all other system catalogs, you + have to manually specify any macros you need + via oid_symbol entries. +

+ To find an available OID for a new pre-loaded row, run the + script src/include/catalog/unused_oids. + It prints inclusive ranges of unused OIDs (e.g., the output + line 45-900 means OIDs 45 through 900 have not been + allocated yet). Currently, OIDs 1–9999 are reserved for manual + assignment; the unused_oids script simply looks + through the catalog headers and .dat files + to see which ones do not appear. You can also use + the duplicate_oids script to check for mistakes. + (genbki.pl will assign OIDs for any rows that + didn't get one hand-assigned to them, and it will also detect duplicate + OIDs at compile time.) +

+ When choosing OIDs for a patch that is not expected to be committed + immediately, best practice is to use a group of more-or-less + consecutive OIDs starting with some random choice in the range + 8000—9999. This minimizes the risk of OID collisions with other + patches being developed concurrently. To keep the 8000—9999 + range free for development purposes, after a patch has been committed + to the master git repository its OIDs should be renumbered into + available space below that range. Typically, this will be done + near the end of each development cycle, moving all OIDs consumed by + patches committed in that cycle at the same time. The script + renumber_oids.pl can be used for this purpose. + If an uncommitted patch is found to have OID conflicts with some + recently-committed patch, renumber_oids.pl may + also be useful for recovering from that situation. +

+ Because of this convention of possibly renumbering OIDs assigned by + patches, the OIDs assigned by a patch should not be considered stable + until the patch has been included in an official release. We do not + change manually-assigned object OIDs once released, however, as that + would create assorted compatibility problems. +

+ If genbki.pl needs to assign an OID to a catalog + entry that does not have a manually-assigned OID, it will use a value in + the range 10000—11999. The server's OID counter is set to 10000 + at the start of a bootstrap run, so that any objects created on-the-fly + during bootstrap processing also receive OIDs in this range. (The + usual OID assignment mechanism takes care of preventing any conflicts.) +

+ Objects with OIDs below FirstUnpinnedObjectId (12000) + are considered “pinned”, preventing them from being + deleted. (There are a small number of exceptions, which are + hard-wired into IsPinnedObject().) + initdb forces the OID counter up + to FirstUnpinnedObjectId as soon as it's ready to + create unpinned objects. Thus objects created during the later phases + of initdb, such as objects created while + running the information_schema.sql script, will + not be pinned, while all objects known + to genbki.pl will be. +

+ OIDs assigned during normal database operation are constrained to be + 16384 or higher. This ensures that the range 10000—16383 is free + for OIDs assigned automatically by genbki.pl or + during initdb. These + automatically-assigned OIDs are not considered stable, and may change + from one installation to another. +

75.2.3. OID Reference Lookup #

+ In principle, cross-references from one initial catalog row to another + could be written just by writing the preassigned OID of the referenced + row in the referencing field. However, that is against project + policy, because it is error-prone, hard to read, and subject to + breakage if a newly-assigned OID is renumbered. Therefore + genbki.pl provides mechanisms to write + symbolic references instead. + The rules are as follows: +

+ genbki.pl resolves all symbolic references while it + runs, and puts simple numeric OIDs into the emitted BKI file. There is + therefore no need for the bootstrap backend to deal with symbolic + references. +

+ It's desirable to mark OID reference columns + with BKI_LOOKUP or BKI_LOOKUP_OPT + even if the catalog has no initial data that requires lookup. This + allows genbki.pl to record the foreign key + relationships that exist in the system catalogs. That information is + used in the regression tests to check for incorrect entries. See also + the macros DECLARE_FOREIGN_KEY, + DECLARE_FOREIGN_KEY_OPT, + DECLARE_ARRAY_FOREIGN_KEY, + and DECLARE_ARRAY_FOREIGN_KEY_OPT, which are + used to declare foreign key relationships that are too complex + for BKI_LOOKUP (typically, multi-column foreign + keys). +

75.2.4. Automatic Creation of Array Types #

+ Most scalar data types should have a corresponding array type (that is, + a standard varlena array type whose element type is the scalar type, and + which is referenced by the typarray field of + the scalar type's pg_type + entry). genbki.pl is able to generate + the pg_type entry for the array type + automatically in most cases. +

+ To use this facility, just write an array_type_oid + => nnnn metadata field in the + scalar type's pg_type entry, specifying the OID + to use for the array type. You may then omit + the typarray field, since it will be filled + automatically with that OID. +

+ The generated array type's name is the scalar type's name with an + underscore prepended. The array entry's other fields are filled from + BKI_ARRAY_DEFAULT(value) + annotations in pg_type.h, or if there isn't one, + copied from the scalar type. (There's also a special case + for typalign.) Then + the typelem + and typarray fields of the two entries are + set to cross-reference each other. +

75.2.5. Recipes for Editing Data Files #

+ Here are some suggestions about the easiest ways to perform common tasks + when updating catalog data files. +

Add a new column with a default to a catalog: + Add the column to the header file with + a BKI_DEFAULT(value) + annotation. The data file need only be adjusted by adding the field + in existing rows where a non-default value is needed. +

Add a default value to an existing column that doesn't have + one: + Add a BKI_DEFAULT annotation to the header file, + then run make reformat-dat-files to remove + now-redundant field entries. +

Remove a column, whether it has a default or not: + Remove the column from the header, then run make + reformat-dat-files to remove now-useless field entries. +

Change or remove an existing default value: + You cannot simply change the header file, since that will cause the + current data to be interpreted incorrectly. First run make + expand-dat-files to rewrite the data files with all + default values inserted explicitly, then change or remove + the BKI_DEFAULT annotation, then run make + reformat-dat-files to remove superfluous fields again. +

Ad-hoc bulk editing: + reformat_dat_file.pl can be adapted to perform + many kinds of bulk changes. Look for its block comments showing where + one-off code can be inserted. In the following example, we are going + to consolidate two Boolean fields in pg_proc + into a char field: + +

+ Add the new column, with a default, + to pg_proc.h: +

++    /* see PROKIND_ categories below */
++    char        prokind BKI_DEFAULT(f);
+

+ Create a new script based on reformat_dat_file.pl + to insert appropriate values on-the-fly: +

+-           # At this point we have the full row in memory as a hash
+-           # and can do any operations we want. As written, it only
+-           # removes default values, but this script can be adapted to
+-           # do one-off bulk-editing.
++           # One-off change to migrate to prokind
++           # Default has already been filled in by now, so change to other
++           # values as appropriate
++           if ($values{proisagg} eq 't')
++           {
++               $values{prokind} = 'a';
++           }
++           elsif ($values{proiswindow} eq 't')
++           {
++               $values{prokind} = 'w';
++           }
+

+ Run the new script: +
```
+$ cd src/include/catalog
+$ perl  rewrite_dat_with_prokind.pl  pg_proc.dat
+
```
+ At this point pg_proc.dat has all three + columns, prokind, + proisagg, + and proiswindow, though they will appear + only in rows where they have non-default values. +

+ Remove the old columns from pg_proc.h: +

+-    /* is it an aggregate? */
+-    bool        proisagg BKI_DEFAULT(f);
+-
+-    /* is it a window function? */
+-    bool        proiswindow BKI_DEFAULT(f);
+

+ Finally, run make reformat-dat-files to remove + the useless old entries from pg_proc.dat. +

+ + For further examples of scripts used for bulk editing, see + convert_oid2name.pl + and remove_pg_type_oid_symbols.pl attached to this + message: + https://www.postgresql.org/message-id/CAJVSVGVX8gXnPm+Xa=DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com +

Prev	Up	Next
75.1. System Catalog Declaration Rules	Home	75.3. BKI File Format