diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:15:05 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 12:15:05 +0000 |
commit | 46651ce6fe013220ed397add242004d764fc0153 (patch) | |
tree | 6e5299f990f88e60174a1d3ae6e48eedd2688b2b /doc/src/sgml/ref/create_index.sgml | |
parent | Initial commit. (diff) | |
download | postgresql-14-upstream.tar.xz postgresql-14-upstream.zip |
Adding upstream version 14.5.upstream/14.5upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/ref/create_index.sgml')
-rw-r--r-- | doc/src/sgml/ref/create_index.sgml | 982 |
1 files changed, 982 insertions, 0 deletions
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml new file mode 100644 index 0000000..f50c88f --- /dev/null +++ b/doc/src/sgml/ref/create_index.sgml @@ -0,0 +1,982 @@ +<!-- +doc/src/sgml/ref/create_index.sgml +PostgreSQL documentation +--> + +<refentry id="sql-createindex"> + <indexterm zone="sql-createindex"> + <primary>CREATE INDEX</primary> + </indexterm> + + <refmeta> + <refentrytitle>CREATE INDEX</refentrytitle> + <manvolnum>7</manvolnum> + <refmiscinfo>SQL - Language Statements</refmiscinfo> + </refmeta> + + <refnamediv> + <refname>CREATE INDEX</refname> + <refpurpose>define a new index</refpurpose> + </refnamediv> + + <refsynopsisdiv> +<synopsis> +CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class="parameter">name</replaceable> ] ON [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ USING <replaceable class="parameter">method</replaceable> ] + ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> [ ( <replaceable class="parameter">opclass_parameter</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ) + [ INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] + [ WITH ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ] + [ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ] + [ WHERE <replaceable class="parameter">predicate</replaceable> ] +</synopsis> + </refsynopsisdiv> + + <refsect1> + <title>Description</title> + + <para> + <command>CREATE INDEX</command> constructs an index on the specified column(s) + of the specified relation, which can be a table or a materialized view. + Indexes are primarily used to enhance database performance (though + inappropriate use can result in slower performance). + </para> + + <para> + The key field(s) for the index are specified as column names, + or alternatively as expressions written in parentheses. + Multiple fields can be specified if the index method supports + multicolumn indexes. + </para> + + <para> + An index field can be an expression computed from the values of + one or more columns of the table row. This feature can be used + to obtain fast access to data based on some transformation of + the basic data. For example, an index computed on + <literal>upper(col)</literal> would allow the clause + <literal>WHERE upper(col) = 'JIM'</literal> to use an index. + </para> + + <para> + <productname>PostgreSQL</productname> provides the index methods + B-tree, hash, GiST, SP-GiST, GIN, and BRIN. Users can also define their own + index methods, but that is fairly complicated. + </para> + + <para> + When the <literal>WHERE</literal> clause is present, a + <firstterm>partial index</firstterm> is created. + A partial index is an index that contains entries for only a portion of + a table, usually a portion that is more useful for indexing than the + rest of the table. For example, if you have a table that contains both + billed and unbilled orders where the unbilled orders take up a small + fraction of the total table and yet that is an often used section, you + can improve performance by creating an index on just that portion. + Another possible application is to use <literal>WHERE</literal> with + <literal>UNIQUE</literal> to enforce uniqueness over a subset of a + table. See <xref linkend="indexes-partial"/> for more discussion. + </para> + + <para> + The expression used in the <literal>WHERE</literal> clause can refer + only to columns of the underlying table, but it can use all columns, + not just the ones being indexed. Presently, subqueries and + aggregate expressions are also forbidden in <literal>WHERE</literal>. + The same restrictions apply to index fields that are expressions. + </para> + + <para> + All functions and operators used in an index definition must be + <quote>immutable</quote>, that is, their results must depend only on + their arguments and never on any outside influence (such as + the contents of another table or the current time). This restriction + ensures that the behavior of the index is well-defined. To use a + user-defined function in an index expression or <literal>WHERE</literal> + clause, remember to mark the function immutable when you create it. + </para> + </refsect1> + + <refsect1> + <title>Parameters</title> + + <variablelist> + <varlistentry> + <term><literal>UNIQUE</literal></term> + <listitem> + <para> + Causes the system to check for + duplicate values in the table when the index is created (if data + already exist) and each time data is added. Attempts to + insert or update data which would result in duplicate entries + will generate an error. + </para> + + <para> + Additional restrictions apply when unique indexes are applied to + partitioned tables; see <xref linkend="sql-createtable" />. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>CONCURRENTLY</literal></term> + <listitem> + <para> + When this option is used, <productname>PostgreSQL</productname> will build the + index without taking any locks that prevent concurrent inserts, + updates, or deletes on the table; whereas a standard index build + locks out writes (but not reads) on the table until it's done. + There are several caveats to be aware of when using this option + — see <xref linkend="sql-createindex-concurrently"/> below. + </para> + <para> + For temporary tables, <command>CREATE INDEX</command> is always + non-concurrent, as no other session can access them, and + non-concurrent index creation is cheaper. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>IF NOT EXISTS</literal></term> + <listitem> + <para> + Do not throw an error if a relation with the same name already exists. + A notice is issued in this case. Note that there is no guarantee that + the existing index is anything like the one that would have been created. + Index name is required when <literal>IF NOT EXISTS</literal> is specified. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>INCLUDE</literal></term> + <listitem> + <para> + The optional <literal>INCLUDE</literal> clause specifies a + list of columns which will be included in the index + as <firstterm>non-key</firstterm> columns. A non-key column cannot + be used in an index scan search qualification, and it is disregarded + for purposes of any uniqueness or exclusion constraint enforced by + the index. However, an index-only scan can return the contents of + non-key columns without having to visit the index's table, since + they are available directly from the index entry. Thus, addition of + non-key columns allows index-only scans to be used for queries that + otherwise could not use them. + </para> + + <para> + It's wise to be conservative about adding non-key columns to an + index, especially wide columns. If an index tuple exceeds the + maximum size allowed for the index type, data insertion will fail. + In any case, non-key columns duplicate data from the index's table + and bloat the size of the index, thus potentially slowing searches. + Furthermore, B-tree deduplication is never used with indexes + that have a non-key column. + </para> + + <para> + Columns listed in the <literal>INCLUDE</literal> clause don't need + appropriate operator classes; the clause can include + columns whose data types don't have operator classes defined for + a given access method. + </para> + + <para> + Expressions are not supported as included columns since they cannot be + used in index-only scans. + </para> + + <para> + Currently, the B-tree, GiST and SP-GiST index access methods support + this feature. In these indexes, the values of columns listed + in the <literal>INCLUDE</literal> clause are included in leaf tuples + which correspond to heap tuples, but are not included in upper-level + index entries used for tree navigation. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">name</replaceable></term> + <listitem> + <para> + The name of the index to be created. No schema name can be included + here; the index is always created in the same schema as its parent + table. If the name is omitted, <productname>PostgreSQL</productname> chooses a + suitable name based on the parent table's name and the indexed column + name(s). + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>ONLY</literal></term> + <listitem> + <para> + Indicates not to recurse creating indexes on partitions, if the + table is partitioned. The default is to recurse. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">table_name</replaceable></term> + <listitem> + <para> + The name (possibly schema-qualified) of the table to be indexed. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">method</replaceable></term> + <listitem> + <para> + The name of the index method to be used. Choices are + <literal>btree</literal>, <literal>hash</literal>, + <literal>gist</literal>, <literal>spgist</literal>, <literal>gin</literal>, and + <literal>brin</literal>. + The default method is <literal>btree</literal>. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">column_name</replaceable></term> + <listitem> + <para> + The name of a column of the table. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">expression</replaceable></term> + <listitem> + <para> + An expression based on one or more columns of the table. The + expression usually must be written with surrounding parentheses, + as shown in the syntax. However, the parentheses can be omitted + if the expression has the form of a function call. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">collation</replaceable></term> + <listitem> + <para> + The name of the collation to use for the index. By default, + the index uses the collation declared for the column to be + indexed or the result collation of the expression to be + indexed. Indexes with non-default collations can be useful for + queries that involve expressions using non-default collations. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">opclass</replaceable></term> + <listitem> + <para> + The name of an operator class. See below for details. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">opclass_parameter</replaceable></term> + <listitem> + <para> + The name of an operator class parameter. See below for details. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>ASC</literal></term> + <listitem> + <para> + Specifies ascending sort order (which is the default). + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>DESC</literal></term> + <listitem> + <para> + Specifies descending sort order. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>NULLS FIRST</literal></term> + <listitem> + <para> + Specifies that nulls sort before non-nulls. This is the default + when <literal>DESC</literal> is specified. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><literal>NULLS LAST</literal></term> + <listitem> + <para> + Specifies that nulls sort after non-nulls. This is the default + when <literal>DESC</literal> is not specified. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">storage_parameter</replaceable></term> + <listitem> + <para> + The name of an index-method-specific storage parameter. See + <xref linkend="sql-createindex-storage-parameters"/> below + for details. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">tablespace_name</replaceable></term> + <listitem> + <para> + The tablespace in which to create the index. If not specified, + <xref linkend="guc-default-tablespace"/> is consulted, or + <xref linkend="guc-temp-tablespaces"/> for indexes on temporary + tables. + </para> + </listitem> + </varlistentry> + + <varlistentry> + <term><replaceable class="parameter">predicate</replaceable></term> + <listitem> + <para> + The constraint expression for a partial index. + </para> + </listitem> + </varlistentry> + + </variablelist> + + <refsect2 id="sql-createindex-storage-parameters" xreflabel="Index Storage Parameters"> + <title>Index Storage Parameters</title> + + <para> + The optional <literal>WITH</literal> clause specifies <firstterm>storage + parameters</firstterm> for the index. Each index method has its own set of allowed + storage parameters. The B-tree, hash, GiST and SP-GiST index methods all + accept this parameter: + </para> + + <variablelist> + <varlistentry id="index-reloption-fillfactor" xreflabel="fillfactor"> + <term><literal>fillfactor</literal> (<type>integer</type>) + <indexterm> + <primary><varname>fillfactor</varname> storage parameter</primary> + </indexterm> + </term> + <listitem> + <para> + The fillfactor for an index is a percentage that determines how full + the index method will try to pack index pages. For B-trees, leaf pages + are filled to this percentage during initial index builds, and also + when extending the index at the right (adding new largest key values). + If pages + subsequently become completely full, they will be split, leading to + fragmentation of the on-disk index structure. B-trees use a default + fillfactor of 90, but any integer value from 10 to 100 can be selected. + </para> + <para> + B-tree indexes on tables where many inserts and/or updates are + anticipated can benefit from lower fillfactor settings at + <command>CREATE INDEX</command> time (following bulk loading into the + table). Values in the range of 50 - 90 can usefully <quote>smooth + out</quote> the <emphasis>rate</emphasis> of page splits during the + early life of the B-tree index (lowering fillfactor like this may even + lower the absolute number of page splits, though this effect is highly + workload dependent). The B-tree bottom-up index deletion technique + described in <xref linkend="btree-deletion"/> is dependent on having + some <quote>extra</quote> space on pages to store <quote>extra</quote> + tuple versions, and so can be affected by fillfactor (though the effect + is usually not significant). + </para> + <para> + In other specific cases it might be useful to increase fillfactor to + 100 at <command>CREATE INDEX</command> time as a way of maximizing + space utilization. You should only consider this when you are + completely sure that the table is static (i.e. that it will never be + affected by either inserts or updates). A fillfactor setting of 100 + otherwise risks <emphasis>harming</emphasis> performance: even a few + updates or inserts will cause a sudden flood of page splits. + </para> + <para> + The other index methods use fillfactor in different but roughly + analogous ways; the default fillfactor varies between methods. + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + B-tree indexes additionally accept this parameter: + </para> + + <variablelist> + <varlistentry id="index-reloption-deduplicate-items" xreflabel="deduplicate_items"> + <term><literal>deduplicate_items</literal> (<type>boolean</type>) + <indexterm> + <primary><varname>deduplicate_items</varname> storage parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Controls usage of the B-tree deduplication technique described + in <xref linkend="btree-deduplication"/>. Set to + <literal>ON</literal> or <literal>OFF</literal> to enable or + disable the optimization. (Alternative spellings of + <literal>ON</literal> and <literal>OFF</literal> are allowed as + described in <xref linkend="config-setting"/>.) The default is + <literal>ON</literal>. + </para> + + <note> + <para> + Turning <literal>deduplicate_items</literal> off via + <command>ALTER INDEX</command> prevents future insertions from + triggering deduplication, but does not in itself make existing + posting list tuples use the standard tuple representation. + </para> + </note> + </listitem> + </varlistentry> + </variablelist> + + <para> + GiST indexes additionally accept this parameter: + </para> + + <variablelist> + <varlistentry id="index-reloption-buffering" xreflabel="buffering"> + <term><literal>buffering</literal> (<type>enum</type>) + <indexterm> + <primary><varname>buffering</varname> storage parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Determines whether the buffered build technique described in + <xref linkend="gist-buffering-build"/> is used to build the index. With + <literal>OFF</literal> buffering is disabled, with <literal>ON</literal> + it is enabled, and with <literal>AUTO</literal> it is initially disabled, + but is turned on on-the-fly once the index size reaches + <xref linkend="guc-effective-cache-size"/>. The default + is <literal>AUTO</literal>. + Note that if sorted build is possible, it will be used instead of + buffered build unless <literal>buffering=ON</literal> is specified. + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + GIN indexes accept different parameters: + </para> + + <variablelist> + <varlistentry id="index-reloption-fastupdate" xreflabel="fastupdate"> + <term><literal>fastupdate</literal> (<type>boolean</type>) + <indexterm> + <primary><varname>fastupdate</varname> storage parameter</primary> + </indexterm> + </term> + <listitem> + <para> + This setting controls usage of the fast update technique described in + <xref linkend="gin-fast-update"/>. It is a Boolean parameter: + <literal>ON</literal> enables fast update, <literal>OFF</literal> disables it. + The default is <literal>ON</literal>. + </para> + + <note> + <para> + Turning <literal>fastupdate</literal> off via <command>ALTER INDEX</command> prevents + future insertions from going into the list of pending index entries, + but does not in itself flush previous entries. You might want to + <command>VACUUM</command> the table or call <function>gin_clean_pending_list</function> + function afterward to ensure the pending list is emptied. + </para> + </note> + </listitem> + </varlistentry> + </variablelist> + + <variablelist> + <varlistentry id="index-reloption-gin-pending-list-limit" xreflabel="gin_pending_list_limit"> + <term><literal>gin_pending_list_limit</literal> (<type>integer</type>) + <indexterm> + <primary><varname>gin_pending_list_limit</varname></primary> + <secondary>storage parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Custom <xref linkend="guc-gin-pending-list-limit"/> parameter. + This value is specified in kilobytes. + </para> + </listitem> + </varlistentry> + </variablelist> + + <para> + <acronym>BRIN</acronym> indexes accept different parameters: + </para> + + <variablelist> + <varlistentry id="index-reloption-pages-per-range" xreflabel="pages_per_range"> + <term><literal>pages_per_range</literal> (<type>integer</type>) + <indexterm> + <primary><varname>pages_per_range</varname> storage parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Defines the number of table blocks that make up one block range for + each entry of a <acronym>BRIN</acronym> index (see <xref linkend="brin-intro"/> + for more details). The default is <literal>128</literal>. + </para> + </listitem> + </varlistentry> + + <varlistentry id="index-reloption-autosummarize" xreflabel="autosummarize"> + <term><literal>autosummarize</literal> (<type>boolean</type>) + <indexterm> + <primary><varname>autosummarize</varname> storage parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Defines whether a summarization run is queued for the previous page + range whenever an insertion is detected on the next one. + See <xref linkend="brin-operation"/> for more details. + The default is <literal>off</literal>. + </para> + </listitem> + </varlistentry> + </variablelist> + </refsect2> + + <refsect2 id="sql-createindex-concurrently" xreflabel="Building Indexes Concurrently"> + <title>Building Indexes Concurrently</title> + + <indexterm zone="sql-createindex-concurrently"> + <primary>index</primary> + <secondary>building concurrently</secondary> + </indexterm> + + <para> + Creating an index can interfere with regular operation of a database. + Normally <productname>PostgreSQL</productname> locks the table to be indexed against + writes and performs the entire index build with a single scan of the + table. Other transactions can still read the table, but if they try to + insert, update, or delete rows in the table they will block until the + index build is finished. This could have a severe effect if the system is + a live production database. Very large tables can take many hours to be + indexed, and even for smaller tables, an index build can lock out writers + for periods that are unacceptably long for a production system. + </para> + + <para> + <productname>PostgreSQL</productname> supports building indexes without locking + out writes. This method is invoked by specifying the + <literal>CONCURRENTLY</literal> option of <command>CREATE INDEX</command>. + When this option is used, + <productname>PostgreSQL</productname> must perform two scans of the table, and in + addition it must wait for all existing transactions that could potentially + modify or use the index to terminate. Thus + this method requires more total work than a standard index build and takes + significantly longer to complete. However, since it allows normal + operations to continue while the index is built, this method is useful for + adding new indexes in a production environment. Of course, the extra CPU + and I/O load imposed by the index creation might slow other operations. + </para> + + <para> + In a concurrent index build, the index is actually entered as an + <quote>invalid</quote> index into + the system catalogs in one transaction, then two table scans occur in + two more transactions. Before each table scan, the index build must + wait for existing transactions that have modified the table to terminate. + After the second scan, the index build must wait for any transactions + that have a snapshot (see <xref linkend="mvcc"/>) predating the second + scan to terminate, including transactions used by any phase of concurrent + index builds on other tables, if the indexes involved are partial or have + columns that are not simple column references. + Then finally the index can be marked <quote>valid</quote> and ready for use, + and the <command>CREATE INDEX</command> command terminates. + Even then, however, the index may not be immediately usable for queries: + in the worst case, it cannot be used as long as transactions exist that + predate the start of the index build. + </para> + + <para> + If a problem arises while scanning the table, such as a deadlock or a + uniqueness violation in a unique index, the <command>CREATE INDEX</command> + command will fail but leave behind an <quote>invalid</quote> index. This index + will be ignored for querying purposes because it might be incomplete; + however it will still consume update overhead. The <application>psql</application> + <command>\d</command> command will report such an index as <literal>INVALID</literal>: + +<programlisting> +postgres=# \d tab + Table "public.tab" + Column | Type | Collation | Nullable | Default +--------+---------+-----------+----------+--------- + col | integer | | | +Indexes: + "idx" btree (col) INVALID +</programlisting> + + The recommended recovery + method in such cases is to drop the index and try again to perform + <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is + to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>). + </para> + + <para> + Another caveat when building a unique index concurrently is that the + uniqueness constraint is already being enforced against other transactions + when the second table scan begins. This means that constraint violations + could be reported in other queries prior to the index becoming available + for use, or even in cases where the index build eventually fails. Also, + if a failure does occur in the second scan, the <quote>invalid</quote> index + continues to enforce its uniqueness constraint afterwards. + </para> + + <para> + Concurrent builds of expression indexes and partial indexes are supported. + Errors occurring in the evaluation of these expressions could cause + behavior similar to that described above for unique constraint violations. + </para> + + <para> + Regular index builds permit other regular index builds on the + same table to occur simultaneously, but only one concurrent index build + can occur on a table at a time. In either case, schema modification of the + table is not allowed while the index is being built. Another difference is + that a regular <command>CREATE INDEX</command> command can be performed + within a transaction block, but <command>CREATE INDEX CONCURRENTLY</command> + cannot. + </para> + + <para> + Concurrent builds for indexes on partitioned tables are currently not + supported. However, you may concurrently build the index on each + partition individually and then finally create the partitioned index + non-concurrently in order to reduce the time where writes to the + partitioned table will be locked out. In this case, building the + partitioned index is a metadata only operation. + </para> + + </refsect2> + </refsect1> + + <refsect1> + <title>Notes</title> + + <para> + See <xref linkend="indexes"/> for information about when indexes can + be used, when they are not used, and in which particular situations + they can be useful. + </para> + + <para> + Currently, only the B-tree, GiST, GIN, and BRIN index methods support + multiple-key-column indexes. Whether there can be multiple key + columns is independent of whether <literal>INCLUDE</literal> columns + can be added to the index. Indexes can have up to 32 columns, + including <literal>INCLUDE</literal> columns. + (This limit can be altered when building + <productname>PostgreSQL</productname>.) Only B-tree currently + supports unique indexes. + </para> + + <para> + An <firstterm>operator class</firstterm> with optional parameters + can be specified for each column of an index. + The operator class identifies the operators to be + used by the index for that column. For example, a B-tree index on + four-byte integers would use the <literal>int4_ops</literal> class; + this operator class includes comparison functions for four-byte + integers. In practice the default operator class for the column's data + type is usually sufficient. The main point of having operator classes + is that for some data types, there could be more than one meaningful + ordering. For example, we might want to sort a complex-number data + type either by absolute value or by real part. We could do this by + defining two operator classes for the data type and then selecting + the proper class when creating an index. More information about + operator classes is in <xref linkend="indexes-opclass"/> and in <xref + linkend="xindex"/>. + </para> + + <para> + When <literal>CREATE INDEX</literal> is invoked on a partitioned + table, the default behavior is to recurse to all partitions to ensure + they all have matching indexes. + Each partition is first checked to determine whether an equivalent + index already exists, and if so, that index will become attached as a + partition index to the index being created, which will become its + parent index. + If no matching index exists, a new index will be created and + automatically attached; the name of the new index in each partition + will be determined as if no index name had been specified in the + command. + If the <literal>ONLY</literal> option is specified, no recursion + is done, and the index is marked invalid. + (<command>ALTER INDEX ... ATTACH PARTITION</command> marks the index + valid, once all partitions acquire matching indexes.) Note, however, + that any partition that is created in the future using + <command>CREATE TABLE ... PARTITION OF</command> will automatically + have a matching index, regardless of whether <literal>ONLY</literal> is + specified. + </para> + + <para> + For index methods that support ordered scans (currently, only B-tree), + the optional clauses <literal>ASC</literal>, <literal>DESC</literal>, <literal>NULLS + FIRST</literal>, and/or <literal>NULLS LAST</literal> can be specified to modify + the sort ordering of the index. Since an ordered index can be + scanned either forward or backward, it is not normally useful to create a + single-column <literal>DESC</literal> index — that sort ordering is already + available with a regular index. The value of these options is that + multicolumn indexes can be created that match the sort ordering requested + by a mixed-ordering query, such as <literal>SELECT ... ORDER BY x ASC, y + DESC</literal>. The <literal>NULLS</literal> options are useful if you need to support + <quote>nulls sort low</quote> behavior, rather than the default <quote>nulls + sort high</quote>, in queries that depend on indexes to avoid sorting steps. + </para> + + <para> + The system regularly collects statistics on all of a table's + columns. Newly-created non-expression indexes can immediately + use these statistics to determine an index's usefulness. + For new expression indexes, it is necessary to run <link + linkend="sql-analyze"><command>ANALYZE</command></link> or wait for + the <link linkend="autovacuum">autovacuum daemon</link> to analyze + the table to generate statistics for these indexes. + </para> + + <para> + For most index methods, the speed of creating an index is + dependent on the setting of <xref linkend="guc-maintenance-work-mem"/>. + Larger values will reduce the time needed for index creation, so long + as you don't make it larger than the amount of memory really available, + which would drive the machine into swapping. + </para> + + <para> + <productname>PostgreSQL</productname> can build indexes while + leveraging multiple CPUs in order to process the table rows faster. + This feature is known as <firstterm>parallel index + build</firstterm>. For index methods that support building indexes + in parallel (currently, only B-tree), + <varname>maintenance_work_mem</varname> specifies the maximum + amount of memory that can be used by each index build operation as + a whole, regardless of how many worker processes were started. + Generally, a cost model automatically determines how many worker + processes should be requested, if any. + </para> + + <para> + Parallel index builds may benefit from increasing + <varname>maintenance_work_mem</varname> where an equivalent serial + index build will see little or no benefit. Note that + <varname>maintenance_work_mem</varname> may influence the number of + worker processes requested, since parallel workers must have at + least a <literal>32MB</literal> share of the total + <varname>maintenance_work_mem</varname> budget. There must also be + a remaining <literal>32MB</literal> share for the leader process. + Increasing <xref linkend="guc-max-parallel-maintenance-workers"/> + may allow more workers to be used, which will reduce the time + needed for index creation, so long as the index build is not + already I/O bound. Of course, there should also be sufficient + CPU capacity that would otherwise lie idle. + </para> + + <para> + Setting a value for <literal>parallel_workers</literal> via <link + linkend="sql-altertable"><command>ALTER TABLE</command></link> directly controls how many parallel + worker processes will be requested by a <command>CREATE + INDEX</command> against the table. This bypasses the cost model + completely, and prevents <varname>maintenance_work_mem</varname> + from affecting how many parallel workers are requested. Setting + <literal>parallel_workers</literal> to 0 via <command>ALTER + TABLE</command> will disable parallel index builds on the table in + all cases. + </para> + + <tip> + <para> + You might want to reset <literal>parallel_workers</literal> after + setting it as part of tuning an index build. This avoids + inadvertent changes to query plans, since + <literal>parallel_workers</literal> affects + <emphasis>all</emphasis> parallel table scans. + </para> + </tip> + + <para> + While <command>CREATE INDEX</command> with the + <literal>CONCURRENTLY</literal> option supports parallel builds + without special restrictions, only the first table scan is actually + performed in parallel. + </para> + + <para> + Use <link linkend="sql-dropindex"><command>DROP INDEX</command></link> + to remove an index. + </para> + + <para> + Like any long-running transaction, <command>CREATE INDEX</command> on a + table can affect which tuples can be removed by concurrent + <command>VACUUM</command> on any other table. + </para> + + <para> + Prior releases of <productname>PostgreSQL</productname> also had an + R-tree index method. This method has been removed because + it had no significant advantages over the GiST method. + If <literal>USING rtree</literal> is specified, <command>CREATE INDEX</command> + will interpret it as <literal>USING gist</literal>, to simplify conversion + of old databases to GiST. + </para> + + <para> + Each backend running <command>CREATE INDEX</command> will report its + progress in the <structname>pg_stat_progress_create_index</structname> + view. See <xref linkend="create-index-progress-reporting"/> for details. + </para> + </refsect1> + + <refsect1> + <title>Examples</title> + + <para> + To create a unique B-tree index on the column <literal>title</literal> in + the table <literal>films</literal>: +<programlisting> +CREATE UNIQUE INDEX title_idx ON films (title); +</programlisting> + </para> + + <para> + To create a unique B-tree index on the column <literal>title</literal> + with included columns <literal>director</literal> + and <literal>rating</literal> in the table <literal>films</literal>: +<programlisting> +CREATE UNIQUE INDEX title_idx ON films (title) INCLUDE (director, rating); +</programlisting> + </para> + + <para> + To create a B-Tree index with deduplication disabled: +<programlisting> +CREATE INDEX title_idx ON films (title) WITH (deduplicate_items = off); +</programlisting> + </para> + + <para> + To create an index on the expression <literal>lower(title)</literal>, + allowing efficient case-insensitive searches: +<programlisting> +CREATE INDEX ON films ((lower(title))); +</programlisting> + (In this example we have chosen to omit the index name, so the system + will choose a name, typically <literal>films_lower_idx</literal>.) + </para> + + <para> + To create an index with non-default collation: +<programlisting> +CREATE INDEX title_idx_german ON films (title COLLATE "de_DE"); +</programlisting> + </para> + + <para> + To create an index with non-default sort ordering of nulls: +<programlisting> +CREATE INDEX title_idx_nulls_low ON films (title NULLS FIRST); +</programlisting> + </para> + + <para> + To create an index with non-default fill factor: +<programlisting> +CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70); +</programlisting> + </para> + + <para> + To create a <acronym>GIN</acronym> index with fast updates disabled: +<programlisting> +CREATE INDEX gin_idx ON documents_table USING GIN (locations) WITH (fastupdate = off); +</programlisting> + </para> + + <para> + To create an index on the column <literal>code</literal> in the table + <literal>films</literal> and have the index reside in the tablespace + <literal>indexspace</literal>: +<programlisting> +CREATE INDEX code_idx ON films (code) TABLESPACE indexspace; +</programlisting> + </para> + + <para> + To create a GiST index on a point attribute so that we + can efficiently use box operators on the result of the + conversion function: +<programlisting> +CREATE INDEX pointloc + ON points USING gist (box(location,location)); +SELECT * FROM points + WHERE box(location,location) && '(0,0),(1,1)'::box; +</programlisting> + </para> + + <para> + To create an index without locking out writes to the table: +<programlisting> +CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity); +</programlisting></para> + + </refsect1> + + <refsect1> + <title>Compatibility</title> + + <para> + <command>CREATE INDEX</command> is a + <productname>PostgreSQL</productname> language extension. There + are no provisions for indexes in the SQL standard. + </para> + </refsect1> + + <refsect1> + <title>See Also</title> + + <simplelist type="inline"> + <member><xref linkend="sql-alterindex"/></member> + <member><xref linkend="sql-dropindex"/></member> + <member><xref linkend="sql-reindex"/></member> + <member><xref linkend="create-index-progress-reporting"/></member> + </simplelist> + </refsect1> +</refentry> |