summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/ref/create_index.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/ref/create_index.sgml')
-rw-r--r--doc/src/sgml/ref/create_index.sgml999
1 files changed, 999 insertions, 0 deletions
diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
new file mode 100644
index 0000000..40986aa
--- /dev/null
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -0,0 +1,999 @@
+<!--
+doc/src/sgml/ref/create_index.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="sql-createindex">
+ <indexterm zone="sql-createindex">
+ <primary>CREATE INDEX</primary>
+ </indexterm>
+
+ <refmeta>
+ <refentrytitle>CREATE INDEX</refentrytitle>
+ <manvolnum>7</manvolnum>
+ <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+ <refname>CREATE INDEX</refname>
+ <refpurpose>define a new index</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class="parameter">name</replaceable> ] ON [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ USING <replaceable class="parameter">method</replaceable> ]
+ ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> [ ( <replaceable class="parameter">opclass_parameter</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
+ [ INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ NULLS [ NOT ] DISTINCT ]
+ [ WITH ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ]
+ [ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ]
+ [ WHERE <replaceable class="parameter">predicate</replaceable> ]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+ <title>Description</title>
+
+ <para>
+ <command>CREATE INDEX</command> constructs an index on the specified column(s)
+ of the specified relation, which can be a table or a materialized view.
+ Indexes are primarily used to enhance database performance (though
+ inappropriate use can result in slower performance).
+ </para>
+
+ <para>
+ The key field(s) for the index are specified as column names,
+ or alternatively as expressions written in parentheses.
+ Multiple fields can be specified if the index method supports
+ multicolumn indexes.
+ </para>
+
+ <para>
+ An index field can be an expression computed from the values of
+ one or more columns of the table row. This feature can be used
+ to obtain fast access to data based on some transformation of
+ the basic data. For example, an index computed on
+ <literal>upper(col)</literal> would allow the clause
+ <literal>WHERE upper(col) = 'JIM'</literal> to use an index.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> provides the index methods
+ B-tree, hash, GiST, SP-GiST, GIN, and BRIN. Users can also define their own
+ index methods, but that is fairly complicated.
+ </para>
+
+ <para>
+ When the <literal>WHERE</literal> clause is present, a
+ <firstterm>partial index</firstterm> is created.
+ A partial index is an index that contains entries for only a portion of
+ a table, usually a portion that is more useful for indexing than the
+ rest of the table. For example, if you have a table that contains both
+ billed and unbilled orders where the unbilled orders take up a small
+ fraction of the total table and yet that is an often used section, you
+ can improve performance by creating an index on just that portion.
+ Another possible application is to use <literal>WHERE</literal> with
+ <literal>UNIQUE</literal> to enforce uniqueness over a subset of a
+ table. See <xref linkend="indexes-partial"/> for more discussion.
+ </para>
+
+ <para>
+ The expression used in the <literal>WHERE</literal> clause can refer
+ only to columns of the underlying table, but it can use all columns,
+ not just the ones being indexed. Presently, subqueries and
+ aggregate expressions are also forbidden in <literal>WHERE</literal>.
+ The same restrictions apply to index fields that are expressions.
+ </para>
+
+ <para>
+ All functions and operators used in an index definition must be
+ <quote>immutable</quote>, that is, their results must depend only on
+ their arguments and never on any outside influence (such as
+ the contents of another table or the current time). This restriction
+ ensures that the behavior of the index is well-defined. To use a
+ user-defined function in an index expression or <literal>WHERE</literal>
+ clause, remember to mark the function immutable when you create it.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Parameters</title>
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>UNIQUE</literal></term>
+ <listitem>
+ <para>
+ Causes the system to check for
+ duplicate values in the table when the index is created (if data
+ already exist) and each time data is added. Attempts to
+ insert or update data which would result in duplicate entries
+ will generate an error.
+ </para>
+
+ <para>
+ Additional restrictions apply when unique indexes are applied to
+ partitioned tables; see <xref linkend="sql-createtable" />.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>CONCURRENTLY</literal></term>
+ <listitem>
+ <para>
+ When this option is used, <productname>PostgreSQL</productname> will build the
+ index without taking any locks that prevent concurrent inserts,
+ updates, or deletes on the table; whereas a standard index build
+ locks out writes (but not reads) on the table until it's done.
+ There are several caveats to be aware of when using this option
+ &mdash; see <xref linkend="sql-createindex-concurrently"/> below.
+ </para>
+ <para>
+ For temporary tables, <command>CREATE INDEX</command> is always
+ non-concurrent, as no other session can access them, and
+ non-concurrent index creation is cheaper.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>IF NOT EXISTS</literal></term>
+ <listitem>
+ <para>
+ Do not throw an error if a relation with the same name already exists.
+ A notice is issued in this case. Note that there is no guarantee that
+ the existing index is anything like the one that would have been created.
+ Index name is required when <literal>IF NOT EXISTS</literal> is specified.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>INCLUDE</literal></term>
+ <listitem>
+ <para>
+ The optional <literal>INCLUDE</literal> clause specifies a
+ list of columns which will be included in the index
+ as <firstterm>non-key</firstterm> columns. A non-key column cannot
+ be used in an index scan search qualification, and it is disregarded
+ for purposes of any uniqueness or exclusion constraint enforced by
+ the index. However, an index-only scan can return the contents of
+ non-key columns without having to visit the index's table, since
+ they are available directly from the index entry. Thus, addition of
+ non-key columns allows index-only scans to be used for queries that
+ otherwise could not use them.
+ </para>
+
+ <para>
+ It's wise to be conservative about adding non-key columns to an
+ index, especially wide columns. If an index tuple exceeds the
+ maximum size allowed for the index type, data insertion will fail.
+ In any case, non-key columns duplicate data from the index's table
+ and bloat the size of the index, thus potentially slowing searches.
+ Furthermore, B-tree deduplication is never used with indexes
+ that have a non-key column.
+ </para>
+
+ <para>
+ Columns listed in the <literal>INCLUDE</literal> clause don't need
+ appropriate operator classes; the clause can include
+ columns whose data types don't have operator classes defined for
+ a given access method.
+ </para>
+
+ <para>
+ Expressions are not supported as included columns since they cannot be
+ used in index-only scans.
+ </para>
+
+ <para>
+ Currently, the B-tree, GiST and SP-GiST index access methods support
+ this feature. In these indexes, the values of columns listed
+ in the <literal>INCLUDE</literal> clause are included in leaf tuples
+ which correspond to heap tuples, but are not included in upper-level
+ index entries used for tree navigation.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">name</replaceable></term>
+ <listitem>
+ <para>
+ The name of the index to be created. No schema name can be included
+ here; the index is always created in the same schema as its parent
+ table. The name of the index must be distinct from the name of any
+ other relation (table, sequence, index, view, materialized view, or
+ foreign table) in that schema.
+ If the name is omitted, <productname>PostgreSQL</productname> chooses a
+ suitable name based on the parent table's name and the indexed column
+ name(s).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>ONLY</literal></term>
+ <listitem>
+ <para>
+ Indicates not to recurse creating indexes on partitions, if the
+ table is partitioned. The default is to recurse.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">table_name</replaceable></term>
+ <listitem>
+ <para>
+ The name (possibly schema-qualified) of the table to be indexed.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">method</replaceable></term>
+ <listitem>
+ <para>
+ The name of the index method to be used. Choices are
+ <literal>btree</literal>, <literal>hash</literal>,
+ <literal>gist</literal>, <literal>spgist</literal>, <literal>gin</literal>,
+ <literal>brin</literal>, or user-installed access methods like
+ <link linkend="bloom">bloom</link>.
+ The default method is <literal>btree</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">column_name</replaceable></term>
+ <listitem>
+ <para>
+ The name of a column of the table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">expression</replaceable></term>
+ <listitem>
+ <para>
+ An expression based on one or more columns of the table. The
+ expression usually must be written with surrounding parentheses,
+ as shown in the syntax. However, the parentheses can be omitted
+ if the expression has the form of a function call.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">collation</replaceable></term>
+ <listitem>
+ <para>
+ The name of the collation to use for the index. By default,
+ the index uses the collation declared for the column to be
+ indexed or the result collation of the expression to be
+ indexed. Indexes with non-default collations can be useful for
+ queries that involve expressions using non-default collations.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">opclass</replaceable></term>
+ <listitem>
+ <para>
+ The name of an operator class. See below for details.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">opclass_parameter</replaceable></term>
+ <listitem>
+ <para>
+ The name of an operator class parameter. See below for details.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>ASC</literal></term>
+ <listitem>
+ <para>
+ Specifies ascending sort order (which is the default).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>DESC</literal></term>
+ <listitem>
+ <para>
+ Specifies descending sort order.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>NULLS FIRST</literal></term>
+ <listitem>
+ <para>
+ Specifies that nulls sort before non-nulls. This is the default
+ when <literal>DESC</literal> is specified.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>NULLS LAST</literal></term>
+ <listitem>
+ <para>
+ Specifies that nulls sort after non-nulls. This is the default
+ when <literal>DESC</literal> is not specified.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>NULLS DISTINCT</literal></term>
+ <term><literal>NULLS NOT DISTINCT</literal></term>
+ <listitem>
+ <para>
+ Specifies whether for a unique index, null values should be considered
+ distinct (not equal). The default is that they are distinct, so that
+ a unique index could contain multiple null values in a column.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">storage_parameter</replaceable></term>
+ <listitem>
+ <para>
+ The name of an index-method-specific storage parameter. See
+ <xref linkend="sql-createindex-storage-parameters"/> below
+ for details.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">tablespace_name</replaceable></term>
+ <listitem>
+ <para>
+ The tablespace in which to create the index. If not specified,
+ <xref linkend="guc-default-tablespace"/> is consulted, or
+ <xref linkend="guc-temp-tablespaces"/> for indexes on temporary
+ tables.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="parameter">predicate</replaceable></term>
+ <listitem>
+ <para>
+ The constraint expression for a partial index.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <refsect2 id="sql-createindex-storage-parameters" xreflabel="Index Storage Parameters">
+ <title>Index Storage Parameters</title>
+
+ <para>
+ The optional <literal>WITH</literal> clause specifies <firstterm>storage
+ parameters</firstterm> for the index. Each index method has its own set of allowed
+ storage parameters. The B-tree, hash, GiST and SP-GiST index methods all
+ accept this parameter:
+ </para>
+
+ <variablelist>
+ <varlistentry id="index-reloption-fillfactor" xreflabel="fillfactor">
+ <term><literal>fillfactor</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>fillfactor</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ The fillfactor for an index is a percentage that determines how full
+ the index method will try to pack index pages. For B-trees, leaf pages
+ are filled to this percentage during initial index builds, and also
+ when extending the index at the right (adding new largest key values).
+ If pages
+ subsequently become completely full, they will be split, leading to
+ fragmentation of the on-disk index structure. B-trees use a default
+ fillfactor of 90, but any integer value from 10 to 100 can be selected.
+ </para>
+ <para>
+ B-tree indexes on tables where many inserts and/or updates are
+ anticipated can benefit from lower fillfactor settings at
+ <command>CREATE INDEX</command> time (following bulk loading into the
+ table). Values in the range of 50 - 90 can usefully <quote>smooth
+ out</quote> the <emphasis>rate</emphasis> of page splits during the
+ early life of the B-tree index (lowering fillfactor like this may even
+ lower the absolute number of page splits, though this effect is highly
+ workload dependent). The B-tree bottom-up index deletion technique
+ described in <xref linkend="btree-deletion"/> is dependent on having
+ some <quote>extra</quote> space on pages to store <quote>extra</quote>
+ tuple versions, and so can be affected by fillfactor (though the effect
+ is usually not significant).
+ </para>
+ <para>
+ In other specific cases it might be useful to increase fillfactor to
+ 100 at <command>CREATE INDEX</command> time as a way of maximizing
+ space utilization. You should only consider this when you are
+ completely sure that the table is static (i.e. that it will never be
+ affected by either inserts or updates). A fillfactor setting of 100
+ otherwise risks <emphasis>harming</emphasis> performance: even a few
+ updates or inserts will cause a sudden flood of page splits.
+ </para>
+ <para>
+ The other index methods use fillfactor in different but roughly
+ analogous ways; the default fillfactor varies between methods.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ B-tree indexes additionally accept this parameter:
+ </para>
+
+ <variablelist>
+ <varlistentry id="index-reloption-deduplicate-items" xreflabel="deduplicate_items">
+ <term><literal>deduplicate_items</literal> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>deduplicate_items</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Controls usage of the B-tree deduplication technique described
+ in <xref linkend="btree-deduplication"/>. Set to
+ <literal>ON</literal> or <literal>OFF</literal> to enable or
+ disable the optimization. (Alternative spellings of
+ <literal>ON</literal> and <literal>OFF</literal> are allowed as
+ described in <xref linkend="config-setting"/>.) The default is
+ <literal>ON</literal>.
+ </para>
+
+ <note>
+ <para>
+ Turning <literal>deduplicate_items</literal> off via
+ <command>ALTER INDEX</command> prevents future insertions from
+ triggering deduplication, but does not in itself make existing
+ posting list tuples use the standard tuple representation.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ GiST indexes additionally accept this parameter:
+ </para>
+
+ <variablelist>
+ <varlistentry id="index-reloption-buffering" xreflabel="buffering">
+ <term><literal>buffering</literal> (<type>enum</type>)
+ <indexterm>
+ <primary><varname>buffering</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Determines whether the buffered build technique described in
+ <xref linkend="gist-buffering-build"/> is used to build the index. With
+ <literal>OFF</literal> buffering is disabled, with <literal>ON</literal>
+ it is enabled, and with <literal>AUTO</literal> it is initially disabled,
+ but is turned on on-the-fly once the index size reaches
+ <xref linkend="guc-effective-cache-size"/>. The default
+ is <literal>AUTO</literal>.
+ Note that if sorted build is possible, it will be used instead of
+ buffered build unless <literal>buffering=ON</literal> is specified.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ GIN indexes accept different parameters:
+ </para>
+
+ <variablelist>
+ <varlistentry id="index-reloption-fastupdate" xreflabel="fastupdate">
+ <term><literal>fastupdate</literal> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>fastupdate</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ This setting controls usage of the fast update technique described in
+ <xref linkend="gin-fast-update"/>. It is a Boolean parameter:
+ <literal>ON</literal> enables fast update, <literal>OFF</literal> disables it.
+ The default is <literal>ON</literal>.
+ </para>
+
+ <note>
+ <para>
+ Turning <literal>fastupdate</literal> off via <command>ALTER INDEX</command> prevents
+ future insertions from going into the list of pending index entries,
+ but does not in itself flush previous entries. You might want to
+ <command>VACUUM</command> the table or call <function>gin_clean_pending_list</function>
+ function afterward to ensure the pending list is emptied.
+ </para>
+ </note>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <variablelist>
+ <varlistentry id="index-reloption-gin-pending-list-limit" xreflabel="gin_pending_list_limit">
+ <term><literal>gin_pending_list_limit</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>gin_pending_list_limit</varname></primary>
+ <secondary>storage parameter</secondary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Custom <xref linkend="guc-gin-pending-list-limit"/> parameter.
+ This value is specified in kilobytes.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ <acronym>BRIN</acronym> indexes accept different parameters:
+ </para>
+
+ <variablelist>
+ <varlistentry id="index-reloption-pages-per-range" xreflabel="pages_per_range">
+ <term><literal>pages_per_range</literal> (<type>integer</type>)
+ <indexterm>
+ <primary><varname>pages_per_range</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Defines the number of table blocks that make up one block range for
+ each entry of a <acronym>BRIN</acronym> index (see <xref linkend="brin-intro"/>
+ for more details). The default is <literal>128</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="index-reloption-autosummarize" xreflabel="autosummarize">
+ <term><literal>autosummarize</literal> (<type>boolean</type>)
+ <indexterm>
+ <primary><varname>autosummarize</varname> storage parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Defines whether a summarization run is queued for the previous page
+ range whenever an insertion is detected on the next one.
+ See <xref linkend="brin-operation"/> for more details.
+ The default is <literal>off</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </refsect2>
+
+ <refsect2 id="sql-createindex-concurrently" xreflabel="Building Indexes Concurrently">
+ <title>Building Indexes Concurrently</title>
+
+ <indexterm zone="sql-createindex-concurrently">
+ <primary>index</primary>
+ <secondary>building concurrently</secondary>
+ </indexterm>
+
+ <para>
+ Creating an index can interfere with regular operation of a database.
+ Normally <productname>PostgreSQL</productname> locks the table to be indexed against
+ writes and performs the entire index build with a single scan of the
+ table. Other transactions can still read the table, but if they try to
+ insert, update, or delete rows in the table they will block until the
+ index build is finished. This could have a severe effect if the system is
+ a live production database. Very large tables can take many hours to be
+ indexed, and even for smaller tables, an index build can lock out writers
+ for periods that are unacceptably long for a production system.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> supports building indexes without locking
+ out writes. This method is invoked by specifying the
+ <literal>CONCURRENTLY</literal> option of <command>CREATE INDEX</command>.
+ When this option is used,
+ <productname>PostgreSQL</productname> must perform two scans of the table, and in
+ addition it must wait for all existing transactions that could potentially
+ modify or use the index to terminate. Thus
+ this method requires more total work than a standard index build and takes
+ significantly longer to complete. However, since it allows normal
+ operations to continue while the index is built, this method is useful for
+ adding new indexes in a production environment. Of course, the extra CPU
+ and I/O load imposed by the index creation might slow other operations.
+ </para>
+
+ <para>
+ In a concurrent index build, the index is actually entered as an
+ <quote>invalid</quote> index into
+ the system catalogs in one transaction, then two table scans occur in
+ two more transactions. Before each table scan, the index build must
+ wait for existing transactions that have modified the table to terminate.
+ After the second scan, the index build must wait for any transactions
+ that have a snapshot (see <xref linkend="mvcc"/>) predating the second
+ scan to terminate, including transactions used by any phase of concurrent
+ index builds on other tables, if the indexes involved are partial or have
+ columns that are not simple column references.
+ Then finally the index can be marked <quote>valid</quote> and ready for use,
+ and the <command>CREATE INDEX</command> command terminates.
+ Even then, however, the index may not be immediately usable for queries:
+ in the worst case, it cannot be used as long as transactions exist that
+ predate the start of the index build.
+ </para>
+
+ <para>
+ If a problem arises while scanning the table, such as a deadlock or a
+ uniqueness violation in a unique index, the <command>CREATE INDEX</command>
+ command will fail but leave behind an <quote>invalid</quote> index. This index
+ will be ignored for querying purposes because it might be incomplete;
+ however it will still consume update overhead. The <application>psql</application>
+ <command>\d</command> command will report such an index as <literal>INVALID</literal>:
+
+<programlisting>
+postgres=# \d tab
+ Table "public.tab"
+ Column | Type | Collation | Nullable | Default
+--------+---------+-----------+----------+---------
+ col | integer | | |
+Indexes:
+ "idx" btree (col) INVALID
+</programlisting>
+
+ The recommended recovery
+ method in such cases is to drop the index and try again to perform
+ <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is
+ to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>).
+ </para>
+
+ <para>
+ Another caveat when building a unique index concurrently is that the
+ uniqueness constraint is already being enforced against other transactions
+ when the second table scan begins. This means that constraint violations
+ could be reported in other queries prior to the index becoming available
+ for use, or even in cases where the index build eventually fails. Also,
+ if a failure does occur in the second scan, the <quote>invalid</quote> index
+ continues to enforce its uniqueness constraint afterwards.
+ </para>
+
+ <para>
+ Concurrent builds of expression indexes and partial indexes are supported.
+ Errors occurring in the evaluation of these expressions could cause
+ behavior similar to that described above for unique constraint violations.
+ </para>
+
+ <para>
+ Regular index builds permit other regular index builds on the
+ same table to occur simultaneously, but only one concurrent index build
+ can occur on a table at a time. In either case, schema modification of the
+ table is not allowed while the index is being built. Another difference is
+ that a regular <command>CREATE INDEX</command> command can be performed
+ within a transaction block, but <command>CREATE INDEX CONCURRENTLY</command>
+ cannot.
+ </para>
+
+ <para>
+ Concurrent builds for indexes on partitioned tables are currently not
+ supported. However, you may concurrently build the index on each
+ partition individually and then finally create the partitioned index
+ non-concurrently in order to reduce the time where writes to the
+ partitioned table will be locked out. In this case, building the
+ partitioned index is a metadata only operation.
+ </para>
+
+ </refsect2>
+ </refsect1>
+
+ <refsect1>
+ <title>Notes</title>
+
+ <para>
+ See <xref linkend="indexes"/> for information about when indexes can
+ be used, when they are not used, and in which particular situations
+ they can be useful.
+ </para>
+
+ <para>
+ Currently, only the B-tree, GiST, GIN, and BRIN index methods support
+ multiple-key-column indexes. Whether there can be multiple key
+ columns is independent of whether <literal>INCLUDE</literal> columns
+ can be added to the index. Indexes can have up to 32 columns,
+ including <literal>INCLUDE</literal> columns.
+ (This limit can be altered when building
+ <productname>PostgreSQL</productname>.) Only B-tree currently
+ supports unique indexes.
+ </para>
+
+ <para>
+ An <firstterm>operator class</firstterm> with optional parameters
+ can be specified for each column of an index.
+ The operator class identifies the operators to be
+ used by the index for that column. For example, a B-tree index on
+ four-byte integers would use the <literal>int4_ops</literal> class;
+ this operator class includes comparison functions for four-byte
+ integers. In practice the default operator class for the column's data
+ type is usually sufficient. The main point of having operator classes
+ is that for some data types, there could be more than one meaningful
+ ordering. For example, we might want to sort a complex-number data
+ type either by absolute value or by real part. We could do this by
+ defining two operator classes for the data type and then selecting
+ the proper class when creating an index. More information about
+ operator classes is in <xref linkend="indexes-opclass"/> and in <xref
+ linkend="xindex"/>.
+ </para>
+
+ <para>
+ When <literal>CREATE INDEX</literal> is invoked on a partitioned
+ table, the default behavior is to recurse to all partitions to ensure
+ they all have matching indexes.
+ Each partition is first checked to determine whether an equivalent
+ index already exists, and if so, that index will become attached as a
+ partition index to the index being created, which will become its
+ parent index.
+ If no matching index exists, a new index will be created and
+ automatically attached; the name of the new index in each partition
+ will be determined as if no index name had been specified in the
+ command.
+ If the <literal>ONLY</literal> option is specified, no recursion
+ is done, and the index is marked invalid.
+ (<command>ALTER INDEX ... ATTACH PARTITION</command> marks the index
+ valid, once all partitions acquire matching indexes.) Note, however,
+ that any partition that is created in the future using
+ <command>CREATE TABLE ... PARTITION OF</command> will automatically
+ have a matching index, regardless of whether <literal>ONLY</literal> is
+ specified.
+ </para>
+
+ <para>
+ For index methods that support ordered scans (currently, only B-tree),
+ the optional clauses <literal>ASC</literal>, <literal>DESC</literal>, <literal>NULLS
+ FIRST</literal>, and/or <literal>NULLS LAST</literal> can be specified to modify
+ the sort ordering of the index. Since an ordered index can be
+ scanned either forward or backward, it is not normally useful to create a
+ single-column <literal>DESC</literal> index &mdash; that sort ordering is already
+ available with a regular index. The value of these options is that
+ multicolumn indexes can be created that match the sort ordering requested
+ by a mixed-ordering query, such as <literal>SELECT ... ORDER BY x ASC, y
+ DESC</literal>. The <literal>NULLS</literal> options are useful if you need to support
+ <quote>nulls sort low</quote> behavior, rather than the default <quote>nulls
+ sort high</quote>, in queries that depend on indexes to avoid sorting steps.
+ </para>
+
+ <para>
+ The system regularly collects statistics on all of a table's
+ columns. Newly-created non-expression indexes can immediately
+ use these statistics to determine an index's usefulness.
+ For new expression indexes, it is necessary to run <link
+ linkend="sql-analyze"><command>ANALYZE</command></link> or wait for
+ the <link linkend="autovacuum">autovacuum daemon</link> to analyze
+ the table to generate statistics for these indexes.
+ </para>
+
+ <para>
+ For most index methods, the speed of creating an index is
+ dependent on the setting of <xref linkend="guc-maintenance-work-mem"/>.
+ Larger values will reduce the time needed for index creation, so long
+ as you don't make it larger than the amount of memory really available,
+ which would drive the machine into swapping.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> can build indexes while
+ leveraging multiple CPUs in order to process the table rows faster.
+ This feature is known as <firstterm>parallel index
+ build</firstterm>. For index methods that support building indexes
+ in parallel (currently, only B-tree),
+ <varname>maintenance_work_mem</varname> specifies the maximum
+ amount of memory that can be used by each index build operation as
+ a whole, regardless of how many worker processes were started.
+ Generally, a cost model automatically determines how many worker
+ processes should be requested, if any.
+ </para>
+
+ <para>
+ Parallel index builds may benefit from increasing
+ <varname>maintenance_work_mem</varname> where an equivalent serial
+ index build will see little or no benefit. Note that
+ <varname>maintenance_work_mem</varname> may influence the number of
+ worker processes requested, since parallel workers must have at
+ least a <literal>32MB</literal> share of the total
+ <varname>maintenance_work_mem</varname> budget. There must also be
+ a remaining <literal>32MB</literal> share for the leader process.
+ Increasing <xref linkend="guc-max-parallel-maintenance-workers"/>
+ may allow more workers to be used, which will reduce the time
+ needed for index creation, so long as the index build is not
+ already I/O bound. Of course, there should also be sufficient
+ CPU capacity that would otherwise lie idle.
+ </para>
+
+ <para>
+ Setting a value for <literal>parallel_workers</literal> via <link
+ linkend="sql-altertable"><command>ALTER TABLE</command></link> directly controls how many parallel
+ worker processes will be requested by a <command>CREATE
+ INDEX</command> against the table. This bypasses the cost model
+ completely, and prevents <varname>maintenance_work_mem</varname>
+ from affecting how many parallel workers are requested. Setting
+ <literal>parallel_workers</literal> to 0 via <command>ALTER
+ TABLE</command> will disable parallel index builds on the table in
+ all cases.
+ </para>
+
+ <tip>
+ <para>
+ You might want to reset <literal>parallel_workers</literal> after
+ setting it as part of tuning an index build. This avoids
+ inadvertent changes to query plans, since
+ <literal>parallel_workers</literal> affects
+ <emphasis>all</emphasis> parallel table scans.
+ </para>
+ </tip>
+
+ <para>
+ While <command>CREATE INDEX</command> with the
+ <literal>CONCURRENTLY</literal> option supports parallel builds
+ without special restrictions, only the first table scan is actually
+ performed in parallel.
+ </para>
+
+ <para>
+ Use <link linkend="sql-dropindex"><command>DROP INDEX</command></link>
+ to remove an index.
+ </para>
+
+ <para>
+ Like any long-running transaction, <command>CREATE INDEX</command> on a
+ table can affect which tuples can be removed by concurrent
+ <command>VACUUM</command> on any other table.
+ </para>
+
+ <para>
+ Prior releases of <productname>PostgreSQL</productname> also had an
+ R-tree index method. This method has been removed because
+ it had no significant advantages over the GiST method.
+ If <literal>USING rtree</literal> is specified, <command>CREATE INDEX</command>
+ will interpret it as <literal>USING gist</literal>, to simplify conversion
+ of old databases to GiST.
+ </para>
+
+ <para>
+ Each backend running <command>CREATE INDEX</command> will report its
+ progress in the <structname>pg_stat_progress_create_index</structname>
+ view. See <xref linkend="create-index-progress-reporting"/> for details.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>Examples</title>
+
+ <para>
+ To create a unique B-tree index on the column <literal>title</literal> in
+ the table <literal>films</literal>:
+<programlisting>
+CREATE UNIQUE INDEX title_idx ON films (title);
+</programlisting>
+ </para>
+
+ <para>
+ To create a unique B-tree index on the column <literal>title</literal>
+ with included columns <literal>director</literal>
+ and <literal>rating</literal> in the table <literal>films</literal>:
+<programlisting>
+CREATE UNIQUE INDEX title_idx ON films (title) INCLUDE (director, rating);
+</programlisting>
+ </para>
+
+ <para>
+ To create a B-Tree index with deduplication disabled:
+<programlisting>
+CREATE INDEX title_idx ON films (title) WITH (deduplicate_items = off);
+</programlisting>
+ </para>
+
+ <para>
+ To create an index on the expression <literal>lower(title)</literal>,
+ allowing efficient case-insensitive searches:
+<programlisting>
+CREATE INDEX ON films ((lower(title)));
+</programlisting>
+ (In this example we have chosen to omit the index name, so the system
+ will choose a name, typically <literal>films_lower_idx</literal>.)
+ </para>
+
+ <para>
+ To create an index with non-default collation:
+<programlisting>
+CREATE INDEX title_idx_german ON films (title COLLATE "de_DE");
+</programlisting>
+ </para>
+
+ <para>
+ To create an index with non-default sort ordering of nulls:
+<programlisting>
+CREATE INDEX title_idx_nulls_low ON films (title NULLS FIRST);
+</programlisting>
+ </para>
+
+ <para>
+ To create an index with non-default fill factor:
+<programlisting>
+CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70);
+</programlisting>
+ </para>
+
+ <para>
+ To create a <acronym>GIN</acronym> index with fast updates disabled:
+<programlisting>
+CREATE INDEX gin_idx ON documents_table USING GIN (locations) WITH (fastupdate = off);
+</programlisting>
+ </para>
+
+ <para>
+ To create an index on the column <literal>code</literal> in the table
+ <literal>films</literal> and have the index reside in the tablespace
+ <literal>indexspace</literal>:
+<programlisting>
+CREATE INDEX code_idx ON films (code) TABLESPACE indexspace;
+</programlisting>
+ </para>
+
+ <para>
+ To create a GiST index on a point attribute so that we
+ can efficiently use box operators on the result of the
+ conversion function:
+<programlisting>
+CREATE INDEX pointloc
+ ON points USING gist (box(location,location));
+SELECT * FROM points
+ WHERE box(location,location) &amp;&amp; '(0,0),(1,1)'::box;
+</programlisting>
+ </para>
+
+ <para>
+ To create an index without locking out writes to the table:
+<programlisting>
+CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
+</programlisting></para>
+
+ </refsect1>
+
+ <refsect1>
+ <title>Compatibility</title>
+
+ <para>
+ <command>CREATE INDEX</command> is a
+ <productname>PostgreSQL</productname> language extension. There
+ are no provisions for indexes in the SQL standard.
+ </para>
+ </refsect1>
+
+ <refsect1>
+ <title>See Also</title>
+
+ <simplelist type="inline">
+ <member><xref linkend="sql-alterindex"/></member>
+ <member><xref linkend="sql-dropindex"/></member>
+ <member><xref linkend="sql-reindex"/></member>
+ <member><xref linkend="create-index-progress-reporting"/></member>
+ </simplelist>
+ </refsect1>
+</refentry>