summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/ddl.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/ddl.sgml')
-rw-r--r--doc/src/sgml/ddl.sgml5245
1 files changed, 5245 insertions, 0 deletions
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
new file mode 100644
index 0000000..cfe3dcb
--- /dev/null
+++ b/doc/src/sgml/ddl.sgml
@@ -0,0 +1,5245 @@
+<!-- doc/src/sgml/ddl.sgml -->
+
+<chapter id="ddl">
+ <title>Data Definition</title>
+
+ <para>
+ This chapter covers how one creates the database structures that
+ will hold one's data. In a relational database, the raw data is
+ stored in tables, so the majority of this chapter is devoted to
+ explaining how tables are created and modified and what features are
+ available to control what data is stored in the tables.
+ Subsequently, we discuss how tables can be organized into
+ schemas, and how privileges can be assigned to tables. Finally,
+ we will briefly look at other features that affect the data storage,
+ such as inheritance, table partitioning, views, functions, and
+ triggers.
+ </para>
+
+ <sect1 id="ddl-basics">
+ <title>Table Basics</title>
+
+ <indexterm zone="ddl-basics">
+ <primary>table</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>row</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>column</primary>
+ </indexterm>
+
+ <para>
+ A table in a relational database is much like a table on paper: It
+ consists of rows and columns. The number and order of the columns
+ is fixed, and each column has a name. The number of rows is
+ variable &mdash; it reflects how much data is stored at a given moment.
+ SQL does not make any guarantees about the order of the rows in a
+ table. When a table is read, the rows will appear in an unspecified order,
+ unless sorting is explicitly requested. This is covered in <xref
+ linkend="queries"/>. Furthermore, SQL does not assign unique
+ identifiers to rows, so it is possible to have several completely
+ identical rows in a table. This is a consequence of the
+ mathematical model that underlies SQL but is usually not desirable.
+ Later in this chapter we will see how to deal with this issue.
+ </para>
+
+ <para>
+ Each column has a data type. The data type constrains the set of
+ possible values that can be assigned to a column and assigns
+ semantics to the data stored in the column so that it can be used
+ for computations. For instance, a column declared to be of a
+ numerical type will not accept arbitrary text strings, and the data
+ stored in such a column can be used for mathematical computations.
+ By contrast, a column declared to be of a character string type
+ will accept almost any kind of data but it does not lend itself to
+ mathematical calculations, although other operations such as string
+ concatenation are available.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> includes a sizable set of
+ built-in data types that fit many applications. Users can also
+ define their own data types. Most built-in data types have obvious
+ names and semantics, so we defer a detailed explanation to <xref
+ linkend="datatype"/>. Some of the frequently used data types are
+ <type>integer</type> for whole numbers, <type>numeric</type> for
+ possibly fractional numbers, <type>text</type> for character
+ strings, <type>date</type> for dates, <type>time</type> for
+ time-of-day values, and <type>timestamp</type> for values
+ containing both date and time.
+ </para>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>creating</secondary>
+ </indexterm>
+
+ <para>
+ To create a table, you use the aptly named <xref
+ linkend="sql-createtable"/> command.
+ In this command you specify at least a name for the new table, the
+ names of the columns and the data type of each column. For
+ example:
+<programlisting>
+CREATE TABLE my_first_table (
+ first_column text,
+ second_column integer
+);
+</programlisting>
+ This creates a table named <literal>my_first_table</literal> with
+ two columns. The first column is named
+ <literal>first_column</literal> and has a data type of
+ <type>text</type>; the second column has the name
+ <literal>second_column</literal> and the type <type>integer</type>.
+ The table and column names follow the identifier syntax explained
+ in <xref linkend="sql-syntax-identifiers"/>. The type names are
+ usually also identifiers, but there are some exceptions. Note that the
+ column list is comma-separated and surrounded by parentheses.
+ </para>
+
+ <para>
+ Of course, the previous example was heavily contrived. Normally,
+ you would give names to your tables and columns that convey what
+ kind of data they store. So let's look at a more realistic
+ example:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric
+);
+</programlisting>
+ (The <type>numeric</type> type can store fractional components, as
+ would be typical of monetary amounts.)
+ </para>
+
+ <tip>
+ <para>
+ When you create many interrelated tables it is wise to choose a
+ consistent naming pattern for the tables and columns. For
+ instance, there is a choice of using singular or plural nouns for
+ table names, both of which are favored by some theorist or other.
+ </para>
+ </tip>
+
+ <para>
+ There is a limit on how many columns a table can contain.
+ Depending on the column types, it is between 250 and 1600.
+ However, defining a table with anywhere near this many columns is
+ highly unusual and often a questionable design.
+ </para>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>removing</secondary>
+ </indexterm>
+
+ <para>
+ If you no longer need a table, you can remove it using the <xref
+ linkend="sql-droptable"/> command.
+ For example:
+<programlisting>
+DROP TABLE my_first_table;
+DROP TABLE products;
+</programlisting>
+ Attempting to drop a table that does not exist is an error.
+ Nevertheless, it is common in SQL script files to unconditionally
+ try to drop each table before creating it, ignoring any error
+ messages, so that the script works whether or not the table exists.
+ (If you like, you can use the <literal>DROP TABLE IF EXISTS</literal> variant
+ to avoid the error messages, but this is not standard SQL.)
+ </para>
+
+ <para>
+ If you need to modify a table that already exists, see <xref
+ linkend="ddl-alter"/> later in this chapter.
+ </para>
+
+ <para>
+ With the tools discussed so far you can create fully functional
+ tables. The remainder of this chapter is concerned with adding
+ features to the table definition to ensure data integrity,
+ security, or convenience. If you are eager to fill your tables with
+ data now you can skip ahead to <xref linkend="dml"/> and read the
+ rest of this chapter later.
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-default">
+ <title>Default Values</title>
+
+ <indexterm zone="ddl-default">
+ <primary>default value</primary>
+ </indexterm>
+
+ <para>
+ A column can be assigned a default value. When a new row is
+ created and no values are specified for some of the columns, those
+ columns will be filled with their respective default values. A
+ data manipulation command can also request explicitly that a column
+ be set to its default value, without having to know what that value is.
+ (Details about data manipulation commands are in <xref linkend="dml"/>.)
+ </para>
+
+ <para>
+ <indexterm><primary>null value</primary><secondary>default value</secondary></indexterm>
+ If no default value is declared explicitly, the default value is the
+ null value. This usually makes sense because a null value can
+ be considered to represent unknown data.
+ </para>
+
+ <para>
+ In a table definition, default values are listed after the column
+ data type. For example:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric <emphasis>DEFAULT 9.99</emphasis>
+);
+</programlisting>
+ </para>
+
+ <para>
+ The default value can be an expression, which will be
+ evaluated whenever the default value is inserted
+ (<emphasis>not</emphasis> when the table is created). A common example
+ is for a <type>timestamp</type> column to have a default of <literal>CURRENT_TIMESTAMP</literal>,
+ so that it gets set to the time of row insertion. Another common
+ example is generating a <quote>serial number</quote> for each row.
+ In <productname>PostgreSQL</productname> this is typically done by
+ something like:
+<programlisting>
+CREATE TABLE products (
+ product_no integer <emphasis>DEFAULT nextval('products_product_no_seq')</emphasis>,
+ ...
+);
+</programlisting>
+ where the <literal>nextval()</literal> function supplies successive values
+ from a <firstterm>sequence object</firstterm> (see <xref
+ linkend="functions-sequence"/>). This arrangement is sufficiently common
+ that there's a special shorthand for it:
+<programlisting>
+CREATE TABLE products (
+ product_no <emphasis>SERIAL</emphasis>,
+ ...
+);
+</programlisting>
+ The <literal>SERIAL</literal> shorthand is discussed further in <xref
+ linkend="datatype-serial"/>.
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-generated-columns">
+ <title>Generated Columns</title>
+
+ <indexterm zone="ddl-generated-columns">
+ <primary>generated column</primary>
+ </indexterm>
+
+ <para>
+ A generated column is a special column that is always computed from other
+ columns. Thus, it is for columns what a view is for tables. There are two
+ kinds of generated columns: stored and virtual. A stored generated column
+ is computed when it is written (inserted or updated) and occupies storage
+ as if it were a normal column. A virtual generated column occupies no
+ storage and is computed when it is read. Thus, a virtual generated column
+ is similar to a view and a stored generated column is similar to a
+ materialized view (except that it is always updated automatically).
+ PostgreSQL currently implements only stored generated columns.
+ </para>
+
+ <para>
+ To create a generated column, use the <literal>GENERATED ALWAYS
+ AS</literal> clause in <command>CREATE TABLE</command>, for example:
+<programlisting>
+CREATE TABLE people (
+ ...,
+ height_cm numeric,
+ height_in numeric <emphasis>GENERATED ALWAYS AS (height_cm / 2.54) STORED</emphasis>
+);
+</programlisting>
+ The keyword <literal>STORED</literal> must be specified to choose the
+ stored kind of generated column. See <xref linkend="sql-createtable"/> for
+ more details.
+ </para>
+
+ <para>
+ A generated column cannot be written to directly. In
+ <command>INSERT</command> or <command>UPDATE</command> commands, a value
+ cannot be specified for a generated column, but the keyword
+ <literal>DEFAULT</literal> may be specified.
+ </para>
+
+ <para>
+ Consider the differences between a column with a default and a generated
+ column. The column default is evaluated once when the row is first
+ inserted if no other value was provided; a generated column is updated
+ whenever the row changes and cannot be overridden. A column default may
+ not refer to other columns of the table; a generation expression would
+ normally do so. A column default can use volatile functions, for example
+ <literal>random()</literal> or functions referring to the current time;
+ this is not allowed for generated columns.
+ </para>
+
+ <para>
+ Several restrictions apply to the definition of generated columns and
+ tables involving generated columns:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ The generation expression can only use immutable functions and cannot
+ use subqueries or reference anything other than the current row in any
+ way.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A generation expression cannot reference another generated column.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A generation expression cannot reference a system column, except
+ <varname>tableoid</varname>.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A generated column cannot have a column default or an identity definition.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ A generated column cannot be part of a partition key.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Foreign tables can have generated columns. See <xref
+ linkend="sql-createforeigntable"/> for details.
+ </para>
+ </listitem>
+ <listitem>
+ <para>For inheritance:</para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ If a parent column is a generated column, a child column must also be
+ a generated column using the same expression. In the definition of
+ the child column, leave off the <literal>GENERATED</literal> clause,
+ as it will be copied from the parent.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ In case of multiple inheritance, if one parent column is a generated
+ column, then all parent columns must be generated columns and with the
+ same expression.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If a parent column is not a generated column, a child column may be
+ defined to be a generated column or not.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Additional considerations apply to the use of generated columns.
+ <itemizedlist>
+ <listitem>
+ <para>
+ Generated columns maintain access privileges separately from their
+ underlying base columns. So, it is possible to arrange it so that a
+ particular role can read from a generated column but not from the
+ underlying base columns.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Generated columns are, conceptually, updated after
+ <literal>BEFORE</literal> triggers have run. Therefore, changes made to
+ base columns in a <literal>BEFORE</literal> trigger will be reflected in
+ generated columns. But conversely, it is not allowed to access
+ generated columns in <literal>BEFORE</literal> triggers.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-constraints">
+ <title>Constraints</title>
+
+ <indexterm zone="ddl-constraints">
+ <primary>constraint</primary>
+ </indexterm>
+
+ <para>
+ Data types are a way to limit the kind of data that can be stored
+ in a table. For many applications, however, the constraint they
+ provide is too coarse. For example, a column containing a product
+ price should probably only accept positive values. But there is no
+ standard data type that accepts only positive numbers. Another issue is
+ that you might want to constrain column data with respect to other
+ columns or rows. For example, in a table containing product
+ information, there should be only one row for each product number.
+ </para>
+
+ <para>
+ To that end, SQL allows you to define constraints on columns and
+ tables. Constraints give you as much control over the data in your
+ tables as you wish. If a user attempts to store data in a column
+ that would violate a constraint, an error is raised. This applies
+ even if the value came from the default value definition.
+ </para>
+
+ <sect2 id="ddl-constraints-check-constraints">
+ <title>Check Constraints</title>
+
+ <indexterm>
+ <primary>check constraint</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>check</secondary>
+ </indexterm>
+
+ <para>
+ A check constraint is the most generic constraint type. It allows
+ you to specify that the value in a certain column must satisfy a
+ Boolean (truth-value) expression. For instance, to require positive
+ product prices, you could use:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric <emphasis>CHECK (price &gt; 0)</emphasis>
+);
+</programlisting>
+ </para>
+
+ <para>
+ As you see, the constraint definition comes after the data type,
+ just like default value definitions. Default values and
+ constraints can be listed in any order. A check constraint
+ consists of the key word <literal>CHECK</literal> followed by an
+ expression in parentheses. The check constraint expression should
+ involve the column thus constrained, otherwise the constraint
+ would not make too much sense.
+ </para>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>name</secondary>
+ </indexterm>
+
+ <para>
+ You can also give the constraint a separate name. This clarifies
+ error messages and allows you to refer to the constraint when you
+ need to change it. The syntax is:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric <emphasis>CONSTRAINT positive_price</emphasis> CHECK (price &gt; 0)
+);
+</programlisting>
+ So, to specify a named constraint, use the key word
+ <literal>CONSTRAINT</literal> followed by an identifier followed
+ by the constraint definition. (If you don't specify a constraint
+ name in this way, the system chooses a name for you.)
+ </para>
+
+ <para>
+ A check constraint can also refer to several columns. Say you
+ store a regular price and a discounted price, and you want to
+ ensure that the discounted price is lower than the regular price:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric CHECK (price &gt; 0),
+ discounted_price numeric CHECK (discounted_price &gt; 0),
+ <emphasis>CHECK (price &gt; discounted_price)</emphasis>
+);
+</programlisting>
+ </para>
+
+ <para>
+ The first two constraints should look familiar. The third one
+ uses a new syntax. It is not attached to a particular column,
+ instead it appears as a separate item in the comma-separated
+ column list. Column definitions and these constraint
+ definitions can be listed in mixed order.
+ </para>
+
+ <para>
+ We say that the first two constraints are column constraints, whereas the
+ third one is a table constraint because it is written separately
+ from any one column definition. Column constraints can also be
+ written as table constraints, while the reverse is not necessarily
+ possible, since a column constraint is supposed to refer to only the
+ column it is attached to. (<productname>PostgreSQL</productname> doesn't
+ enforce that rule, but you should follow it if you want your table
+ definitions to work with other database systems.) The above example could
+ also be written as:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric,
+ CHECK (price &gt; 0),
+ discounted_price numeric,
+ CHECK (discounted_price &gt; 0),
+ CHECK (price &gt; discounted_price)
+);
+</programlisting>
+ or even:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric CHECK (price &gt; 0),
+ discounted_price numeric,
+ CHECK (discounted_price &gt; 0 AND price &gt; discounted_price)
+);
+</programlisting>
+ It's a matter of taste.
+ </para>
+
+ <para>
+ Names can be assigned to table constraints in the same way as
+ column constraints:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric,
+ CHECK (price &gt; 0),
+ discounted_price numeric,
+ CHECK (discounted_price &gt; 0),
+ <emphasis>CONSTRAINT valid_discount</emphasis> CHECK (price &gt; discounted_price)
+);
+</programlisting>
+ </para>
+
+ <indexterm>
+ <primary>null value</primary>
+ <secondary sortas="check constraints">with check constraints</secondary>
+ </indexterm>
+
+ <para>
+ It should be noted that a check constraint is satisfied if the
+ check expression evaluates to true or the null value. Since most
+ expressions will evaluate to the null value if any operand is null,
+ they will not prevent null values in the constrained columns. To
+ ensure that a column does not contain null values, the not-null
+ constraint described in the next section can be used.
+ </para>
+
+ <note>
+ <para>
+ <productname>PostgreSQL</productname> does not support
+ <literal>CHECK</literal> constraints that reference table data other than
+ the new or updated row being checked. While a <literal>CHECK</literal>
+ constraint that violates this rule may appear to work in simple
+ tests, it cannot guarantee that the database will not reach a state
+ in which the constraint condition is false (due to subsequent changes
+ of the other row(s) involved). This would cause a database dump and
+ restore to fail. The restore could fail even when the complete
+ database state is consistent with the constraint, due to rows not
+ being loaded in an order that will satisfy the constraint. If
+ possible, use <literal>UNIQUE</literal>, <literal>EXCLUDE</literal>,
+ or <literal>FOREIGN KEY</literal> constraints to express
+ cross-row and cross-table restrictions.
+ </para>
+
+ <para>
+ If what you desire is a one-time check against other rows at row
+ insertion, rather than a continuously-maintained consistency
+ guarantee, a custom <link linkend="triggers">trigger</link> can be used
+ to implement that. (This approach avoids the dump/restore problem because
+ <application>pg_dump</application> does not reinstall triggers until after
+ restoring data, so that the check will not be enforced during a
+ dump/restore.)
+ </para>
+ </note>
+
+ <note>
+ <para>
+ <productname>PostgreSQL</productname> assumes that
+ <literal>CHECK</literal> constraints' conditions are immutable, that
+ is, they will always give the same result for the same input row.
+ This assumption is what justifies examining <literal>CHECK</literal>
+ constraints only when rows are inserted or updated, and not at other
+ times. (The warning above about not referencing other table data is
+ really a special case of this restriction.)
+ </para>
+
+ <para>
+ An example of a common way to break this assumption is to reference a
+ user-defined function in a <literal>CHECK</literal> expression, and
+ then change the behavior of that
+ function. <productname>PostgreSQL</productname> does not disallow
+ that, but it will not notice if there are rows in the table that now
+ violate the <literal>CHECK</literal> constraint. That would cause a
+ subsequent database dump and restore to fail.
+ The recommended way to handle such a change is to drop the constraint
+ (using <command>ALTER TABLE</command>), adjust the function definition,
+ and re-add the constraint, thereby rechecking it against all table rows.
+ </para>
+ </note>
+ </sect2>
+
+ <sect2>
+ <title>Not-Null Constraints</title>
+
+ <indexterm>
+ <primary>not-null constraint</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>NOT NULL</secondary>
+ </indexterm>
+
+ <para>
+ A not-null constraint simply specifies that a column must not
+ assume the null value. A syntax example:
+<programlisting>
+CREATE TABLE products (
+ product_no integer <emphasis>NOT NULL</emphasis>,
+ name text <emphasis>NOT NULL</emphasis>,
+ price numeric
+);
+</programlisting>
+ </para>
+
+ <para>
+ A not-null constraint is always written as a column constraint. A
+ not-null constraint is functionally equivalent to creating a check
+ constraint <literal>CHECK (<replaceable>column_name</replaceable>
+ IS NOT NULL)</literal>, but in
+ <productname>PostgreSQL</productname> creating an explicit
+ not-null constraint is more efficient. The drawback is that you
+ cannot give explicit names to not-null constraints created this
+ way.
+ </para>
+
+ <para>
+ Of course, a column can have more than one constraint. Just write
+ the constraints one after another:
+<programlisting>
+CREATE TABLE products (
+ product_no integer NOT NULL,
+ name text NOT NULL,
+ price numeric NOT NULL CHECK (price &gt; 0)
+);
+</programlisting>
+ The order doesn't matter. It does not necessarily determine in which
+ order the constraints are checked.
+ </para>
+
+ <para>
+ The <literal>NOT NULL</literal> constraint has an inverse: the
+ <literal>NULL</literal> constraint. This does not mean that the
+ column must be null, which would surely be useless. Instead, this
+ simply selects the default behavior that the column might be null.
+ The <literal>NULL</literal> constraint is not present in the SQL
+ standard and should not be used in portable applications. (It was
+ only added to <productname>PostgreSQL</productname> to be
+ compatible with some other database systems.) Some users, however,
+ like it because it makes it easy to toggle the constraint in a
+ script file. For example, you could start with:
+<programlisting>
+CREATE TABLE products (
+ product_no integer NULL,
+ name text NULL,
+ price numeric NULL
+);
+</programlisting>
+ and then insert the <literal>NOT</literal> key word where desired.
+ </para>
+
+ <tip>
+ <para>
+ In most database designs the majority of columns should be marked
+ not null.
+ </para>
+ </tip>
+ </sect2>
+
+ <sect2 id="ddl-constraints-unique-constraints">
+ <title>Unique Constraints</title>
+
+ <indexterm>
+ <primary>unique constraint</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>unique</secondary>
+ </indexterm>
+
+ <para>
+ Unique constraints ensure that the data contained in a column, or a
+ group of columns, is unique among all the rows in the
+ table. The syntax is:
+<programlisting>
+CREATE TABLE products (
+ product_no integer <emphasis>UNIQUE</emphasis>,
+ name text,
+ price numeric
+);
+</programlisting>
+ when written as a column constraint, and:
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric,
+ <emphasis>UNIQUE (product_no)</emphasis>
+);
+</programlisting>
+ when written as a table constraint.
+ </para>
+
+ <para>
+ To define a unique constraint for a group of columns, write it as a
+ table constraint with the column names separated by commas:
+<programlisting>
+CREATE TABLE example (
+ a integer,
+ b integer,
+ c integer,
+ <emphasis>UNIQUE (a, c)</emphasis>
+);
+</programlisting>
+ This specifies that the combination of values in the indicated columns
+ is unique across the whole table, though any one of the columns
+ need not be (and ordinarily isn't) unique.
+ </para>
+
+ <para>
+ You can assign your own name for a unique constraint, in the usual way:
+<programlisting>
+CREATE TABLE products (
+ product_no integer <emphasis>CONSTRAINT must_be_different</emphasis> UNIQUE,
+ name text,
+ price numeric
+);
+</programlisting>
+ </para>
+
+ <para>
+ Adding a unique constraint will automatically create a unique B-tree
+ index on the column or group of columns listed in the constraint.
+ A uniqueness restriction covering only some rows cannot be written as
+ a unique constraint, but it is possible to enforce such a restriction by
+ creating a unique <link linkend="indexes-partial">partial index</link>.
+ </para>
+
+ <indexterm>
+ <primary>null value</primary>
+ <secondary sortas="unique constraints">with unique constraints</secondary>
+ </indexterm>
+
+ <para>
+ In general, a unique constraint is violated if there is more than
+ one row in the table where the values of all of the
+ columns included in the constraint are equal.
+ By default, two null values are not considered equal in this
+ comparison. That means even in the presence of a
+ unique constraint it is possible to store duplicate
+ rows that contain a null value in at least one of the constrained
+ columns. This behavior can be changed by adding the clause <literal>NULLS
+ NOT DISTINCT</literal>, like
+<programlisting>
+CREATE TABLE products (
+ product_no integer UNIQUE <emphasis>NULLS NOT DISTINCT</emphasis>,
+ name text,
+ price numeric
+);
+</programlisting>
+ or
+<programlisting>
+CREATE TABLE products (
+ product_no integer,
+ name text,
+ price numeric,
+ UNIQUE <emphasis>NULLS NOT DISTINCT</emphasis> (product_no)
+);
+</programlisting>
+ The default behavior can be specified explicitly using <literal>NULLS
+ DISTINCT</literal>. The default null treatment in unique constraints is
+ implementation-defined according to the SQL standard, and other
+ implementations have a different behavior. So be careful when developing
+ applications that are intended to be portable.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-constraints-primary-keys">
+ <title>Primary Keys</title>
+
+ <indexterm>
+ <primary>primary key</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>primary key</secondary>
+ </indexterm>
+
+ <para>
+ A primary key constraint indicates that a column, or group of columns,
+ can be used as a unique identifier for rows in the table. This
+ requires that the values be both unique and not null. So, the following
+ two table definitions accept the same data:
+<programlisting>
+CREATE TABLE products (
+ product_no integer UNIQUE NOT NULL,
+ name text,
+ price numeric
+);
+</programlisting>
+
+<programlisting>
+CREATE TABLE products (
+ product_no integer <emphasis>PRIMARY KEY</emphasis>,
+ name text,
+ price numeric
+);
+</programlisting>
+ </para>
+
+ <para>
+ Primary keys can span more than one column; the syntax
+ is similar to unique constraints:
+<programlisting>
+CREATE TABLE example (
+ a integer,
+ b integer,
+ c integer,
+ <emphasis>PRIMARY KEY (a, c)</emphasis>
+);
+</programlisting>
+ </para>
+
+ <para>
+ Adding a primary key will automatically create a unique B-tree index
+ on the column or group of columns listed in the primary key, and will
+ force the column(s) to be marked <literal>NOT NULL</literal>.
+ </para>
+
+ <para>
+ A table can have at most one primary key. (There can be any number
+ of unique and not-null constraints, which are functionally almost the
+ same thing, but only one can be identified as the primary key.)
+ Relational database theory
+ dictates that every table must have a primary key. This rule is
+ not enforced by <productname>PostgreSQL</productname>, but it is
+ usually best to follow it.
+ </para>
+
+ <para>
+ Primary keys are useful both for
+ documentation purposes and for client applications. For example,
+ a GUI application that allows modifying row values probably needs
+ to know the primary key of a table to be able to identify rows
+ uniquely. There are also various ways in which the database system
+ makes use of a primary key if one has been declared; for example,
+ the primary key defines the default target column(s) for foreign keys
+ referencing its table.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-constraints-fk">
+ <title>Foreign Keys</title>
+
+ <indexterm>
+ <primary>foreign key</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>foreign key</secondary>
+ </indexterm>
+
+ <indexterm>
+ <primary>referential integrity</primary>
+ </indexterm>
+
+ <para>
+ A foreign key constraint specifies that the values in a column (or
+ a group of columns) must match the values appearing in some row
+ of another table.
+ We say this maintains the <firstterm>referential
+ integrity</firstterm> between two related tables.
+ </para>
+
+ <para>
+ Say you have the product table that we have used several times already:
+<programlisting>
+CREATE TABLE products (
+ product_no integer PRIMARY KEY,
+ name text,
+ price numeric
+);
+</programlisting>
+ Let's also assume you have a table storing orders of those
+ products. We want to ensure that the orders table only contains
+ orders of products that actually exist. So we define a foreign
+ key constraint in the orders table that references the products
+ table:
+<programlisting>
+CREATE TABLE orders (
+ order_id integer PRIMARY KEY,
+ product_no integer <emphasis>REFERENCES products (product_no)</emphasis>,
+ quantity integer
+);
+</programlisting>
+ Now it is impossible to create orders with non-NULL
+ <structfield>product_no</structfield> entries that do not appear in the
+ products table.
+ </para>
+
+ <para>
+ We say that in this situation the orders table is the
+ <firstterm>referencing</firstterm> table and the products table is
+ the <firstterm>referenced</firstterm> table. Similarly, there are
+ referencing and referenced columns.
+ </para>
+
+ <para>
+ You can also shorten the above command to:
+<programlisting>
+CREATE TABLE orders (
+ order_id integer PRIMARY KEY,
+ product_no integer <emphasis>REFERENCES products</emphasis>,
+ quantity integer
+);
+</programlisting>
+ because in absence of a column list the primary key of the
+ referenced table is used as the referenced column(s).
+ </para>
+
+ <para>
+ You can assign your own name for a foreign key constraint,
+ in the usual way.
+ </para>
+
+ <para>
+ A foreign key can also constrain and reference a group of columns.
+ As usual, it then needs to be written in table constraint form.
+ Here is a contrived syntax example:
+<programlisting>
+CREATE TABLE t1 (
+ a integer PRIMARY KEY,
+ b integer,
+ c integer,
+ <emphasis>FOREIGN KEY (b, c) REFERENCES other_table (c1, c2)</emphasis>
+);
+</programlisting>
+ Of course, the number and type of the constrained columns need to
+ match the number and type of the referenced columns.
+ </para>
+
+ <indexterm>
+ <primary>foreign key</primary>
+ <secondary>self-referential</secondary>
+ </indexterm>
+
+ <para>
+ Sometimes it is useful for the <quote>other table</quote> of a
+ foreign key constraint to be the same table; this is called
+ a <firstterm>self-referential</firstterm> foreign key. For
+ example, if you want rows of a table to represent nodes of a tree
+ structure, you could write
+<programlisting>
+CREATE TABLE tree (
+ node_id integer PRIMARY KEY,
+ parent_id integer REFERENCES tree,
+ name text,
+ ...
+);
+</programlisting>
+ A top-level node would have NULL <structfield>parent_id</structfield>,
+ while non-NULL <structfield>parent_id</structfield> entries would be
+ constrained to reference valid rows of the table.
+ </para>
+
+ <para>
+ A table can have more than one foreign key constraint. This is
+ used to implement many-to-many relationships between tables. Say
+ you have tables about products and orders, but now you want to
+ allow one order to contain possibly many products (which the
+ structure above did not allow). You could use this table structure:
+<programlisting>
+CREATE TABLE products (
+ product_no integer PRIMARY KEY,
+ name text,
+ price numeric
+);
+
+CREATE TABLE orders (
+ order_id integer PRIMARY KEY,
+ shipping_address text,
+ ...
+);
+
+CREATE TABLE order_items (
+ product_no integer REFERENCES products,
+ order_id integer REFERENCES orders,
+ quantity integer,
+ PRIMARY KEY (product_no, order_id)
+);
+</programlisting>
+ Notice that the primary key overlaps with the foreign keys in
+ the last table.
+ </para>
+
+ <indexterm>
+ <primary>CASCADE</primary>
+ <secondary>foreign key action</secondary>
+ </indexterm>
+
+ <indexterm>
+ <primary>RESTRICT</primary>
+ <secondary>foreign key action</secondary>
+ </indexterm>
+
+ <para>
+ We know that the foreign keys disallow creation of orders that
+ do not relate to any products. But what if a product is removed
+ after an order is created that references it? SQL allows you to
+ handle that as well. Intuitively, we have a few options:
+ <itemizedlist spacing="compact">
+ <listitem><para>Disallow deleting a referenced product</para></listitem>
+ <listitem><para>Delete the orders as well</para></listitem>
+ <listitem><para>Something else?</para></listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ To illustrate this, let's implement the following policy on the
+ many-to-many relationship example above: when someone wants to
+ remove a product that is still referenced by an order (via
+ <literal>order_items</literal>), we disallow it. If someone
+ removes an order, the order items are removed as well:
+<programlisting>
+CREATE TABLE products (
+ product_no integer PRIMARY KEY,
+ name text,
+ price numeric
+);
+
+CREATE TABLE orders (
+ order_id integer PRIMARY KEY,
+ shipping_address text,
+ ...
+);
+
+CREATE TABLE order_items (
+ product_no integer REFERENCES products <emphasis>ON DELETE RESTRICT</emphasis>,
+ order_id integer REFERENCES orders <emphasis>ON DELETE CASCADE</emphasis>,
+ quantity integer,
+ PRIMARY KEY (product_no, order_id)
+);
+</programlisting>
+ </para>
+
+ <para>
+ Restricting and cascading deletes are the two most common options.
+ <literal>RESTRICT</literal> prevents deletion of a
+ referenced row. <literal>NO ACTION</literal> means that if any
+ referencing rows still exist when the constraint is checked, an error
+ is raised; this is the default behavior if you do not specify anything.
+ (The essential difference between these two choices is that
+ <literal>NO ACTION</literal> allows the check to be deferred until
+ later in the transaction, whereas <literal>RESTRICT</literal> does not.)
+ <literal>CASCADE</literal> specifies that when a referenced row is deleted,
+ row(s) referencing it should be automatically deleted as well.
+ There are two other options:
+ <literal>SET NULL</literal> and <literal>SET DEFAULT</literal>.
+ These cause the referencing column(s) in the referencing row(s)
+ to be set to nulls or their default
+ values, respectively, when the referenced row is deleted.
+ Note that these do not excuse you from observing any constraints.
+ For example, if an action specifies <literal>SET DEFAULT</literal>
+ but the default value would not satisfy the foreign key constraint, the
+ operation will fail.
+ </para>
+
+ <para>
+ The appropriate choice of <literal>ON DELETE</literal> action depends on
+ what kinds of objects the related tables represent. When the referencing
+ table represents something that is a component of what is represented by
+ the referenced table and cannot exist independently, then
+ <literal>CASCADE</literal> could be appropriate. If the two tables
+ represent independent objects, then <literal>RESTRICT</literal> or
+ <literal>NO ACTION</literal> is more appropriate; an application that
+ actually wants to delete both objects would then have to be explicit about
+ this and run two delete commands. In the above example, order items are
+ part of an order, and it is convenient if they are deleted automatically
+ if an order is deleted. But products and orders are different things, and
+ so making a deletion of a product automatically cause the deletion of some
+ order items could be considered problematic. The actions <literal>SET
+ NULL</literal> or <literal>SET DEFAULT</literal> can be appropriate if a
+ foreign-key relationship represents optional information. For example, if
+ the products table contained a reference to a product manager, and the
+ product manager entry gets deleted, then setting the product's product
+ manager to null or a default might be useful.
+ </para>
+
+ <para>
+ The actions <literal>SET NULL</literal> and <literal>SET DEFAULT</literal>
+ can take a column list to specify which columns to set. Normally, all
+ columns of the foreign-key constraint are set; setting only a subset is
+ useful in some special cases. Consider the following example:
+<programlisting>
+CREATE TABLE tenants (
+ tenant_id integer PRIMARY KEY
+);
+
+CREATE TABLE users (
+ tenant_id integer REFERENCES tenants ON DELETE CASCADE,
+ user_id integer NOT NULL,
+ PRIMARY KEY (tenant_id, user_id)
+);
+
+CREATE TABLE posts (
+ tenant_id integer REFERENCES tenants ON DELETE CASCADE,
+ post_id integer NOT NULL,
+ author_id integer,
+ PRIMARY KEY (tenant_id, post_id),
+ FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL <emphasis>(author_id)</emphasis>
+);
+</programlisting>
+ Without the specification of the column, the foreign key would also set
+ the column <literal>tenant_id</literal> to null, but that column is still
+ required as part of the primary key.
+ </para>
+
+ <para>
+ Analogous to <literal>ON DELETE</literal> there is also
+ <literal>ON UPDATE</literal> which is invoked when a referenced
+ column is changed (updated). The possible actions are the same,
+ except that column lists cannot be specified for <literal>SET
+ NULL</literal> and <literal>SET DEFAULT</literal>.
+ In this case, <literal>CASCADE</literal> means that the updated values of the
+ referenced column(s) should be copied into the referencing row(s).
+ </para>
+
+ <para>
+ Normally, a referencing row need not satisfy the foreign key constraint
+ if any of its referencing columns are null. If <literal>MATCH FULL</literal>
+ is added to the foreign key declaration, a referencing row escapes
+ satisfying the constraint only if all its referencing columns are null
+ (so a mix of null and non-null values is guaranteed to fail a
+ <literal>MATCH FULL</literal> constraint). If you don't want referencing rows
+ to be able to avoid satisfying the foreign key constraint, declare the
+ referencing column(s) as <literal>NOT NULL</literal>.
+ </para>
+
+ <para>
+ A foreign key must reference columns that either are a primary key or
+ form a unique constraint. This means that the referenced columns always
+ have an index (the one underlying the primary key or unique constraint);
+ so checks on whether a referencing row has a match will be efficient.
+ Since a <command>DELETE</command> of a row from the referenced table
+ or an <command>UPDATE</command> of a referenced column will require
+ a scan of the referencing table for rows matching the old value, it
+ is often a good idea to index the referencing columns too. Because this
+ is not always needed, and there are many choices available on how
+ to index, declaration of a foreign key constraint does not
+ automatically create an index on the referencing columns.
+ </para>
+
+ <para>
+ More information about updating and deleting data is in <xref
+ linkend="dml"/>. Also see the description of foreign key constraint
+ syntax in the reference documentation for
+ <xref linkend="sql-createtable"/>.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-constraints-exclusion">
+ <title>Exclusion Constraints</title>
+
+ <indexterm>
+ <primary>exclusion constraint</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>exclusion</secondary>
+ </indexterm>
+
+ <para>
+ Exclusion constraints ensure that if any two rows are compared on
+ the specified columns or expressions using the specified operators,
+ at least one of these operator comparisons will return false or null.
+ The syntax is:
+<programlisting>
+CREATE TABLE circles (
+ c circle,
+ EXCLUDE USING gist (c WITH &amp;&amp;)
+);
+</programlisting>
+ </para>
+
+ <para>
+ See also <link linkend="sql-createtable-exclude"><command>CREATE
+ TABLE ... CONSTRAINT ... EXCLUDE</command></link> for details.
+ </para>
+
+ <para>
+ Adding an exclusion constraint will automatically create an index
+ of the type specified in the constraint declaration.
+ </para>
+ </sect2>
+ </sect1>
+
+ <sect1 id="ddl-system-columns">
+ <title>System Columns</title>
+
+ <para>
+ Every table has several <firstterm>system columns</firstterm> that are
+ implicitly defined by the system. Therefore, these names cannot be
+ used as names of user-defined columns. (Note that these
+ restrictions are separate from whether the name is a key word or
+ not; quoting a name will not allow you to escape these
+ restrictions.) You do not really need to be concerned about these
+ columns; just know they exist.
+ </para>
+
+ <indexterm>
+ <primary>column</primary>
+ <secondary>system column</secondary>
+ </indexterm>
+
+ <variablelist>
+ <varlistentry>
+ <term><structfield>tableoid</structfield></term>
+ <listitem>
+ <indexterm>
+ <primary>tableoid</primary>
+ </indexterm>
+
+ <para>
+ The OID of the table containing this row. This column is
+ particularly handy for queries that select from partitioned
+ tables (see <xref linkend="ddl-partitioning"/>) or inheritance
+ hierarchies (see <xref linkend="ddl-inherit"/>), since without it,
+ it's difficult to tell which individual table a row came from. The
+ <structfield>tableoid</structfield> can be joined against the
+ <structfield>oid</structfield> column of
+ <structname>pg_class</structname> to obtain the table name.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><structfield>xmin</structfield></term>
+ <listitem>
+ <indexterm>
+ <primary>xmin</primary>
+ </indexterm>
+
+ <para>
+ The identity (transaction ID) of the inserting transaction for
+ this row version. (A row version is an individual state of a
+ row; each update of a row creates a new row version for the same
+ logical row.)
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><structfield>cmin</structfield></term>
+ <listitem>
+ <indexterm>
+ <primary>cmin</primary>
+ </indexterm>
+
+ <para>
+ The command identifier (starting at zero) within the inserting
+ transaction.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><structfield>xmax</structfield></term>
+ <listitem>
+ <indexterm>
+ <primary>xmax</primary>
+ </indexterm>
+
+ <para>
+ The identity (transaction ID) of the deleting transaction, or
+ zero for an undeleted row version. It is possible for this column to
+ be nonzero in a visible row version. That usually indicates that the
+ deleting transaction hasn't committed yet, or that an attempted
+ deletion was rolled back.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><structfield>cmax</structfield></term>
+ <listitem>
+ <indexterm>
+ <primary>cmax</primary>
+ </indexterm>
+
+ <para>
+ The command identifier within the deleting transaction, or zero.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><structfield>ctid</structfield></term>
+ <listitem>
+ <indexterm>
+ <primary>ctid</primary>
+ </indexterm>
+
+ <para>
+ The physical location of the row version within its table. Note that
+ although the <structfield>ctid</structfield> can be used to
+ locate the row version very quickly, a row's
+ <structfield>ctid</structfield> will change if it is
+ updated or moved by <command>VACUUM FULL</command>. Therefore
+ <structfield>ctid</structfield> is useless as a long-term row
+ identifier. A primary key should be used to identify logical rows.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>
+ Transaction identifiers are also 32-bit quantities. In a
+ long-lived database it is possible for transaction IDs to wrap
+ around. This is not a fatal problem given appropriate maintenance
+ procedures; see <xref linkend="maintenance"/> for details. It is
+ unwise, however, to depend on the uniqueness of transaction IDs
+ over the long term (more than one billion transactions).
+ </para>
+
+ <para>
+ Command identifiers are also 32-bit quantities. This creates a hard limit
+ of 2<superscript>32</superscript> (4 billion) <acronym>SQL</acronym> commands
+ within a single transaction. In practice this limit is not a
+ problem &mdash; note that the limit is on the number of
+ <acronym>SQL</acronym> commands, not the number of rows processed.
+ Also, only commands that actually modify the database contents will
+ consume a command identifier.
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-alter">
+ <title>Modifying Tables</title>
+
+ <indexterm zone="ddl-alter">
+ <primary>table</primary>
+ <secondary>modifying</secondary>
+ </indexterm>
+
+ <para>
+ When you create a table and you realize that you made a mistake, or
+ the requirements of the application change, you can drop the
+ table and create it again. But this is not a convenient option if
+ the table is already filled with data, or if the table is
+ referenced by other database objects (for instance a foreign key
+ constraint). Therefore <productname>PostgreSQL</productname>
+ provides a family of commands to make modifications to existing
+ tables. Note that this is conceptually distinct from altering
+ the data contained in the table: here we are interested in altering
+ the definition, or structure, of the table.
+ </para>
+
+ <para>
+ You can:
+ <itemizedlist spacing="compact">
+ <listitem>
+ <para>Add columns</para>
+ </listitem>
+ <listitem>
+ <para>Remove columns</para>
+ </listitem>
+ <listitem>
+ <para>Add constraints</para>
+ </listitem>
+ <listitem>
+ <para>Remove constraints</para>
+ </listitem>
+ <listitem>
+ <para>Change default values</para>
+ </listitem>
+ <listitem>
+ <para>Change column data types</para>
+ </listitem>
+ <listitem>
+ <para>Rename columns</para>
+ </listitem>
+ <listitem>
+ <para>Rename tables</para>
+ </listitem>
+ </itemizedlist>
+
+ All these actions are performed using the
+ <xref linkend="sql-altertable"/>
+ command, whose reference page contains details beyond those given
+ here.
+ </para>
+
+ <sect2 id="ddl-alter-adding-a-column">
+ <title>Adding a Column</title>
+
+ <indexterm>
+ <primary>column</primary>
+ <secondary>adding</secondary>
+ </indexterm>
+
+ <para>
+ To add a column, use a command like:
+<programlisting>
+ALTER TABLE products ADD COLUMN description text;
+</programlisting>
+ The new column is initially filled with whatever default
+ value is given (null if you don't specify a <literal>DEFAULT</literal> clause).
+ </para>
+
+ <tip>
+ <para>
+ From <productname>PostgreSQL</productname> 11, adding a column with
+ a constant default value no longer means that each row of the table
+ needs to be updated when the <command>ALTER TABLE</command> statement
+ is executed. Instead, the default value will be returned the next time
+ the row is accessed, and applied when the table is rewritten, making
+ the <command>ALTER TABLE</command> very fast even on large tables.
+ </para>
+
+ <para>
+ However, if the default value is volatile (e.g.,
+ <function>clock_timestamp()</function>)
+ each row will need to be updated with the value calculated at the time
+ <command>ALTER TABLE</command> is executed. To avoid a potentially
+ lengthy update operation, particularly if you intend to fill the column
+ with mostly nondefault values anyway, it may be preferable to add the
+ column with no default, insert the correct values using
+ <command>UPDATE</command>, and then add any desired default as described
+ below.
+ </para>
+ </tip>
+
+ <para>
+ You can also define constraints on the column at the same time,
+ using the usual syntax:
+<programlisting>
+ALTER TABLE products ADD COLUMN description text CHECK (description &lt;&gt; '');
+</programlisting>
+ In fact all the options that can be applied to a column description
+ in <command>CREATE TABLE</command> can be used here. Keep in mind however
+ that the default value must satisfy the given constraints, or the
+ <literal>ADD</literal> will fail. Alternatively, you can add
+ constraints later (see below) after you've filled in the new column
+ correctly.
+ </para>
+
+ </sect2>
+
+ <sect2 id="ddl-alter-removing-a-column">
+ <title>Removing a Column</title>
+
+ <indexterm>
+ <primary>column</primary>
+ <secondary>removing</secondary>
+ </indexterm>
+
+ <para>
+ To remove a column, use a command like:
+<programlisting>
+ALTER TABLE products DROP COLUMN description;
+</programlisting>
+ Whatever data was in the column disappears. Table constraints involving
+ the column are dropped, too. However, if the column is referenced by a
+ foreign key constraint of another table,
+ <productname>PostgreSQL</productname> will not silently drop that
+ constraint. You can authorize dropping everything that depends on
+ the column by adding <literal>CASCADE</literal>:
+<programlisting>
+ALTER TABLE products DROP COLUMN description CASCADE;
+</programlisting>
+ See <xref linkend="ddl-depend"/> for a description of the general
+ mechanism behind this.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-alter-adding-a-constraint">
+ <title>Adding a Constraint</title>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>adding</secondary>
+ </indexterm>
+
+ <para>
+ To add a constraint, the table constraint syntax is used. For example:
+<programlisting>
+ALTER TABLE products ADD CHECK (name &lt;&gt; '');
+ALTER TABLE products ADD CONSTRAINT some_name UNIQUE (product_no);
+ALTER TABLE products ADD FOREIGN KEY (product_group_id) REFERENCES product_groups;
+</programlisting>
+ To add a not-null constraint, which cannot be written as a table
+ constraint, use this syntax:
+<programlisting>
+ALTER TABLE products ALTER COLUMN product_no SET NOT NULL;
+</programlisting>
+ </para>
+
+ <para>
+ The constraint will be checked immediately, so the table data must
+ satisfy the constraint before it can be added.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-alter-removing-a-constraint">
+ <title>Removing a Constraint</title>
+
+ <indexterm>
+ <primary>constraint</primary>
+ <secondary>removing</secondary>
+ </indexterm>
+
+ <para>
+ To remove a constraint you need to know its name. If you gave it
+ a name then that's easy. Otherwise the system assigned a
+ generated name, which you need to find out. The
+ <application>psql</application> command <literal>\d
+ <replaceable>tablename</replaceable></literal> can be helpful
+ here; other interfaces might also provide a way to inspect table
+ details. Then the command is:
+<programlisting>
+ALTER TABLE products DROP CONSTRAINT some_name;
+</programlisting>
+ (If you are dealing with a generated constraint name like <literal>$2</literal>,
+ don't forget that you'll need to double-quote it to make it a valid
+ identifier.)
+ </para>
+
+ <para>
+ As with dropping a column, you need to add <literal>CASCADE</literal> if you
+ want to drop a constraint that something else depends on. An example
+ is that a foreign key constraint depends on a unique or primary key
+ constraint on the referenced column(s).
+ </para>
+
+ <para>
+ This works the same for all constraint types except not-null
+ constraints. To drop a not null constraint use:
+<programlisting>
+ALTER TABLE products ALTER COLUMN product_no DROP NOT NULL;
+</programlisting>
+ (Recall that not-null constraints do not have names.)
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Changing a Column's Default Value</title>
+
+ <indexterm>
+ <primary>default value</primary>
+ <secondary>changing</secondary>
+ </indexterm>
+
+ <para>
+ To set a new default for a column, use a command like:
+<programlisting>
+ALTER TABLE products ALTER COLUMN price SET DEFAULT 7.77;
+</programlisting>
+ Note that this doesn't affect any existing rows in the table, it
+ just changes the default for future <command>INSERT</command> commands.
+ </para>
+
+ <para>
+ To remove any default value, use:
+<programlisting>
+ALTER TABLE products ALTER COLUMN price DROP DEFAULT;
+</programlisting>
+ This is effectively the same as setting the default to null.
+ As a consequence, it is not an error
+ to drop a default where one hadn't been defined, because the
+ default is implicitly the null value.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Changing a Column's Data Type</title>
+
+ <indexterm>
+ <primary>column data type</primary>
+ <secondary>changing</secondary>
+ </indexterm>
+
+ <para>
+ To convert a column to a different data type, use a command like:
+<programlisting>
+ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
+</programlisting>
+ This will succeed only if each existing entry in the column can be
+ converted to the new type by an implicit cast. If a more complex
+ conversion is needed, you can add a <literal>USING</literal> clause that
+ specifies how to compute the new values from the old.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> will attempt to convert the column's
+ default value (if any) to the new type, as well as any constraints
+ that involve the column. But these conversions might fail, or might
+ produce surprising results. It's often best to drop any constraints
+ on the column before altering its type, and then add back suitably
+ modified constraints afterwards.
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Renaming a Column</title>
+
+ <indexterm>
+ <primary>column</primary>
+ <secondary>renaming</secondary>
+ </indexterm>
+
+ <para>
+ To rename a column:
+<programlisting>
+ALTER TABLE products RENAME COLUMN product_no TO product_number;
+</programlisting>
+ </para>
+ </sect2>
+
+ <sect2>
+ <title>Renaming a Table</title>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>renaming</secondary>
+ </indexterm>
+
+ <para>
+ To rename a table:
+<programlisting>
+ALTER TABLE products RENAME TO items;
+</programlisting>
+ </para>
+ </sect2>
+ </sect1>
+
+ <sect1 id="ddl-priv">
+ <title>Privileges</title>
+
+ <indexterm zone="ddl-priv">
+ <primary>privilege</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>permission</primary>
+ <see>privilege</see>
+ </indexterm>
+
+ <indexterm zone="ddl-priv">
+ <primary>owner</primary>
+ </indexterm>
+
+ <indexterm zone="ddl-priv">
+ <primary>GRANT</primary>
+ </indexterm>
+
+ <indexterm zone="ddl-priv">
+ <primary>REVOKE</primary>
+ </indexterm>
+
+ <indexterm zone="ddl-priv">
+ <primary>ACL</primary>
+ </indexterm>
+
+ <para>
+ When an object is created, it is assigned an owner. The
+ owner is normally the role that executed the creation statement.
+ For most kinds of objects, the initial state is that only the owner
+ (or a superuser) can do anything with the object. To allow
+ other roles to use it, <firstterm>privileges</firstterm> must be
+ granted.
+ </para>
+
+ <para>
+ There are different kinds of privileges: <literal>SELECT</literal>,
+ <literal>INSERT</literal>, <literal>UPDATE</literal>, <literal>DELETE</literal>,
+ <literal>TRUNCATE</literal>, <literal>REFERENCES</literal>, <literal>TRIGGER</literal>,
+ <literal>CREATE</literal>, <literal>CONNECT</literal>, <literal>TEMPORARY</literal>,
+ <literal>EXECUTE</literal>, <literal>USAGE</literal>, <literal>SET</literal>
+ and <literal>ALTER SYSTEM</literal>.
+ The privileges applicable to a particular
+ object vary depending on the object's type (table, function, etc.).
+ More detail about the meanings of these privileges appears below.
+ The following sections and chapters will also show you how
+ these privileges are used.
+ </para>
+
+ <para>
+ The right to modify or destroy an object is inherent in being the
+ object's owner, and cannot be granted or revoked in itself.
+ (However, like all privileges, that right can be inherited by
+ members of the owning role; see <xref linkend="role-membership"/>.)
+ </para>
+
+ <para>
+ An object can be assigned to a new owner with an <command>ALTER</command>
+ command of the appropriate kind for the object, for example
+<programlisting>
+ALTER TABLE <replaceable>table_name</replaceable> OWNER TO <replaceable>new_owner</replaceable>;
+</programlisting>
+ Superusers can always do this; ordinary roles can only do it if they are
+ both the current owner of the object (or a member of the owning role) and
+ a member of the new owning role.
+ </para>
+
+ <para>
+ To assign privileges, the <xref linkend="sql-grant"/> command is
+ used. For example, if <literal>joe</literal> is an existing role, and
+ <literal>accounts</literal> is an existing table, the privilege to
+ update the table can be granted with:
+<programlisting>
+GRANT UPDATE ON accounts TO joe;
+</programlisting>
+ Writing <literal>ALL</literal> in place of a specific privilege grants all
+ privileges that are relevant for the object type.
+ </para>
+
+ <para>
+ The special <quote>role</quote> name <literal>PUBLIC</literal> can
+ be used to grant a privilege to every role on the system. Also,
+ <quote>group</quote> roles can be set up to help manage privileges when
+ there are many users of a database &mdash; for details see
+ <xref linkend="user-manag"/>.
+ </para>
+
+ <para>
+ To revoke a previously-granted privilege, use the fittingly named
+ <xref linkend="sql-revoke"/> command:
+<programlisting>
+REVOKE ALL ON accounts FROM PUBLIC;
+</programlisting>
+ </para>
+
+ <para>
+ Ordinarily, only the object's owner (or a superuser) can grant or
+ revoke privileges on an object. However, it is possible to grant a
+ privilege <quote>with grant option</quote>, which gives the recipient
+ the right to grant it in turn to others. If the grant option is
+ subsequently revoked then all who received the privilege from that
+ recipient (directly or through a chain of grants) will lose the
+ privilege. For details see the <xref linkend="sql-grant"/> and
+ <xref linkend="sql-revoke"/> reference pages.
+ </para>
+
+ <para>
+ An object's owner can choose to revoke their own ordinary privileges,
+ for example to make a table read-only for themselves as well as others.
+ But owners are always treated as holding all grant options, so they
+ can always re-grant their own privileges.
+ </para>
+
+ <para>
+ The available privileges are:
+
+ <variablelist>
+ <varlistentry>
+ <term><literal>SELECT</literal></term>
+ <listitem>
+ <para>
+ Allows <command>SELECT</command> from
+ any column, or specific column(s), of a table, view, materialized
+ view, or other table-like object.
+ Also allows use of <command>COPY TO</command>.
+ This privilege is also needed to reference existing column values in
+ <command>UPDATE</command>, <command>DELETE</command>,
+ or <command>MERGE</command>.
+ For sequences, this privilege also allows use of the
+ <function>currval</function> function.
+ For large objects, this privilege allows the object to be read.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>INSERT</literal></term>
+ <listitem>
+ <para>
+ Allows <command>INSERT</command> of a new row into a table, view,
+ etc. Can be granted on specific column(s), in which case
+ only those columns may be assigned to in the <command>INSERT</command>
+ command (other columns will therefore receive default values).
+ Also allows use of <command>COPY FROM</command>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>UPDATE</literal></term>
+ <listitem>
+ <para>
+ Allows <command>UPDATE</command> of any
+ column, or specific column(s), of a table, view, etc.
+ (In practice, any nontrivial <command>UPDATE</command> command will
+ require <literal>SELECT</literal> privilege as well, since it must
+ reference table columns to determine which rows to update, and/or to
+ compute new values for columns.)
+ <literal>SELECT ... FOR UPDATE</literal>
+ and <literal>SELECT ... FOR SHARE</literal>
+ also require this privilege on at least one column, in addition to the
+ <literal>SELECT</literal> privilege. For sequences, this
+ privilege allows use of the <function>nextval</function> and
+ <function>setval</function> functions.
+ For large objects, this privilege allows writing or truncating the
+ object.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>DELETE</literal></term>
+ <listitem>
+ <para>
+ Allows <command>DELETE</command> of a row from a table, view, etc.
+ (In practice, any nontrivial <command>DELETE</command> command will
+ require <literal>SELECT</literal> privilege as well, since it must
+ reference table columns to determine which rows to delete.)
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>TRUNCATE</literal></term>
+ <listitem>
+ <para>
+ Allows <command>TRUNCATE</command> on a table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>REFERENCES</literal></term>
+ <listitem>
+ <para>
+ Allows creation of a foreign key constraint referencing a
+ table, or specific column(s) of a table.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>TRIGGER</literal></term>
+ <listitem>
+ <para>
+ Allows creation of a trigger on a table, view, etc.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>CREATE</literal></term>
+ <listitem>
+ <para>
+ For databases, allows new schemas and publications to be created within
+ the database, and allows trusted extensions to be installed within
+ the database.
+ </para>
+ <para>
+ For schemas, allows new objects to be created within the schema.
+ To rename an existing object, you must own the
+ object <emphasis>and</emphasis> have this privilege for the containing
+ schema.
+ </para>
+ <para>
+ For tablespaces, allows tables, indexes, and temporary files to be
+ created within the tablespace, and allows databases to be created that
+ have the tablespace as their default tablespace.
+ </para>
+ <para>
+ Note that revoking this privilege will not alter the existence or
+ location of existing objects.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>CONNECT</literal></term>
+ <listitem>
+ <para>
+ Allows the grantee to connect to the database. This
+ privilege is checked at connection startup (in addition to checking
+ any restrictions imposed by <filename>pg_hba.conf</filename>).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>TEMPORARY</literal></term>
+ <listitem>
+ <para>
+ Allows temporary tables to be created while using the database.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>EXECUTE</literal></term>
+ <listitem>
+ <para>
+ Allows calling a function or procedure, including use of
+ any operators that are implemented on top of the function. This is the
+ only type of privilege that is applicable to functions and procedures.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>USAGE</literal></term>
+ <listitem>
+ <para>
+ For procedural languages, allows use of the language for
+ the creation of functions in that language. This is the only type
+ of privilege that is applicable to procedural languages.
+ </para>
+ <para>
+ For schemas, allows access to objects contained in the
+ schema (assuming that the objects' own privilege requirements are
+ also met). Essentially this allows the grantee to <quote>look up</quote>
+ objects within the schema. Without this permission, it is still
+ possible to see the object names, e.g., by querying system catalogs.
+ Also, after revoking this permission, existing sessions might have
+ statements that have previously performed this lookup, so this is not
+ a completely secure way to prevent object access.
+ </para>
+ <para>
+ For sequences, allows use of the
+ <function>currval</function> and <function>nextval</function> functions.
+ </para>
+ <para>
+ For types and domains, allows use of the type or domain in the
+ creation of tables, functions, and other schema objects. (Note that
+ this privilege does not control all <quote>usage</quote> of the
+ type, such as values of the type appearing in queries. It only
+ prevents objects from being created that depend on the type. The
+ main purpose of this privilege is controlling which users can create
+ dependencies on a type, which could prevent the owner from changing
+ the type later.)
+ </para>
+ <para>
+ For foreign-data wrappers, allows creation of new servers using the
+ foreign-data wrapper.
+ </para>
+ <para>
+ For foreign servers, allows creation of foreign tables using the
+ server. Grantees may also create, alter, or drop their own user
+ mappings associated with that server.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>SET</literal></term>
+ <listitem>
+ <para>
+ Allows a server configuration parameter to be set to a new value
+ within the current session. (While this privilege can be granted
+ on any parameter, it is meaningless except for parameters that would
+ normally require superuser privilege to set.)
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>ALTER SYSTEM</literal></term>
+ <listitem>
+ <para>
+ Allows a server configuration parameter to be configured to a new
+ value using the <xref linkend="sql-altersystem"/> command.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ The privileges required by other commands are listed on the
+ reference page of the respective command.
+ </para>
+
+ <para>
+ PostgreSQL grants privileges on some types of objects to
+ <literal>PUBLIC</literal> by default when the objects are created.
+ No privileges are granted to <literal>PUBLIC</literal> by default on
+ tables,
+ table columns,
+ sequences,
+ foreign data wrappers,
+ foreign servers,
+ large objects,
+ schemas,
+ tablespaces,
+ or configuration parameters.
+ For other types of objects, the default privileges
+ granted to <literal>PUBLIC</literal> are as follows:
+ <literal>CONNECT</literal> and <literal>TEMPORARY</literal> (create
+ temporary tables) privileges for databases;
+ <literal>EXECUTE</literal> privilege for functions and procedures; and
+ <literal>USAGE</literal> privilege for languages and data types
+ (including domains).
+ The object owner can, of course, <command>REVOKE</command>
+ both default and expressly granted privileges. (For maximum
+ security, issue the <command>REVOKE</command> in the same transaction that
+ creates the object; then there is no window in which another user
+ can use the object.)
+ Also, these default privilege settings can be overridden using the
+ <xref linkend="sql-alterdefaultprivileges"/> command.
+ </para>
+
+ <para>
+ <xref linkend="privilege-abbrevs-table"/> shows the one-letter
+ abbreviations that are used for these privilege types in
+ <firstterm>ACL</firstterm> (Access Control List) values.
+ You will see these letters in the output of the <xref linkend="app-psql"/>
+ commands listed below, or when looking at ACL columns of system catalogs.
+ </para>
+
+ <table id="privilege-abbrevs-table">
+ <title>ACL Privilege Abbreviations</title>
+ <tgroup cols="3">
+ <colspec colname="col1" colwidth="1*"/>
+ <colspec colname="col2" colwidth="1*"/>
+ <colspec colname="col3" colwidth="2*"/>
+ <thead>
+ <row>
+ <entry>Privilege</entry>
+ <entry>Abbreviation</entry>
+ <entry>Applicable Object Types</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><literal>SELECT</literal></entry>
+ <entry><literal>r</literal> (<quote>read</quote>)</entry>
+ <entry>
+ <literal>LARGE OBJECT</literal>,
+ <literal>SEQUENCE</literal>,
+ <literal>TABLE</literal> (and table-like objects),
+ table column
+ </entry>
+ </row>
+ <row>
+ <entry><literal>INSERT</literal></entry>
+ <entry><literal>a</literal> (<quote>append</quote>)</entry>
+ <entry><literal>TABLE</literal>, table column</entry>
+ </row>
+ <row>
+ <entry><literal>UPDATE</literal></entry>
+ <entry><literal>w</literal> (<quote>write</quote>)</entry>
+ <entry>
+ <literal>LARGE OBJECT</literal>,
+ <literal>SEQUENCE</literal>,
+ <literal>TABLE</literal>,
+ table column
+ </entry>
+ </row>
+ <row>
+ <entry><literal>DELETE</literal></entry>
+ <entry><literal>d</literal></entry>
+ <entry><literal>TABLE</literal></entry>
+ </row>
+ <row>
+ <entry><literal>TRUNCATE</literal></entry>
+ <entry><literal>D</literal></entry>
+ <entry><literal>TABLE</literal></entry>
+ </row>
+ <row>
+ <entry><literal>REFERENCES</literal></entry>
+ <entry><literal>x</literal></entry>
+ <entry><literal>TABLE</literal>, table column</entry>
+ </row>
+ <row>
+ <entry><literal>TRIGGER</literal></entry>
+ <entry><literal>t</literal></entry>
+ <entry><literal>TABLE</literal></entry>
+ </row>
+ <row>
+ <entry><literal>CREATE</literal></entry>
+ <entry><literal>C</literal></entry>
+ <entry>
+ <literal>DATABASE</literal>,
+ <literal>SCHEMA</literal>,
+ <literal>TABLESPACE</literal>
+ </entry>
+ </row>
+ <row>
+ <entry><literal>CONNECT</literal></entry>
+ <entry><literal>c</literal></entry>
+ <entry><literal>DATABASE</literal></entry>
+ </row>
+ <row>
+ <entry><literal>TEMPORARY</literal></entry>
+ <entry><literal>T</literal></entry>
+ <entry><literal>DATABASE</literal></entry>
+ </row>
+ <row>
+ <entry><literal>EXECUTE</literal></entry>
+ <entry><literal>X</literal></entry>
+ <entry><literal>FUNCTION</literal>, <literal>PROCEDURE</literal></entry>
+ </row>
+ <row>
+ <entry><literal>USAGE</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry>
+ <literal>DOMAIN</literal>,
+ <literal>FOREIGN DATA WRAPPER</literal>,
+ <literal>FOREIGN SERVER</literal>,
+ <literal>LANGUAGE</literal>,
+ <literal>SCHEMA</literal>,
+ <literal>SEQUENCE</literal>,
+ <literal>TYPE</literal>
+ </entry>
+ </row>
+ <row>
+ <entry><literal>SET</literal></entry>
+ <entry><literal>s</literal></entry>
+ <entry><literal>PARAMETER</literal></entry>
+ </row>
+ <row>
+ <entry><literal>ALTER SYSTEM</literal></entry>
+ <entry><literal>A</literal></entry>
+ <entry><literal>PARAMETER</literal></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ <xref linkend="privileges-summary-table"/> summarizes the privileges
+ available for each type of SQL object, using the abbreviations shown
+ above.
+ It also shows the <application>psql</application> command
+ that can be used to examine privilege settings for each object type.
+ </para>
+
+ <table id="privileges-summary-table">
+ <title>Summary of Access Privileges</title>
+ <tgroup cols="4">
+ <colspec colname="col1" colwidth="2*"/>
+ <colspec colname="col2" colwidth="1*"/>
+ <colspec colname="col3" colwidth="1*"/>
+ <colspec colname="col4" colwidth="1*"/>
+ <thead>
+ <row>
+ <entry>Object Type</entry>
+ <entry>All Privileges</entry>
+ <entry>Default <literal>PUBLIC</literal> Privileges</entry>
+ <entry><application>psql</application> Command</entry>
+ </row>
+ </thead>
+ <tbody>
+ <row>
+ <entry><literal>DATABASE</literal></entry>
+ <entry><literal>CTc</literal></entry>
+ <entry><literal>Tc</literal></entry>
+ <entry><literal>\l</literal></entry>
+ </row>
+ <row>
+ <entry><literal>DOMAIN</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry><literal>\dD+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>FUNCTION</literal> or <literal>PROCEDURE</literal></entry>
+ <entry><literal>X</literal></entry>
+ <entry><literal>X</literal></entry>
+ <entry><literal>\df+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>FOREIGN DATA WRAPPER</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dew+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>FOREIGN SERVER</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\des+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>LANGUAGE</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry><literal>\dL+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>LARGE OBJECT</literal></entry>
+ <entry><literal>rw</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dl+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>PARAMETER</literal></entry>
+ <entry><literal>sA</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dconfig+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>SCHEMA</literal></entry>
+ <entry><literal>UC</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dn+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>SEQUENCE</literal></entry>
+ <entry><literal>rwU</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dp</literal></entry>
+ </row>
+ <row>
+ <entry><literal>TABLE</literal> (and table-like objects)</entry>
+ <entry><literal>arwdDxt</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dp</literal></entry>
+ </row>
+ <row>
+ <entry>Table column</entry>
+ <entry><literal>arwx</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\dp</literal></entry>
+ </row>
+ <row>
+ <entry><literal>TABLESPACE</literal></entry>
+ <entry><literal>C</literal></entry>
+ <entry>none</entry>
+ <entry><literal>\db+</literal></entry>
+ </row>
+ <row>
+ <entry><literal>TYPE</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry><literal>U</literal></entry>
+ <entry><literal>\dT+</literal></entry>
+ </row>
+ </tbody>
+ </tgroup>
+ </table>
+
+ <para>
+ <indexterm>
+ <primary><type>aclitem</type></primary>
+ </indexterm>
+ The privileges that have been granted for a particular object are
+ displayed as a list of <type>aclitem</type> entries, where each
+ <type>aclitem</type> describes the permissions of one grantee that
+ have been granted by a particular grantor. For example,
+ <literal>calvin=r*w/hobbes</literal> specifies that the role
+ <literal>calvin</literal> has the privilege
+ <literal>SELECT</literal> (<literal>r</literal>) with grant option
+ (<literal>*</literal>) as well as the non-grantable
+ privilege <literal>UPDATE</literal> (<literal>w</literal>), both granted
+ by the role <literal>hobbes</literal>. If <literal>calvin</literal>
+ also has some privileges on the same object granted by a different
+ grantor, those would appear as a separate <type>aclitem</type> entry.
+ An empty grantee field in an <type>aclitem</type> stands
+ for <literal>PUBLIC</literal>.
+ </para>
+
+ <para>
+ As an example, suppose that user <literal>miriam</literal> creates
+ table <literal>mytable</literal> and does:
+<programlisting>
+GRANT SELECT ON mytable TO PUBLIC;
+GRANT SELECT, UPDATE, INSERT ON mytable TO admin;
+GRANT SELECT (col1), UPDATE (col1) ON mytable TO miriam_rw;
+</programlisting>
+ Then <application>psql</application>'s <literal>\dp</literal> command
+ would show:
+<programlisting>
+=&gt; \dp mytable
+ Access privileges
+ Schema | Name | Type | Access privileges | Column privileges | Policies
+--------+---------+-------+-----------------------+-----------------------+----------
+ public | mytable | table | miriam=arwdDxt/miriam+| col1: +|
+ | | | =r/miriam +| miriam_rw=rw/miriam |
+ | | | admin=arw/miriam | |
+(1 row)
+</programlisting>
+ </para>
+
+ <para>
+ If the <quote>Access privileges</quote> column is empty for a given
+ object, it means the object has default privileges (that is, its
+ privileges entry in the relevant system catalog is null). Default
+ privileges always include all privileges for the owner, and can include
+ some privileges for <literal>PUBLIC</literal> depending on the object
+ type, as explained above. The first <command>GRANT</command>
+ or <command>REVOKE</command> on an object will instantiate the default
+ privileges (producing, for
+ example, <literal>miriam=arwdDxt/miriam</literal>) and then modify them
+ per the specified request. Similarly, entries are shown in <quote>Column
+ privileges</quote> only for columns with nondefault privileges.
+ (Note: for this purpose, <quote>default privileges</quote> always means
+ the built-in default privileges for the object's type. An object whose
+ privileges have been affected by an <command>ALTER DEFAULT
+ PRIVILEGES</command> command will always be shown with an explicit
+ privilege entry that includes the effects of
+ the <command>ALTER</command>.)
+ </para>
+
+ <para>
+ Notice that the owner's implicit grant options are not marked in the
+ access privileges display. A <literal>*</literal> will appear only when
+ grant options have been explicitly granted to someone.
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-rowsecurity">
+ <title>Row Security Policies</title>
+
+ <indexterm zone="ddl-rowsecurity">
+ <primary>row-level security</primary>
+ </indexterm>
+
+ <indexterm zone="ddl-rowsecurity">
+ <primary>policy</primary>
+ </indexterm>
+
+ <para>
+ In addition to the SQL-standard <link linkend="ddl-priv">privilege
+ system</link> available through <xref linkend="sql-grant"/>,
+ tables can have <firstterm>row security policies</firstterm> that restrict,
+ on a per-user basis, which rows can be returned by normal queries
+ or inserted, updated, or deleted by data modification commands.
+ This feature is also known as <firstterm>Row-Level Security</firstterm>.
+ By default, tables do not have any policies, so that if a user has
+ access privileges to a table according to the SQL privilege system,
+ all rows within it are equally available for querying or updating.
+ </para>
+
+ <para>
+ When row security is enabled on a table (with
+ <link linkend="sql-altertable">ALTER TABLE ... ENABLE ROW LEVEL
+ SECURITY</link>), all normal access to the table for selecting rows or
+ modifying rows must be allowed by a row security policy. (However, the
+ table's owner is typically not subject to row security policies.) If no
+ policy exists for the table, a default-deny policy is used, meaning that
+ no rows are visible or can be modified. Operations that apply to the
+ whole table, such as <command>TRUNCATE</command> and <literal>REFERENCES</literal>,
+ are not subject to row security.
+ </para>
+
+ <para>
+ Row security policies can be specific to commands, or to roles, or to
+ both. A policy can be specified to apply to <literal>ALL</literal>
+ commands, or to <literal>SELECT</literal>, <literal>INSERT</literal>, <literal>UPDATE</literal>,
+ or <literal>DELETE</literal>. Multiple roles can be assigned to a given
+ policy, and normal role membership and inheritance rules apply.
+ </para>
+
+ <para>
+ To specify which rows are visible or modifiable according to a policy,
+ an expression is required that returns a Boolean result. This
+ expression will be evaluated for each row prior to any conditions or
+ functions coming from the user's query. (The only exceptions to this
+ rule are <literal>leakproof</literal> functions, which are guaranteed to
+ not leak information; the optimizer may choose to apply such functions
+ ahead of the row-security check.) Rows for which the expression does
+ not return <literal>true</literal> will not be processed. Separate expressions
+ may be specified to provide independent control over the rows which are
+ visible and the rows which are allowed to be modified. Policy
+ expressions are run as part of the query and with the privileges of the
+ user running the query, although security-definer functions can be used
+ to access data not available to the calling user.
+ </para>
+
+ <para>
+ Superusers and roles with the <literal>BYPASSRLS</literal> attribute always
+ bypass the row security system when accessing a table. Table owners
+ normally bypass row security as well, though a table owner can choose to
+ be subject to row security with <link linkend="sql-altertable">ALTER
+ TABLE ... FORCE ROW LEVEL SECURITY</link>.
+ </para>
+
+ <para>
+ Enabling and disabling row security, as well as adding policies to a
+ table, is always the privilege of the table owner only.
+ </para>
+
+ <para>
+ Policies are created using the <xref linkend="sql-createpolicy"/>
+ command, altered using the <xref linkend="sql-alterpolicy"/> command,
+ and dropped using the <xref linkend="sql-droppolicy"/> command. To
+ enable and disable row security for a given table, use the
+ <xref linkend="sql-altertable"/> command.
+ </para>
+
+ <para>
+ Each policy has a name and multiple policies can be defined for a
+ table. As policies are table-specific, each policy for a table must
+ have a unique name. Different tables may have policies with the
+ same name.
+ </para>
+
+ <para>
+ When multiple policies apply to a given query, they are combined using
+ either <literal>OR</literal> (for permissive policies, which are the
+ default) or using <literal>AND</literal> (for restrictive policies).
+ This is similar to the rule that a given role has the privileges
+ of all roles that they are a member of. Permissive vs. restrictive
+ policies are discussed further below.
+ </para>
+
+ <para>
+ As a simple example, here is how to create a policy on
+ the <literal>account</literal> relation to allow only members of
+ the <literal>managers</literal> role to access rows, and only rows of their
+ accounts:
+ </para>
+
+<programlisting>
+CREATE TABLE accounts (manager text, company text, contact_email text);
+
+ALTER TABLE accounts ENABLE ROW LEVEL SECURITY;
+
+CREATE POLICY account_managers ON accounts TO managers
+ USING (manager = current_user);
+</programlisting>
+
+ <para>
+ The policy above implicitly provides a <literal>WITH CHECK</literal>
+ clause identical to its <literal>USING</literal> clause, so that the
+ constraint applies both to rows selected by a command (so a manager
+ cannot <command>SELECT</command>, <command>UPDATE</command>,
+ or <command>DELETE</command> existing rows belonging to a different
+ manager) and to rows modified by a command (so rows belonging to a
+ different manager cannot be created via <command>INSERT</command>
+ or <command>UPDATE</command>).
+ </para>
+
+ <para>
+ If no role is specified, or the special user name
+ <literal>PUBLIC</literal> is used, then the policy applies to all
+ users on the system. To allow all users to access only their own row in
+ a <literal>users</literal> table, a simple policy can be used:
+ </para>
+
+<programlisting>
+CREATE POLICY user_policy ON users
+ USING (user_name = current_user);
+</programlisting>
+
+ <para>
+ This works similarly to the previous example.
+ </para>
+
+ <para>
+ To use a different policy for rows that are being added to the table
+ compared to those rows that are visible, multiple policies can be
+ combined. This pair of policies would allow all users to view all rows
+ in the <literal>users</literal> table, but only modify their own:
+ </para>
+
+<programlisting>
+CREATE POLICY user_sel_policy ON users
+ FOR SELECT
+ USING (true);
+CREATE POLICY user_mod_policy ON users
+ USING (user_name = current_user);
+</programlisting>
+
+ <para>
+ In a <command>SELECT</command> command, these two policies are combined
+ using <literal>OR</literal>, with the net effect being that all rows
+ can be selected. In other command types, only the second policy applies,
+ so that the effects are the same as before.
+ </para>
+
+ <para>
+ Row security can also be disabled with the <command>ALTER TABLE</command>
+ command. Disabling row security does not remove any policies that are
+ defined on the table; they are simply ignored. Then all rows in the
+ table are visible and modifiable, subject to the standard SQL privileges
+ system.
+ </para>
+
+ <para>
+ Below is a larger example of how this feature can be used in production
+ environments. The table <literal>passwd</literal> emulates a Unix password
+ file:
+ </para>
+
+<programlisting>
+-- Simple passwd-file based example
+CREATE TABLE passwd (
+ user_name text UNIQUE NOT NULL,
+ pwhash text,
+ uid int PRIMARY KEY,
+ gid int NOT NULL,
+ real_name text NOT NULL,
+ home_phone text,
+ extra_info text,
+ home_dir text NOT NULL,
+ shell text NOT NULL
+);
+
+CREATE ROLE admin; -- Administrator
+CREATE ROLE bob; -- Normal user
+CREATE ROLE alice; -- Normal user
+
+-- Populate the table
+INSERT INTO passwd VALUES
+ ('admin','xxx',0,0,'Admin','111-222-3333',null,'/root','/bin/dash');
+INSERT INTO passwd VALUES
+ ('bob','xxx',1,1,'Bob','123-456-7890',null,'/home/bob','/bin/zsh');
+INSERT INTO passwd VALUES
+ ('alice','xxx',2,1,'Alice','098-765-4321',null,'/home/alice','/bin/zsh');
+
+-- Be sure to enable row-level security on the table
+ALTER TABLE passwd ENABLE ROW LEVEL SECURITY;
+
+-- Create policies
+-- Administrator can see all rows and add any rows
+CREATE POLICY admin_all ON passwd TO admin USING (true) WITH CHECK (true);
+-- Normal users can view all rows
+CREATE POLICY all_view ON passwd FOR SELECT USING (true);
+-- Normal users can update their own records, but
+-- limit which shells a normal user is allowed to set
+CREATE POLICY user_mod ON passwd FOR UPDATE
+ USING (current_user = user_name)
+ WITH CHECK (
+ current_user = user_name AND
+ shell IN ('/bin/bash','/bin/sh','/bin/dash','/bin/zsh','/bin/tcsh')
+ );
+
+-- Allow admin all normal rights
+GRANT SELECT, INSERT, UPDATE, DELETE ON passwd TO admin;
+-- Users only get select access on public columns
+GRANT SELECT
+ (user_name, uid, gid, real_name, home_phone, extra_info, home_dir, shell)
+ ON passwd TO public;
+-- Allow users to update certain columns
+GRANT UPDATE
+ (pwhash, real_name, home_phone, extra_info, shell)
+ ON passwd TO public;
+</programlisting>
+
+ <para>
+ As with any security settings, it's important to test and ensure that
+ the system is behaving as expected. Using the example above, this
+ demonstrates that the permission system is working properly.
+ </para>
+
+<programlisting>
+-- admin can view all rows and fields
+postgres=&gt; set role admin;
+SET
+postgres=&gt; table passwd;
+ user_name | pwhash | uid | gid | real_name | home_phone | extra_info | home_dir | shell
+-----------+--------+-----+-----+-----------+--------------+------------+-------------+-----------
+ admin | xxx | 0 | 0 | Admin | 111-222-3333 | | /root | /bin/dash
+ bob | xxx | 1 | 1 | Bob | 123-456-7890 | | /home/bob | /bin/zsh
+ alice | xxx | 2 | 1 | Alice | 098-765-4321 | | /home/alice | /bin/zsh
+(3 rows)
+
+-- Test what Alice is able to do
+postgres=&gt; set role alice;
+SET
+postgres=&gt; table passwd;
+ERROR: permission denied for table passwd
+postgres=&gt; select user_name,real_name,home_phone,extra_info,home_dir,shell from passwd;
+ user_name | real_name | home_phone | extra_info | home_dir | shell
+-----------+-----------+--------------+------------+-------------+-----------
+ admin | Admin | 111-222-3333 | | /root | /bin/dash
+ bob | Bob | 123-456-7890 | | /home/bob | /bin/zsh
+ alice | Alice | 098-765-4321 | | /home/alice | /bin/zsh
+(3 rows)
+
+postgres=&gt; update passwd set user_name = 'joe';
+ERROR: permission denied for table passwd
+-- Alice is allowed to change her own real_name, but no others
+postgres=&gt; update passwd set real_name = 'Alice Doe';
+UPDATE 1
+postgres=&gt; update passwd set real_name = 'John Doe' where user_name = 'admin';
+UPDATE 0
+postgres=&gt; update passwd set shell = '/bin/xx';
+ERROR: new row violates WITH CHECK OPTION for "passwd"
+postgres=&gt; delete from passwd;
+ERROR: permission denied for table passwd
+postgres=&gt; insert into passwd (user_name) values ('xxx');
+ERROR: permission denied for table passwd
+-- Alice can change her own password; RLS silently prevents updating other rows
+postgres=&gt; update passwd set pwhash = 'abc';
+UPDATE 1
+</programlisting>
+
+ <para>
+ All of the policies constructed thus far have been permissive policies,
+ meaning that when multiple policies are applied they are combined using
+ the <quote>OR</quote> Boolean operator. While permissive policies can be constructed
+ to only allow access to rows in the intended cases, it can be simpler to
+ combine permissive policies with restrictive policies (which the records
+ must pass and which are combined using the <quote>AND</quote> Boolean operator).
+ Building on the example above, we add a restrictive policy to require
+ the administrator to be connected over a local Unix socket to access the
+ records of the <literal>passwd</literal> table:
+ </para>
+
+<programlisting>
+CREATE POLICY admin_local_only ON passwd AS RESTRICTIVE TO admin
+ USING (pg_catalog.inet_client_addr() IS NULL);
+</programlisting>
+
+ <para>
+ We can then see that an administrator connecting over a network will not
+ see any records, due to the restrictive policy:
+ </para>
+
+<programlisting>
+=&gt; SELECT current_user;
+ current_user
+--------------
+ admin
+(1 row)
+
+=&gt; select inet_client_addr();
+ inet_client_addr
+------------------
+ 127.0.0.1
+(1 row)
+
+=&gt; TABLE passwd;
+ user_name | pwhash | uid | gid | real_name | home_phone | extra_info | home_dir | shell
+-----------+--------+-----+-----+-----------+------------+------------+----------+-------
+(0 rows)
+
+=&gt; UPDATE passwd set pwhash = NULL;
+UPDATE 0
+</programlisting>
+
+ <para>
+ Referential integrity checks, such as unique or primary key constraints
+ and foreign key references, always bypass row security to ensure that
+ data integrity is maintained. Care must be taken when developing
+ schemas and row level policies to avoid <quote>covert channel</quote> leaks of
+ information through such referential integrity checks.
+ </para>
+
+ <para>
+ In some contexts it is important to be sure that row security is
+ not being applied. For example, when taking a backup, it could be
+ disastrous if row security silently caused some rows to be omitted
+ from the backup. In such a situation, you can set the
+ <xref linkend="guc-row-security"/> configuration parameter
+ to <literal>off</literal>. This does not in itself bypass row security;
+ what it does is throw an error if any query's results would get filtered
+ by a policy. The reason for the error can then be investigated and
+ fixed.
+ </para>
+
+ <para>
+ In the examples above, the policy expressions consider only the current
+ values in the row to be accessed or updated. This is the simplest and
+ best-performing case; when possible, it's best to design row security
+ applications to work this way. If it is necessary to consult other rows
+ or other tables to make a policy decision, that can be accomplished using
+ sub-<command>SELECT</command>s, or functions that contain <command>SELECT</command>s,
+ in the policy expressions. Be aware however that such accesses can
+ create race conditions that could allow information leakage if care is
+ not taken. As an example, consider the following table design:
+ </para>
+
+<programlisting>
+-- definition of privilege groups
+CREATE TABLE groups (group_id int PRIMARY KEY,
+ group_name text NOT NULL);
+
+INSERT INTO groups VALUES
+ (1, 'low'),
+ (2, 'medium'),
+ (5, 'high');
+
+GRANT ALL ON groups TO alice; -- alice is the administrator
+GRANT SELECT ON groups TO public;
+
+-- definition of users' privilege levels
+CREATE TABLE users (user_name text PRIMARY KEY,
+ group_id int NOT NULL REFERENCES groups);
+
+INSERT INTO users VALUES
+ ('alice', 5),
+ ('bob', 2),
+ ('mallory', 2);
+
+GRANT ALL ON users TO alice;
+GRANT SELECT ON users TO public;
+
+-- table holding the information to be protected
+CREATE TABLE information (info text,
+ group_id int NOT NULL REFERENCES groups);
+
+INSERT INTO information VALUES
+ ('barely secret', 1),
+ ('slightly secret', 2),
+ ('very secret', 5);
+
+ALTER TABLE information ENABLE ROW LEVEL SECURITY;
+
+-- a row should be visible to/updatable by users whose security group_id is
+-- greater than or equal to the row's group_id
+CREATE POLICY fp_s ON information FOR SELECT
+ USING (group_id &lt;= (SELECT group_id FROM users WHERE user_name = current_user));
+CREATE POLICY fp_u ON information FOR UPDATE
+ USING (group_id &lt;= (SELECT group_id FROM users WHERE user_name = current_user));
+
+-- we rely only on RLS to protect the information table
+GRANT ALL ON information TO public;
+</programlisting>
+
+ <para>
+ Now suppose that <literal>alice</literal> wishes to change the <quote>slightly
+ secret</quote> information, but decides that <literal>mallory</literal> should not
+ be trusted with the new content of that row, so she does:
+ </para>
+
+<programlisting>
+BEGIN;
+UPDATE users SET group_id = 1 WHERE user_name = 'mallory';
+UPDATE information SET info = 'secret from mallory' WHERE group_id = 2;
+COMMIT;
+</programlisting>
+
+ <para>
+ That looks safe; there is no window wherein <literal>mallory</literal> should be
+ able to see the <quote>secret from mallory</quote> string. However, there is
+ a race condition here. If <literal>mallory</literal> is concurrently doing,
+ say,
+<programlisting>
+SELECT * FROM information WHERE group_id = 2 FOR UPDATE;
+</programlisting>
+ and her transaction is in <literal>READ COMMITTED</literal> mode, it is possible
+ for her to see <quote>secret from mallory</quote>. That happens if her
+ transaction reaches the <structname>information</structname> row just
+ after <literal>alice</literal>'s does. It blocks waiting
+ for <literal>alice</literal>'s transaction to commit, then fetches the updated
+ row contents thanks to the <literal>FOR UPDATE</literal> clause. However, it
+ does <emphasis>not</emphasis> fetch an updated row for the
+ implicit <command>SELECT</command> from <structname>users</structname>, because that
+ sub-<command>SELECT</command> did not have <literal>FOR UPDATE</literal>; instead
+ the <structname>users</structname> row is read with the snapshot taken at the start
+ of the query. Therefore, the policy expression tests the old value
+ of <literal>mallory</literal>'s privilege level and allows her to see the
+ updated row.
+ </para>
+
+ <para>
+ There are several ways around this problem. One simple answer is to use
+ <literal>SELECT ... FOR SHARE</literal> in sub-<command>SELECT</command>s in row
+ security policies. However, that requires granting <literal>UPDATE</literal>
+ privilege on the referenced table (here <structname>users</structname>) to the
+ affected users, which might be undesirable. (But another row security
+ policy could be applied to prevent them from actually exercising that
+ privilege; or the sub-<command>SELECT</command> could be embedded into a security
+ definer function.) Also, heavy concurrent use of row share locks on the
+ referenced table could pose a performance problem, especially if updates
+ of it are frequent. Another solution, practical if updates of the
+ referenced table are infrequent, is to take an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the
+ referenced table when updating it, so that no concurrent transactions
+ could be examining old row values. Or one could just wait for all
+ concurrent transactions to end after committing an update of the
+ referenced table and before making changes that rely on the new security
+ situation.
+ </para>
+
+ <para>
+ For additional details see <xref linkend="sql-createpolicy"/>
+ and <xref linkend="sql-altertable"/>.
+ </para>
+
+ </sect1>
+
+ <sect1 id="ddl-schemas">
+ <title>Schemas</title>
+
+ <indexterm zone="ddl-schemas">
+ <primary>schema</primary>
+ </indexterm>
+
+ <para>
+ A <productname>PostgreSQL</productname> database cluster contains
+ one or more named databases. Roles and a few other object types are
+ shared across the entire cluster. A client connection to the server
+ can only access data in a single database, the one specified in the
+ connection request.
+ </para>
+
+ <note>
+ <para>
+ Users of a cluster do not necessarily have the privilege to access every
+ database in the cluster. Sharing of role names means that there
+ cannot be different roles named, say, <literal>joe</literal> in two databases
+ in the same cluster; but the system can be configured to allow
+ <literal>joe</literal> access to only some of the databases.
+ </para>
+ </note>
+
+ <para>
+ A database contains one or more named <firstterm>schemas</firstterm>, which
+ in turn contain tables. Schemas also contain other kinds of named
+ objects, including data types, functions, and operators. The same
+ object name can be used in different schemas without conflict; for
+ example, both <literal>schema1</literal> and <literal>myschema</literal> can
+ contain tables named <literal>mytable</literal>. Unlike databases,
+ schemas are not rigidly separated: a user can access objects in any
+ of the schemas in the database they are connected to, if they have
+ privileges to do so.
+ </para>
+
+ <para>
+ There are several reasons why one might want to use schemas:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ To allow many users to use one database without interfering with
+ each other.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ To organize database objects into logical groups to make them
+ more manageable.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Third-party applications can be put into separate schemas so
+ they do not collide with the names of other objects.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ Schemas are analogous to directories at the operating system level,
+ except that schemas cannot be nested.
+ </para>
+
+ <sect2 id="ddl-schemas-create">
+ <title>Creating a Schema</title>
+
+ <indexterm zone="ddl-schemas-create">
+ <primary>schema</primary>
+ <secondary>creating</secondary>
+ </indexterm>
+
+ <para>
+ To create a schema, use the <xref linkend="sql-createschema"/>
+ command. Give the schema a name
+ of your choice. For example:
+<programlisting>
+CREATE SCHEMA myschema;
+</programlisting>
+ </para>
+
+ <indexterm>
+ <primary>qualified name</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>name</primary>
+ <secondary>qualified</secondary>
+ </indexterm>
+
+ <para>
+ To create or access objects in a schema, write a
+ <firstterm>qualified name</firstterm> consisting of the schema name and
+ table name separated by a dot:
+<synopsis>
+<replaceable>schema</replaceable><literal>.</literal><replaceable>table</replaceable>
+</synopsis>
+ This works anywhere a table name is expected, including the table
+ modification commands and the data access commands discussed in
+ the following chapters.
+ (For brevity we will speak of tables only, but the same ideas apply
+ to other kinds of named objects, such as types and functions.)
+ </para>
+
+ <para>
+ Actually, the even more general syntax
+<synopsis>
+<replaceable>database</replaceable><literal>.</literal><replaceable>schema</replaceable><literal>.</literal><replaceable>table</replaceable>
+</synopsis>
+ can be used too, but at present this is just for pro forma
+ compliance with the SQL standard. If you write a database name,
+ it must be the same as the database you are connected to.
+ </para>
+
+ <para>
+ So to create a table in the new schema, use:
+<programlisting>
+CREATE TABLE myschema.mytable (
+ ...
+);
+</programlisting>
+ </para>
+
+ <indexterm>
+ <primary>schema</primary>
+ <secondary>removing</secondary>
+ </indexterm>
+
+ <para>
+ To drop a schema if it's empty (all objects in it have been
+ dropped), use:
+<programlisting>
+DROP SCHEMA myschema;
+</programlisting>
+ To drop a schema including all contained objects, use:
+<programlisting>
+DROP SCHEMA myschema CASCADE;
+</programlisting>
+ See <xref linkend="ddl-depend"/> for a description of the general
+ mechanism behind this.
+ </para>
+
+ <para>
+ Often you will want to create a schema owned by someone else
+ (since this is one of the ways to restrict the activities of your
+ users to well-defined namespaces). The syntax for that is:
+<programlisting>
+CREATE SCHEMA <replaceable>schema_name</replaceable> AUTHORIZATION <replaceable>user_name</replaceable>;
+</programlisting>
+ You can even omit the schema name, in which case the schema name
+ will be the same as the user name. See <xref
+ linkend="ddl-schemas-patterns"/> for how this can be useful.
+ </para>
+
+ <para>
+ Schema names beginning with <literal>pg_</literal> are reserved for
+ system purposes and cannot be created by users.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-schemas-public">
+ <title>The Public Schema</title>
+
+ <indexterm zone="ddl-schemas-public">
+ <primary>schema</primary>
+ <secondary>public</secondary>
+ </indexterm>
+
+ <para>
+ In the previous sections we created tables without specifying any
+ schema names. By default such tables (and other objects) are
+ automatically put into a schema named <quote>public</quote>. Every new
+ database contains such a schema. Thus, the following are equivalent:
+<programlisting>
+CREATE TABLE products ( ... );
+</programlisting>
+ and:
+<programlisting>
+CREATE TABLE public.products ( ... );
+</programlisting>
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-schemas-path">
+ <title>The Schema Search Path</title>
+
+ <indexterm>
+ <primary>search path</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>unqualified name</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>name</primary>
+ <secondary>unqualified</secondary>
+ </indexterm>
+
+ <para>
+ Qualified names are tedious to write, and it's often best not to
+ wire a particular schema name into applications anyway. Therefore
+ tables are often referred to by <firstterm>unqualified names</firstterm>,
+ which consist of just the table name. The system determines which table
+ is meant by following a <firstterm>search path</firstterm>, which is a list
+ of schemas to look in. The first matching table in the search path
+ is taken to be the one wanted. If there is no match in the search
+ path, an error is reported, even if matching table names exist
+ in other schemas in the database.
+ </para>
+
+ <para>
+ The ability to create like-named objects in different schemas complicates
+ writing a query that references precisely the same objects every time. It
+ also opens up the potential for users to change the behavior of other
+ users' queries, maliciously or accidentally. Due to the prevalence of
+ unqualified names in queries and their use
+ in <productname>PostgreSQL</productname> internals, adding a schema
+ to <varname>search_path</varname> effectively trusts all users having
+ <literal>CREATE</literal> privilege on that schema. When you run an
+ ordinary query, a malicious user able to create objects in a schema of
+ your search path can take control and execute arbitrary SQL functions as
+ though you executed them.
+ </para>
+
+ <indexterm>
+ <primary>schema</primary>
+ <secondary>current</secondary>
+ </indexterm>
+
+ <para>
+ The first schema named in the search path is called the current schema.
+ Aside from being the first schema searched, it is also the schema in
+ which new tables will be created if the <command>CREATE TABLE</command>
+ command does not specify a schema name.
+ </para>
+
+ <indexterm>
+ <primary><varname>search_path</varname> configuration parameter</primary>
+ </indexterm>
+
+ <para>
+ To show the current search path, use the following command:
+<programlisting>
+SHOW search_path;
+</programlisting>
+ In the default setup this returns:
+<screen>
+ search_path
+--------------
+ "$user", public
+</screen>
+ The first element specifies that a schema with the same name as
+ the current user is to be searched. If no such schema exists,
+ the entry is ignored. The second element refers to the
+ public schema that we have seen already.
+ </para>
+
+ <para>
+ The first schema in the search path that exists is the default
+ location for creating new objects. That is the reason that by
+ default objects are created in the public schema. When objects
+ are referenced in any other context without schema qualification
+ (table modification, data modification, or query commands) the
+ search path is traversed until a matching object is found.
+ Therefore, in the default configuration, any unqualified access
+ again can only refer to the public schema.
+ </para>
+
+ <para>
+ To put our new schema in the path, we use:
+<programlisting>
+SET search_path TO myschema,public;
+</programlisting>
+ (We omit the <literal>$user</literal> here because we have no
+ immediate need for it.) And then we can access the table without
+ schema qualification:
+<programlisting>
+DROP TABLE mytable;
+</programlisting>
+ Also, since <literal>myschema</literal> is the first element in
+ the path, new objects would by default be created in it.
+ </para>
+
+ <para>
+ We could also have written:
+<programlisting>
+SET search_path TO myschema;
+</programlisting>
+ Then we no longer have access to the public schema without
+ explicit qualification. There is nothing special about the public
+ schema except that it exists by default. It can be dropped, too.
+ </para>
+
+ <para>
+ See also <xref linkend="functions-info"/> for other ways to manipulate
+ the schema search path.
+ </para>
+
+ <para>
+ The search path works in the same way for data type names, function names,
+ and operator names as it does for table names. Data type and function
+ names can be qualified in exactly the same way as table names. If you
+ need to write a qualified operator name in an expression, there is a
+ special provision: you must write
+<synopsis>
+<literal>OPERATOR(</literal><replaceable>schema</replaceable><literal>.</literal><replaceable>operator</replaceable><literal>)</literal>
+</synopsis>
+ This is needed to avoid syntactic ambiguity. An example is:
+<programlisting>
+SELECT 3 OPERATOR(pg_catalog.+) 4;
+</programlisting>
+ In practice one usually relies on the search path for operators,
+ so as not to have to write anything so ugly as that.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-schemas-priv">
+ <title>Schemas and Privileges</title>
+
+ <indexterm zone="ddl-schemas-priv">
+ <primary>privilege</primary>
+ <secondary sortas="schemas">for schemas</secondary>
+ </indexterm>
+
+ <para>
+ By default, users cannot access any objects in schemas they do not
+ own. To allow that, the owner of the schema must grant the
+ <literal>USAGE</literal> privilege on the schema. By default, everyone
+ has that privilege on the schema <literal>public</literal>. To allow
+ users to make use of the objects in a schema, additional privileges might
+ need to be granted, as appropriate for the object.
+ </para>
+
+ <para>
+ A user can also be allowed to create objects in someone else's schema. To
+ allow that, the <literal>CREATE</literal> privilege on the schema needs to
+ be granted. In databases upgraded from
+ <productname>PostgreSQL</productname> 14 or earlier, everyone has that
+ privilege on the schema <literal>public</literal>.
+ Some <link linkend="ddl-schemas-patterns">usage patterns</link> call for
+ revoking that privilege:
+<programlisting>
+REVOKE CREATE ON SCHEMA public FROM PUBLIC;
+</programlisting>
+ (The first <quote>public</quote> is the schema, the second
+ <quote>public</quote> means <quote>every user</quote>. In the
+ first sense it is an identifier, in the second sense it is a
+ key word, hence the different capitalization; recall the
+ guidelines from <xref linkend="sql-syntax-identifiers"/>.)
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-schemas-catalog">
+ <title>The System Catalog Schema</title>
+
+ <indexterm zone="ddl-schemas-catalog">
+ <primary>system catalog</primary>
+ <secondary>schema</secondary>
+ </indexterm>
+
+ <para>
+ In addition to <literal>public</literal> and user-created schemas, each
+ database contains a <literal>pg_catalog</literal> schema, which contains
+ the system tables and all the built-in data types, functions, and
+ operators. <literal>pg_catalog</literal> is always effectively part of
+ the search path. If it is not named explicitly in the path then
+ it is implicitly searched <emphasis>before</emphasis> searching the path's
+ schemas. This ensures that built-in names will always be
+ findable. However, you can explicitly place
+ <literal>pg_catalog</literal> at the end of your search path if you
+ prefer to have user-defined names override built-in names.
+ </para>
+
+ <para>
+ Since system table names begin with <literal>pg_</literal>, it is best to
+ avoid such names to ensure that you won't suffer a conflict if some
+ future version defines a system table named the same as your
+ table. (With the default search path, an unqualified reference to
+ your table name would then be resolved as the system table instead.)
+ System tables will continue to follow the convention of having
+ names beginning with <literal>pg_</literal>, so that they will not
+ conflict with unqualified user-table names so long as users avoid
+ the <literal>pg_</literal> prefix.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-schemas-patterns">
+ <title>Usage Patterns</title>
+
+ <para>
+ Schemas can be used to organize your data in many ways.
+ A <firstterm>secure schema usage pattern</firstterm> prevents untrusted
+ users from changing the behavior of other users' queries. When a database
+ does not use a secure schema usage pattern, users wishing to securely
+ query that database would take protective action at the beginning of each
+ session. Specifically, they would begin each session by
+ setting <varname>search_path</varname> to the empty string or otherwise
+ removing schemas that are writable by non-superusers
+ from <varname>search_path</varname>. There are a few usage patterns
+ easily supported by the default configuration:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Constrain ordinary users to user-private schemas.
+ To implement this pattern, first ensure that no schemas have
+ public <literal>CREATE</literal> privileges. Then, for every user
+ needing to create non-temporary objects, create a schema with the
+ same name as that user, for example
+ <literal>CREATE SCHEMA alice AUTHORIZATION alice</literal>.
+ (Recall that the default search path starts
+ with <literal>$user</literal>, which resolves to the user
+ name. Therefore, if each user has a separate schema, they access
+ their own schemas by default.) This pattern is a secure schema
+ usage pattern unless an untrusted user is the database owner or
+ holds the <literal>CREATEROLE</literal> privilege, in which case no
+ secure schema usage pattern exists.
+ </para>
+ <!-- A database owner can attack the database's users via "CREATE SCHEMA
+ trojan; ALTER DATABASE $mydb SET search_path = trojan, public;". A
+ CREATEROLE user can issue "GRANT $dbowner TO $me" and then use the
+ database owner attack. -->
+
+ <para>
+ In <productname>PostgreSQL</productname> 15 and later, the default
+ configuration supports this usage pattern. In prior versions, or
+ when using a database that has been upgraded from a prior version,
+ you will need to remove the public <literal>CREATE</literal>
+ privilege from the <literal>public</literal> schema (issue
+ <literal>REVOKE CREATE ON SCHEMA public FROM PUBLIC</literal>).
+ Then consider auditing the <literal>public</literal> schema for
+ objects named like objects in schema <literal>pg_catalog</literal>.
+ </para>
+ <!-- "DROP SCHEMA public" is inferior to this REVOKE, because pg_dump
+ doesn't preserve that DROP. -->
+ </listitem>
+
+ <listitem>
+ <para>
+ Remove the public schema from the default search path, by modifying
+ <link linkend="config-setting-configuration-file"><filename>postgresql.conf</filename></link>
+ or by issuing <literal>ALTER ROLE ALL SET search_path =
+ "$user"</literal>. Then, grant privileges to create in the public
+ schema. Only qualified names will choose public schema objects. While
+ qualified table references are fine, calls to functions in the public
+ schema <link linkend="typeconv-func">will be unsafe or
+ unreliable</link>. If you create functions or extensions in the public
+ schema, use the first pattern instead. Otherwise, like the first
+ pattern, this is secure unless an untrusted user is the database owner
+ or holds the <literal>CREATEROLE</literal> privilege.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Keep the default search path, and grant privileges to create in the
+ public schema. All users access the public schema implicitly. This
+ simulates the situation where schemas are not available at all, giving
+ a smooth transition from the non-schema-aware world. However, this is
+ never a secure pattern. It is acceptable only when the database has a
+ single user or a few mutually-trusting users. In databases upgraded
+ from <productname>PostgreSQL</productname> 14 or earlier, this is the
+ default.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ For any pattern, to install shared applications (tables to be used by
+ everyone, additional functions provided by third parties, etc.), put them
+ into separate schemas. Remember to grant appropriate privileges to allow
+ the other users to access them. Users can then refer to these additional
+ objects by qualifying the names with a schema name, or they can put the
+ additional schemas into their search path, as they choose.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-schemas-portability">
+ <title>Portability</title>
+
+ <para>
+ In the SQL standard, the notion of objects in the same schema
+ being owned by different users does not exist. Moreover, some
+ implementations do not allow you to create schemas that have a
+ different name than their owner. In fact, the concepts of schema
+ and user are nearly equivalent in a database system that
+ implements only the basic schema support specified in the
+ standard. Therefore, many users consider qualified names to
+ really consist of
+ <literal><replaceable>user_name</replaceable>.<replaceable>table_name</replaceable></literal>.
+ This is how <productname>PostgreSQL</productname> will effectively
+ behave if you create a per-user schema for every user.
+ </para>
+
+ <para>
+ Also, there is no concept of a <literal>public</literal> schema in the
+ SQL standard. For maximum conformance to the standard, you should
+ not use the <literal>public</literal> schema.
+ </para>
+
+ <para>
+ Of course, some SQL database systems might not implement schemas
+ at all, or provide namespace support by allowing (possibly
+ limited) cross-database access. If you need to work with those
+ systems, then maximum portability would be achieved by not using
+ schemas at all.
+ </para>
+ </sect2>
+ </sect1>
+
+ <sect1 id="ddl-inherit">
+ <title>Inheritance</title>
+
+ <indexterm>
+ <primary>inheritance</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>inheritance</secondary>
+ </indexterm>
+
+ <para>
+ <productname>PostgreSQL</productname> implements table inheritance,
+ which can be a useful tool for database designers. (SQL:1999 and
+ later define a type inheritance feature, which differs in many
+ respects from the features described here.)
+ </para>
+
+ <para>
+ Let's start with an example: suppose we are trying to build a data
+ model for cities. Each state has many cities, but only one
+ capital. We want to be able to quickly retrieve the capital city
+ for any particular state. This can be done by creating two tables,
+ one for state capitals and one for cities that are not
+ capitals. However, what happens when we want to ask for data about
+ a city, regardless of whether it is a capital or not? The
+ inheritance feature can help to resolve this problem. We define the
+ <structname>capitals</structname> table so that it inherits from
+ <structname>cities</structname>:
+
+<programlisting>
+CREATE TABLE cities (
+ name text,
+ population float,
+ elevation int -- in feet
+);
+
+CREATE TABLE capitals (
+ state char(2)
+) INHERITS (cities);
+</programlisting>
+
+ In this case, the <structname>capitals</structname> table <firstterm>inherits</firstterm>
+ all the columns of its parent table, <structname>cities</structname>. State
+ capitals also have an extra column, <structfield>state</structfield>, that shows
+ their state.
+ </para>
+
+ <para>
+ In <productname>PostgreSQL</productname>, a table can inherit from
+ zero or more other tables, and a query can reference either all
+ rows of a table or all rows of a table plus all of its descendant tables.
+ The latter behavior is the default.
+ For example, the following query finds the names of all cities,
+ including state capitals, that are located at an elevation over
+ 500 feet:
+
+<programlisting>
+SELECT name, elevation
+ FROM cities
+ WHERE elevation &gt; 500;
+</programlisting>
+
+ Given the sample data from the <productname>PostgreSQL</productname>
+ tutorial (see <xref linkend="tutorial-sql-intro"/>), this returns:
+
+<programlisting>
+ name | elevation
+-----------+-----------
+ Las Vegas | 2174
+ Mariposa | 1953
+ Madison | 845
+</programlisting>
+ </para>
+
+ <para>
+ On the other hand, the following query finds all the cities that
+ are not state capitals and are situated at an elevation over 500 feet:
+
+<programlisting>
+SELECT name, elevation
+ FROM ONLY cities
+ WHERE elevation &gt; 500;
+
+ name | elevation
+-----------+-----------
+ Las Vegas | 2174
+ Mariposa | 1953
+</programlisting>
+ </para>
+
+ <para>
+ Here the <literal>ONLY</literal> keyword indicates that the query
+ should apply only to <structname>cities</structname>, and not any tables
+ below <structname>cities</structname> in the inheritance hierarchy. Many
+ of the commands that we have already discussed &mdash;
+ <command>SELECT</command>, <command>UPDATE</command> and
+ <command>DELETE</command> &mdash; support the
+ <literal>ONLY</literal> keyword.
+ </para>
+
+ <para>
+ You can also write the table name with a trailing <literal>*</literal>
+ to explicitly specify that descendant tables are included:
+
+<programlisting>
+SELECT name, elevation
+ FROM cities*
+ WHERE elevation &gt; 500;
+</programlisting>
+
+ Writing <literal>*</literal> is not necessary, since this behavior is always
+ the default. However, this syntax is still supported for
+ compatibility with older releases where the default could be changed.
+ </para>
+
+ <para>
+ In some cases you might wish to know which table a particular row
+ originated from. There is a system column called
+ <structfield>tableoid</structfield> in each table which can tell you the
+ originating table:
+
+<programlisting>
+SELECT c.tableoid, c.name, c.elevation
+FROM cities c
+WHERE c.elevation &gt; 500;
+</programlisting>
+
+ which returns:
+
+<programlisting>
+ tableoid | name | elevation
+----------+-----------+-----------
+ 139793 | Las Vegas | 2174
+ 139793 | Mariposa | 1953
+ 139798 | Madison | 845
+</programlisting>
+
+ (If you try to reproduce this example, you will probably get
+ different numeric OIDs.) By doing a join with
+ <structname>pg_class</structname> you can see the actual table names:
+
+<programlisting>
+SELECT p.relname, c.name, c.elevation
+FROM cities c, pg_class p
+WHERE c.elevation &gt; 500 AND c.tableoid = p.oid;
+</programlisting>
+
+ which returns:
+
+<programlisting>
+ relname | name | elevation
+----------+-----------+-----------
+ cities | Las Vegas | 2174
+ cities | Mariposa | 1953
+ capitals | Madison | 845
+</programlisting>
+ </para>
+
+ <para>
+ Another way to get the same effect is to use the <type>regclass</type>
+ alias type, which will print the table OID symbolically:
+
+<programlisting>
+SELECT c.tableoid::regclass, c.name, c.elevation
+FROM cities c
+WHERE c.elevation &gt; 500;
+</programlisting>
+ </para>
+
+ <para>
+ Inheritance does not automatically propagate data from
+ <command>INSERT</command> or <command>COPY</command> commands to
+ other tables in the inheritance hierarchy. In our example, the
+ following <command>INSERT</command> statement will fail:
+<programlisting>
+INSERT INTO cities (name, population, elevation, state)
+VALUES ('Albany', NULL, NULL, 'NY');
+</programlisting>
+ We might hope that the data would somehow be routed to the
+ <structname>capitals</structname> table, but this does not happen:
+ <command>INSERT</command> always inserts into exactly the table
+ specified. In some cases it is possible to redirect the insertion
+ using a rule (see <xref linkend="rules"/>). However that does not
+ help for the above case because the <structname>cities</structname> table
+ does not contain the column <structfield>state</structfield>, and so the
+ command will be rejected before the rule can be applied.
+ </para>
+
+ <para>
+ All check constraints and not-null constraints on a parent table are
+ automatically inherited by its children, unless explicitly specified
+ otherwise with <literal>NO INHERIT</literal> clauses. Other types of constraints
+ (unique, primary key, and foreign key constraints) are not inherited.
+ </para>
+
+ <para>
+ A table can inherit from more than one parent table, in which case it has
+ the union of the columns defined by the parent tables. Any columns
+ declared in the child table's definition are added to these. If the
+ same column name appears in multiple parent tables, or in both a parent
+ table and the child's definition, then these columns are <quote>merged</quote>
+ so that there is only one such column in the child table. To be merged,
+ columns must have the same data types, else an error is raised.
+ Inheritable check constraints and not-null constraints are merged in a
+ similar fashion. Thus, for example, a merged column will be marked
+ not-null if any one of the column definitions it came from is marked
+ not-null. Check constraints are merged if they have the same name,
+ and the merge will fail if their conditions are different.
+ </para>
+
+ <para>
+ Table inheritance is typically established when the child table is
+ created, using the <literal>INHERITS</literal> clause of the
+ <link linkend="sql-createtable"><command>CREATE TABLE</command></link>
+ statement.
+ Alternatively, a table which is already defined in a compatible way can
+ have a new parent relationship added, using the <literal>INHERIT</literal>
+ variant of <link linkend="sql-altertable"><command>ALTER TABLE</command></link>.
+ To do this the new child table must already include columns with
+ the same names and types as the columns of the parent. It must also include
+ check constraints with the same names and check expressions as those of the
+ parent. Similarly an inheritance link can be removed from a child using the
+ <literal>NO INHERIT</literal> variant of <command>ALTER TABLE</command>.
+ Dynamically adding and removing inheritance links like this can be useful
+ when the inheritance relationship is being used for table
+ partitioning (see <xref linkend="ddl-partitioning"/>).
+ </para>
+
+ <para>
+ One convenient way to create a compatible table that will later be made
+ a new child is to use the <literal>LIKE</literal> clause in <command>CREATE
+ TABLE</command>. This creates a new table with the same columns as
+ the source table. If there are any <literal>CHECK</literal>
+ constraints defined on the source table, the <literal>INCLUDING
+ CONSTRAINTS</literal> option to <literal>LIKE</literal> should be
+ specified, as the new child must have constraints matching the parent
+ to be considered compatible.
+ </para>
+
+ <para>
+ A parent table cannot be dropped while any of its children remain. Neither
+ can columns or check constraints of child tables be dropped or altered
+ if they are inherited
+ from any parent tables. If you wish to remove a table and all of its
+ descendants, one easy way is to drop the parent table with the
+ <literal>CASCADE</literal> option (see <xref linkend="ddl-depend"/>).
+ </para>
+
+ <para>
+ <command>ALTER TABLE</command> will
+ propagate any changes in column data definitions and check
+ constraints down the inheritance hierarchy. Again, dropping
+ columns that are depended on by other tables is only possible when using
+ the <literal>CASCADE</literal> option. <command>ALTER
+ TABLE</command> follows the same rules for duplicate column merging
+ and rejection that apply during <command>CREATE TABLE</command>.
+ </para>
+
+ <para>
+ Inherited queries perform access permission checks on the parent table
+ only. Thus, for example, granting <literal>UPDATE</literal> permission on
+ the <structname>cities</structname> table implies permission to update rows in
+ the <structname>capitals</structname> table as well, when they are
+ accessed through <structname>cities</structname>. This preserves the appearance
+ that the data is (also) in the parent table. But
+ the <structname>capitals</structname> table could not be updated directly
+ without an additional grant. In a similar way, the parent table's row
+ security policies (see <xref linkend="ddl-rowsecurity"/>) are applied to
+ rows coming from child tables during an inherited query. A child table's
+ policies, if any, are applied only when it is the table explicitly named
+ in the query; and in that case, any policies attached to its parent(s) are
+ ignored.
+ </para>
+
+ <para>
+ Foreign tables (see <xref linkend="ddl-foreign-data"/>) can also
+ be part of inheritance hierarchies, either as parent or child
+ tables, just as regular tables can be. If a foreign table is part
+ of an inheritance hierarchy then any operations not supported by
+ the foreign table are not supported on the whole hierarchy either.
+ </para>
+
+ <sect2 id="ddl-inherit-caveats">
+ <title>Caveats</title>
+
+ <para>
+ Note that not all SQL commands are able to work on
+ inheritance hierarchies. Commands that are used for data querying,
+ data modification, or schema modification
+ (e.g., <literal>SELECT</literal>, <literal>UPDATE</literal>, <literal>DELETE</literal>,
+ most variants of <literal>ALTER TABLE</literal>, but
+ not <literal>INSERT</literal> or <literal>ALTER TABLE ...
+ RENAME</literal>) typically default to including child tables and
+ support the <literal>ONLY</literal> notation to exclude them.
+ Commands that do database maintenance and tuning
+ (e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
+ typically only work on individual, physical tables and do not
+ support recursing over inheritance hierarchies. The respective
+ behavior of each individual command is documented in its reference
+ page (<xref linkend="sql-commands"/>).
+ </para>
+
+ <para>
+ A serious limitation of the inheritance feature is that indexes (including
+ unique constraints) and foreign key constraints only apply to single
+ tables, not to their inheritance children. This is true on both the
+ referencing and referenced sides of a foreign key constraint. Thus,
+ in the terms of the above example:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ If we declared <structname>cities</structname>.<structfield>name</structfield> to be
+ <literal>UNIQUE</literal> or a <literal>PRIMARY KEY</literal>, this would not stop the
+ <structname>capitals</structname> table from having rows with names duplicating
+ rows in <structname>cities</structname>. And those duplicate rows would by
+ default show up in queries from <structname>cities</structname>. In fact, by
+ default <structname>capitals</structname> would have no unique constraint at all,
+ and so could contain multiple rows with the same name.
+ You could add a unique constraint to <structname>capitals</structname>, but this
+ would not prevent duplication compared to <structname>cities</structname>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Similarly, if we were to specify that
+ <structname>cities</structname>.<structfield>name</structfield> <literal>REFERENCES</literal> some
+ other table, this constraint would not automatically propagate to
+ <structname>capitals</structname>. In this case you could work around it by
+ manually adding the same <literal>REFERENCES</literal> constraint to
+ <structname>capitals</structname>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Specifying that another table's column <literal>REFERENCES
+ cities(name)</literal> would allow the other table to contain city names, but
+ not capital names. There is no good workaround for this case.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ Some functionality not implemented for inheritance hierarchies is
+ implemented for declarative partitioning.
+ Considerable care is needed in deciding whether partitioning with legacy
+ inheritance is useful for your application.
+ </para>
+
+ </sect2>
+ </sect1>
+
+ <sect1 id="ddl-partitioning">
+ <title>Table Partitioning</title>
+
+ <indexterm>
+ <primary>partitioning</primary>
+ </indexterm>
+
+ <indexterm>
+ <primary>table</primary>
+ <secondary>partitioning</secondary>
+ </indexterm>
+
+ <indexterm>
+ <primary>partitioned table</primary>
+ </indexterm>
+
+ <para>
+ <productname>PostgreSQL</productname> supports basic table
+ partitioning. This section describes why and how to implement
+ partitioning as part of your database design.
+ </para>
+
+ <sect2 id="ddl-partitioning-overview">
+ <title>Overview</title>
+
+ <para>
+ Partitioning refers to splitting what is logically one large table into
+ smaller physical pieces. Partitioning can provide several benefits:
+ <itemizedlist>
+ <listitem>
+ <para>
+ Query performance can be improved dramatically in certain situations,
+ particularly when most of the heavily accessed rows of the table are in a
+ single partition or a small number of partitions. Partitioning
+ effectively substitutes for the upper tree levels of indexes,
+ making it more likely that the heavily-used parts of the indexes
+ fit in memory.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ When queries or updates access a large percentage of a single
+ partition, performance can be improved by using a
+ sequential scan of that partition instead of using an
+ index, which would require random-access reads scattered across the
+ whole table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Bulk loads and deletes can be accomplished by adding or removing
+ partitions, if the usage pattern is accounted for in the
+ partitioning design. Dropping an individual partition
+ using <command>DROP TABLE</command>, or doing <command>ALTER TABLE
+ DETACH PARTITION</command>, is far faster than a bulk
+ operation. These commands also entirely avoid the
+ <command>VACUUM</command> overhead caused by a bulk <command>DELETE</command>.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Seldom-used data can be migrated to cheaper and slower storage media.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ These benefits will normally be worthwhile only when a table would
+ otherwise be very large. The exact point at which a table will
+ benefit from partitioning depends on the application, although a
+ rule of thumb is that the size of the table should exceed the physical
+ memory of the database server.
+ </para>
+
+ <para>
+ <productname>PostgreSQL</productname> offers built-in support for the
+ following forms of partitioning:
+
+ <variablelist>
+ <varlistentry>
+ <term>Range Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned into <quote>ranges</quote> defined
+ by a key column or set of columns, with no overlap between
+ the ranges of values assigned to different partitions. For
+ example, one might partition by date ranges, or by ranges of
+ identifiers for particular business objects.
+ Each range's bounds are understood as being inclusive at the
+ lower end and exclusive at the upper end. For example, if one
+ partition's range is from <literal>1</literal>
+ to <literal>10</literal>, and the next one's range is
+ from <literal>10</literal> to <literal>20</literal>, then
+ value <literal>10</literal> belongs to the second partition not
+ the first.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>List Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned by explicitly listing which key value(s)
+ appear in each partition.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>Hash Partitioning</term>
+
+ <listitem>
+ <para>
+ The table is partitioned by specifying a modulus and a remainder for
+ each partition. Each partition will hold the rows for which the hash
+ value of the partition key divided by the specified modulus will
+ produce the specified remainder.
+ </para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ If your application needs to use other forms of partitioning not listed
+ above, alternative methods such as inheritance and
+ <literal>UNION ALL</literal> views can be used instead. Such methods
+ offer flexibility but do not have some of the performance benefits
+ of built-in declarative partitioning.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-declarative">
+ <title>Declarative Partitioning</title>
+
+ <para>
+ <productname>PostgreSQL</productname> allows you to declare
+ that a table is divided into partitions. The table that is divided
+ is referred to as a <firstterm>partitioned table</firstterm>. The
+ declaration includes the <firstterm>partitioning method</firstterm>
+ as described above, plus a list of columns or expressions to be used
+ as the <firstterm>partition key</firstterm>.
+ </para>
+
+ <para>
+ The partitioned table itself is a <quote>virtual</quote> table having
+ no storage of its own. Instead, the storage belongs
+ to <firstterm>partitions</firstterm>, which are otherwise-ordinary
+ tables associated with the partitioned table.
+ Each partition stores a subset of the data as defined by its
+ <firstterm>partition bounds</firstterm>.
+ All rows inserted into a partitioned table will be routed to the
+ appropriate one of the partitions based on the values of the partition
+ key column(s).
+ Updating the partition key of a row will cause it to be moved into a
+ different partition if it no longer satisfies the partition bounds
+ of its original partition.
+ </para>
+
+ <para>
+ Partitions may themselves be defined as partitioned tables, resulting
+ in <firstterm>sub-partitioning</firstterm>. Although all partitions
+ must have the same columns as their partitioned parent, partitions may
+ have their
+ own indexes, constraints and default values, distinct from those of other
+ partitions. See <xref linkend="sql-createtable"/> for more details on
+ creating partitioned tables and partitions.
+ </para>
+
+ <para>
+ It is not possible to turn a regular table into a partitioned table or
+ vice versa. However, it is possible to add an existing regular or
+ partitioned table as a partition of a partitioned table, or remove a
+ partition from a partitioned table turning it into a standalone table;
+ this can simplify and speed up many maintenance processes.
+ See <xref linkend="sql-altertable"/> to learn more about the
+ <command>ATTACH PARTITION</command> and <command>DETACH PARTITION</command>
+ sub-commands.
+ </para>
+
+ <para>
+ Partitions can also be <link linkend="ddl-foreign-data">foreign
+ tables</link>, although considerable care is needed because it is then
+ the user's responsibility that the contents of the foreign table
+ satisfy the partitioning rule. There are some other restrictions as
+ well. See <xref linkend="sql-createforeigntable"/> for more
+ information.
+ </para>
+
+ <sect3 id="ddl-partitioning-declarative-example">
+ <title>Example</title>
+
+ <para>
+ Suppose we are constructing a database for a large ice cream company.
+ The company measures peak temperatures every day as well as ice cream
+ sales in each region. Conceptually, we want a table like:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+);
+</programlisting>
+
+ We know that most queries will access just the last week's, month's or
+ quarter's data, since the main use of this table will be to prepare
+ online reports for management. To reduce the amount of old data that
+ needs to be stored, we decide to keep only the most recent 3 years
+ worth of data. At the beginning of each month we will remove the oldest
+ month's data. In this situation we can use partitioning to help us meet
+ all of our different requirements for the measurements table.
+ </para>
+
+ <para>
+ To use declarative partitioning in this case, use the following steps:
+
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create the <structname>measurement</structname> table as a partitioned
+ table by specifying the <literal>PARTITION BY</literal> clause, which
+ includes the partitioning method (<literal>RANGE</literal> in this
+ case) and the list of column(s) to use as the partition key.
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+) PARTITION BY RANGE (logdate);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create partitions. Each partition's definition must specify bounds
+ that correspond to the partitioning method and partition key of the
+ parent. Note that specifying bounds such that the new partition's
+ values would overlap with those in one or more existing partitions will
+ cause an error.
+ </para>
+
+ <para>
+ Partitions thus created are in every way normal
+ <productname>PostgreSQL</productname>
+ tables (or, possibly, foreign tables). It is possible to specify a
+ tablespace and storage parameters for each partition separately.
+ </para>
+
+ <para>
+ For our example, each partition should hold one month's worth of
+ data, to match the requirement of deleting one month's data at a
+ time. So the commands might look like:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
+
+CREATE TABLE measurement_y2006m03 PARTITION OF measurement
+ FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
+
+...
+CREATE TABLE measurement_y2007m11 PARTITION OF measurement
+ FOR VALUES FROM ('2007-11-01') TO ('2007-12-01');
+
+CREATE TABLE measurement_y2007m12 PARTITION OF measurement
+ FOR VALUES FROM ('2007-12-01') TO ('2008-01-01')
+ TABLESPACE fasttablespace;
+
+CREATE TABLE measurement_y2008m01 PARTITION OF measurement
+ FOR VALUES FROM ('2008-01-01') TO ('2008-02-01')
+ WITH (parallel_workers = 4)
+ TABLESPACE fasttablespace;
+</programlisting>
+
+ (Recall that adjacent partitions can share a bound value, since
+ range upper bounds are treated as exclusive bounds.)
+ </para>
+
+ <para>
+ If you wish to implement sub-partitioning, again specify the
+ <literal>PARTITION BY</literal> clause in the commands used to create
+ individual partitions, for example:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 PARTITION OF measurement
+ FOR VALUES FROM ('2006-02-01') TO ('2006-03-01')
+ PARTITION BY RANGE (peaktemp);
+</programlisting>
+
+ After creating partitions of <structname>measurement_y2006m02</structname>,
+ any data inserted into <structname>measurement</structname> that is mapped to
+ <structname>measurement_y2006m02</structname> (or data that is
+ directly inserted into <structname>measurement_y2006m02</structname>,
+ which is allowed provided its partition constraint is satisfied)
+ will be further redirected to one of its
+ partitions based on the <structfield>peaktemp</structfield> column. The partition
+ key specified may overlap with the parent's partition key, although
+ care should be taken when specifying the bounds of a sub-partition
+ such that the set of data it accepts constitutes a subset of what
+ the partition's own bounds allow; the system does not try to check
+ whether that's really the case.
+ </para>
+
+ <para>
+ Inserting data into the parent table that does not map
+ to one of the existing partitions will cause an error; an appropriate
+ partition must be added manually.
+ </para>
+
+ <para>
+ It is not necessary to manually create table constraints describing
+ the partition boundary conditions for partitions. Such constraints
+ will be created automatically.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create an index on the key column(s), as well as any other indexes you
+ might want, on the partitioned table. (The key index is not strictly
+ necessary, but in most scenarios it is helpful.)
+ This automatically creates a matching index on each partition, and
+ any partitions you create or attach later will also have such an
+ index.
+ An index or unique constraint declared on a partitioned table
+ is <quote>virtual</quote> in the same way that the partitioned table
+ is: the actual data is in child indexes on the individual partition
+ tables.
+
+<programlisting>
+CREATE INDEX ON measurement (logdate);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Ensure that the <xref linkend="guc-enable-partition-pruning"/>
+ configuration parameter is not disabled in <filename>postgresql.conf</filename>.
+ If it is, queries will not be optimized as desired.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ In the above example we would be creating a new partition each month, so
+ it might be wise to write a script that generates the required DDL
+ automatically.
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-declarative-maintenance">
+ <title>Partition Maintenance</title>
+
+ <para>
+ Normally the set of partitions established when initially defining the
+ table is not intended to remain static. It is common to want to
+ remove partitions holding old data and periodically add new partitions for
+ new data. One of the most important advantages of partitioning is
+ precisely that it allows this otherwise painful task to be executed
+ nearly instantaneously by manipulating the partition structure, rather
+ than physically moving large amounts of data around.
+ </para>
+
+ <para>
+ The simplest option for removing old data is to drop the partition that
+ is no longer necessary:
+<programlisting>
+DROP TABLE measurement_y2006m02;
+</programlisting>
+ This can very quickly delete millions of records because it doesn't have
+ to individually delete every record. Note however that the above command
+ requires taking an <literal>ACCESS EXCLUSIVE</literal> lock on the parent
+ table.
+ </para>
+
+ <para>
+ Another option that is often preferable is to remove the partition from
+ the partitioned table but retain access to it as a table in its own
+ right. This has two forms:
+
+<programlisting>
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02;
+ALTER TABLE measurement DETACH PARTITION measurement_y2006m02 CONCURRENTLY;
+</programlisting>
+
+ These allow further operations to be performed on the data before
+ it is dropped. For example, this is often a useful time to back up
+ the data using <command>COPY</command>, <application>pg_dump</application>, or
+ similar tools. It might also be a useful time to aggregate data
+ into smaller formats, perform other data manipulations, or run
+ reports. The first form of the command requires an
+ <literal>ACCESS EXCLUSIVE</literal> lock on the parent table.
+ Adding the <literal>CONCURRENTLY</literal> qualifier as in the second
+ form allows the detach operation to require only
+ <literal>SHARE UPDATE EXCLUSIVE</literal> lock on the parent table, but see
+ <link linkend="sql-altertable-detach-partition"><literal>ALTER TABLE ... DETACH PARTITION</literal></link>
+ for details on the restrictions.
+ </para>
+
+ <para>
+ Similarly we can add a new partition to handle new data. We can create an
+ empty partition in the partitioned table just as the original partitions
+ were created above:
+
+<programlisting>
+CREATE TABLE measurement_y2008m02 PARTITION OF measurement
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01')
+ TABLESPACE fasttablespace;
+</programlisting>
+
+ As an alternative, it is sometimes more convenient to create the
+ new table outside the partition structure, and attach it as a
+ partition later. This allows new data to be loaded, checked, and
+ transformed prior to it appearing in the partitioned table.
+ Moreover, the <literal>ATTACH PARTITION</literal> operation requires
+ only <literal>SHARE UPDATE EXCLUSIVE</literal> lock on the
+ partitioned table, as opposed to the <literal>ACCESS
+ EXCLUSIVE</literal> lock that is required by <command>CREATE TABLE
+ ... PARTITION OF</command>, so it is more friendly to concurrent
+ operations on the partitioned table.
+ The <literal>CREATE TABLE ... LIKE</literal> option is helpful
+ to avoid tediously repeating the parent table's definition:
+
+<programlisting>
+CREATE TABLE measurement_y2008m02
+ (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS)
+ TABLESPACE fasttablespace;
+
+ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
+ CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' );
+
+\copy measurement_y2008m02 from 'measurement_y2008m02'
+-- possibly some other data preparation work
+
+ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
+ FOR VALUES FROM ('2008-02-01') TO ('2008-03-01' );
+</programlisting>
+ </para>
+
+ <para>
+ Before running the <command>ATTACH PARTITION</command> command, it is
+ recommended to create a <literal>CHECK</literal> constraint on the table to
+ be attached that matches the expected partition constraint, as
+ illustrated above. That way, the system will be able to skip the scan
+ which is otherwise needed to validate the implicit
+ partition constraint. Without the <literal>CHECK</literal> constraint,
+ the table will be scanned to validate the partition constraint while
+ holding an <literal>ACCESS EXCLUSIVE</literal> lock on that partition.
+ It is recommended to drop the now-redundant <literal>CHECK</literal>
+ constraint after the <command>ATTACH PARTITION</command> is complete. If
+ the table being attached is itself a partitioned table, then each of its
+ sub-partitions will be recursively locked and scanned until either a
+ suitable <literal>CHECK</literal> constraint is encountered or the leaf
+ partitions are reached.
+ </para>
+
+ <para>
+ Similarly, if the partitioned table has a <literal>DEFAULT</literal>
+ partition, it is recommended to create a <literal>CHECK</literal>
+ constraint which excludes the to-be-attached partition's constraint. If
+ this is not done then the <literal>DEFAULT</literal> partition will be
+ scanned to verify that it contains no records which should be located in
+ the partition being attached. This operation will be performed whilst
+ holding an <literal>ACCESS EXCLUSIVE</literal> lock on the <literal>
+ DEFAULT</literal> partition. If the <literal>DEFAULT</literal> partition
+ is itself a partitioned table, then each of its partitions will be
+ recursively checked in the same way as the table being attached, as
+ mentioned above.
+ </para>
+
+ <para>
+ As explained above, it is possible to create indexes on partitioned tables
+ so that they are applied automatically to the entire hierarchy.
+ This is very
+ convenient, as not only will the existing partitions become indexed, but
+ also any partitions that are created in the future will. One limitation is
+ that it's not possible to use the <literal>CONCURRENTLY</literal>
+ qualifier when creating such a partitioned index. To avoid long lock
+ times, it is possible to use <command>CREATE INDEX ON ONLY</command>
+ the partitioned table; such an index is marked invalid, and the partitions
+ do not get the index applied automatically. The indexes on partitions can
+ be created individually using <literal>CONCURRENTLY</literal>, and then
+ <firstterm>attached</firstterm> to the index on the parent using
+ <command>ALTER INDEX .. ATTACH PARTITION</command>. Once indexes for all
+ partitions are attached to the parent index, the parent index is marked
+ valid automatically. Example:
+<programlisting>
+CREATE INDEX measurement_usls_idx ON ONLY measurement (unitsales);
+
+CREATE INDEX measurement_usls_200602_idx
+ ON measurement_y2006m02 (unitsales);
+ALTER INDEX measurement_usls_idx
+ ATTACH PARTITION measurement_usls_200602_idx;
+...
+</programlisting>
+
+ This technique can be used with <literal>UNIQUE</literal> and
+ <literal>PRIMARY KEY</literal> constraints too; the indexes are created
+ implicitly when the constraint is created. Example:
+<programlisting>
+ALTER TABLE ONLY measurement ADD UNIQUE (city_id, logdate);
+
+ALTER TABLE measurement_y2006m02 ADD UNIQUE (city_id, logdate);
+ALTER INDEX measurement_city_id_logdate_key
+ ATTACH PARTITION measurement_y2006m02_city_id_logdate_key;
+...
+</programlisting>
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-declarative-limitations">
+ <title>Limitations</title>
+
+ <para>
+ The following limitations apply to partitioned tables:
+ <itemizedlist>
+ <listitem>
+ <para>
+ To create a unique or primary key constraint on a partitioned table,
+ the partition keys must not include any expressions or function calls
+ and the constraint's columns must include all of the partition key
+ columns. This limitation exists because the individual indexes making
+ up the constraint can only directly enforce uniqueness within their own
+ partitions; therefore, the partition structure itself must guarantee
+ that there are not duplicates in different partitions.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ There is no way to create an exclusion constraint spanning the
+ whole partitioned table. It is only possible to put such a
+ constraint on each leaf partition individually. Again, this
+ limitation stems from not being able to enforce cross-partition
+ restrictions.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <literal>BEFORE ROW</literal> triggers on <literal>INSERT</literal>
+ cannot change which partition is the final destination for a new row.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Mixing temporary and permanent relations in the same partition tree is
+ not allowed. Hence, if the partitioned table is permanent, so must be
+ its partitions and likewise if the partitioned table is temporary. When
+ using temporary relations, all members of the partition tree have to be
+ from the same session.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Individual partitions are linked to their partitioned table using
+ inheritance behind-the-scenes. However, it is not possible to use
+ all of the generic features of inheritance with declaratively
+ partitioned tables or their partitions, as discussed below. Notably,
+ a partition cannot have any parents other than the partitioned table
+ it is a partition of, nor can a table inherit from both a partitioned
+ table and a regular table. That means partitioned tables and their
+ partitions never share an inheritance hierarchy with regular tables.
+ </para>
+
+ <para>
+ Since a partition hierarchy consisting of the partitioned table and its
+ partitions is still an inheritance hierarchy,
+ <structfield>tableoid</structfield> and all the normal rules of
+ inheritance apply as described in <xref linkend="ddl-inherit"/>, with
+ a few exceptions:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Partitions cannot have columns that are not present in the parent. It
+ is not possible to specify columns when creating partitions with
+ <command>CREATE TABLE</command>, nor is it possible to add columns to
+ partitions after-the-fact using <command>ALTER TABLE</command>.
+ Tables may be added as a partition with <command>ALTER TABLE
+ ... ATTACH PARTITION</command> only if their columns exactly match
+ the parent.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Both <literal>CHECK</literal> and <literal>NOT NULL</literal>
+ constraints of a partitioned table are always inherited by all its
+ partitions. <literal>CHECK</literal> constraints that are marked
+ <literal>NO INHERIT</literal> are not allowed to be created on
+ partitioned tables.
+ You cannot drop a <literal>NOT NULL</literal> constraint on a
+ partition's column if the same constraint is present in the parent
+ table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Using <literal>ONLY</literal> to add or drop a constraint on only
+ the partitioned table is supported as long as there are no
+ partitions. Once partitions exist, using <literal>ONLY</literal>
+ will result in an error. Instead, constraints on the partitions
+ themselves can be added and (if they are not present in the parent
+ table) dropped.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ As a partitioned table does not have any data itself, attempts to use
+ <command>TRUNCATE</command> <literal>ONLY</literal> on a partitioned
+ table will always return an error.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-using-inheritance">
+ <title>Partitioning Using Inheritance</title>
+
+ <para>
+ While the built-in declarative partitioning is suitable for most
+ common use cases, there are some circumstances where a more flexible
+ approach may be useful. Partitioning can be implemented using table
+ inheritance, which allows for several features not supported
+ by declarative partitioning, such as:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ For declarative partitioning, partitions must have exactly the same set
+ of columns as the partitioned table, whereas with table inheritance,
+ child tables may have extra columns not present in the parent.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Table inheritance allows for multiple inheritance.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Declarative partitioning only supports range, list and hash
+ partitioning, whereas table inheritance allows data to be divided in a
+ manner of the user's choosing. (Note, however, that if constraint
+ exclusion is unable to prune child tables effectively, query performance
+ might be poor.)
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <sect3 id="ddl-partitioning-inheritance-example">
+ <title>Example</title>
+
+ <para>
+ This example builds a partitioning structure equivalent to the
+ declarative partitioning example above. Use
+ the following steps:
+
+ <orderedlist spacing="compact">
+ <listitem>
+ <para>
+ Create the <quote>root</quote> table, from which all of the
+ <quote>child</quote> tables will inherit. This table will contain no data. Do not
+ define any check constraints on this table, unless you intend them
+ to be applied equally to all child tables. There is no point in
+ defining any indexes or unique constraints on it, either. For our
+ example, the root table is the <structname>measurement</structname>
+ table as originally defined:
+
+<programlisting>
+CREATE TABLE measurement (
+ city_id int not null,
+ logdate date not null,
+ peaktemp int,
+ unitsales int
+);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Create several <quote>child</quote> tables that each inherit from
+ the root table. Normally, these tables will not add any columns
+ to the set inherited from the root. Just as with declarative
+ partitioning, these tables are in every way normal
+ <productname>PostgreSQL</productname> tables (or foreign tables).
+ </para>
+
+ <para>
+<programlisting>
+CREATE TABLE measurement_y2006m02 () INHERITS (measurement);
+CREATE TABLE measurement_y2006m03 () INHERITS (measurement);
+...
+CREATE TABLE measurement_y2007m11 () INHERITS (measurement);
+CREATE TABLE measurement_y2007m12 () INHERITS (measurement);
+CREATE TABLE measurement_y2008m01 () INHERITS (measurement);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Add non-overlapping table constraints to the child tables to
+ define the allowed key values in each.
+ </para>
+
+ <para>
+ Typical examples would be:
+<programlisting>
+CHECK ( x = 1 )
+CHECK ( county IN ( 'Oxfordshire', 'Buckinghamshire', 'Warwickshire' ))
+CHECK ( outletID &gt;= 100 AND outletID &lt; 200 )
+</programlisting>
+ Ensure that the constraints guarantee that there is no overlap
+ between the key values permitted in different child tables. A common
+ mistake is to set up range constraints like:
+<programlisting>
+CHECK ( outletID BETWEEN 100 AND 200 )
+CHECK ( outletID BETWEEN 200 AND 300 )
+</programlisting>
+ This is wrong since it is not clear which child table the key
+ value 200 belongs in.
+ Instead, ranges should be defined in this style:
+
+<programlisting>
+CREATE TABLE measurement_y2006m02 (
+ CHECK ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
+) INHERITS (measurement);
+
+CREATE TABLE measurement_y2006m03 (
+ CHECK ( logdate &gt;= DATE '2006-03-01' AND logdate &lt; DATE '2006-04-01' )
+) INHERITS (measurement);
+
+...
+CREATE TABLE measurement_y2007m11 (
+ CHECK ( logdate &gt;= DATE '2007-11-01' AND logdate &lt; DATE '2007-12-01' )
+) INHERITS (measurement);
+
+CREATE TABLE measurement_y2007m12 (
+ CHECK ( logdate &gt;= DATE '2007-12-01' AND logdate &lt; DATE '2008-01-01' )
+) INHERITS (measurement);
+
+CREATE TABLE measurement_y2008m01 (
+ CHECK ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
+) INHERITS (measurement);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ For each child table, create an index on the key column(s),
+ as well as any other indexes you might want.
+<programlisting>
+CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
+CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
+CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
+CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
+CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
+</programlisting>
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ We want our application to be able to say <literal>INSERT INTO
+ measurement ...</literal> and have the data be redirected into the
+ appropriate child table. We can arrange that by attaching
+ a suitable trigger function to the root table.
+ If data will be added only to the latest child, we can
+ use a very simple trigger function:
+
+<programlisting>
+CREATE OR REPLACE FUNCTION measurement_insert_trigger()
+RETURNS TRIGGER AS $$
+BEGIN
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
+ RETURN NULL;
+END;
+$$
+LANGUAGE plpgsql;
+</programlisting>
+ </para>
+
+ <para>
+ After creating the function, we create a trigger which
+ calls the trigger function:
+
+<programlisting>
+CREATE TRIGGER insert_measurement_trigger
+ BEFORE INSERT ON measurement
+ FOR EACH ROW EXECUTE FUNCTION measurement_insert_trigger();
+</programlisting>
+
+ We must redefine the trigger function each month so that it always
+ inserts into the current child table. The trigger definition does
+ not need to be updated, however.
+ </para>
+
+ <para>
+ We might want to insert data and have the server automatically
+ locate the child table into which the row should be added. We
+ could do this with a more complex trigger function, for example:
+
+<programlisting>
+CREATE OR REPLACE FUNCTION measurement_insert_trigger()
+RETURNS TRIGGER AS $$
+BEGIN
+ IF ( NEW.logdate &gt;= DATE '2006-02-01' AND
+ NEW.logdate &lt; DATE '2006-03-01' ) THEN
+ INSERT INTO measurement_y2006m02 VALUES (NEW.*);
+ ELSIF ( NEW.logdate &gt;= DATE '2006-03-01' AND
+ NEW.logdate &lt; DATE '2006-04-01' ) THEN
+ INSERT INTO measurement_y2006m03 VALUES (NEW.*);
+ ...
+ ELSIF ( NEW.logdate &gt;= DATE '2008-01-01' AND
+ NEW.logdate &lt; DATE '2008-02-01' ) THEN
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
+ ELSE
+ RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
+ END IF;
+ RETURN NULL;
+END;
+$$
+LANGUAGE plpgsql;
+</programlisting>
+
+ The trigger definition is the same as before.
+ Note that each <literal>IF</literal> test must exactly match the
+ <literal>CHECK</literal> constraint for its child table.
+ </para>
+
+ <para>
+ While this function is more complex than the single-month case,
+ it doesn't need to be updated as often, since branches can be
+ added in advance of being needed.
+ </para>
+
+ <note>
+ <para>
+ In practice, it might be best to check the newest child first,
+ if most inserts go into that child. For simplicity, we have
+ shown the trigger's tests in the same order as in other parts
+ of this example.
+ </para>
+ </note>
+
+ <para>
+ A different approach to redirecting inserts into the appropriate
+ child table is to set up rules, instead of a trigger, on the
+ root table. For example:
+
+<programlisting>
+CREATE RULE measurement_insert_y2006m02 AS
+ON INSERT TO measurement WHERE
+ ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2006m02 VALUES (NEW.*);
+...
+CREATE RULE measurement_insert_y2008m01 AS
+ON INSERT TO measurement WHERE
+ ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
+DO INSTEAD
+ INSERT INTO measurement_y2008m01 VALUES (NEW.*);
+</programlisting>
+
+ A rule has significantly more overhead than a trigger, but the
+ overhead is paid once per query rather than once per row, so this
+ method might be advantageous for bulk-insert situations. In most
+ cases, however, the trigger method will offer better performance.
+ </para>
+
+ <para>
+ Be aware that <command>COPY</command> ignores rules. If you want to
+ use <command>COPY</command> to insert data, you'll need to copy into the
+ correct child table rather than directly into the root. <command>COPY</command>
+ does fire triggers, so you can use it normally if you use the trigger
+ approach.
+ </para>
+
+ <para>
+ Another disadvantage of the rule approach is that there is no simple
+ way to force an error if the set of rules doesn't cover the insertion
+ date; the data will silently go into the root table instead.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Ensure that the <xref linkend="guc-constraint-exclusion"/>
+ configuration parameter is not disabled in
+ <filename>postgresql.conf</filename>; otherwise
+ child tables may be accessed unnecessarily.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ As we can see, a complex table hierarchy could require a
+ substantial amount of DDL. In the above example we would be creating
+ a new child table each month, so it might be wise to write a script that
+ generates the required DDL automatically.
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-inheritance-maintenance">
+ <title>Maintenance for Inheritance Partitioning</title>
+ <para>
+ To remove old data quickly, simply drop the child table that is no longer
+ necessary:
+<programlisting>
+DROP TABLE measurement_y2006m02;
+</programlisting>
+ </para>
+
+ <para>
+ To remove the child table from the inheritance hierarchy table but retain access to
+ it as a table in its own right:
+
+<programlisting>
+ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
+</programlisting>
+ </para>
+
+ <para>
+ To add a new child table to handle new data, create an empty child table
+ just as the original children were created above:
+
+<programlisting>
+CREATE TABLE measurement_y2008m02 (
+ CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' )
+) INHERITS (measurement);
+</programlisting>
+
+ Alternatively, one may want to create and populate the new child table
+ before adding it to the table hierarchy. This could allow data to be
+ loaded, checked, and transformed before being made visible to queries on
+ the parent table.
+
+<programlisting>
+CREATE TABLE measurement_y2008m02
+ (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
+ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
+ CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' );
+\copy measurement_y2008m02 from 'measurement_y2008m02'
+-- possibly some other data preparation work
+ALTER TABLE measurement_y2008m02 INHERIT measurement;
+</programlisting>
+ </para>
+ </sect3>
+
+ <sect3 id="ddl-partitioning-inheritance-caveats">
+ <title>Caveats</title>
+
+ <para>
+ The following caveats apply to partitioning implemented using
+ inheritance:
+ <itemizedlist>
+ <listitem>
+ <para>
+ There is no automatic way to verify that all of the
+ <literal>CHECK</literal> constraints are mutually
+ exclusive. It is safer to create code that generates
+ child tables and creates and/or modifies associated objects than
+ to write each by hand.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Indexes and foreign key constraints apply to single tables and not
+ to their inheritance children, hence they have some
+ <link linkend="ddl-inherit-caveats">caveats</link> to be aware of.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ The schemes shown here assume that the values of a row's key column(s)
+ never change, or at least do not change enough to require it to move to another partition.
+ An <command>UPDATE</command> that attempts
+ to do that will fail because of the <literal>CHECK</literal> constraints.
+ If you need to handle such cases, you can put suitable update triggers
+ on the child tables, but it makes management of the structure
+ much more complicated.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ If you are using manual <command>VACUUM</command> or
+ <command>ANALYZE</command> commands, don't forget that
+ you need to run them on each child table individually. A command like:
+<programlisting>
+ANALYZE measurement;
+</programlisting>
+ will only process the root table.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ <command>INSERT</command> statements with <literal>ON CONFLICT</literal>
+ clauses are unlikely to work as expected, as the <literal>ON CONFLICT</literal>
+ action is only taken in case of unique violations on the specified
+ target relation, not its child relations.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Triggers or rules will be needed to route rows to the desired
+ child table, unless the application is explicitly aware of the
+ partitioning scheme. Triggers may be complicated to write, and will
+ be much slower than the tuple routing performed internally by
+ declarative partitioning.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="ddl-partition-pruning">
+ <title>Partition Pruning</title>
+
+ <indexterm>
+ <primary>partition pruning</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Partition pruning</firstterm> is a query optimization technique
+ that improves performance for declaratively partitioned tables.
+ As an example:
+
+<programlisting>
+SET enable_partition_pruning = on; -- the default
+SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
+</programlisting>
+
+ Without partition pruning, the above query would scan each of the
+ partitions of the <structname>measurement</structname> table. With
+ partition pruning enabled, the planner will examine the definition
+ of each partition and prove that the partition need not
+ be scanned because it could not contain any rows meeting the query's
+ <literal>WHERE</literal> clause. When the planner can prove this, it
+ excludes (<firstterm>prunes</firstterm>) the partition from the query
+ plan.
+ </para>
+
+ <para>
+ By using the EXPLAIN command and the <xref
+ linkend="guc-enable-partition-pruning"/> configuration parameter, it's
+ possible to show the difference between a plan for which partitions have
+ been pruned and one for which they have not. A typical unoptimized
+ plan for this type of table setup is:
+<programlisting>
+SET enable_partition_pruning = off;
+EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;----------------
+ Aggregate (cost=188.76..188.77 rows=1 width=8)
+ -&gt; Append (cost=0.00..181.05 rows=3085 width=0)
+ -&gt; Seq Scan on measurement_y2006m02 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate &gt;= '2008-01-01'::date)
+ -&gt; Seq Scan on measurement_y2006m03 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate &gt;= '2008-01-01'::date)
+...
+ -&gt; Seq Scan on measurement_y2007m11 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate &gt;= '2008-01-01'::date)
+ -&gt; Seq Scan on measurement_y2007m12 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate &gt;= '2008-01-01'::date)
+ -&gt; Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate &gt;= '2008-01-01'::date)
+</programlisting>
+
+ Some or all of the partitions might use index scans instead of
+ full-table sequential scans, but the point here is that there
+ is no need to scan the older partitions at all to answer this query.
+ When we enable partition pruning, we get a significantly
+ cheaper plan that will deliver the same answer:
+<programlisting>
+SET enable_partition_pruning = on;
+EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;----------------
+ Aggregate (cost=37.75..37.76 rows=1 width=8)
+ -&gt; Seq Scan on measurement_y2008m01 (cost=0.00..33.12 rows=617 width=0)
+ Filter: (logdate &gt;= '2008-01-01'::date)
+</programlisting>
+ </para>
+
+ <para>
+ Note that partition pruning is driven only by the constraints defined
+ implicitly by the partition keys, not by the presence of indexes.
+ Therefore it isn't necessary to define indexes on the key columns.
+ Whether an index needs to be created for a given partition depends on
+ whether you expect that queries that scan the partition will
+ generally scan a large part of the partition or just a small part.
+ An index will be helpful in the latter case but not the former.
+ </para>
+
+ <para>
+ Partition pruning can be performed not only during the planning of a
+ given query, but also during its execution. This is useful as it can
+ allow more partitions to be pruned when clauses contain expressions
+ whose values are not known at query planning time, for example,
+ parameters defined in a <command>PREPARE</command> statement, using a
+ value obtained from a subquery, or using a parameterized value on the
+ inner side of a nested loop join. Partition pruning during execution
+ can be performed at any of the following times:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ During initialization of the query plan. Partition pruning can be
+ performed here for parameter values which are known during the
+ initialization phase of execution. Partitions which are pruned
+ during this stage will not show up in the query's
+ <command>EXPLAIN</command> or <command>EXPLAIN ANALYZE</command>.
+ It is possible to determine the number of partitions which were
+ removed during this phase by observing the
+ <quote>Subplans Removed</quote> property in the
+ <command>EXPLAIN</command> output.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ During actual execution of the query plan. Partition pruning may
+ also be performed here to remove partitions using values which are
+ only known during actual query execution. This includes values
+ from subqueries and values from execution-time parameters such as
+ those from parameterized nested loop joins. Since the value of
+ these parameters may change many times during the execution of the
+ query, partition pruning is performed whenever one of the
+ execution parameters being used by partition pruning changes.
+ Determining if partitions were pruned during this phase requires
+ careful inspection of the <literal>loops</literal> property in
+ the <command>EXPLAIN ANALYZE</command> output. Subplans
+ corresponding to different partitions may have different values
+ for it depending on how many times each of them was pruned during
+ execution. Some may be shown as <literal>(never executed)</literal>
+ if they were pruned every time.
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <para>
+ Partition pruning can be disabled using the
+ <xref linkend="guc-enable-partition-pruning"/> setting.
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-constraint-exclusion">
+ <title>Partitioning and Constraint Exclusion</title>
+
+ <indexterm>
+ <primary>constraint exclusion</primary>
+ </indexterm>
+
+ <para>
+ <firstterm>Constraint exclusion</firstterm> is a query optimization
+ technique similar to partition pruning. While it is primarily used
+ for partitioning implemented using the legacy inheritance method, it can be
+ used for other purposes, including with declarative partitioning.
+ </para>
+
+ <para>
+ Constraint exclusion works in a very similar way to partition
+ pruning, except that it uses each table's <literal>CHECK</literal>
+ constraints &mdash; which gives it its name &mdash; whereas partition
+ pruning uses the table's partition bounds, which exist only in the
+ case of declarative partitioning. Another difference is that
+ constraint exclusion is only applied at plan time; there is no attempt
+ to remove partitions at execution time.
+ </para>
+
+ <para>
+ The fact that constraint exclusion uses <literal>CHECK</literal>
+ constraints, which makes it slow compared to partition pruning, can
+ sometimes be used as an advantage: because constraints can be defined
+ even on declaratively-partitioned tables, in addition to their internal
+ partition bounds, constraint exclusion may be able
+ to elide additional partitions from the query plan.
+ </para>
+
+ <para>
+ The default (and recommended) setting of
+ <xref linkend="guc-constraint-exclusion"/> is neither
+ <literal>on</literal> nor <literal>off</literal>, but an intermediate setting
+ called <literal>partition</literal>, which causes the technique to be
+ applied only to queries that are likely to be working on inheritance partitioned
+ tables. The <literal>on</literal> setting causes the planner to examine
+ <literal>CHECK</literal> constraints in all queries, even simple ones that
+ are unlikely to benefit.
+ </para>
+
+ <para>
+ The following caveats apply to constraint exclusion:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Constraint exclusion is only applied during query planning, unlike
+ partition pruning, which can also be applied during query execution.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Constraint exclusion only works when the query's <literal>WHERE</literal>
+ clause contains constants (or externally supplied parameters).
+ For example, a comparison against a non-immutable function such as
+ <function>CURRENT_TIMESTAMP</function> cannot be optimized, since the
+ planner cannot know which child table the function's value might fall
+ into at run time.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Keep the partitioning constraints simple, else the planner may not be
+ able to prove that child tables might not need to be visited. Use simple
+ equality conditions for list partitioning, or simple
+ range tests for range partitioning, as illustrated in the preceding
+ examples. A good rule of thumb is that partitioning constraints should
+ contain only comparisons of the partitioning column(s) to constants
+ using B-tree-indexable operators, because only B-tree-indexable
+ column(s) are allowed in the partition key.
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ All constraints on all children of the parent table are examined
+ during constraint exclusion, so large numbers of children are likely
+ to increase query planning time considerably. So the legacy
+ inheritance based partitioning will work well with up to perhaps a
+ hundred child tables; don't try to use many thousands of children.
+ </para>
+ </listitem>
+
+ </itemizedlist>
+ </para>
+ </sect2>
+
+ <sect2 id="ddl-partitioning-declarative-best-practices">
+ <title>Best Practices for Declarative Partitioning</title>
+
+ <para>
+ The choice of how to partition a table should be made carefully, as the
+ performance of query planning and execution can be negatively affected by
+ poor design.
+ </para>
+
+ <para>
+ One of the most critical design decisions will be the column or columns
+ by which you partition your data. Often the best choice will be to
+ partition by the column or set of columns which most commonly appear in
+ <literal>WHERE</literal> clauses of queries being executed on the
+ partitioned table. <literal>WHERE</literal> clauses that are compatible
+ with the partition bound constraints can be used to prune unneeded
+ partitions. However, you may be forced into making other decisions by
+ requirements for the <literal>PRIMARY KEY</literal> or a
+ <literal>UNIQUE</literal> constraint. Removal of unwanted data is also a
+ factor to consider when planning your partitioning strategy. An entire
+ partition can be detached fairly quickly, so it may be beneficial to
+ design the partition strategy in such a way that all data to be removed
+ at once is located in a single partition.
+ </para>
+
+ <para>
+ Choosing the target number of partitions that the table should be divided
+ into is also a critical decision to make. Not having enough partitions
+ may mean that indexes remain too large and that data locality remains poor
+ which could result in low cache hit ratios. However, dividing the table
+ into too many partitions can also cause issues. Too many partitions can
+ mean longer query planning times and higher memory consumption during both
+ query planning and execution, as further described below.
+ When choosing how to partition your table,
+ it's also important to consider what changes may occur in the future. For
+ example, if you choose to have one partition per customer and you
+ currently have a small number of large customers, consider the
+ implications if in several years you instead find yourself with a large
+ number of small customers. In this case, it may be better to choose to
+ partition by <literal>HASH</literal> and choose a reasonable number of
+ partitions rather than trying to partition by <literal>LIST</literal> and
+ hoping that the number of customers does not increase beyond what it is
+ practical to partition the data by.
+ </para>
+
+ <para>
+ Sub-partitioning can be useful to further divide partitions that are
+ expected to become larger than other partitions.
+ Another option is to use range partitioning with multiple columns in
+ the partition key.
+ Either of these can easily lead to excessive numbers of partitions,
+ so restraint is advisable.
+ </para>
+
+ <para>
+ It is important to consider the overhead of partitioning during
+ query planning and execution. The query planner is generally able to
+ handle partition hierarchies with up to a few thousand partitions fairly
+ well, provided that typical queries allow the query planner to prune all
+ but a small number of partitions. Planning times become longer and memory
+ consumption becomes higher when more partitions remain after the planner
+ performs partition pruning. Another
+ reason to be concerned about having a large number of partitions is that
+ the server's memory consumption may grow significantly over
+ time, especially if many sessions touch large numbers of partitions.
+ That's because each partition requires its metadata to be loaded into the
+ local memory of each session that touches it.
+ </para>
+
+ <para>
+ With data warehouse type workloads, it can make sense to use a larger
+ number of partitions than with an <acronym>OLTP</acronym> type workload.
+ Generally, in data warehouses, query planning time is less of a concern as
+ the majority of processing time is spent during query execution. With
+ either of these two types of workload, it is important to make the right
+ decisions early, as re-partitioning large quantities of data can be
+ painfully slow. Simulations of the intended workload are often beneficial
+ for optimizing the partitioning strategy. Never just assume that more
+ partitions are better than fewer partitions, nor vice-versa.
+ </para>
+ </sect2>
+
+ </sect1>
+
+ <sect1 id="ddl-foreign-data">
+ <title>Foreign Data</title>
+
+ <indexterm>
+ <primary>foreign data</primary>
+ </indexterm>
+ <indexterm>
+ <primary>foreign table</primary>
+ </indexterm>
+ <indexterm>
+ <primary>user mapping</primary>
+ </indexterm>
+
+ <para>
+ <productname>PostgreSQL</productname> implements portions of the SQL/MED
+ specification, allowing you to access data that resides outside
+ PostgreSQL using regular SQL queries. Such data is referred to as
+ <firstterm>foreign data</firstterm>. (Note that this usage is not to be confused
+ with foreign keys, which are a type of constraint within the database.)
+ </para>
+
+ <para>
+ Foreign data is accessed with help from a
+ <firstterm>foreign data wrapper</firstterm>. A foreign data wrapper is a
+ library that can communicate with an external data source, hiding the
+ details of connecting to the data source and obtaining data from it.
+ There are some foreign data wrappers available as <filename>contrib</filename>
+ modules; see <xref linkend="contrib"/>. Other kinds of foreign data
+ wrappers might be found as third party products. If none of the existing
+ foreign data wrappers suit your needs, you can write your own; see <xref
+ linkend="fdwhandler"/>.
+ </para>
+
+ <para>
+ To access foreign data, you need to create a <firstterm>foreign server</firstterm>
+ object, which defines how to connect to a particular external data source
+ according to the set of options used by its supporting foreign data
+ wrapper. Then you need to create one or more <firstterm>foreign
+ tables</firstterm>, which define the structure of the remote data. A
+ foreign table can be used in queries just like a normal table, but a
+ foreign table has no storage in the PostgreSQL server. Whenever it is
+ used, <productname>PostgreSQL</productname> asks the foreign data wrapper
+ to fetch data from the external source, or transmit data to the external
+ source in the case of update commands.
+ </para>
+
+ <para>
+ Accessing remote data may require authenticating to the external
+ data source. This information can be provided by a
+ <firstterm>user mapping</firstterm>, which can provide additional data
+ such as user names and passwords based
+ on the current <productname>PostgreSQL</productname> role.
+ </para>
+
+ <para>
+ For additional information, see
+ <xref linkend="sql-createforeigndatawrapper"/>,
+ <xref linkend="sql-createserver"/>,
+ <xref linkend="sql-createusermapping"/>,
+ <xref linkend="sql-createforeigntable"/>, and
+ <xref linkend="sql-importforeignschema"/>.
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-others">
+ <title>Other Database Objects</title>
+
+ <para>
+ Tables are the central objects in a relational database structure,
+ because they hold your data. But they are not the only objects
+ that exist in a database. Many other kinds of objects can be
+ created to make the use and management of the data more efficient
+ or convenient. They are not discussed in this chapter, but we give
+ you a list here so that you are aware of what is possible:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ Views
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Functions, procedures, and operators
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Data types and domains
+ </para>
+ </listitem>
+
+ <listitem>
+ <para>
+ Triggers and rewrite rules
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ Detailed information on
+ these topics appears in <xref linkend="server-programming"/>.
+ </para>
+ </sect1>
+
+ <sect1 id="ddl-depend">
+ <title>Dependency Tracking</title>
+
+ <indexterm zone="ddl-depend">
+ <primary>CASCADE</primary>
+ <secondary sortas="DROP">with DROP</secondary>
+ </indexterm>
+
+ <indexterm zone="ddl-depend">
+ <primary>RESTRICT</primary>
+ <secondary sortas="DROP">with DROP</secondary>
+ </indexterm>
+
+ <para>
+ When you create complex database structures involving many tables
+ with foreign key constraints, views, triggers, functions, etc. you
+ implicitly create a net of dependencies between the objects.
+ For instance, a table with a foreign key constraint depends on the
+ table it references.
+ </para>
+
+ <para>
+ To ensure the integrity of the entire database structure,
+ <productname>PostgreSQL</productname> makes sure that you cannot
+ drop objects that other objects still depend on. For example,
+ attempting to drop the products table we considered in <xref
+ linkend="ddl-constraints-fk"/>, with the orders table depending on
+ it, would result in an error message like this:
+<screen>
+DROP TABLE products;
+
+ERROR: cannot drop table products because other objects depend on it
+DETAIL: constraint orders_product_no_fkey on table orders depends on table products
+HINT: Use DROP ... CASCADE to drop the dependent objects too.
+</screen>
+ The error message contains a useful hint: if you do not want to
+ bother deleting all the dependent objects individually, you can run:
+<screen>
+DROP TABLE products CASCADE;
+</screen>
+ and all the dependent objects will be removed, as will any objects
+ that depend on them, recursively. In this case, it doesn't remove
+ the orders table, it only removes the foreign key constraint.
+ It stops there because nothing depends on the foreign key constraint.
+ (If you want to check what <command>DROP ... CASCADE</command> will do,
+ run <command>DROP</command> without <literal>CASCADE</literal> and read the
+ <literal>DETAIL</literal> output.)
+ </para>
+
+ <para>
+ Almost all <command>DROP</command> commands in <productname>PostgreSQL</productname> support
+ specifying <literal>CASCADE</literal>. Of course, the nature of
+ the possible dependencies varies with the type of the object. You
+ can also write <literal>RESTRICT</literal> instead of
+ <literal>CASCADE</literal> to get the default behavior, which is to
+ prevent dropping objects that any other objects depend on.
+ </para>
+
+ <note>
+ <para>
+ According to the SQL standard, specifying either
+ <literal>RESTRICT</literal> or <literal>CASCADE</literal> is
+ required in a <command>DROP</command> command. No database system actually
+ enforces that rule, but whether the default behavior
+ is <literal>RESTRICT</literal> or <literal>CASCADE</literal> varies
+ across systems.
+ </para>
+ </note>
+
+ <para>
+ If a <command>DROP</command> command lists multiple
+ objects, <literal>CASCADE</literal> is only required when there are
+ dependencies outside the specified group. For example, when saying
+ <literal>DROP TABLE tab1, tab2</literal> the existence of a foreign
+ key referencing <literal>tab1</literal> from <literal>tab2</literal> would not mean
+ that <literal>CASCADE</literal> is needed to succeed.
+ </para>
+
+ <para>
+ For a user-defined function or procedure whose body is defined as a string
+ literal, <productname>PostgreSQL</productname> tracks
+ dependencies associated with the function's externally-visible properties,
+ such as its argument and result types, but <emphasis>not</emphasis> dependencies
+ that could only be known by examining the function body. As an example,
+ consider this situation:
+
+<programlisting>
+CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow',
+ 'green', 'blue', 'purple');
+
+CREATE TABLE my_colors (color rainbow, note text);
+
+CREATE FUNCTION get_color_note (rainbow) RETURNS text AS
+ 'SELECT note FROM my_colors WHERE color = $1'
+ LANGUAGE SQL;
+</programlisting>
+
+ (See <xref linkend="xfunc-sql"/> for an explanation of SQL-language
+ functions.) <productname>PostgreSQL</productname> will be aware that
+ the <function>get_color_note</function> function depends on the <type>rainbow</type>
+ type: dropping the type would force dropping the function, because its
+ argument type would no longer be defined. But <productname>PostgreSQL</productname>
+ will not consider <function>get_color_note</function> to depend on
+ the <structname>my_colors</structname> table, and so will not drop the function if
+ the table is dropped. While there are disadvantages to this approach,
+ there are also benefits. The function is still valid in some sense if the
+ table is missing, though executing it would cause an error; creating a new
+ table of the same name would allow the function to work again.
+ </para>
+
+ <para>
+ On the other hand, for a SQL-language function or procedure whose body
+ is written in SQL-standard style, the body is parsed at function
+ definition time and all dependencies recognized by the parser are
+ stored. Thus, if we write the function above as
+
+<programlisting>
+CREATE FUNCTION get_color_note (rainbow) RETURNS text
+BEGIN ATOMIC
+ SELECT note FROM my_colors WHERE color = $1;
+END;
+</programlisting>
+
+ then the function's dependency on the <structname>my_colors</structname>
+ table will be known and enforced by <command>DROP</command>.
+ </para>
+ </sect1>
+
+</chapter>