1 files changed, 355 insertions, 0 deletions
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
new file mode 100644
index 0000000..73fa3b3
--- /dev/null
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -0,0 +1,355 @@
+<!--
+doc/src/sgml/ref/analyze.sgml
+PostgreSQL documentation
+-->
+
+<refentry id="sql-analyze">
+ <indexterm zone="sql-analyze">
+  <primary>ANALYZE</primary>
+ </indexterm>
+
+ <refmeta>
+  <refentrytitle>ANALYZE</refentrytitle>
+  <manvolnum>7</manvolnum>
+  <refmiscinfo>SQL - Language Statements</refmiscinfo>
+ </refmeta>
+
+ <refnamediv>
+  <refname>ANALYZE</refname>
+  <refpurpose>collect statistics about a database</refpurpose>
+ </refnamediv>
+
+ <refsynopsisdiv>
+<synopsis>
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ VERBOSE ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+
+<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
+
+    VERBOSE [ <replaceable class="parameter">boolean</replaceable> ]
+    SKIP_LOCKED [ <replaceable class="parameter">boolean</replaceable> ]
+    BUFFER_USAGE_LIMIT <replaceable class="parameter">size</replaceable>
+
+<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
+
+    <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+</synopsis>
+ </refsynopsisdiv>
+
+ <refsect1>
+  <title>Description</title>
+
+  <para>
+   <command>ANALYZE</command> collects statistics about the contents
+   of tables in the database, and stores the results in the <link
+   linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
+   system catalog.  Subsequently, the query planner uses these
+   statistics to help determine the most efficient execution plans for
+   queries.
+  </para>
+
+  <para>
+   Without a <replaceable class="parameter">table_and_columns</replaceable>
+   list, <command>ANALYZE</command> processes every table and materialized view
+   in the current database that the current user has permission to analyze.
+   With a list, <command>ANALYZE</command> processes only those table(s).
+   It is further possible to give a list of column names for a table,
+   in which case only the statistics for those columns are collected.
+  </para>
+
+  <para>
+   When the option list is surrounded by parentheses, the options can be
+   written in any order.  The parenthesized syntax was added in
+   <productname>PostgreSQL</productname> 11;  the unparenthesized syntax
+   is deprecated.
+  </para>
+ </refsect1>
+
+ <refsect1>
+  <title>Parameters</title>
+
+  <variablelist>
+   <varlistentry>
+    <term><literal>VERBOSE</literal></term>
+    <listitem>
+     <para>
+      Enables display of progress messages.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>SKIP_LOCKED</literal></term>
+    <listitem>
+     <para>
+      Specifies that <command>ANALYZE</command> should not wait for any
+      conflicting locks to be released when beginning work on a relation:
+      if a relation cannot be locked immediately without waiting, the relation
+      is skipped.  Note that even with this option, <command>ANALYZE</command>
+      may still block when opening the relation's indexes or when acquiring
+      sample rows from partitions, table inheritance children, and some
+      types of foreign tables.  Also, while <command>ANALYZE</command>
+      ordinarily processes all partitions of specified partitioned tables,
+      this option will cause <command>ANALYZE</command> to skip all
+      partitions if there is a conflicting lock on the partitioned table.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><literal>BUFFER_USAGE_LIMIT</literal></term>
+    <listitem>
+     <para>
+      Specifies the
+      <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
+      ring buffer size for <command>ANALYZE</command>.  This size is used to
+      calculate the number of shared buffers which will be reused as part of
+      this strategy.  <literal>0</literal> disables use of a
+      <literal>Buffer Access Strategy</literal>.   When this option is not
+      specified, <command>ANALYZE</command> uses the value from
+      <xref linkend="guc-vacuum-buffer-usage-limit"/>.  Higher settings can
+      allow <command>ANALYZE</command> to run more quickly, but having too
+      large a setting may cause too many other useful pages to be evicted from
+      shared buffers.  The minimum value is <literal>128 kB</literal> and the
+      maximum value is <literal>16 GB</literal>.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">boolean</replaceable></term>
+    <listitem>
+     <para>
+      Specifies whether the selected option should be turned on or off.
+      You can write <literal>TRUE</literal>, <literal>ON</literal>, or
+      <literal>1</literal> to enable the option, and <literal>FALSE</literal>,
+      <literal>OFF</literal>, or <literal>0</literal> to disable it.  The
+      <replaceable class="parameter">boolean</replaceable> value can also
+      be omitted, in which case <literal>TRUE</literal> is assumed.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">size</replaceable></term>
+    <listitem>
+     <para>
+      Specifies an amount of memory in kilobytes.  Sizes may also be specified
+      as a string containing the numerical size followed by any one of the
+      following memory units: <literal>B</literal> (bytes),
+      <literal>kB</literal> (kilobytes), <literal>MB</literal> (megabytes),
+      <literal>GB</literal> (gigabytes), or <literal>TB</literal> (terabytes).
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">table_name</replaceable></term>
+    <listitem>
+     <para>
+      The name (possibly schema-qualified) of a specific table to
+      analyze.  If omitted, all regular tables, partitioned tables, and
+      materialized views in the current database are analyzed (but not
+      foreign tables).  If the specified table is a partitioned table, both the
+      inheritance statistics of the partitioned table as a whole and
+      statistics of the individual partitions are updated.
+     </para>
+    </listitem>
+   </varlistentry>
+
+   <varlistentry>
+    <term><replaceable class="parameter">column_name</replaceable></term>
+    <listitem>
+     <para>
+      The name of a specific column to analyze. Defaults to all columns.
+     </para>
+    </listitem>
+   </varlistentry>
+  </variablelist>
+ </refsect1>
+
+ <refsect1>
+  <title>Outputs</title>
+
+   <para>
+    When <literal>VERBOSE</literal> is specified, <command>ANALYZE</command> emits
+    progress messages to indicate which table is currently being
+    processed.  Various statistics about the tables are printed as well.
+   </para>
+ </refsect1>
+
+ <refsect1>
+  <title>Notes</title>
+
+  <para>
+   To analyze a table, one must ordinarily be the table's owner or a
+   superuser.  However, database owners are allowed to
+   analyze all tables in their databases, except shared catalogs.
+   (The restriction for shared catalogs means that a true database-wide
+   <command>ANALYZE</command> can only be performed by a superuser.)
+   <command>ANALYZE</command> will skip over any tables that the calling user
+   does not have permission to analyze.
+  </para>
+
+  <para>
+   Foreign tables are analyzed only when explicitly selected.  Not all
+   foreign data wrappers support <command>ANALYZE</command>.  If the table's
+   wrapper does not support <command>ANALYZE</command>, the command prints a
+   warning and does nothing.
+  </para>
+
+  <para>
+   In the default <productname>PostgreSQL</productname> configuration,
+   the autovacuum daemon (see <xref linkend="autovacuum"/>)
+   takes care of automatic analyzing of tables when they are first loaded
+   with data, and as they change throughout regular operation.
+   When autovacuum is disabled,
+   it is a good idea to run <command>ANALYZE</command> periodically, or
+   just after making major changes in the contents of a table.  Accurate
+   statistics will help the planner to choose the most appropriate query
+   plan, and thereby improve the speed of query processing.  A common
+   strategy for read-mostly databases is to run <link linkend="sql-vacuum"><command>VACUUM</command></link>
+   and <command>ANALYZE</command> once a day during a low-usage time of day.
+   (This will not be sufficient if there is heavy update activity.)
+  </para>
+
+  <para>
+   <command>ANALYZE</command>
+   requires only a read lock on the target table, so it can run in
+   parallel with other activity on the table.
+  </para>
+
+  <para>
+   The statistics collected by <command>ANALYZE</command> usually
+   include a list of some of the most common values in each column and
+   a histogram showing the approximate data distribution in each
+   column.  One or both of these can be omitted if
+   <command>ANALYZE</command> deems them uninteresting (for example,
+   in a unique-key column, there are no common values) or if the
+   column data type does not support the appropriate operators.  There
+   is more information about the statistics in <xref
+   linkend="maintenance"/>.
+  </para>
+
+  <para>
+   For large tables, <command>ANALYZE</command> takes a random sample
+   of the table contents, rather than examining every row.  This
+   allows even very large tables to be analyzed in a small amount of
+   time.  Note, however, that the statistics are only approximate, and
+   will change slightly each time <command>ANALYZE</command> is run,
+   even if the actual table contents did not change.  This might result
+   in small changes in the planner's estimated costs shown by
+   <link linkend="sql-explain"><command>EXPLAIN</command></link>.
+   In rare situations, this non-determinism will cause the planner's
+   choices of query plans to change after <command>ANALYZE</command> is run.
+   To avoid this, raise the amount of statistics collected by
+   <command>ANALYZE</command>, as described below.
+  </para>
+
+  <para>
+   The extent of analysis can be controlled by adjusting the
+   <xref linkend="guc-default-statistics-target"/> configuration variable, or
+   on a column-by-column basis by setting the per-column statistics
+   target with <link linkend="sql-altertable"><command>ALTER TABLE ... ALTER COLUMN ... SET
+   STATISTICS</command></link>.
+   The target value sets the
+   maximum number of entries in the most-common-value list and the
+   maximum number of bins in the histogram.  The default target value
+   is 100, but this can be adjusted up or down to trade off accuracy of
+   planner estimates against the time taken for
+   <command>ANALYZE</command> and the amount of space occupied in
+   <literal>pg_statistic</literal>.  In particular, setting the
+   statistics target to zero disables collection of statistics for
+   that column.  It might be useful to do that for columns that are
+   never used as part of the <literal>WHERE</literal>, <literal>GROUP BY</literal>,
+   or <literal>ORDER BY</literal> clauses of queries, since the planner will
+   have no use for statistics on such columns.
+  </para>
+
+  <para>
+   The largest statistics target among the columns being analyzed determines
+   the number of table rows sampled to prepare the statistics.  Increasing
+   the target causes a proportional increase in the time and space needed
+   to do <command>ANALYZE</command>.
+  </para>
+
+  <para>
+   One of the values estimated by <command>ANALYZE</command> is the number of
+   distinct values that appear in each column.  Because only a subset of the
+   rows are examined, this estimate can sometimes be quite inaccurate, even
+   with the largest possible statistics target.  If this inaccuracy leads to
+   bad query plans, a more accurate value can be determined manually and then
+   installed with
+   <link linkend="sql-altertable"><command>ALTER TABLE ... ALTER COLUMN ... SET (n_distinct = ...)</command></link>.
+  </para>
+
+  <para>
+    If the table being analyzed has inheritance children,
+    <command>ANALYZE</command> gathers two sets of statistics: one on the rows
+    of the parent table only, and a second including rows of both the parent
+    table and all of its children.  This second set of statistics is needed when
+    planning queries that process the inheritance tree as a whole.  The child
+    tables themselves are not individually analyzed in this case.
+    The autovacuum daemon, however, will only consider inserts or
+    updates on the parent table itself when deciding whether to trigger an
+    automatic analyze for that table.  If that table is rarely inserted into
+    or updated, the inheritance statistics will not be up to date unless you
+    run <command>ANALYZE</command> manually.
+  </para>
+
+  <para>
+    For partitioned tables, <command>ANALYZE</command> gathers statistics by
+    sampling rows from all partitions; in addition, it will recurse into each
+    partition and update its statistics.  Each leaf partition is analyzed only
+    once, even with multi-level partitioning.  No statistics are collected for
+    only the parent table (without data from its partitions), because with
+    partitioning it's guaranteed to be empty.
+  </para>
+
+  <para>
+    The autovacuum daemon does not process partitioned tables, nor does it
+    process inheritance parents if only the children are ever modified.
+    It is usually necessary to periodically run a manual
+    <command>ANALYZE</command> to keep the statistics of the table hierarchy
+    up to date.
+  </para>
+
+  <para>
+    If any child tables or partitions are foreign tables whose foreign
+    data wrappers do not support <command>ANALYZE</command>, those tables are
+    ignored while gathering inheritance statistics.
+  </para>
+
+  <para>
+    If the table being analyzed is completely empty, <command>ANALYZE</command>
+    will not record new statistics for that table.  Any existing statistics
+    will be retained.
+  </para>
+
+  <para>
+    Each backend running <command>ANALYZE</command> will report its progress
+    in the <structname>pg_stat_progress_analyze</structname> view. See
+    <xref linkend="analyze-progress-reporting"/> for details.
+  </para>
+ </refsect1>
+
+ <refsect1>
+  <title>Compatibility</title>
+
+  <para>
+   There is no <command>ANALYZE</command> statement in the SQL standard.
+  </para>
+ </refsect1>
+
+ <refsect1>
+  <title>See Also</title>
+
+  <simplelist type="inline">
+   <member><xref linkend="sql-vacuum"/></member>
+   <member><xref linkend="app-vacuumdb"/></member>
+   <member><xref linkend="runtime-config-resource-vacuum-cost"/></member>
+   <member><xref linkend="autovacuum"/></member>
+   <member><xref linkend="analyze-progress-reporting"/></member>
+  </simplelist>
+ </refsect1>
+</refentry>