summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/man7/ANALYZE.7
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--doc/src/sgml/man7/ANALYZE.7219
1 files changed, 219 insertions, 0 deletions
diff --git a/doc/src/sgml/man7/ANALYZE.7 b/doc/src/sgml/man7/ANALYZE.7
new file mode 100644
index 0000000..e168653
--- /dev/null
+++ b/doc/src/sgml/man7/ANALYZE.7
@@ -0,0 +1,219 @@
+'\" t
+.\" Title: ANALYZE
+.\" Author: The PostgreSQL Global Development Group
+.\" Generator: DocBook XSL Stylesheets vsnapshot <http://docbook.sf.net/>
+.\" Date: 2022
+.\" Manual: PostgreSQL 14.5 Documentation
+.\" Source: PostgreSQL 14.5
+.\" Language: English
+.\"
+.TH "ANALYZE" "7" "2022" "PostgreSQL 14.5" "PostgreSQL 14.5 Documentation"
+.\" -----------------------------------------------------------------
+.\" * Define some portability stuff
+.\" -----------------------------------------------------------------
+.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.\" http://bugs.debian.org/507673
+.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html
+.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+.ie \n(.g .ds Aq \(aq
+.el .ds Aq '
+.\" -----------------------------------------------------------------
+.\" * set default formatting
+.\" -----------------------------------------------------------------
+.\" disable hyphenation
+.nh
+.\" disable justification (adjust text to left margin only)
+.ad l
+.\" -----------------------------------------------------------------
+.\" * MAIN CONTENT STARTS HERE *
+.\" -----------------------------------------------------------------
+.SH "NAME"
+ANALYZE \- collect statistics about a database
+.SH "SYNOPSIS"
+.sp
+.nf
+ANALYZE [ ( \fIoption\fR [, \&.\&.\&.] ) ] [ \fItable_and_columns\fR [, \&.\&.\&.] ]
+ANALYZE [ VERBOSE ] [ \fItable_and_columns\fR [, \&.\&.\&.] ]
+
+where \fIoption\fR can be one of:
+
+ VERBOSE [ \fIboolean\fR ]
+ SKIP_LOCKED [ \fIboolean\fR ]
+
+and \fItable_and_columns\fR is:
+
+ \fItable_name\fR [ ( \fIcolumn_name\fR [, \&.\&.\&.] ) ]
+.fi
+.SH "DESCRIPTION"
+.PP
+\fBANALYZE\fR
+collects statistics about the contents of tables in the database, and stores the results in the
+pg_statistic
+system catalog\&. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries\&.
+.PP
+Without a
+\fItable_and_columns\fR
+list,
+\fBANALYZE\fR
+processes every table and materialized view in the current database that the current user has permission to analyze\&. With a list,
+\fBANALYZE\fR
+processes only those table(s)\&. It is further possible to give a list of column names for a table, in which case only the statistics for those columns are collected\&.
+.PP
+When the option list is surrounded by parentheses, the options can be written in any order\&. The parenthesized syntax was added in
+PostgreSQL
+11; the unparenthesized syntax is deprecated\&.
+.SH "PARAMETERS"
+.PP
+VERBOSE
+.RS 4
+Enables display of progress messages\&.
+.RE
+.PP
+SKIP_LOCKED
+.RS 4
+Specifies that
+\fBANALYZE\fR
+should not wait for any conflicting locks to be released when beginning work on a relation: if a relation cannot be locked immediately without waiting, the relation is skipped\&. Note that even with this option,
+\fBANALYZE\fR
+may still block when opening the relation\*(Aqs indexes or when acquiring sample rows from partitions, table inheritance children, and some types of foreign tables\&. Also, while
+\fBANALYZE\fR
+ordinarily processes all partitions of specified partitioned tables, this option will cause
+\fBANALYZE\fR
+to skip all partitions if there is a conflicting lock on the partitioned table\&.
+.RE
+.PP
+\fIboolean\fR
+.RS 4
+Specifies whether the selected option should be turned on or off\&. You can write
+TRUE,
+ON, or
+1
+to enable the option, and
+FALSE,
+OFF, or
+0
+to disable it\&. The
+\fIboolean\fR
+value can also be omitted, in which case
+TRUE
+is assumed\&.
+.RE
+.PP
+\fItable_name\fR
+.RS 4
+The name (possibly schema\-qualified) of a specific table to analyze\&. If omitted, all regular tables, partitioned tables, and materialized views in the current database are analyzed (but not foreign tables)\&. If the specified table is a partitioned table, both the inheritance statistics of the partitioned table as a whole and statistics of the individual partitions are updated\&.
+.RE
+.PP
+\fIcolumn_name\fR
+.RS 4
+The name of a specific column to analyze\&. Defaults to all columns\&.
+.RE
+.SH "OUTPUTS"
+.PP
+When
+VERBOSE
+is specified,
+\fBANALYZE\fR
+emits progress messages to indicate which table is currently being processed\&. Various statistics about the tables are printed as well\&.
+.SH "NOTES"
+.PP
+To analyze a table, one must ordinarily be the table\*(Aqs owner or a superuser\&. However, database owners are allowed to analyze all tables in their databases, except shared catalogs\&. (The restriction for shared catalogs means that a true database\-wide
+\fBANALYZE\fR
+can only be performed by a superuser\&.)
+\fBANALYZE\fR
+will skip over any tables that the calling user does not have permission to analyze\&.
+.PP
+Foreign tables are analyzed only when explicitly selected\&. Not all foreign data wrappers support
+\fBANALYZE\fR\&. If the table\*(Aqs wrapper does not support
+\fBANALYZE\fR, the command prints a warning and does nothing\&.
+.PP
+In the default
+PostgreSQL
+configuration, the autovacuum daemon (see
+Section\ \&25.1.6) takes care of automatic analyzing of tables when they are first loaded with data, and as they change throughout regular operation\&. When autovacuum is disabled, it is a good idea to run
+\fBANALYZE\fR
+periodically, or just after making major changes in the contents of a table\&. Accurate statistics will help the planner to choose the most appropriate query plan, and thereby improve the speed of query processing\&. A common strategy for read\-mostly databases is to run
+\fBVACUUM\fR
+and
+\fBANALYZE\fR
+once a day during a low\-usage time of day\&. (This will not be sufficient if there is heavy update activity\&.)
+.PP
+\fBANALYZE\fR
+requires only a read lock on the target table, so it can run in parallel with other activity on the table\&.
+.PP
+The statistics collected by
+\fBANALYZE\fR
+usually include a list of some of the most common values in each column and a histogram showing the approximate data distribution in each column\&. One or both of these can be omitted if
+\fBANALYZE\fR
+deems them uninteresting (for example, in a unique\-key column, there are no common values) or if the column data type does not support the appropriate operators\&. There is more information about the statistics in
+Chapter\ \&25\&.
+.PP
+For large tables,
+\fBANALYZE\fR
+takes a random sample of the table contents, rather than examining every row\&. This allows even very large tables to be analyzed in a small amount of time\&. Note, however, that the statistics are only approximate, and will change slightly each time
+\fBANALYZE\fR
+is run, even if the actual table contents did not change\&. This might result in small changes in the planner\*(Aqs estimated costs shown by
+\fBEXPLAIN\fR\&. In rare situations, this non\-determinism will cause the planner\*(Aqs choices of query plans to change after
+\fBANALYZE\fR
+is run\&. To avoid this, raise the amount of statistics collected by
+\fBANALYZE\fR, as described below\&.
+.PP
+The extent of analysis can be controlled by adjusting the
+default_statistics_target
+configuration variable, or on a column\-by\-column basis by setting the per\-column statistics target with
+\fBALTER TABLE \&.\&.\&. ALTER COLUMN \&.\&.\&. SET STATISTICS\fR\&. The target value sets the maximum number of entries in the most\-common\-value list and the maximum number of bins in the histogram\&. The default target value is 100, but this can be adjusted up or down to trade off accuracy of planner estimates against the time taken for
+\fBANALYZE\fR
+and the amount of space occupied in
+pg_statistic\&. In particular, setting the statistics target to zero disables collection of statistics for that column\&. It might be useful to do that for columns that are never used as part of the
+WHERE,
+GROUP BY, or
+ORDER BY
+clauses of queries, since the planner will have no use for statistics on such columns\&.
+.PP
+The largest statistics target among the columns being analyzed determines the number of table rows sampled to prepare the statistics\&. Increasing the target causes a proportional increase in the time and space needed to do
+\fBANALYZE\fR\&.
+.PP
+One of the values estimated by
+\fBANALYZE\fR
+is the number of distinct values that appear in each column\&. Because only a subset of the rows are examined, this estimate can sometimes be quite inaccurate, even with the largest possible statistics target\&. If this inaccuracy leads to bad query plans, a more accurate value can be determined manually and then installed with
+\fBALTER TABLE \&.\&.\&. ALTER COLUMN \&.\&.\&. SET (n_distinct = \&.\&.\&.)\fR\&.
+.PP
+If the table being analyzed has one or more children,
+\fBANALYZE\fR
+will gather statistics twice: once on the rows of the parent table only, and a second time on the rows of the parent table with all of its children\&. This second set of statistics is needed when planning queries that traverse the entire inheritance tree\&. The autovacuum daemon, however, will only consider inserts or updates on the parent table itself when deciding whether to trigger an automatic analyze for that table\&. If that table is rarely inserted into or updated, the inheritance statistics will not be up to date unless you run
+\fBANALYZE\fR
+manually\&.
+.PP
+For partitioned tables,
+\fBANALYZE\fR
+gathers statistics by sampling rows from all partitions; in addition, it will recurse into each partition and update its statistics\&. Each leaf partition is analyzed only once, even with multi\-level partitioning\&. No statistics are collected for only the parent table (without data from its partitions), because with partitioning it\*(Aqs guaranteed to be empty\&.
+.PP
+By contrast, if the table being analyzed has inheritance children,
+\fBANALYZE\fR
+gathers two sets of statistics: one on the rows of the parent table only, and a second including rows of both the parent table and all of its children\&. This second set of statistics is needed when planning queries that process the inheritance tree as a whole\&. The child tables themselves are not individually analyzed in this case\&.
+.PP
+The autovacuum daemon does not process partitioned tables, nor does it process inheritance parents if only the children are ever modified\&. It is usually necessary to periodically run a manual
+\fBANALYZE\fR
+to keep the statistics of the table hierarchy up to date\&.
+.PP
+If any child tables or partitions are foreign tables whose foreign data wrappers do not support
+\fBANALYZE\fR, those tables are ignored while gathering inheritance statistics\&.
+.PP
+If the table being analyzed is completely empty,
+\fBANALYZE\fR
+will not record new statistics for that table\&. Any existing statistics will be retained\&.
+.PP
+Each backend running
+\fBANALYZE\fR
+will report its progress in the
+pg_stat_progress_analyze
+view\&. See
+Section\ \&28.4.1
+for details\&.
+.SH "COMPATIBILITY"
+.PP
+There is no
+\fBANALYZE\fR
+statement in the SQL standard\&.
+.SH "SEE ALSO"
+\fBVACUUM\fR(7), \fBvacuumdb\fR(1), Section\ \&20.4.4, Section\ \&25.1.6, Section\ \&28.4.1