diff options
Diffstat (limited to '')
-rw-r--r-- | doc/src/sgml/man7/ANALYZE.7 | 219 |
1 files changed, 219 insertions, 0 deletions
diff --git a/doc/src/sgml/man7/ANALYZE.7 b/doc/src/sgml/man7/ANALYZE.7 new file mode 100644 index 0000000..e168653 --- /dev/null +++ b/doc/src/sgml/man7/ANALYZE.7 @@ -0,0 +1,219 @@ +'\" t +.\" Title: ANALYZE +.\" Author: The PostgreSQL Global Development Group +.\" Generator: DocBook XSL Stylesheets vsnapshot <http://docbook.sf.net/> +.\" Date: 2022 +.\" Manual: PostgreSQL 14.5 Documentation +.\" Source: PostgreSQL 14.5 +.\" Language: English +.\" +.TH "ANALYZE" "7" "2022" "PostgreSQL 14.5" "PostgreSQL 14.5 Documentation" +.\" ----------------------------------------------------------------- +.\" * Define some portability stuff +.\" ----------------------------------------------------------------- +.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.\" http://bugs.debian.org/507673 +.\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html +.\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.ie \n(.g .ds Aq \(aq +.el .ds Aq ' +.\" ----------------------------------------------------------------- +.\" * set default formatting +.\" ----------------------------------------------------------------- +.\" disable hyphenation +.nh +.\" disable justification (adjust text to left margin only) +.ad l +.\" ----------------------------------------------------------------- +.\" * MAIN CONTENT STARTS HERE * +.\" ----------------------------------------------------------------- +.SH "NAME" +ANALYZE \- collect statistics about a database +.SH "SYNOPSIS" +.sp +.nf +ANALYZE [ ( \fIoption\fR [, \&.\&.\&.] ) ] [ \fItable_and_columns\fR [, \&.\&.\&.] ] +ANALYZE [ VERBOSE ] [ \fItable_and_columns\fR [, \&.\&.\&.] ] + +where \fIoption\fR can be one of: + + VERBOSE [ \fIboolean\fR ] + SKIP_LOCKED [ \fIboolean\fR ] + +and \fItable_and_columns\fR is: + + \fItable_name\fR [ ( \fIcolumn_name\fR [, \&.\&.\&.] ) ] +.fi +.SH "DESCRIPTION" +.PP +\fBANALYZE\fR +collects statistics about the contents of tables in the database, and stores the results in the +pg_statistic +system catalog\&. Subsequently, the query planner uses these statistics to help determine the most efficient execution plans for queries\&. +.PP +Without a +\fItable_and_columns\fR +list, +\fBANALYZE\fR +processes every table and materialized view in the current database that the current user has permission to analyze\&. With a list, +\fBANALYZE\fR +processes only those table(s)\&. It is further possible to give a list of column names for a table, in which case only the statistics for those columns are collected\&. +.PP +When the option list is surrounded by parentheses, the options can be written in any order\&. The parenthesized syntax was added in +PostgreSQL +11; the unparenthesized syntax is deprecated\&. +.SH "PARAMETERS" +.PP +VERBOSE +.RS 4 +Enables display of progress messages\&. +.RE +.PP +SKIP_LOCKED +.RS 4 +Specifies that +\fBANALYZE\fR +should not wait for any conflicting locks to be released when beginning work on a relation: if a relation cannot be locked immediately without waiting, the relation is skipped\&. Note that even with this option, +\fBANALYZE\fR +may still block when opening the relation\*(Aqs indexes or when acquiring sample rows from partitions, table inheritance children, and some types of foreign tables\&. Also, while +\fBANALYZE\fR +ordinarily processes all partitions of specified partitioned tables, this option will cause +\fBANALYZE\fR +to skip all partitions if there is a conflicting lock on the partitioned table\&. +.RE +.PP +\fIboolean\fR +.RS 4 +Specifies whether the selected option should be turned on or off\&. You can write +TRUE, +ON, or +1 +to enable the option, and +FALSE, +OFF, or +0 +to disable it\&. The +\fIboolean\fR +value can also be omitted, in which case +TRUE +is assumed\&. +.RE +.PP +\fItable_name\fR +.RS 4 +The name (possibly schema\-qualified) of a specific table to analyze\&. If omitted, all regular tables, partitioned tables, and materialized views in the current database are analyzed (but not foreign tables)\&. If the specified table is a partitioned table, both the inheritance statistics of the partitioned table as a whole and statistics of the individual partitions are updated\&. +.RE +.PP +\fIcolumn_name\fR +.RS 4 +The name of a specific column to analyze\&. Defaults to all columns\&. +.RE +.SH "OUTPUTS" +.PP +When +VERBOSE +is specified, +\fBANALYZE\fR +emits progress messages to indicate which table is currently being processed\&. Various statistics about the tables are printed as well\&. +.SH "NOTES" +.PP +To analyze a table, one must ordinarily be the table\*(Aqs owner or a superuser\&. However, database owners are allowed to analyze all tables in their databases, except shared catalogs\&. (The restriction for shared catalogs means that a true database\-wide +\fBANALYZE\fR +can only be performed by a superuser\&.) +\fBANALYZE\fR +will skip over any tables that the calling user does not have permission to analyze\&. +.PP +Foreign tables are analyzed only when explicitly selected\&. Not all foreign data wrappers support +\fBANALYZE\fR\&. If the table\*(Aqs wrapper does not support +\fBANALYZE\fR, the command prints a warning and does nothing\&. +.PP +In the default +PostgreSQL +configuration, the autovacuum daemon (see +Section\ \&25.1.6) takes care of automatic analyzing of tables when they are first loaded with data, and as they change throughout regular operation\&. When autovacuum is disabled, it is a good idea to run +\fBANALYZE\fR +periodically, or just after making major changes in the contents of a table\&. Accurate statistics will help the planner to choose the most appropriate query plan, and thereby improve the speed of query processing\&. A common strategy for read\-mostly databases is to run +\fBVACUUM\fR +and +\fBANALYZE\fR +once a day during a low\-usage time of day\&. (This will not be sufficient if there is heavy update activity\&.) +.PP +\fBANALYZE\fR +requires only a read lock on the target table, so it can run in parallel with other activity on the table\&. +.PP +The statistics collected by +\fBANALYZE\fR +usually include a list of some of the most common values in each column and a histogram showing the approximate data distribution in each column\&. One or both of these can be omitted if +\fBANALYZE\fR +deems them uninteresting (for example, in a unique\-key column, there are no common values) or if the column data type does not support the appropriate operators\&. There is more information about the statistics in +Chapter\ \&25\&. +.PP +For large tables, +\fBANALYZE\fR +takes a random sample of the table contents, rather than examining every row\&. This allows even very large tables to be analyzed in a small amount of time\&. Note, however, that the statistics are only approximate, and will change slightly each time +\fBANALYZE\fR +is run, even if the actual table contents did not change\&. This might result in small changes in the planner\*(Aqs estimated costs shown by +\fBEXPLAIN\fR\&. In rare situations, this non\-determinism will cause the planner\*(Aqs choices of query plans to change after +\fBANALYZE\fR +is run\&. To avoid this, raise the amount of statistics collected by +\fBANALYZE\fR, as described below\&. +.PP +The extent of analysis can be controlled by adjusting the +default_statistics_target +configuration variable, or on a column\-by\-column basis by setting the per\-column statistics target with +\fBALTER TABLE \&.\&.\&. ALTER COLUMN \&.\&.\&. SET STATISTICS\fR\&. The target value sets the maximum number of entries in the most\-common\-value list and the maximum number of bins in the histogram\&. The default target value is 100, but this can be adjusted up or down to trade off accuracy of planner estimates against the time taken for +\fBANALYZE\fR +and the amount of space occupied in +pg_statistic\&. In particular, setting the statistics target to zero disables collection of statistics for that column\&. It might be useful to do that for columns that are never used as part of the +WHERE, +GROUP BY, or +ORDER BY +clauses of queries, since the planner will have no use for statistics on such columns\&. +.PP +The largest statistics target among the columns being analyzed determines the number of table rows sampled to prepare the statistics\&. Increasing the target causes a proportional increase in the time and space needed to do +\fBANALYZE\fR\&. +.PP +One of the values estimated by +\fBANALYZE\fR +is the number of distinct values that appear in each column\&. Because only a subset of the rows are examined, this estimate can sometimes be quite inaccurate, even with the largest possible statistics target\&. If this inaccuracy leads to bad query plans, a more accurate value can be determined manually and then installed with +\fBALTER TABLE \&.\&.\&. ALTER COLUMN \&.\&.\&. SET (n_distinct = \&.\&.\&.)\fR\&. +.PP +If the table being analyzed has one or more children, +\fBANALYZE\fR +will gather statistics twice: once on the rows of the parent table only, and a second time on the rows of the parent table with all of its children\&. This second set of statistics is needed when planning queries that traverse the entire inheritance tree\&. The autovacuum daemon, however, will only consider inserts or updates on the parent table itself when deciding whether to trigger an automatic analyze for that table\&. If that table is rarely inserted into or updated, the inheritance statistics will not be up to date unless you run +\fBANALYZE\fR +manually\&. +.PP +For partitioned tables, +\fBANALYZE\fR +gathers statistics by sampling rows from all partitions; in addition, it will recurse into each partition and update its statistics\&. Each leaf partition is analyzed only once, even with multi\-level partitioning\&. No statistics are collected for only the parent table (without data from its partitions), because with partitioning it\*(Aqs guaranteed to be empty\&. +.PP +By contrast, if the table being analyzed has inheritance children, +\fBANALYZE\fR +gathers two sets of statistics: one on the rows of the parent table only, and a second including rows of both the parent table and all of its children\&. This second set of statistics is needed when planning queries that process the inheritance tree as a whole\&. The child tables themselves are not individually analyzed in this case\&. +.PP +The autovacuum daemon does not process partitioned tables, nor does it process inheritance parents if only the children are ever modified\&. It is usually necessary to periodically run a manual +\fBANALYZE\fR +to keep the statistics of the table hierarchy up to date\&. +.PP +If any child tables or partitions are foreign tables whose foreign data wrappers do not support +\fBANALYZE\fR, those tables are ignored while gathering inheritance statistics\&. +.PP +If the table being analyzed is completely empty, +\fBANALYZE\fR +will not record new statistics for that table\&. Any existing statistics will be retained\&. +.PP +Each backend running +\fBANALYZE\fR +will report its progress in the +pg_stat_progress_analyze +view\&. See +Section\ \&28.4.1 +for details\&. +.SH "COMPATIBILITY" +.PP +There is no +\fBANALYZE\fR +statement in the SQL standard\&. +.SH "SEE ALSO" +\fBVACUUM\fR(7), \fBvacuumdb\fR(1), Section\ \&20.4.4, Section\ \&25.1.6, Section\ \&28.4.1 |