summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/planstats.sgml
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/planstats.sgml')
-rw-r--r--doc/src/sgml/planstats.sgml762
1 files changed, 762 insertions, 0 deletions
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml
new file mode 100644
index 0000000..df85ea5
--- /dev/null
+++ b/doc/src/sgml/planstats.sgml
@@ -0,0 +1,762 @@
+<!-- doc/src/sgml/planstats.sgml -->
+
+<chapter id="planner-stats-details">
+ <title>How the Planner Uses Statistics</title>
+
+ <para>
+ This chapter builds on the material covered in <xref
+ linkend="using-explain"/> and <xref linkend="planner-stats"/> to show some
+ additional details about how the planner uses the
+ system statistics to estimate the number of rows each part of a query might
+ return. This is a significant part of the planning process,
+ providing much of the raw material for cost calculation.
+ </para>
+
+ <para>
+ The intent of this chapter is not to document the code in detail,
+ but to present an overview of how it works.
+ This will perhaps ease the learning curve for someone who subsequently
+ wishes to read the code.
+ </para>
+
+ <sect1 id="row-estimation-examples">
+ <title>Row Estimation Examples</title>
+
+ <indexterm zone="row-estimation-examples">
+ <primary>row estimation</primary>
+ <secondary>planner</secondary>
+ </indexterm>
+
+ <para>
+ The examples shown below use tables in the <productname>PostgreSQL</productname>
+ regression test database.
+ The outputs shown are taken from version 8.3.
+ The behavior of earlier (or later) versions might vary.
+ Note also that since <command>ANALYZE</command> uses random sampling
+ while producing statistics, the results will change slightly after
+ any new <command>ANALYZE</command>.
+ </para>
+
+ <para>
+ Let's start with a very simple query:
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1;
+
+ QUERY PLAN
+-------------------------------------------------------------
+ Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244)
+</programlisting>
+
+ How the planner determines the cardinality of <structname>tenk1</structname>
+ is covered in <xref linkend="planner-stats"/>, but is repeated here for
+ completeness. The number of pages and rows is looked up in
+ <structname>pg_class</structname>:
+
+<programlisting>
+SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1';
+
+ relpages | reltuples
+----------+-----------
+ 358 | 10000
+</programlisting>
+
+ These numbers are current as of the last <command>VACUUM</command> or
+ <command>ANALYZE</command> on the table. The planner then fetches the
+ actual current number of pages in the table (this is a cheap operation,
+ not requiring a table scan). If that is different from
+ <structfield>relpages</structfield> then
+ <structfield>reltuples</structfield> is scaled accordingly to
+ arrive at a current number-of-rows estimate. In the example above, the value of
+ <structfield>relpages</structfield> is up-to-date so the rows estimate is
+ the same as <structfield>reltuples</structfield>.
+ </para>
+
+ <para>
+ Let's move on to an example with a range condition in its
+ <literal>WHERE</literal> clause:
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000;
+
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;-------------
+ Bitmap Heap Scan on tenk1 (cost=24.06..394.64 rows=1007 width=244)
+ Recheck Cond: (unique1 &lt; 1000)
+ -&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
+ Index Cond: (unique1 &lt; 1000)
+</programlisting>
+
+ The planner examines the <literal>WHERE</literal> clause condition
+ and looks up the selectivity function for the operator
+ <literal>&lt;</literal> in <structname>pg_operator</structname>.
+ This is held in the column <structfield>oprrest</structfield>,
+ and the entry in this case is <function>scalarltsel</function>.
+ The <function>scalarltsel</function> function retrieves the histogram for
+ <structfield>unique1</structfield> from
+ <structname>pg_statistic</structname>. For manual queries it is more
+ convenient to look in the simpler <structname>pg_stats</structname>
+ view:
+
+<programlisting>
+SELECT histogram_bounds FROM pg_stats
+WHERE tablename='tenk1' AND attname='unique1';
+
+ histogram_bounds
+------------------------------------------------------
+ {0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995}
+</programlisting>
+
+ Next the fraction of the histogram occupied by <quote>&lt; 1000</quote>
+ is worked out. This is the selectivity. The histogram divides the range
+ into equal frequency buckets, so all we have to do is locate the bucket
+ that our value is in and count <emphasis>part</emphasis> of it and
+ <emphasis>all</emphasis> of the ones before. The value 1000 is clearly in
+ the second bucket (993&ndash;1997). Assuming a linear distribution of
+ values inside each bucket, we can calculate the selectivity as:
+
+<programlisting>
+selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets
+ = (1 + (1000 - 993)/(1997 - 993))/10
+ = 0.100697
+</programlisting>
+
+ that is, one whole bucket plus a linear fraction of the second, divided by
+ the number of buckets. The estimated number of rows can now be calculated as
+ the product of the selectivity and the cardinality of
+ <structname>tenk1</structname>:
+
+<programlisting>
+rows = rel_cardinality * selectivity
+ = 10000 * 0.100697
+ = 1007 (rounding off)
+</programlisting>
+ </para>
+
+ <para>
+ Next let's consider an example with an equality condition in its
+ <literal>WHERE</literal> clause:
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA';
+
+ QUERY PLAN
+----------------------------------------------------------
+ Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244)
+ Filter: (stringu1 = 'CRAAAA'::name)
+</programlisting>
+
+ Again the planner examines the <literal>WHERE</literal> clause condition
+ and looks up the selectivity function for <literal>=</literal>, which is
+ <function>eqsel</function>. For equality estimation the histogram is
+ not useful; instead the list of <firstterm>most
+ common values</firstterm> (<acronym>MCV</acronym>s) is used to determine the
+ selectivity. Let's have a look at the MCVs, with some additional columns
+ that will be useful later:
+
+<programlisting>
+SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats
+WHERE tablename='tenk1' AND attname='stringu1';
+
+null_frac | 0
+n_distinct | 676
+most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,&zwsp;JOAAAA,MCAAAA,NAAAAA,WGAAAA}
+most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,&zwsp;0.003,0.003,0.003,0.003}
+
+</programlisting>
+
+ Since <literal>CRAAAA</literal> appears in the list of MCVs, the selectivity is
+ merely the corresponding entry in the list of most common frequencies
+ (<acronym>MCF</acronym>s):
+
+<programlisting>
+selectivity = mcf[3]
+ = 0.003
+</programlisting>
+
+ As before, the estimated number of rows is just the product of this with the
+ cardinality of <structname>tenk1</structname>:
+
+<programlisting>
+rows = 10000 * 0.003
+ = 30
+</programlisting>
+ </para>
+
+ <para>
+ Now consider the same query, but with a constant that is not in the
+ <acronym>MCV</acronym> list:
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx';
+
+ QUERY PLAN
+----------------------------------------------------------
+ Seq Scan on tenk1 (cost=0.00..483.00 rows=15 width=244)
+ Filter: (stringu1 = 'xxx'::name)
+</programlisting>
+
+ This is quite a different problem: how to estimate the selectivity when the
+ value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list.
+ The approach is to use the fact that the value is not in the list,
+ combined with the knowledge of the frequencies for all of the
+ <acronym>MCV</acronym>s:
+
+<programlisting>
+selectivity = (1 - sum(mvf))/(num_distinct - num_mcv)
+ = (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 +
+ 0.003 + 0.003 + 0.003 + 0.003))/(676 - 10)
+ = 0.0014559
+</programlisting>
+
+ That is, add up all the frequencies for the <acronym>MCV</acronym>s and
+ subtract them from one, then
+ divide by the number of <emphasis>other</emphasis> distinct values.
+ This amounts to assuming that the fraction of the column that is not any
+ of the MCVs is evenly distributed among all the other distinct values.
+ Notice that there are no null values so we don't have to worry about those
+ (otherwise we'd subtract the null fraction from the numerator as well).
+ The estimated number of rows is then calculated as usual:
+
+<programlisting>
+rows = 10000 * 0.0014559
+ = 15 (rounding off)
+</programlisting>
+ </para>
+
+ <para>
+ The previous example with <literal>unique1 &lt; 1000</literal> was an
+ oversimplification of what <function>scalarltsel</function> really does;
+ now that we have seen an example of the use of MCVs, we can fill in some
+ more detail. The example was correct as far as it went, because since
+ <structfield>unique1</structfield> is a unique column it has no MCVs (obviously, no
+ value is any more common than any other value). For a non-unique
+ column, there will normally be both a histogram and an MCV list, and
+ <emphasis>the histogram does not include the portion of the column
+ population represented by the MCVs</emphasis>. We do things this way because
+ it allows more precise estimation. In this situation
+ <function>scalarltsel</function> directly applies the condition (e.g.,
+ <quote>&lt; 1000</quote>) to each value of the MCV list, and adds up the
+ frequencies of the MCVs for which the condition is true. This gives
+ an exact estimate of the selectivity within the portion of the table
+ that is MCVs. The histogram is then used in the same way as above
+ to estimate the selectivity in the portion of the table that is not
+ MCVs, and then the two numbers are combined to estimate the overall
+ selectivity. For example, consider
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1 WHERE stringu1 &lt; 'IAAAAA';
+
+ QUERY PLAN
+------------------------------------------------------------
+ Seq Scan on tenk1 (cost=0.00..483.00 rows=3077 width=244)
+ Filter: (stringu1 &lt; 'IAAAAA'::name)
+</programlisting>
+
+ We already saw the MCV information for <structfield>stringu1</structfield>,
+ and here is its histogram:
+
+<programlisting>
+SELECT histogram_bounds FROM pg_stats
+WHERE tablename='tenk1' AND attname='stringu1';
+
+ histogram_bounds
+-------------------------------------------------------------------&zwsp;-------------
+ {AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,&zwsp;XLAAAA,ZZAAAA}
+</programlisting>
+
+ Checking the MCV list, we find that the condition <literal>stringu1 &lt;
+ 'IAAAAA'</literal> is satisfied by the first six entries and not the last four,
+ so the selectivity within the MCV part of the population is
+
+<programlisting>
+selectivity = sum(relevant mvfs)
+ = 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003
+ = 0.01833333
+</programlisting>
+
+ Summing all the MCFs also tells us that the total fraction of the
+ population represented by MCVs is 0.03033333, and therefore the
+ fraction represented by the histogram is 0.96966667 (again, there
+ are no nulls, else we'd have to exclude them here). We can see
+ that the value <literal>IAAAAA</literal> falls nearly at the end of the
+ third histogram bucket. Using some rather cheesy assumptions
+ about the frequency of different characters, the planner arrives
+ at the estimate 0.298387 for the portion of the histogram population
+ that is less than <literal>IAAAAA</literal>. We then combine the estimates
+ for the MCV and non-MCV populations:
+
+<programlisting>
+selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction
+ = 0.01833333 + 0.298387 * 0.96966667
+ = 0.307669
+
+rows = 10000 * 0.307669
+ = 3077 (rounding off)
+</programlisting>
+
+ In this particular example, the correction from the MCV list is fairly
+ small, because the column distribution is actually quite flat (the
+ statistics showing these particular values as being more common than
+ others are mostly due to sampling error). In a more typical case where
+ some values are significantly more common than others, this complicated
+ process gives a useful improvement in accuracy because the selectivity
+ for the most common values is found exactly.
+ </para>
+
+ <para>
+ Now let's consider a case with more than one
+ condition in the <literal>WHERE</literal> clause:
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 1000 AND stringu1 = 'xxx';
+
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;-------------
+ Bitmap Heap Scan on tenk1 (cost=23.80..396.91 rows=1 width=244)
+ Recheck Cond: (unique1 &lt; 1000)
+ Filter: (stringu1 = 'xxx'::name)
+ -&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0)
+ Index Cond: (unique1 &lt; 1000)
+</programlisting>
+
+ The planner assumes that the two conditions are independent, so that
+ the individual selectivities of the clauses can be multiplied together:
+
+<programlisting>
+selectivity = selectivity(unique1 &lt; 1000) * selectivity(stringu1 = 'xxx')
+ = 0.100697 * 0.0014559
+ = 0.0001466
+
+rows = 10000 * 0.0001466
+ = 1 (rounding off)
+</programlisting>
+
+ Notice that the number of rows estimated to be returned from the bitmap
+ index scan reflects only the condition used with the index; this is
+ important since it affects the cost estimate for the subsequent heap
+ fetches.
+ </para>
+
+ <para>
+ Finally we will examine a query that involves a join:
+
+<programlisting>
+EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2
+WHERE t1.unique1 &lt; 50 AND t1.unique2 = t2.unique2;
+
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;-------------------
+ Nested Loop (cost=4.64..456.23 rows=50 width=488)
+ -&gt; Bitmap Heap Scan on tenk1 t1 (cost=4.64..142.17 rows=50 width=244)
+ Recheck Cond: (unique1 &lt; 50)
+ -&gt; Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.63 rows=50 width=0)
+ Index Cond: (unique1 &lt; 50)
+ -&gt; Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..6.27 rows=1 width=244)
+ Index Cond: (unique2 = t1.unique2)
+</programlisting>
+
+ The restriction on <structname>tenk1</structname>,
+ <literal>unique1 &lt; 50</literal>,
+ is evaluated before the nested-loop join.
+ This is handled analogously to the previous range example. This time the
+ value 50 falls into the first bucket of the
+ <structfield>unique1</structfield> histogram:
+
+<programlisting>
+selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets
+ = (0 + (50 - 0)/(993 - 0))/10
+ = 0.005035
+
+rows = 10000 * 0.005035
+ = 50 (rounding off)
+</programlisting>
+
+ The restriction for the join is <literal>t2.unique2 = t1.unique2</literal>.
+ The operator is just
+ our familiar <literal>=</literal>, however the selectivity function is
+ obtained from the <structfield>oprjoin</structfield> column of
+ <structname>pg_operator</structname>, and is <function>eqjoinsel</function>.
+ <function>eqjoinsel</function> looks up the statistical information for both
+ <structname>tenk2</structname> and <structname>tenk1</structname>:
+
+<programlisting>
+SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats
+WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2';
+
+tablename | null_frac | n_distinct | most_common_vals
+-----------+-----------+------------+------------------
+ tenk1 | 0 | -1 |
+ tenk2 | 0 | -1 |
+</programlisting>
+
+ In this case there is no <acronym>MCV</acronym> information for
+ <structfield>unique2</structfield> because all the values appear to be
+ unique, so we use an algorithm that relies only on the number of
+ distinct values for both relations together with their null fractions:
+
+<programlisting>
+selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2)
+ = (1 - 0) * (1 - 0) / max(10000, 10000)
+ = 0.0001
+</programlisting>
+
+ This is, subtract the null fraction from one for each of the relations,
+ and divide by the maximum of the numbers of distinct values.
+ The number of rows
+ that the join is likely to emit is calculated as the cardinality of the
+ Cartesian product of the two inputs, multiplied by the
+ selectivity:
+
+<programlisting>
+rows = (outer_cardinality * inner_cardinality) * selectivity
+ = (50 * 10000) * 0.0001
+ = 50
+</programlisting>
+ </para>
+
+ <para>
+ Had there been MCV lists for the two columns,
+ <function>eqjoinsel</function> would have used direct comparison of the MCV
+ lists to determine the join selectivity within the part of the column
+ populations represented by the MCVs. The estimate for the remainder of the
+ populations follows the same approach shown here.
+ </para>
+
+ <para>
+ Notice that we showed <literal>inner_cardinality</literal> as 10000, that is,
+ the unmodified size of <structname>tenk2</structname>. It might appear from
+ inspection of the <command>EXPLAIN</command> output that the estimate of
+ join rows comes from 50 * 1, that is, the number of outer rows times
+ the estimated number of rows obtained by each inner index scan on
+ <structname>tenk2</structname>. But this is not the case: the join relation size
+ is estimated before any particular join plan has been considered. If
+ everything is working well then the two ways of estimating the join
+ size will produce about the same answer, but due to round-off error and
+ other factors they sometimes diverge significantly.
+ </para>
+
+ <para>
+ For those interested in further details, estimation of the size of
+ a table (before any <literal>WHERE</literal> clauses) is done in
+ <filename>src/backend/optimizer/util/plancat.c</filename>. The generic
+ logic for clause selectivities is in
+ <filename>src/backend/optimizer/path/clausesel.c</filename>. The
+ operator-specific selectivity functions are mostly found
+ in <filename>src/backend/utils/adt/selfuncs.c</filename>.
+ </para>
+ </sect1>
+
+ <sect1 id="multivariate-statistics-examples">
+ <title>Multivariate Statistics Examples</title>
+
+ <indexterm>
+ <primary>row estimation</primary>
+ <secondary>multivariate</secondary>
+ </indexterm>
+
+ <sect2 id="functional-dependencies">
+ <title>Functional Dependencies</title>
+
+ <para>
+ Multivariate correlation can be demonstrated with a very simple data set
+ &mdash; a table with two columns, both containing the same values:
+
+<programlisting>
+CREATE TABLE t (a INT, b INT);
+INSERT INTO t SELECT i % 100, i % 100 FROM generate_series(1, 10000) s(i);
+ANALYZE t;
+</programlisting>
+
+ As explained in <xref linkend="planner-stats"/>, the planner can determine
+ cardinality of <structname>t</structname> using the number of pages and
+ rows obtained from <structname>pg_class</structname>:
+
+<programlisting>
+SELECT relpages, reltuples FROM pg_class WHERE relname = 't';
+
+ relpages | reltuples
+----------+-----------
+ 45 | 10000
+</programlisting>
+
+ The data distribution is very simple; there are only 100 distinct values
+ in each column, uniformly distributed.
+ </para>
+
+ <para>
+ The following example shows the result of estimating a <literal>WHERE</literal>
+ condition on the <structfield>a</structfield> column:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;------------
+ Seq Scan on t (cost=0.00..170.00 rows=100 width=8) (actual rows=100 loops=1)
+ Filter: (a = 1)
+ Rows Removed by Filter: 9900
+</programlisting>
+
+ The planner examines the condition and determines the selectivity
+ of this clause to be 1%. By comparing this estimate and the actual
+ number of rows, we see that the estimate is very accurate
+ (in fact exact, as the table is very small). Changing the
+ <literal>WHERE</literal> condition to use the <structfield>b</structfield> column, an
+ identical plan is generated. But observe what happens if we apply the same
+ condition on both columns, combining them with <literal>AND</literal>:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;----------
+ Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=100 loops=1)
+ Filter: ((a = 1) AND (b = 1))
+ Rows Removed by Filter: 9900
+</programlisting>
+
+ The planner estimates the selectivity for each condition individually,
+ arriving at the same 1% estimates as above. Then it assumes that the
+ conditions are independent, and so it multiplies their selectivities,
+ producing a final selectivity estimate of just 0.01%.
+ This is a significant underestimate, as the actual number of rows
+ matching the conditions (100) is two orders of magnitude higher.
+ </para>
+
+ <para>
+ This problem can be fixed by creating a statistics object that
+ directs <command>ANALYZE</command> to calculate functional-dependency
+ multivariate statistics on the two columns:
+
+<programlisting>
+CREATE STATISTICS stts (dependencies) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;------------
+ Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+ Filter: ((a = 1) AND (b = 1))
+ Rows Removed by Filter: 9900
+</programlisting>
+ </para>
+ </sect2>
+
+ <sect2 id="multivariate-ndistinct-counts">
+ <title>Multivariate N-Distinct Counts</title>
+
+ <para>
+ A similar problem occurs with estimation of the cardinality of sets of
+ multiple columns, such as the number of groups that would be generated by
+ a <command>GROUP BY</command> clause. When <command>GROUP BY</command>
+ lists a single column, the n-distinct estimate (which is visible as the
+ estimated number of rows returned by the HashAggregate node) is very
+ accurate:
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;----------------------
+ HashAggregate (cost=195.00..196.00 rows=100 width=12) (actual rows=100 loops=1)
+ Group Key: a
+ -&gt; Seq Scan on t (cost=0.00..145.00 rows=10000 width=4) (actual rows=10000 loops=1)
+</programlisting>
+ But without multivariate statistics, the estimate for the number of
+ groups in a query with two columns in <command>GROUP BY</command>, as
+ in the following example, is off by an order of magnitude:
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;-------------------------
+ HashAggregate (cost=220.00..230.00 rows=1000 width=16) (actual rows=100 loops=1)
+ Group Key: a, b
+ -&gt; Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
+</programlisting>
+ By redefining the statistics object to include n-distinct counts for the
+ two columns, the estimate is much improved:
+<programlisting>
+DROP STATISTICS stts;
+CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;-------------------------
+ HashAggregate (cost=220.00..221.00 rows=100 width=16) (actual rows=100 loops=1)
+ Group Key: a, b
+ -&gt; Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1)
+</programlisting>
+ </para>
+
+ </sect2>
+
+ <sect2 id="mcv-lists">
+ <title>MCV Lists</title>
+
+ <para>
+ As explained in <xref linkend="functional-dependencies"/>, functional
+ dependencies are very cheap and efficient type of statistics, but their
+ main limitation is their global nature (only tracking dependencies at
+ the column level, not between individual column values).
+ </para>
+
+ <para>
+ This section introduces multivariate variant of <acronym>MCV</acronym>
+ (most-common values) lists, a straightforward extension of the per-column
+ statistics described in <xref linkend="row-estimation-examples"/>. These
+ statistics address the limitation by storing individual values, but it is
+ naturally more expensive, both in terms of building the statistics in
+ <command>ANALYZE</command>, storage and planning time.
+ </para>
+
+ <para>
+ Let's look at the query from <xref linkend="functional-dependencies"/>
+ again, but this time with a <acronym>MCV</acronym> list created on the
+ same set of columns (be sure to drop the functional dependencies, to
+ make sure the planner uses the newly created statistics).
+
+<programlisting>
+DROP STATISTICS stts;
+CREATE STATISTICS stts2 (mcv) ON a, b FROM t;
+ANALYZE t;
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;------------
+ Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1)
+ Filter: ((a = 1) AND (b = 1))
+ Rows Removed by Filter: 9900
+</programlisting>
+
+ The estimate is as accurate as with the functional dependencies, mostly
+ thanks to the table being fairly small and having a simple distribution
+ with a low number of distinct values. Before looking at the second query,
+ which was not handled by functional dependencies particularly well,
+ let's inspect the <acronym>MCV</acronym> list a bit.
+ </para>
+
+ <para>
+ Inspecting the <acronym>MCV</acronym> list is possible using
+ <function>pg_mcv_list_items</function> set-returning function.
+
+<programlisting>
+SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid),
+ pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts2';
+ index | values | nulls | frequency | base_frequency
+-------+----------+-------+-----------+----------------
+ 0 | {0, 0} | {f,f} | 0.01 | 0.0001
+ 1 | {1, 1} | {f,f} | 0.01 | 0.0001
+ ...
+ 49 | {49, 49} | {f,f} | 0.01 | 0.0001
+ 50 | {50, 50} | {f,f} | 0.01 | 0.0001
+ ...
+ 97 | {97, 97} | {f,f} | 0.01 | 0.0001
+ 98 | {98, 98} | {f,f} | 0.01 | 0.0001
+ 99 | {99, 99} | {f,f} | 0.01 | 0.0001
+(100 rows)
+</programlisting>
+
+ This confirms there are 100 distinct combinations in the two columns, and
+ all of them are about equally likely (1% frequency for each one). The
+ base frequency is the frequency computed from per-column statistics, as if
+ there were no multi-column statistics. Had there been any null values in
+ either of the columns, this would be identified in the
+ <structfield>nulls</structfield> column.
+ </para>
+
+ <para>
+ When estimating the selectivity, the planner applies all the conditions
+ on items in the <acronym>MCV</acronym> list, and then sums the frequencies
+ of the matching ones. See <function>mcv_clauselist_selectivity</function>
+ in <filename>src/backend/statistics/mcv.c</filename> for details.
+ </para>
+
+ <para>
+ Compared to functional dependencies, <acronym>MCV</acronym> lists have two
+ major advantages. Firstly, the list stores actual values, making it possible
+ to decide which combinations are compatible.
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;--------
+ Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+ Filter: ((a = 1) AND (b = 10))
+ Rows Removed by Filter: 10000
+</programlisting>
+
+ Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types,
+ not just equality clauses like functional dependencies. For example,
+ consider the following range query for the same table:
+
+<programlisting>
+EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a &lt;= 49 AND b &gt; 49;
+ QUERY PLAN
+-------------------------------------------------------------------&zwsp;--------
+ Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1)
+ Filter: ((a &lt;= 49) AND (b &gt; 49))
+ Rows Removed by Filter: 10000
+</programlisting>
+
+ </para>
+
+ </sect2>
+
+ </sect1>
+
+ <sect1 id="planner-stats-security">
+ <title>Planner Statistics and Security</title>
+
+ <para>
+ Access to the table <structname>pg_statistic</structname> is restricted to
+ superusers, so that ordinary users cannot learn about the contents of the
+ tables of other users from it. Some selectivity estimation functions will
+ use a user-provided operator (either the operator appearing in the query or
+ a related operator) to analyze the stored statistics. For example, in order
+ to determine whether a stored most common value is applicable, the
+ selectivity estimator will have to run the appropriate <literal>=</literal>
+ operator to compare the constant in the query to the stored value.
+ Thus the data in <structname>pg_statistic</structname> is potentially
+ passed to user-defined operators. An appropriately crafted operator can
+ intentionally leak the passed operands (for example, by logging them
+ or writing them to a different table), or accidentally leak them by showing
+ their values in error messages, in either case possibly exposing data from
+ <structname>pg_statistic</structname> to a user who should not be able to
+ see it.
+ </para>
+
+ <para>
+ In order to prevent this, the following applies to all built-in selectivity
+ estimation functions. When planning a query, in order to be able to use
+ stored statistics, the current user must either
+ have <literal>SELECT</literal> privilege on the table or the involved
+ columns, or the operator used must be <literal>LEAKPROOF</literal> (more
+ accurately, the function that the operator is based on). If not, then the
+ selectivity estimator will behave as if no statistics are available, and
+ the planner will proceed with default or fall-back assumptions.
+ </para>
+
+ <para>
+ If a user does not have the required privilege on the table or columns,
+ then in many cases the query will ultimately receive a permission-denied
+ error, in which case this mechanism is invisible in practice. But if the
+ user is reading from a security-barrier view, then the planner might wish
+ to check the statistics of an underlying table that is otherwise
+ inaccessible to the user. In that case, the operator should be leak-proof
+ or the statistics will not be used. There is no direct feedback about
+ that, except that the plan might be suboptimal. If one suspects that this
+ is the case, one could try running the query as a more privileged user,
+ to see if a different plan results.
+ </para>
+
+ <para>
+ This restriction applies only to cases where the planner would need to
+ execute a user-defined operator on one or more values
+ from <structname>pg_statistic</structname>. Thus the planner is permitted
+ to use generic statistical information, such as the fraction of null values
+ or the number of distinct values in a column, regardless of access
+ privileges.
+ </para>
+
+ <para>
+ Selectivity estimation functions contained in third-party extensions that
+ potentially operate on statistics with user-defined operators should follow
+ the same security rules. Consult the PostgreSQL source code for guidance.
+ </para>
+ </sect1>
+</chapter>