diff options
Diffstat (limited to 'doc/src/sgml/planstats.sgml')
-rw-r--r-- | doc/src/sgml/planstats.sgml | 762 |
1 files changed, 762 insertions, 0 deletions
diff --git a/doc/src/sgml/planstats.sgml b/doc/src/sgml/planstats.sgml new file mode 100644 index 0000000..df85ea5 --- /dev/null +++ b/doc/src/sgml/planstats.sgml @@ -0,0 +1,762 @@ +<!-- doc/src/sgml/planstats.sgml --> + +<chapter id="planner-stats-details"> + <title>How the Planner Uses Statistics</title> + + <para> + This chapter builds on the material covered in <xref + linkend="using-explain"/> and <xref linkend="planner-stats"/> to show some + additional details about how the planner uses the + system statistics to estimate the number of rows each part of a query might + return. This is a significant part of the planning process, + providing much of the raw material for cost calculation. + </para> + + <para> + The intent of this chapter is not to document the code in detail, + but to present an overview of how it works. + This will perhaps ease the learning curve for someone who subsequently + wishes to read the code. + </para> + + <sect1 id="row-estimation-examples"> + <title>Row Estimation Examples</title> + + <indexterm zone="row-estimation-examples"> + <primary>row estimation</primary> + <secondary>planner</secondary> + </indexterm> + + <para> + The examples shown below use tables in the <productname>PostgreSQL</productname> + regression test database. + The outputs shown are taken from version 8.3. + The behavior of earlier (or later) versions might vary. + Note also that since <command>ANALYZE</command> uses random sampling + while producing statistics, the results will change slightly after + any new <command>ANALYZE</command>. + </para> + + <para> + Let's start with a very simple query: + +<programlisting> +EXPLAIN SELECT * FROM tenk1; + + QUERY PLAN +------------------------------------------------------------- + Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=244) +</programlisting> + + How the planner determines the cardinality of <structname>tenk1</structname> + is covered in <xref linkend="planner-stats"/>, but is repeated here for + completeness. The number of pages and rows is looked up in + <structname>pg_class</structname>: + +<programlisting> +SELECT relpages, reltuples FROM pg_class WHERE relname = 'tenk1'; + + relpages | reltuples +----------+----------- + 358 | 10000 +</programlisting> + + These numbers are current as of the last <command>VACUUM</command> or + <command>ANALYZE</command> on the table. The planner then fetches the + actual current number of pages in the table (this is a cheap operation, + not requiring a table scan). If that is different from + <structfield>relpages</structfield> then + <structfield>reltuples</structfield> is scaled accordingly to + arrive at a current number-of-rows estimate. In the example above, the value of + <structfield>relpages</structfield> is up-to-date so the rows estimate is + the same as <structfield>reltuples</structfield>. + </para> + + <para> + Let's move on to an example with a range condition in its + <literal>WHERE</literal> clause: + +<programlisting> +EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000; + + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------- + Bitmap Heap Scan on tenk1 (cost=24.06..394.64 rows=1007 width=244) + Recheck Cond: (unique1 < 1000) + -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0) + Index Cond: (unique1 < 1000) +</programlisting> + + The planner examines the <literal>WHERE</literal> clause condition + and looks up the selectivity function for the operator + <literal><</literal> in <structname>pg_operator</structname>. + This is held in the column <structfield>oprrest</structfield>, + and the entry in this case is <function>scalarltsel</function>. + The <function>scalarltsel</function> function retrieves the histogram for + <structfield>unique1</structfield> from + <structname>pg_statistic</structname>. For manual queries it is more + convenient to look in the simpler <structname>pg_stats</structname> + view: + +<programlisting> +SELECT histogram_bounds FROM pg_stats +WHERE tablename='tenk1' AND attname='unique1'; + + histogram_bounds +------------------------------------------------------ + {0,993,1997,3050,4040,5036,5957,7057,8029,9016,9995} +</programlisting> + + Next the fraction of the histogram occupied by <quote>< 1000</quote> + is worked out. This is the selectivity. The histogram divides the range + into equal frequency buckets, so all we have to do is locate the bucket + that our value is in and count <emphasis>part</emphasis> of it and + <emphasis>all</emphasis> of the ones before. The value 1000 is clearly in + the second bucket (993–1997). Assuming a linear distribution of + values inside each bucket, we can calculate the selectivity as: + +<programlisting> +selectivity = (1 + (1000 - bucket[2].min)/(bucket[2].max - bucket[2].min))/num_buckets + = (1 + (1000 - 993)/(1997 - 993))/10 + = 0.100697 +</programlisting> + + that is, one whole bucket plus a linear fraction of the second, divided by + the number of buckets. The estimated number of rows can now be calculated as + the product of the selectivity and the cardinality of + <structname>tenk1</structname>: + +<programlisting> +rows = rel_cardinality * selectivity + = 10000 * 0.100697 + = 1007 (rounding off) +</programlisting> + </para> + + <para> + Next let's consider an example with an equality condition in its + <literal>WHERE</literal> clause: + +<programlisting> +EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'CRAAAA'; + + QUERY PLAN +---------------------------------------------------------- + Seq Scan on tenk1 (cost=0.00..483.00 rows=30 width=244) + Filter: (stringu1 = 'CRAAAA'::name) +</programlisting> + + Again the planner examines the <literal>WHERE</literal> clause condition + and looks up the selectivity function for <literal>=</literal>, which is + <function>eqsel</function>. For equality estimation the histogram is + not useful; instead the list of <firstterm>most + common values</firstterm> (<acronym>MCV</acronym>s) is used to determine the + selectivity. Let's have a look at the MCVs, with some additional columns + that will be useful later: + +<programlisting> +SELECT null_frac, n_distinct, most_common_vals, most_common_freqs FROM pg_stats +WHERE tablename='tenk1' AND attname='stringu1'; + +null_frac | 0 +n_distinct | 676 +most_common_vals | {EJAAAA,BBAAAA,CRAAAA,FCAAAA,FEAAAA,GSAAAA,&zwsp;JOAAAA,MCAAAA,NAAAAA,WGAAAA} +most_common_freqs | {0.00333333,0.003,0.003,0.003,0.003,0.003,&zwsp;0.003,0.003,0.003,0.003} + +</programlisting> + + Since <literal>CRAAAA</literal> appears in the list of MCVs, the selectivity is + merely the corresponding entry in the list of most common frequencies + (<acronym>MCF</acronym>s): + +<programlisting> +selectivity = mcf[3] + = 0.003 +</programlisting> + + As before, the estimated number of rows is just the product of this with the + cardinality of <structname>tenk1</structname>: + +<programlisting> +rows = 10000 * 0.003 + = 30 +</programlisting> + </para> + + <para> + Now consider the same query, but with a constant that is not in the + <acronym>MCV</acronym> list: + +<programlisting> +EXPLAIN SELECT * FROM tenk1 WHERE stringu1 = 'xxx'; + + QUERY PLAN +---------------------------------------------------------- + Seq Scan on tenk1 (cost=0.00..483.00 rows=15 width=244) + Filter: (stringu1 = 'xxx'::name) +</programlisting> + + This is quite a different problem: how to estimate the selectivity when the + value is <emphasis>not</emphasis> in the <acronym>MCV</acronym> list. + The approach is to use the fact that the value is not in the list, + combined with the knowledge of the frequencies for all of the + <acronym>MCV</acronym>s: + +<programlisting> +selectivity = (1 - sum(mvf))/(num_distinct - num_mcv) + = (1 - (0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 + + 0.003 + 0.003 + 0.003 + 0.003))/(676 - 10) + = 0.0014559 +</programlisting> + + That is, add up all the frequencies for the <acronym>MCV</acronym>s and + subtract them from one, then + divide by the number of <emphasis>other</emphasis> distinct values. + This amounts to assuming that the fraction of the column that is not any + of the MCVs is evenly distributed among all the other distinct values. + Notice that there are no null values so we don't have to worry about those + (otherwise we'd subtract the null fraction from the numerator as well). + The estimated number of rows is then calculated as usual: + +<programlisting> +rows = 10000 * 0.0014559 + = 15 (rounding off) +</programlisting> + </para> + + <para> + The previous example with <literal>unique1 < 1000</literal> was an + oversimplification of what <function>scalarltsel</function> really does; + now that we have seen an example of the use of MCVs, we can fill in some + more detail. The example was correct as far as it went, because since + <structfield>unique1</structfield> is a unique column it has no MCVs (obviously, no + value is any more common than any other value). For a non-unique + column, there will normally be both a histogram and an MCV list, and + <emphasis>the histogram does not include the portion of the column + population represented by the MCVs</emphasis>. We do things this way because + it allows more precise estimation. In this situation + <function>scalarltsel</function> directly applies the condition (e.g., + <quote>< 1000</quote>) to each value of the MCV list, and adds up the + frequencies of the MCVs for which the condition is true. This gives + an exact estimate of the selectivity within the portion of the table + that is MCVs. The histogram is then used in the same way as above + to estimate the selectivity in the portion of the table that is not + MCVs, and then the two numbers are combined to estimate the overall + selectivity. For example, consider + +<programlisting> +EXPLAIN SELECT * FROM tenk1 WHERE stringu1 < 'IAAAAA'; + + QUERY PLAN +------------------------------------------------------------ + Seq Scan on tenk1 (cost=0.00..483.00 rows=3077 width=244) + Filter: (stringu1 < 'IAAAAA'::name) +</programlisting> + + We already saw the MCV information for <structfield>stringu1</structfield>, + and here is its histogram: + +<programlisting> +SELECT histogram_bounds FROM pg_stats +WHERE tablename='tenk1' AND attname='stringu1'; + + histogram_bounds +-------------------------------------------------------------------&zwsp;------------- + {AAAAAA,CQAAAA,FRAAAA,IBAAAA,KRAAAA,NFAAAA,PSAAAA,SGAAAA,VAAAAA,&zwsp;XLAAAA,ZZAAAA} +</programlisting> + + Checking the MCV list, we find that the condition <literal>stringu1 < + 'IAAAAA'</literal> is satisfied by the first six entries and not the last four, + so the selectivity within the MCV part of the population is + +<programlisting> +selectivity = sum(relevant mvfs) + = 0.00333333 + 0.003 + 0.003 + 0.003 + 0.003 + 0.003 + = 0.01833333 +</programlisting> + + Summing all the MCFs also tells us that the total fraction of the + population represented by MCVs is 0.03033333, and therefore the + fraction represented by the histogram is 0.96966667 (again, there + are no nulls, else we'd have to exclude them here). We can see + that the value <literal>IAAAAA</literal> falls nearly at the end of the + third histogram bucket. Using some rather cheesy assumptions + about the frequency of different characters, the planner arrives + at the estimate 0.298387 for the portion of the histogram population + that is less than <literal>IAAAAA</literal>. We then combine the estimates + for the MCV and non-MCV populations: + +<programlisting> +selectivity = mcv_selectivity + histogram_selectivity * histogram_fraction + = 0.01833333 + 0.298387 * 0.96966667 + = 0.307669 + +rows = 10000 * 0.307669 + = 3077 (rounding off) +</programlisting> + + In this particular example, the correction from the MCV list is fairly + small, because the column distribution is actually quite flat (the + statistics showing these particular values as being more common than + others are mostly due to sampling error). In a more typical case where + some values are significantly more common than others, this complicated + process gives a useful improvement in accuracy because the selectivity + for the most common values is found exactly. + </para> + + <para> + Now let's consider a case with more than one + condition in the <literal>WHERE</literal> clause: + +<programlisting> +EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 1000 AND stringu1 = 'xxx'; + + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------- + Bitmap Heap Scan on tenk1 (cost=23.80..396.91 rows=1 width=244) + Recheck Cond: (unique1 < 1000) + Filter: (stringu1 = 'xxx'::name) + -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..23.80 rows=1007 width=0) + Index Cond: (unique1 < 1000) +</programlisting> + + The planner assumes that the two conditions are independent, so that + the individual selectivities of the clauses can be multiplied together: + +<programlisting> +selectivity = selectivity(unique1 < 1000) * selectivity(stringu1 = 'xxx') + = 0.100697 * 0.0014559 + = 0.0001466 + +rows = 10000 * 0.0001466 + = 1 (rounding off) +</programlisting> + + Notice that the number of rows estimated to be returned from the bitmap + index scan reflects only the condition used with the index; this is + important since it affects the cost estimate for the subsequent heap + fetches. + </para> + + <para> + Finally we will examine a query that involves a join: + +<programlisting> +EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 +WHERE t1.unique1 < 50 AND t1.unique2 = t2.unique2; + + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------------- + Nested Loop (cost=4.64..456.23 rows=50 width=488) + -> Bitmap Heap Scan on tenk1 t1 (cost=4.64..142.17 rows=50 width=244) + Recheck Cond: (unique1 < 50) + -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.63 rows=50 width=0) + Index Cond: (unique1 < 50) + -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.00..6.27 rows=1 width=244) + Index Cond: (unique2 = t1.unique2) +</programlisting> + + The restriction on <structname>tenk1</structname>, + <literal>unique1 < 50</literal>, + is evaluated before the nested-loop join. + This is handled analogously to the previous range example. This time the + value 50 falls into the first bucket of the + <structfield>unique1</structfield> histogram: + +<programlisting> +selectivity = (0 + (50 - bucket[1].min)/(bucket[1].max - bucket[1].min))/num_buckets + = (0 + (50 - 0)/(993 - 0))/10 + = 0.005035 + +rows = 10000 * 0.005035 + = 50 (rounding off) +</programlisting> + + The restriction for the join is <literal>t2.unique2 = t1.unique2</literal>. + The operator is just + our familiar <literal>=</literal>, however the selectivity function is + obtained from the <structfield>oprjoin</structfield> column of + <structname>pg_operator</structname>, and is <function>eqjoinsel</function>. + <function>eqjoinsel</function> looks up the statistical information for both + <structname>tenk2</structname> and <structname>tenk1</structname>: + +<programlisting> +SELECT tablename, null_frac,n_distinct, most_common_vals FROM pg_stats +WHERE tablename IN ('tenk1', 'tenk2') AND attname='unique2'; + +tablename | null_frac | n_distinct | most_common_vals +-----------+-----------+------------+------------------ + tenk1 | 0 | -1 | + tenk2 | 0 | -1 | +</programlisting> + + In this case there is no <acronym>MCV</acronym> information for + <structfield>unique2</structfield> because all the values appear to be + unique, so we use an algorithm that relies only on the number of + distinct values for both relations together with their null fractions: + +<programlisting> +selectivity = (1 - null_frac1) * (1 - null_frac2) * min(1/num_distinct1, 1/num_distinct2) + = (1 - 0) * (1 - 0) / max(10000, 10000) + = 0.0001 +</programlisting> + + This is, subtract the null fraction from one for each of the relations, + and divide by the maximum of the numbers of distinct values. + The number of rows + that the join is likely to emit is calculated as the cardinality of the + Cartesian product of the two inputs, multiplied by the + selectivity: + +<programlisting> +rows = (outer_cardinality * inner_cardinality) * selectivity + = (50 * 10000) * 0.0001 + = 50 +</programlisting> + </para> + + <para> + Had there been MCV lists for the two columns, + <function>eqjoinsel</function> would have used direct comparison of the MCV + lists to determine the join selectivity within the part of the column + populations represented by the MCVs. The estimate for the remainder of the + populations follows the same approach shown here. + </para> + + <para> + Notice that we showed <literal>inner_cardinality</literal> as 10000, that is, + the unmodified size of <structname>tenk2</structname>. It might appear from + inspection of the <command>EXPLAIN</command> output that the estimate of + join rows comes from 50 * 1, that is, the number of outer rows times + the estimated number of rows obtained by each inner index scan on + <structname>tenk2</structname>. But this is not the case: the join relation size + is estimated before any particular join plan has been considered. If + everything is working well then the two ways of estimating the join + size will produce about the same answer, but due to round-off error and + other factors they sometimes diverge significantly. + </para> + + <para> + For those interested in further details, estimation of the size of + a table (before any <literal>WHERE</literal> clauses) is done in + <filename>src/backend/optimizer/util/plancat.c</filename>. The generic + logic for clause selectivities is in + <filename>src/backend/optimizer/path/clausesel.c</filename>. The + operator-specific selectivity functions are mostly found + in <filename>src/backend/utils/adt/selfuncs.c</filename>. + </para> + </sect1> + + <sect1 id="multivariate-statistics-examples"> + <title>Multivariate Statistics Examples</title> + + <indexterm> + <primary>row estimation</primary> + <secondary>multivariate</secondary> + </indexterm> + + <sect2 id="functional-dependencies"> + <title>Functional Dependencies</title> + + <para> + Multivariate correlation can be demonstrated with a very simple data set + — a table with two columns, both containing the same values: + +<programlisting> +CREATE TABLE t (a INT, b INT); +INSERT INTO t SELECT i % 100, i % 100 FROM generate_series(1, 10000) s(i); +ANALYZE t; +</programlisting> + + As explained in <xref linkend="planner-stats"/>, the planner can determine + cardinality of <structname>t</structname> using the number of pages and + rows obtained from <structname>pg_class</structname>: + +<programlisting> +SELECT relpages, reltuples FROM pg_class WHERE relname = 't'; + + relpages | reltuples +----------+----------- + 45 | 10000 +</programlisting> + + The data distribution is very simple; there are only 100 distinct values + in each column, uniformly distributed. + </para> + + <para> + The following example shows the result of estimating a <literal>WHERE</literal> + condition on the <structfield>a</structfield> column: + +<programlisting> +EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1; + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------ + Seq Scan on t (cost=0.00..170.00 rows=100 width=8) (actual rows=100 loops=1) + Filter: (a = 1) + Rows Removed by Filter: 9900 +</programlisting> + + The planner examines the condition and determines the selectivity + of this clause to be 1%. By comparing this estimate and the actual + number of rows, we see that the estimate is very accurate + (in fact exact, as the table is very small). Changing the + <literal>WHERE</literal> condition to use the <structfield>b</structfield> column, an + identical plan is generated. But observe what happens if we apply the same + condition on both columns, combining them with <literal>AND</literal>: + +<programlisting> +EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; + QUERY PLAN +-------------------------------------------------------------------&zwsp;---------- + Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=100 loops=1) + Filter: ((a = 1) AND (b = 1)) + Rows Removed by Filter: 9900 +</programlisting> + + The planner estimates the selectivity for each condition individually, + arriving at the same 1% estimates as above. Then it assumes that the + conditions are independent, and so it multiplies their selectivities, + producing a final selectivity estimate of just 0.01%. + This is a significant underestimate, as the actual number of rows + matching the conditions (100) is two orders of magnitude higher. + </para> + + <para> + This problem can be fixed by creating a statistics object that + directs <command>ANALYZE</command> to calculate functional-dependency + multivariate statistics on the two columns: + +<programlisting> +CREATE STATISTICS stts (dependencies) ON a, b FROM t; +ANALYZE t; +EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------ + Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1) + Filter: ((a = 1) AND (b = 1)) + Rows Removed by Filter: 9900 +</programlisting> + </para> + </sect2> + + <sect2 id="multivariate-ndistinct-counts"> + <title>Multivariate N-Distinct Counts</title> + + <para> + A similar problem occurs with estimation of the cardinality of sets of + multiple columns, such as the number of groups that would be generated by + a <command>GROUP BY</command> clause. When <command>GROUP BY</command> + lists a single column, the n-distinct estimate (which is visible as the + estimated number of rows returned by the HashAggregate node) is very + accurate: +<programlisting> +EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a; + QUERY PLAN +-------------------------------------------------------------------&zwsp;---------------------- + HashAggregate (cost=195.00..196.00 rows=100 width=12) (actual rows=100 loops=1) + Group Key: a + -> Seq Scan on t (cost=0.00..145.00 rows=10000 width=4) (actual rows=10000 loops=1) +</programlisting> + But without multivariate statistics, the estimate for the number of + groups in a query with two columns in <command>GROUP BY</command>, as + in the following example, is off by an order of magnitude: +<programlisting> +EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b; + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------------------- + HashAggregate (cost=220.00..230.00 rows=1000 width=16) (actual rows=100 loops=1) + Group Key: a, b + -> Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1) +</programlisting> + By redefining the statistics object to include n-distinct counts for the + two columns, the estimate is much improved: +<programlisting> +DROP STATISTICS stts; +CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t; +ANALYZE t; +EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b; + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------------------- + HashAggregate (cost=220.00..221.00 rows=100 width=16) (actual rows=100 loops=1) + Group Key: a, b + -> Seq Scan on t (cost=0.00..145.00 rows=10000 width=8) (actual rows=10000 loops=1) +</programlisting> + </para> + + </sect2> + + <sect2 id="mcv-lists"> + <title>MCV Lists</title> + + <para> + As explained in <xref linkend="functional-dependencies"/>, functional + dependencies are very cheap and efficient type of statistics, but their + main limitation is their global nature (only tracking dependencies at + the column level, not between individual column values). + </para> + + <para> + This section introduces multivariate variant of <acronym>MCV</acronym> + (most-common values) lists, a straightforward extension of the per-column + statistics described in <xref linkend="row-estimation-examples"/>. These + statistics address the limitation by storing individual values, but it is + naturally more expensive, both in terms of building the statistics in + <command>ANALYZE</command>, storage and planning time. + </para> + + <para> + Let's look at the query from <xref linkend="functional-dependencies"/> + again, but this time with a <acronym>MCV</acronym> list created on the + same set of columns (be sure to drop the functional dependencies, to + make sure the planner uses the newly created statistics). + +<programlisting> +DROP STATISTICS stts; +CREATE STATISTICS stts2 (mcv) ON a, b FROM t; +ANALYZE t; +EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1; + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------ + Seq Scan on t (cost=0.00..195.00 rows=100 width=8) (actual rows=100 loops=1) + Filter: ((a = 1) AND (b = 1)) + Rows Removed by Filter: 9900 +</programlisting> + + The estimate is as accurate as with the functional dependencies, mostly + thanks to the table being fairly small and having a simple distribution + with a low number of distinct values. Before looking at the second query, + which was not handled by functional dependencies particularly well, + let's inspect the <acronym>MCV</acronym> list a bit. + </para> + + <para> + Inspecting the <acronym>MCV</acronym> list is possible using + <function>pg_mcv_list_items</function> set-returning function. + +<programlisting> +SELECT m.* FROM pg_statistic_ext join pg_statistic_ext_data on (oid = stxoid), + pg_mcv_list_items(stxdmcv) m WHERE stxname = 'stts2'; + index | values | nulls | frequency | base_frequency +-------+----------+-------+-----------+---------------- + 0 | {0, 0} | {f,f} | 0.01 | 0.0001 + 1 | {1, 1} | {f,f} | 0.01 | 0.0001 + ... + 49 | {49, 49} | {f,f} | 0.01 | 0.0001 + 50 | {50, 50} | {f,f} | 0.01 | 0.0001 + ... + 97 | {97, 97} | {f,f} | 0.01 | 0.0001 + 98 | {98, 98} | {f,f} | 0.01 | 0.0001 + 99 | {99, 99} | {f,f} | 0.01 | 0.0001 +(100 rows) +</programlisting> + + This confirms there are 100 distinct combinations in the two columns, and + all of them are about equally likely (1% frequency for each one). The + base frequency is the frequency computed from per-column statistics, as if + there were no multi-column statistics. Had there been any null values in + either of the columns, this would be identified in the + <structfield>nulls</structfield> column. + </para> + + <para> + When estimating the selectivity, the planner applies all the conditions + on items in the <acronym>MCV</acronym> list, and then sums the frequencies + of the matching ones. See <function>mcv_clauselist_selectivity</function> + in <filename>src/backend/statistics/mcv.c</filename> for details. + </para> + + <para> + Compared to functional dependencies, <acronym>MCV</acronym> lists have two + major advantages. Firstly, the list stores actual values, making it possible + to decide which combinations are compatible. + +<programlisting> +EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10; + QUERY PLAN +-------------------------------------------------------------------&zwsp;-------- + Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1) + Filter: ((a = 1) AND (b = 10)) + Rows Removed by Filter: 10000 +</programlisting> + + Secondly, <acronym>MCV</acronym> lists handle a wider range of clause types, + not just equality clauses like functional dependencies. For example, + consider the following range query for the same table: + +<programlisting> +EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a <= 49 AND b > 49; + QUERY PLAN +-------------------------------------------------------------------&zwsp;-------- + Seq Scan on t (cost=0.00..195.00 rows=1 width=8) (actual rows=0 loops=1) + Filter: ((a <= 49) AND (b > 49)) + Rows Removed by Filter: 10000 +</programlisting> + + </para> + + </sect2> + + </sect1> + + <sect1 id="planner-stats-security"> + <title>Planner Statistics and Security</title> + + <para> + Access to the table <structname>pg_statistic</structname> is restricted to + superusers, so that ordinary users cannot learn about the contents of the + tables of other users from it. Some selectivity estimation functions will + use a user-provided operator (either the operator appearing in the query or + a related operator) to analyze the stored statistics. For example, in order + to determine whether a stored most common value is applicable, the + selectivity estimator will have to run the appropriate <literal>=</literal> + operator to compare the constant in the query to the stored value. + Thus the data in <structname>pg_statistic</structname> is potentially + passed to user-defined operators. An appropriately crafted operator can + intentionally leak the passed operands (for example, by logging them + or writing them to a different table), or accidentally leak them by showing + their values in error messages, in either case possibly exposing data from + <structname>pg_statistic</structname> to a user who should not be able to + see it. + </para> + + <para> + In order to prevent this, the following applies to all built-in selectivity + estimation functions. When planning a query, in order to be able to use + stored statistics, the current user must either + have <literal>SELECT</literal> privilege on the table or the involved + columns, or the operator used must be <literal>LEAKPROOF</literal> (more + accurately, the function that the operator is based on). If not, then the + selectivity estimator will behave as if no statistics are available, and + the planner will proceed with default or fall-back assumptions. + </para> + + <para> + If a user does not have the required privilege on the table or columns, + then in many cases the query will ultimately receive a permission-denied + error, in which case this mechanism is invisible in practice. But if the + user is reading from a security-barrier view, then the planner might wish + to check the statistics of an underlying table that is otherwise + inaccessible to the user. In that case, the operator should be leak-proof + or the statistics will not be used. There is no direct feedback about + that, except that the plan might be suboptimal. If one suspects that this + is the case, one could try running the query as a more privileged user, + to see if a different plan results. + </para> + + <para> + This restriction applies only to cases where the planner would need to + execute a user-defined operator on one or more values + from <structname>pg_statistic</structname>. Thus the planner is permitted + to use generic statistical information, such as the fraction of null values + or the number of distinct values in a column, regardless of access + privileges. + </para> + + <para> + Selectivity estimation functions contained in third-party extensions that + potentially operate on statistics with user-defined operators should follow + the same security rules. Consult the PostgreSQL source code for guidance. + </para> + </sect1> +</chapter> |