diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-16 19:46:48 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-16 19:46:48 +0000 |
commit | 311bcfc6b3acdd6fd152798c7f287ddf74fa2a98 (patch) | |
tree | 0ec307299b1dada3701e42f4ca6eda57d708261e /doc/src/sgml/html/index-cost-estimation.html | |
parent | Initial commit. (diff) | |
download | postgresql-15-311bcfc6b3acdd6fd152798c7f287ddf74fa2a98.tar.xz postgresql-15-311bcfc6b3acdd6fd152798c7f287ddf74fa2a98.zip |
Adding upstream version 15.4.upstream/15.4upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/src/sgml/html/index-cost-estimation.html')
-rw-r--r-- | doc/src/sgml/html/index-cost-estimation.html | 142 |
1 files changed, 142 insertions, 0 deletions
diff --git a/doc/src/sgml/html/index-cost-estimation.html b/doc/src/sgml/html/index-cost-estimation.html new file mode 100644 index 0000000..6589fd2 --- /dev/null +++ b/doc/src/sgml/html/index-cost-estimation.html @@ -0,0 +1,142 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>64.6. Index Cost Estimation Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="index-unique-checks.html" title="64.5. Index Uniqueness Checks" /><link rel="next" href="generic-wal.html" title="Chapter 65. Generic WAL Records" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">64.6. Index Cost Estimation Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="index-unique-checks.html" title="64.5. Index Uniqueness Checks">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="indexam.html" title="Chapter 64. Index Access Method Interface Definition">Up</a></td><th width="60%" align="center">Chapter 64. Index Access Method Interface Definition</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="generic-wal.html" title="Chapter 65. Generic WAL Records">Next</a></td></tr></table><hr /></div><div class="sect1" id="INDEX-COST-ESTIMATION"><div class="titlepage"><div><div><h2 class="title" style="clear: both">64.6. Index Cost Estimation Functions</h2></div></div></div><p> + The <code class="function">amcostestimate</code> function is given information describing + a possible index scan, including lists of WHERE and ORDER BY clauses that + have been determined to be usable with the index. It must return estimates + of the cost of accessing the index and the selectivity of the WHERE + clauses (that is, the fraction of parent-table rows that will be + retrieved during the index scan). For simple cases, nearly all the + work of the cost estimator can be done by calling standard routines + in the optimizer; the point of having an <code class="function">amcostestimate</code> function is + to allow index access methods to provide index-type-specific knowledge, + in case it is possible to improve on the standard estimates. + </p><p> + Each <code class="function">amcostestimate</code> function must have the signature: + +</p><pre class="programlisting"> +void +amcostestimate (PlannerInfo *root, + IndexPath *path, + double loop_count, + Cost *indexStartupCost, + Cost *indexTotalCost, + Selectivity *indexSelectivity, + double *indexCorrelation, + double *indexPages); +</pre><p> + + The first three parameters are inputs: + + </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><em class="parameter"><code>root</code></em></span></dt><dd><p> + The planner's information about the query being processed. + </p></dd><dt><span class="term"><em class="parameter"><code>path</code></em></span></dt><dd><p> + The index access path being considered. All fields except cost and + selectivity values are valid. + </p></dd><dt><span class="term"><em class="parameter"><code>loop_count</code></em></span></dt><dd><p> + The number of repetitions of the index scan that should be factored + into the cost estimates. This will typically be greater than one when + considering a parameterized scan for use in the inside of a nestloop + join. Note that the cost estimates should still be for just one scan; + a larger <em class="parameter"><code>loop_count</code></em> means that it may be appropriate + to allow for some caching effects across multiple scans. + </p></dd></dl></div><p> + </p><p> + The last five parameters are pass-by-reference outputs: + + </p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><em class="parameter"><code>*indexStartupCost</code></em></span></dt><dd><p> + Set to cost of index start-up processing + </p></dd><dt><span class="term"><em class="parameter"><code>*indexTotalCost</code></em></span></dt><dd><p> + Set to total cost of index processing + </p></dd><dt><span class="term"><em class="parameter"><code>*indexSelectivity</code></em></span></dt><dd><p> + Set to index selectivity + </p></dd><dt><span class="term"><em class="parameter"><code>*indexCorrelation</code></em></span></dt><dd><p> + Set to correlation coefficient between index scan order and + underlying table's order + </p></dd><dt><span class="term"><em class="parameter"><code>*indexPages</code></em></span></dt><dd><p> + Set to number of index leaf pages + </p></dd></dl></div><p> + </p><p> + Note that cost estimate functions must be written in C, not in SQL or + any available procedural language, because they must access internal + data structures of the planner/optimizer. + </p><p> + The index access costs should be computed using the parameters used by + <code class="filename">src/backend/optimizer/path/costsize.c</code>: a sequential + disk block fetch has cost <code class="varname">seq_page_cost</code>, a nonsequential fetch + has cost <code class="varname">random_page_cost</code>, and the cost of processing one index + row should usually be taken as <code class="varname">cpu_index_tuple_cost</code>. In + addition, an appropriate multiple of <code class="varname">cpu_operator_cost</code> should + be charged for any comparison operators invoked during index processing + (especially evaluation of the indexquals themselves). + </p><p> + The access costs should include all disk and CPU costs associated with + scanning the index itself, but <span class="emphasis"><em>not</em></span> the costs of retrieving or + processing the parent-table rows that are identified by the index. + </p><p> + The <span class="quote">“<span class="quote">start-up cost</span>”</span> is the part of the total scan cost that + must be expended before we can begin to fetch the first row. For most + indexes this can be taken as zero, but an index type with a high start-up + cost might want to set it nonzero. + </p><p> + The <em class="parameter"><code>indexSelectivity</code></em> should be set to the estimated fraction of the parent + table rows that will be retrieved during the index scan. In the case + of a lossy query, this will typically be higher than the fraction of + rows that actually pass the given qual conditions. + </p><p> + The <em class="parameter"><code>indexCorrelation</code></em> should be set to the correlation (ranging between + -1.0 and 1.0) between the index order and the table order. This is used + to adjust the estimate for the cost of fetching rows from the parent + table. + </p><p> + The <em class="parameter"><code>indexPages</code></em> should be set to the number of leaf pages. + This is used to estimate the number of workers for parallel index scan. + </p><p> + When <em class="parameter"><code>loop_count</code></em> is greater than one, the returned numbers + should be averages expected for any one scan of the index. + </p><div class="procedure" id="id-1.10.15.12.13"><p class="title"><strong>Cost Estimation</strong></p><p> + A typical cost estimator will proceed as follows: + </p><ol class="procedure" type="1"><li class="step"><p> + Estimate and return the fraction of parent-table rows that will be visited + based on the given qual conditions. In the absence of any index-type-specific + knowledge, use the standard optimizer function <code class="function">clauselist_selectivity()</code>: + +</p><pre class="programlisting"> +*indexSelectivity = clauselist_selectivity(root, path->indexquals, + path->indexinfo->rel->relid, + JOIN_INNER, NULL); +</pre><p> + </p></li><li class="step"><p> + Estimate the number of index rows that will be visited during the + scan. For many index types this is the same as <em class="parameter"><code>indexSelectivity</code></em> times + the number of rows in the index, but it might be more. (Note that the + index's size in pages and rows is available from the + <code class="literal">path->indexinfo</code> struct.) + </p></li><li class="step"><p> + Estimate the number of index pages that will be retrieved during the scan. + This might be just <em class="parameter"><code>indexSelectivity</code></em> times the index's size in pages. + </p></li><li class="step"><p> + Compute the index access cost. A generic estimator might do this: + +</p><pre class="programlisting"> +/* + * Our generic assumption is that the index pages will be read + * sequentially, so they cost seq_page_cost each, not random_page_cost. + * Also, we charge for evaluation of the indexquals at each index row. + * All the costs are assumed to be paid incrementally during the scan. + */ +cost_qual_eval(&index_qual_cost, path->indexquals, root); +*indexStartupCost = index_qual_cost.startup; +*indexTotalCost = seq_page_cost * numIndexPages + + (cpu_index_tuple_cost + index_qual_cost.per_tuple) * numIndexTuples; +</pre><p> + + However, the above does not account for amortization of index reads + across repeated index scans. + </p></li><li class="step"><p> + Estimate the index correlation. For a simple ordered index on a single + field, this can be retrieved from pg_statistic. If the correlation + is not known, the conservative estimate is zero (no correlation). + </p></li></ol></div><p> + Examples of cost estimator functions can be found in + <code class="filename">src/backend/utils/adt/selfuncs.c</code>. + </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="index-unique-checks.html" title="64.5. Index Uniqueness Checks">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="indexam.html" title="Chapter 64. Index Access Method Interface Definition">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="generic-wal.html" title="Chapter 65. Generic WAL Records">Next</a></td></tr><tr><td width="40%" align="left" valign="top">64.5. Index Uniqueness Checks </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Chapter 65. Generic WAL Records</td></tr></table></div></body></html>
\ No newline at end of file |