1 files changed, 163 insertions, 0 deletions
diff --git a/doc/src/sgml/html/tablesample-support-functions.html b/doc/src/sgml/html/tablesample-support-functions.html
new file mode 100644
index 0000000..4e979ed
--- /dev/null
+++ b/doc/src/sgml/html/tablesample-support-functions.html
@@ -0,0 +1,163 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>60.1. Sampling Method Support Functions</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="tablesample-method.html" title="Chapter 60. Writing a Table Sampling Method" /><link rel="next" href="custom-scan.html" title="Chapter 61. Writing a Custom Scan Provider" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">60.1. Sampling Method Support Functions</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="tablesample-method.html" title="Chapter 60. Writing a Table Sampling Method">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="tablesample-method.html" title="Chapter 60. Writing a Table Sampling Method">Up</a></td><th width="60%" align="center">Chapter 60. Writing a Table Sampling Method</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="custom-scan.html" title="Chapter 61. Writing a Custom Scan Provider">Next</a></td></tr></table><hr /></div><div class="sect1" id="TABLESAMPLE-SUPPORT-FUNCTIONS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">60.1. Sampling Method Support Functions <a href="#TABLESAMPLE-SUPPORT-FUNCTIONS" class="id_link">#</a></h2></div></div></div><p>
+   The TSM handler function returns a palloc'd <code class="type">TsmRoutine</code> struct
+   containing pointers to the support functions described below.  Most of
+   the functions are required, but some are optional, and those pointers can
+   be NULL.
+  </p><p>
+</p><pre class="programlisting">
+void
+SampleScanGetSampleSize (PlannerInfo *root,
+                         RelOptInfo *baserel,
+                         List *paramexprs,
+                         BlockNumber *pages,
+                         double *tuples);
+</pre><p>
+
+   This function is called during planning.  It must estimate the number of
+   relation pages that will be read during a sample scan, and the number of
+   tuples that will be selected by the scan.  (For example, these might be
+   determined by estimating the sampling fraction, and then multiplying
+   the <code class="literal">baserel-&gt;pages</code> and <code class="literal">baserel-&gt;tuples</code>
+   numbers by that, being sure to round the results to integral values.)
+   The <code class="literal">paramexprs</code> list holds the expression(s) that are
+   parameters to the <code class="literal">TABLESAMPLE</code> clause.  It is recommended to
+   use <code class="function">estimate_expression_value()</code> to try to reduce these
+   expressions to constants, if their values are needed for estimation
+   purposes; but the function must provide size estimates even if they cannot
+   be reduced, and it should not fail even if the values appear invalid
+   (remember that they're only estimates of what the run-time values will be).
+   The <code class="literal">pages</code> and <code class="literal">tuples</code> parameters are outputs.
+  </p><p>
+</p><pre class="programlisting">
+void
+InitSampleScan (SampleScanState *node,
+                int eflags);
+</pre><p>
+
+   Initialize for execution of a SampleScan plan node.
+   This is called during executor startup.
+   It should perform any initialization needed before processing can start.
+   The <code class="structname">SampleScanState</code> node has already been created, but
+   its <code class="structfield">tsm_state</code> field is NULL.
+   The <code class="function">InitSampleScan</code> function can palloc whatever internal
+   state data is needed by the sampling method, and store a pointer to
+   it in <code class="literal">node-&gt;tsm_state</code>.
+   Information about the table to scan is accessible through other fields
+   of the <code class="structname">SampleScanState</code> node (but note that the
+   <code class="literal">node-&gt;ss.ss_currentScanDesc</code> scan descriptor is not set
+   up yet).
+   <code class="literal">eflags</code> contains flag bits describing the executor's
+   operating mode for this plan node.
+  </p><p>
+   When <code class="literal">(eflags &amp; EXEC_FLAG_EXPLAIN_ONLY)</code> is true,
+   the scan will not actually be performed, so this function should only do
+   the minimum required to make the node state valid for <code class="command">EXPLAIN</code>
+   and <code class="function">EndSampleScan</code>.
+  </p><p>
+   This function can be omitted (set the pointer to NULL), in which case
+   <code class="function">BeginSampleScan</code> must perform all initialization needed
+   by the sampling method.
+  </p><p>
+</p><pre class="programlisting">
+void
+BeginSampleScan (SampleScanState *node,
+                 Datum *params,
+                 int nparams,
+                 uint32 seed);
+</pre><p>
+
+   Begin execution of a sampling scan.
+   This is called just before the first attempt to fetch a tuple, and
+   may be called again if the scan needs to be restarted.
+   Information about the table to scan is accessible through fields
+   of the <code class="structname">SampleScanState</code> node (but note that the
+   <code class="literal">node-&gt;ss.ss_currentScanDesc</code> scan descriptor is not set
+   up yet).
+   The <code class="literal">params</code> array, of length <code class="literal">nparams</code>, contains the
+   values of the parameters supplied in the <code class="literal">TABLESAMPLE</code> clause.
+   These will have the number and types specified in the sampling
+   method's <code class="literal">parameterTypes</code> list, and have been checked
+   to not be null.
+   <code class="literal">seed</code> contains a seed to use for any random numbers generated
+   within the sampling method; it is either a hash derived from the
+   <code class="literal">REPEATABLE</code> value if one was given, or the result
+   of <code class="literal">random()</code> if not.
+  </p><p>
+   This function may adjust the fields <code class="literal">node-&gt;use_bulkread</code>
+   and <code class="literal">node-&gt;use_pagemode</code>.
+   If <code class="literal">node-&gt;use_bulkread</code> is <code class="literal">true</code>, which it is by
+   default, the scan will use a buffer access strategy that encourages
+   recycling buffers after use.  It might be reasonable to set this
+   to <code class="literal">false</code> if the scan will visit only a small fraction of the
+   table's pages.
+   If <code class="literal">node-&gt;use_pagemode</code> is <code class="literal">true</code>, which it is by
+   default, the scan will perform visibility checking in a single pass for
+   all tuples on each visited page.  It might be reasonable to set this
+   to <code class="literal">false</code> if the scan will select only a small fraction of the
+   tuples on each visited page.  That will result in fewer tuple visibility
+   checks being performed, though each one will be more expensive because it
+   will require more locking.
+  </p><p>
+   If the sampling method is
+   marked <code class="literal">repeatable_across_scans</code>, it must be able to
+   select the same set of tuples during a rescan as it did originally, that is
+   a fresh call of <code class="function">BeginSampleScan</code> must lead to selecting the
+   same tuples as before (if the <code class="literal">TABLESAMPLE</code> parameters
+   and seed don't change).
+  </p><p>
+</p><pre class="programlisting">
+BlockNumber
+NextSampleBlock (SampleScanState *node, BlockNumber nblocks);
+</pre><p>
+
+   Returns the block number of the next page to be scanned, or
+   <code class="literal">InvalidBlockNumber</code> if no pages remain to be scanned.
+  </p><p>
+   This function can be omitted (set the pointer to NULL), in which case
+   the core code will perform a sequential scan of the entire relation.
+   Such a scan can use synchronized scanning, so that the sampling method
+   cannot assume that the relation pages are visited in the same order on
+   each scan.
+  </p><p>
+</p><pre class="programlisting">
+OffsetNumber
+NextSampleTuple (SampleScanState *node,
+                 BlockNumber blockno,
+                 OffsetNumber maxoffset);
+</pre><p>
+
+   Returns the offset number of the next tuple to be sampled on the
+   specified page, or <code class="literal">InvalidOffsetNumber</code> if no tuples remain to
+   be sampled.  <code class="literal">maxoffset</code> is the largest offset number in use
+   on the page.
+  </p><div class="note"><h3 class="title">Note</h3><p>
+    <code class="function">NextSampleTuple</code> is not explicitly told which of the offset
+    numbers in the range <code class="literal">1 .. maxoffset</code> actually contain valid
+    tuples.  This is not normally a problem since the core code ignores
+    requests to sample missing or invisible tuples; that should not result in
+    any bias in the sample.  However, if necessary, the function can use
+    <code class="literal">node-&gt;donetuples</code> to examine how many of the tuples
+    it returned were valid and visible.
+   </p></div><div class="note"><h3 class="title">Note</h3><p>
+    <code class="function">NextSampleTuple</code> must <span class="emphasis"><em>not</em></span> assume
+    that <code class="literal">blockno</code> is the same page number returned by the most
+    recent <code class="function">NextSampleBlock</code> call.  It was returned by some
+    previous <code class="function">NextSampleBlock</code> call, but the core code is allowed
+    to call <code class="function">NextSampleBlock</code> in advance of actually scanning
+    pages, so as to support prefetching.  It is OK to assume that once
+    sampling of a given page begins, successive <code class="function">NextSampleTuple</code>
+    calls all refer to the same page until <code class="literal">InvalidOffsetNumber</code> is
+    returned.
+   </p></div><p>
+</p><pre class="programlisting">
+void
+EndSampleScan (SampleScanState *node);
+</pre><p>
+
+   End the scan and release resources.  It is normally not important
+   to release palloc'd memory, but any externally-visible resources
+   should be cleaned up.
+   This function can be omitted (set the pointer to NULL) in the common
+   case where no such resources exist.
+  </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="tablesample-method.html" title="Chapter 60. Writing a Table Sampling Method">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="tablesample-method.html" title="Chapter 60. Writing a Table Sampling Method">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="custom-scan.html" title="Chapter 61. Writing a Custom Scan Provider">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 60. Writing a Table Sampling Method </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.2 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Chapter 61. Writing a Custom Scan Provider</td></tr></table></div></body></html>
+\ No newline at end of file