Adding upstream version 15.4.upstream/15.4 upstream

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-16 19:46:48 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-16 19:46:48 +0000
commit: 311bcfc6b3acdd6fd152798c7f287ddf74fa2a98 (patch)
tree: 0ec307299b1dada3701e42f4ca6eda57d708261e /doc/src/sgml/html/parallel-plans.html
parent: Initial commit. (diff)
download: postgresql-15-upstream.tar.xz
postgresql-15-upstream.zip
1 files changed, 155 insertions, 0 deletions
diff --git a/doc/src/sgml/html/parallel-plans.html b/doc/src/sgml/html/parallel-plans.html
new file mode 100644
index 0000000..f0e6911
--- /dev/null
+++ b/doc/src/sgml/html/parallel-plans.html
@@ -0,0 +1,155 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>15.3. Parallel Plans</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?" /><link rel="next" href="parallel-safety.html" title="15.4. Parallel Safety" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">15.3. Parallel Plans</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="parallel-query.html" title="Chapter 15. Parallel Query">Up</a></td><th width="60%" align="center">Chapter 15. Parallel Query</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="parallel-safety.html" title="15.4. Parallel Safety">Next</a></td></tr></table><hr /></div><div class="sect1" id="PARALLEL-PLANS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">15.3. Parallel Plans</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="parallel-plans.html#PARALLEL-SCANS">15.3.1. Parallel Scans</a></span></dt><dt><span class="sect2"><a href="parallel-plans.html#PARALLEL-JOINS">15.3.2. Parallel Joins</a></span></dt><dt><span class="sect2"><a href="parallel-plans.html#PARALLEL-AGGREGATION">15.3.3. Parallel Aggregation</a></span></dt><dt><span class="sect2"><a href="parallel-plans.html#PARALLEL-APPEND">15.3.4. Parallel Append</a></span></dt><dt><span class="sect2"><a href="parallel-plans.html#PARALLEL-PLAN-TIPS">15.3.5. Parallel Plan Tips</a></span></dt></dl></div><p>
+    Because each worker executes the parallel portion of the plan to
+    completion, it is not possible to simply take an ordinary query plan
+    and run it using multiple workers.  Each worker would produce a full
+    copy of the output result set, so the query would not run any faster
+    than normal but would produce incorrect results.  Instead, the parallel
+    portion of the plan must be what is known internally to the query
+    optimizer as a <em class="firstterm">partial plan</em>; that is, it must be constructed
+    so that each process that executes the plan will generate only a
+    subset of the output rows in such a way that each required output row
+    is guaranteed to be generated by exactly one of the cooperating processes.
+    Generally, this means that the scan on the driving table of the query
+    must be a parallel-aware scan.
+  </p><div class="sect2" id="PARALLEL-SCANS"><div class="titlepage"><div><div><h3 class="title">15.3.1. Parallel Scans</h3></div></div></div><p>
+    The following types of parallel-aware table scans are currently supported.
+
+  </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+        In a <span class="emphasis"><em>parallel sequential scan</em></span>, the table's blocks will
+        be divided into ranges and shared among the cooperating processes.  Each
+        worker process will complete the scanning of its given range of blocks before
+        requesting an additional range of blocks.
+      </p></li><li class="listitem"><p>
+        In a <span class="emphasis"><em>parallel bitmap heap scan</em></span>, one process is chosen
+        as the leader.  That process performs a scan of one or more indexes
+        and builds a bitmap indicating which table blocks need to be visited.
+        These blocks are then divided among the cooperating processes as in
+        a parallel sequential scan.  In other words, the heap scan is performed
+        in parallel, but the underlying index scan is not.
+      </p></li><li class="listitem"><p>
+        In a <span class="emphasis"><em>parallel index scan</em></span> or <span class="emphasis"><em>parallel index-only
+        scan</em></span>, the cooperating processes take turns reading data from the
+        index.  Currently, parallel index scans are supported only for
+        btree indexes.  Each process will claim a single index block and will
+        scan and return all tuples referenced by that block; other processes can
+        at the same time be returning tuples from a different index block.
+        The results of a parallel btree scan are returned in sorted order
+        within each worker process.
+      </p></li></ul></div><p>
+
+    Other scan types, such as scans of non-btree indexes, may support
+    parallel scans in the future.
+  </p></div><div class="sect2" id="PARALLEL-JOINS"><div class="titlepage"><div><div><h3 class="title">15.3.2. Parallel Joins</h3></div></div></div><p>
+    Just as in a non-parallel plan, the driving table may be joined to one or
+    more other tables using a nested loop, hash join, or merge join.  The
+    inner side of the join may be any kind of non-parallel plan that is
+    otherwise supported by the planner provided that it is safe to run within
+    a parallel worker.  Depending on the join type, the inner side may also be
+    a parallel plan.
+  </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+        In a <span class="emphasis"><em>nested loop join</em></span>, the inner side is always
+        non-parallel.  Although it is executed in full, this is efficient if
+        the inner side is an index scan, because the outer tuples and thus
+        the loops that look up values in the index are divided over the
+        cooperating processes.
+      </p></li><li class="listitem"><p>
+        In a <span class="emphasis"><em>merge join</em></span>, the inner side is always
+        a non-parallel plan and therefore executed in full.  This may be
+        inefficient, especially if a sort must be performed, because the work
+        and resulting data are duplicated in every cooperating process.
+      </p></li><li class="listitem"><p>
+        In a <span class="emphasis"><em>hash join</em></span> (without the "parallel" prefix),
+        the inner side is executed in full by every cooperating process
+        to build identical copies of the hash table.  This may be inefficient
+        if the hash table is large or the plan is expensive.  In a
+        <span class="emphasis"><em>parallel hash join</em></span>, the inner side is a
+        <span class="emphasis"><em>parallel hash</em></span> that divides the work of building
+        a shared hash table over the cooperating processes.
+      </p></li></ul></div></div><div class="sect2" id="PARALLEL-AGGREGATION"><div class="titlepage"><div><div><h3 class="title">15.3.3. Parallel Aggregation</h3></div></div></div><p>
+    <span class="productname">PostgreSQL</span> supports parallel aggregation by aggregating in
+    two stages.  First, each process participating in the parallel portion of
+    the query performs an aggregation step, producing a partial result for
+    each group of which that process is aware.  This is reflected in the plan
+    as a <code class="literal">Partial Aggregate</code> node.  Second, the partial results are
+    transferred to the leader via <code class="literal">Gather</code> or <code class="literal">Gather
+    Merge</code>.  Finally, the leader re-aggregates the results across all
+    workers in order to produce the final result.  This is reflected in the
+    plan as a <code class="literal">Finalize Aggregate</code> node.
+  </p><p>
+    Because the <code class="literal">Finalize Aggregate</code> node runs on the leader
+    process, queries that produce a relatively large number of groups in
+    comparison to the number of input rows will appear less favorable to the
+    query planner. For example, in the worst-case scenario the number of
+    groups seen by the <code class="literal">Finalize Aggregate</code> node could be as many as
+    the number of input rows that were seen by all worker processes in the
+    <code class="literal">Partial Aggregate</code> stage. For such cases, there is clearly
+    going to be no performance benefit to using parallel aggregation. The
+    query planner takes this into account during the planning process and is
+    unlikely to choose parallel aggregate in this scenario.
+  </p><p>
+    Parallel aggregation is not supported in all situations.  Each aggregate
+    must be <a class="link" href="parallel-safety.html" title="15.4. Parallel Safety">safe</a> for parallelism and must
+    have a combine function.  If the aggregate has a transition state of type
+    <code class="literal">internal</code>, it must have serialization and deserialization
+    functions.  See <a class="xref" href="sql-createaggregate.html" title="CREATE AGGREGATE"><span class="refentrytitle">CREATE AGGREGATE</span></a> for more details.
+    Parallel aggregation is not supported if any aggregate function call
+    contains <code class="literal">DISTINCT</code> or <code class="literal">ORDER BY</code> clause and is also
+    not supported for ordered set aggregates or when  the query involves
+    <code class="literal">GROUPING SETS</code>.  It can only be used when all joins involved in
+    the query are also part of the parallel portion of the plan.
+  </p></div><div class="sect2" id="PARALLEL-APPEND"><div class="titlepage"><div><div><h3 class="title">15.3.4. Parallel Append</h3></div></div></div><p>
+    Whenever <span class="productname">PostgreSQL</span> needs to combine rows
+    from multiple sources into a single result set, it uses an
+    <code class="literal">Append</code> or <code class="literal">MergeAppend</code> plan node.
+    This commonly happens when implementing <code class="literal">UNION ALL</code> or
+    when scanning a partitioned table.  Such nodes can be used in parallel
+    plans just as they can in any other plan.  However, in a parallel plan,
+    the planner may instead use a <code class="literal">Parallel Append</code> node.
+  </p><p>
+    When an <code class="literal">Append</code> node is used in a parallel plan, each
+    process will execute the child plans in the order in which they appear,
+    so that all participating processes cooperate to execute the first child
+    plan until it is complete and then move to the second plan at around the
+    same time.  When a <code class="literal">Parallel Append</code> is used instead, the
+    executor will instead spread out the participating processes as evenly as
+    possible across its child plans, so that multiple child plans are executed
+    simultaneously.  This avoids contention, and also avoids paying the startup
+    cost of a child plan in those processes that never execute it.
+  </p><p>
+    Also, unlike a regular <code class="literal">Append</code> node, which can only have
+    partial children when used within a parallel plan, a <code class="literal">Parallel
+    Append</code> node can have both partial and non-partial child plans.
+    Non-partial children will be scanned by only a single process, since
+    scanning them more than once would produce duplicate results.  Plans that
+    involve appending multiple results sets can therefore achieve
+    coarse-grained parallelism even when efficient partial plans are not
+    available.  For example, consider a query against a partitioned table
+    that can only be implemented efficiently by using an index that does
+    not support parallel scans.  The planner might choose a <code class="literal">Parallel
+    Append</code> of regular <code class="literal">Index Scan</code> plans; each
+    individual index scan would have to be executed to completion by a single
+    process, but different scans could be performed at the same time by
+    different processes.
+  </p><p>
+    <a class="xref" href="runtime-config-query.html#GUC-ENABLE-PARALLEL-APPEND">enable_parallel_append</a> can be used to disable
+    this feature.
+  </p></div><div class="sect2" id="PARALLEL-PLAN-TIPS"><div class="titlepage"><div><div><h3 class="title">15.3.5. Parallel Plan Tips</h3></div></div></div><p>
+    If a query that is expected to do so does not produce a parallel plan,
+    you can try reducing <a class="xref" href="runtime-config-query.html#GUC-PARALLEL-SETUP-COST">parallel_setup_cost</a> or
+    <a class="xref" href="runtime-config-query.html#GUC-PARALLEL-TUPLE-COST">parallel_tuple_cost</a>.  Of course, this plan may turn
+    out to be slower than the serial plan that the planner preferred, but
+    this will not always be the case.  If you don't get a parallel
+    plan even with very small values of these settings (e.g., after setting
+    them both to zero), there may be some reason why the query planner is
+    unable to generate a parallel plan for your query.  See
+    <a class="xref" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Section 15.2</a> and
+    <a class="xref" href="parallel-safety.html" title="15.4. Parallel Safety">Section 15.4</a> for information on why this may be
+    the case.
+  </p><p>
+    When executing a parallel plan, you can use <code class="literal">EXPLAIN (ANALYZE,
+    VERBOSE)</code> to display per-worker statistics for each plan node.
+    This may be useful in determining whether the work is being evenly
+    distributed between all plan nodes and more generally in understanding the
+    performance characteristics of the plan.
+  </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="parallel-query.html" title="Chapter 15. Parallel Query">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="parallel-safety.html" title="15.4. Parallel Safety">Next</a></td></tr><tr><td width="40%" align="left" valign="top">15.2. When Can Parallel Query Be Used? </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 15.4. Parallel Safety</td></tr></table></div></body></html>
+\ No newline at end of file
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-16 19:46:48 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-16 19:46:48 +0000
commit	311bcfc6b3acdd6fd152798c7f287ddf74fa2a98 (patch)
tree	0ec307299b1dada3701e42f4ca6eda57d708261e /doc/src/sgml/html/parallel-plans.html
parent	Initial commit. (diff)
download	postgresql-15-upstream.tar.xz postgresql-15-upstream.zip