summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/how-parallel-query-works.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/html/how-parallel-query-works.html')
-rw-r--r--doc/src/sgml/html/how-parallel-query-works.html71
1 files changed, 71 insertions, 0 deletions
diff --git a/doc/src/sgml/html/how-parallel-query-works.html b/doc/src/sgml/html/how-parallel-query-works.html
new file mode 100644
index 0000000..89f5518
--- /dev/null
+++ b/doc/src/sgml/html/how-parallel-query-works.html
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>15.1. How Parallel Query Works</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="parallel-query.html" title="Chapter 15. Parallel Query" /><link rel="next" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">15.1. How Parallel Query Works</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="parallel-query.html" title="Chapter 15. Parallel Query">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="parallel-query.html" title="Chapter 15. Parallel Query">Up</a></td><th width="60%" align="center">Chapter 15. Parallel Query</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Next</a></td></tr></table><hr /></div><div class="sect1" id="HOW-PARALLEL-QUERY-WORKS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">15.1. How Parallel Query Works</h2></div></div></div><p>
+ When the optimizer determines that parallel query is the fastest execution
+ strategy for a particular query, it will create a query plan that includes
+ a <em class="firstterm">Gather</em> or <em class="firstterm">Gather Merge</em>
+ node. Here is a simple example:
+
+</p><pre class="screen">
+EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
+ QUERY PLAN
+-------------------------------------------------------------------​------------------
+ Gather (cost=1000.00..217018.43 rows=1 width=97)
+ Workers Planned: 2
+ -&gt; Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=1 width=97)
+ Filter: (filler ~~ '%x%'::text)
+(4 rows)
+</pre><p>
+ </p><p>
+ In all cases, the <code class="literal">Gather</code> or
+ <code class="literal">Gather Merge</code> node will have exactly one
+ child plan, which is the portion of the plan that will be executed in
+ parallel. If the <code class="literal">Gather</code> or <code class="literal">Gather Merge</code> node is
+ at the very top of the plan tree, then the entire query will execute in
+ parallel. If it is somewhere else in the plan tree, then only the portion
+ of the plan below it will run in parallel. In the example above, the
+ query accesses only one table, so there is only one plan node other than
+ the <code class="literal">Gather</code> node itself; since that plan node is a child of the
+ <code class="literal">Gather</code> node, it will run in parallel.
+ </p><p>
+ <a class="link" href="using-explain.html" title="14.1. Using EXPLAIN">Using EXPLAIN</a>, you can see the number of
+ workers chosen by the planner. When the <code class="literal">Gather</code> node is reached
+ during query execution, the process that is implementing the user's
+ session will request a number of <a class="link" href="bgworker.html" title="Chapter 48. Background Worker Processes">background
+ worker processes</a> equal to the number
+ of workers chosen by the planner. The number of background workers that
+ the planner will consider using is limited to at most
+ <a class="xref" href="runtime-config-resource.html#GUC-MAX-PARALLEL-WORKERS-PER-GATHER">max_parallel_workers_per_gather</a>. The total number
+ of background workers that can exist at any one time is limited by both
+ <a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a> and
+ <a class="xref" href="runtime-config-resource.html#GUC-MAX-PARALLEL-WORKERS">max_parallel_workers</a>. Therefore, it is possible for a
+ parallel query to run with fewer workers than planned, or even with
+ no workers at all. The optimal plan may depend on the number of workers
+ that are available, so this can result in poor query performance. If this
+ occurrence is frequent, consider increasing
+ <code class="varname">max_worker_processes</code> and <code class="varname">max_parallel_workers</code>
+ so that more workers can be run simultaneously or alternatively reducing
+ <code class="varname">max_parallel_workers_per_gather</code> so that the planner
+ requests fewer workers.
+ </p><p>
+ Every background worker process that is successfully started for a given
+ parallel query will execute the parallel portion of the plan. The leader
+ will also execute that portion of the plan, but it has an additional
+ responsibility: it must also read all of the tuples generated by the
+ workers. When the parallel portion of the plan generates only a small
+ number of tuples, the leader will often behave very much like an additional
+ worker, speeding up query execution. Conversely, when the parallel portion
+ of the plan generates a large number of tuples, the leader may be almost
+ entirely occupied with reading the tuples generated by the workers and
+ performing any further processing steps that are required by plan nodes
+ above the level of the <code class="literal">Gather</code> node or
+ <code class="literal">Gather Merge</code> node. In such cases, the leader will
+ do very little of the work of executing the parallel portion of the plan.
+ </p><p>
+ When the node at the top of the parallel portion of the plan is
+ <code class="literal">Gather Merge</code> rather than <code class="literal">Gather</code>, it indicates that
+ each process executing the parallel portion of the plan is producing
+ tuples in sorted order, and that the leader is performing an
+ order-preserving merge. In contrast, <code class="literal">Gather</code> reads tuples
+ from the workers in whatever order is convenient, destroying any sort
+ order that may have existed.
+ </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="parallel-query.html" title="Chapter 15. Parallel Query">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="parallel-query.html" title="Chapter 15. Parallel Query">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 15. Parallel Query </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 15.2. When Can Parallel Query Be Used?</td></tr></table></div></body></html> \ No newline at end of file