summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/how-parallel-query-works.html
blob: 89f5518128c03253fdcea7f661b9623385f53f68 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>15.1. How Parallel Query Works</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="parallel-query.html" title="Chapter 15. Parallel Query" /><link rel="next" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">15.1. How Parallel Query Works</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="parallel-query.html" title="Chapter 15. Parallel Query">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="parallel-query.html" title="Chapter 15. Parallel Query">Up</a></td><th width="60%" align="center">Chapter 15. Parallel Query</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Next</a></td></tr></table><hr /></div><div class="sect1" id="HOW-PARALLEL-QUERY-WORKS"><div class="titlepage"><div><div><h2 class="title" style="clear: both">15.1. How Parallel Query Works</h2></div></div></div><p>
    When the optimizer determines that parallel query is the fastest execution
    strategy for a particular query, it will create a query plan that includes
    a <em class="firstterm">Gather</em> or <em class="firstterm">Gather Merge</em>
    node.  Here is a simple example:

</p><pre class="screen">
EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%';
                                     QUERY PLAN
-------------------------------------------------------------------​------------------
 Gather  (cost=1000.00..217018.43 rows=1 width=97)
   Workers Planned: 2
   -&gt;  Parallel Seq Scan on pgbench_accounts  (cost=0.00..216018.33 rows=1 width=97)
         Filter: (filler ~~ '%x%'::text)
(4 rows)
</pre><p>
   </p><p>
    In all cases, the <code class="literal">Gather</code> or
    <code class="literal">Gather Merge</code> node will have exactly one
    child plan, which is the portion of the plan that will be executed in
    parallel.  If the <code class="literal">Gather</code> or <code class="literal">Gather Merge</code> node is
    at the very top of the plan tree, then the entire query will execute in
    parallel.  If it is somewhere else in the plan tree, then only the portion
    of the plan below it will run in parallel.  In the example above, the
    query accesses only one table, so there is only one plan node other than
    the <code class="literal">Gather</code> node itself; since that plan node is a child of the
    <code class="literal">Gather</code> node, it will run in parallel.
   </p><p>
    <a class="link" href="using-explain.html" title="14.1. Using EXPLAIN">Using EXPLAIN</a>, you can see the number of
    workers chosen by the planner.  When the <code class="literal">Gather</code> node is reached
    during query execution, the process that is implementing the user's
    session will request a number of <a class="link" href="bgworker.html" title="Chapter 48. Background Worker Processes">background
    worker processes</a> equal to the number
    of workers chosen by the planner.  The number of background workers that
    the planner will consider using is limited to at most
    <a class="xref" href="runtime-config-resource.html#GUC-MAX-PARALLEL-WORKERS-PER-GATHER">max_parallel_workers_per_gather</a>.  The total number
    of background workers that can exist at any one time is limited by both
    <a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a> and
    <a class="xref" href="runtime-config-resource.html#GUC-MAX-PARALLEL-WORKERS">max_parallel_workers</a>.  Therefore, it is possible for a
    parallel query to run with fewer workers than planned, or even with
    no workers at all.  The optimal plan may depend on the number of workers
    that are available, so this can result in poor query performance.  If this
    occurrence is frequent, consider increasing
    <code class="varname">max_worker_processes</code> and <code class="varname">max_parallel_workers</code>
    so that more workers can be run simultaneously or alternatively reducing
    <code class="varname">max_parallel_workers_per_gather</code> so that the planner
    requests fewer workers.
   </p><p>
    Every background worker process that is successfully started for a given
    parallel query will execute the parallel portion of the plan.  The leader
    will also execute that portion of the plan, but it has an additional
    responsibility: it must also read all of the tuples generated by the
    workers.  When the parallel portion of the plan generates only a small
    number of tuples, the leader will often behave very much like an additional
    worker, speeding up query execution.  Conversely, when the parallel portion
    of the plan generates a large number of tuples, the leader may be almost
    entirely occupied with reading the tuples generated by the workers and
    performing any further processing steps that are required by plan nodes
    above the level of the <code class="literal">Gather</code> node or
    <code class="literal">Gather Merge</code> node.  In such cases, the leader will
    do very little of the work of executing the parallel portion of the plan.
   </p><p>
    When the node at the top of the parallel portion of the plan is
    <code class="literal">Gather Merge</code> rather than <code class="literal">Gather</code>, it indicates that
    each process executing the parallel portion of the plan is producing
    tuples in sorted order, and that the leader is performing an
    order-preserving merge.  In contrast, <code class="literal">Gather</code> reads tuples
    from the workers in whatever order is convenient, destroying any sort
    order that may have existed.
   </p></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="parallel-query.html" title="Chapter 15. Parallel Query">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="parallel-query.html" title="Chapter 15. Parallel Query">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="when-can-parallel-query-be-used.html" title="15.2. When Can Parallel Query Be Used?">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 15. Parallel Query </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 15.2. When Can Parallel Query Be Used?</td></tr></table></div></body></html>