From 293913568e6a7a86fd1479e1cff8e2ecb58d6568 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sat, 13 Apr 2024 15:44:03 +0200 Subject: Adding upstream version 16.2. Signed-off-by: Daniel Baumann --- doc/src/sgml/parallel.sgml | 598 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 598 insertions(+) create mode 100644 doc/src/sgml/parallel.sgml (limited to 'doc/src/sgml/parallel.sgml') diff --git a/doc/src/sgml/parallel.sgml b/doc/src/sgml/parallel.sgml new file mode 100644 index 0000000..5acc953 --- /dev/null +++ b/doc/src/sgml/parallel.sgml @@ -0,0 +1,598 @@ + + + + Parallel Query + + + parallel query + + + + PostgreSQL can devise query plans that can leverage + multiple CPUs in order to answer queries faster. This feature is known + as parallel query. Many queries cannot benefit from parallel query, either + due to limitations of the current implementation or because there is no + imaginable query plan that is any faster than the serial query plan. + However, for queries that can benefit, the speedup from parallel query + is often very significant. Many queries can run more than twice as fast + when using parallel query, and some queries can run four times faster or + even more. Queries that touch a large amount of data but return only a + few rows to the user will typically benefit most. This chapter explains + some details of how parallel query works and in which situations it can be + used so that users who wish to make use of it can understand what to expect. + + + + How Parallel Query Works + + + When the optimizer determines that parallel query is the fastest execution + strategy for a particular query, it will create a query plan that includes + a Gather or Gather Merge + node. Here is a simple example: + + +EXPLAIN SELECT * FROM pgbench_accounts WHERE filler LIKE '%x%'; + QUERY PLAN +-------------------------------------------------------------------&zwsp;------------------ + Gather (cost=1000.00..217018.43 rows=1 width=97) + Workers Planned: 2 + -> Parallel Seq Scan on pgbench_accounts (cost=0.00..216018.33 rows=1 width=97) + Filter: (filler ~~ '%x%'::text) +(4 rows) + + + + + In all cases, the Gather or + Gather Merge node will have exactly one + child plan, which is the portion of the plan that will be executed in + parallel. If the Gather or Gather Merge node is + at the very top of the plan tree, then the entire query will execute in + parallel. If it is somewhere else in the plan tree, then only the portion + of the plan below it will run in parallel. In the example above, the + query accesses only one table, so there is only one plan node other than + the Gather node itself; since that plan node is a child of the + Gather node, it will run in parallel. + + + + Using EXPLAIN, you can see the number of + workers chosen by the planner. When the Gather node is reached + during query execution, the process that is implementing the user's + session will request a number of background + worker processes equal to the number + of workers chosen by the planner. The number of background workers that + the planner will consider using is limited to at most + . The total number + of background workers that can exist at any one time is limited by both + and + . Therefore, it is possible for a + parallel query to run with fewer workers than planned, or even with + no workers at all. The optimal plan may depend on the number of workers + that are available, so this can result in poor query performance. If this + occurrence is frequent, consider increasing + max_worker_processes and max_parallel_workers + so that more workers can be run simultaneously or alternatively reducing + max_parallel_workers_per_gather so that the planner + requests fewer workers. + + + + Every background worker process that is successfully started for a given + parallel query will execute the parallel portion of the plan. The leader + will also execute that portion of the plan, but it has an additional + responsibility: it must also read all of the tuples generated by the + workers. When the parallel portion of the plan generates only a small + number of tuples, the leader will often behave very much like an additional + worker, speeding up query execution. Conversely, when the parallel portion + of the plan generates a large number of tuples, the leader may be almost + entirely occupied with reading the tuples generated by the workers and + performing any further processing steps that are required by plan nodes + above the level of the Gather node or + Gather Merge node. In such cases, the leader will + do very little of the work of executing the parallel portion of the plan. + + + + When the node at the top of the parallel portion of the plan is + Gather Merge rather than Gather, it indicates that + each process executing the parallel portion of the plan is producing + tuples in sorted order, and that the leader is performing an + order-preserving merge. In contrast, Gather reads tuples + from the workers in whatever order is convenient, destroying any sort + order that may have existed. + + + + + When Can Parallel Query Be Used? + + + There are several settings that can cause the query planner not to + generate a parallel query plan under any circumstances. In order for + any parallel query plans whatsoever to be generated, the following + settings must be configured as indicated. + + + + + + must be set to a + value that is greater than zero. This is a special case of the more + general principle that no more workers should be used than the number + configured via max_parallel_workers_per_gather. + + + + + + In addition, the system must not be running in single-user mode. Since + the entire database system is running as a single process in this situation, + no background workers will be available. + + + + Even when it is in general possible for parallel query plans to be + generated, the planner will not generate them for a given query + if any of the following are true: + + + + + + The query writes any data or locks any database rows. If a query + contains a data-modifying operation either at the top level or within + a CTE, no parallel plans for that query will be generated. As an + exception, the following commands, which create a new table and populate + it, can use a parallel plan for the underlying SELECT + part of the query: + + + + CREATE TABLE ... AS + + + SELECT INTO + + + CREATE MATERIALIZED VIEW + + + REFRESH MATERIALIZED VIEW + + + + + + + + The query might be suspended during execution. In any situation in + which the system thinks that partial or incremental execution might + occur, no parallel plan is generated. For example, a cursor created + using DECLARE CURSOR will never use + a parallel plan. Similarly, a PL/pgSQL loop of the form + FOR x IN query LOOP .. END LOOP will never use a + parallel plan, because the parallel query system is unable to verify + that the code in the loop is safe to execute while parallel query is + active. + + + + + + The query uses any function marked PARALLEL UNSAFE. + Most system-defined functions are PARALLEL SAFE, + but user-defined functions are marked PARALLEL + UNSAFE by default. See the discussion of + . + + + + + + The query is running inside of another query that is already parallel. + For example, if a function called by a parallel query issues an SQL + query itself, that query will never use a parallel plan. This is a + limitation of the current implementation, but it may not be desirable + to remove this limitation, since it could result in a single query + using a very large number of processes. + + + + + + Even when parallel query plan is generated for a particular query, there + are several circumstances under which it will be impossible to execute + that plan in parallel at execution time. If this occurs, the leader + will execute the portion of the plan below the Gather + node entirely by itself, almost as if the Gather node were + not present. This will happen if any of the following conditions are met: + + + + + + No background workers can be obtained because of the limitation that + the total number of background workers cannot exceed + . + + + + + + No background workers can be obtained because of the limitation that + the total number of background workers launched for purposes of + parallel query cannot exceed . + + + + + + The client sends an Execute message with a non-zero fetch count. + See the discussion of the + extended query protocol. + Since libpq currently provides no way to + send such a message, this can only occur when using a client that + does not rely on libpq. If this is a frequent + occurrence, it may be a good idea to set + to zero in + sessions where it is likely, so as to avoid generating query plans + that may be suboptimal when run serially. + + + + + + + Parallel Plans + + + Because each worker executes the parallel portion of the plan to + completion, it is not possible to simply take an ordinary query plan + and run it using multiple workers. Each worker would produce a full + copy of the output result set, so the query would not run any faster + than normal but would produce incorrect results. Instead, the parallel + portion of the plan must be what is known internally to the query + optimizer as a partial plan; that is, it must be constructed + so that each process that executes the plan will generate only a + subset of the output rows in such a way that each required output row + is guaranteed to be generated by exactly one of the cooperating processes. + Generally, this means that the scan on the driving table of the query + must be a parallel-aware scan. + + + + Parallel Scans + + + The following types of parallel-aware table scans are currently supported. + + + + + In a parallel sequential scan, the table's blocks will + be divided into ranges and shared among the cooperating processes. Each + worker process will complete the scanning of its given range of blocks before + requesting an additional range of blocks. + + + + + In a parallel bitmap heap scan, one process is chosen + as the leader. That process performs a scan of one or more indexes + and builds a bitmap indicating which table blocks need to be visited. + These blocks are then divided among the cooperating processes as in + a parallel sequential scan. In other words, the heap scan is performed + in parallel, but the underlying index scan is not. + + + + + In a parallel index scan or parallel index-only + scan, the cooperating processes take turns reading data from the + index. Currently, parallel index scans are supported only for + btree indexes. Each process will claim a single index block and will + scan and return all tuples referenced by that block; other processes can + at the same time be returning tuples from a different index block. + The results of a parallel btree scan are returned in sorted order + within each worker process. + + + + + Other scan types, such as scans of non-btree indexes, may support + parallel scans in the future. + + + + + Parallel Joins + + + Just as in a non-parallel plan, the driving table may be joined to one or + more other tables using a nested loop, hash join, or merge join. The + inner side of the join may be any kind of non-parallel plan that is + otherwise supported by the planner provided that it is safe to run within + a parallel worker. Depending on the join type, the inner side may also be + a parallel plan. + + + + + + In a nested loop join, the inner side is always + non-parallel. Although it is executed in full, this is efficient if + the inner side is an index scan, because the outer tuples and thus + the loops that look up values in the index are divided over the + cooperating processes. + + + + + In a merge join, the inner side is always + a non-parallel plan and therefore executed in full. This may be + inefficient, especially if a sort must be performed, because the work + and resulting data are duplicated in every cooperating process. + + + + + In a hash join (without the "parallel" prefix), + the inner side is executed in full by every cooperating process + to build identical copies of the hash table. This may be inefficient + if the hash table is large or the plan is expensive. In a + parallel hash join, the inner side is a + parallel hash that divides the work of building + a shared hash table over the cooperating processes. + + + + + + + Parallel Aggregation + + PostgreSQL supports parallel aggregation by aggregating in + two stages. First, each process participating in the parallel portion of + the query performs an aggregation step, producing a partial result for + each group of which that process is aware. This is reflected in the plan + as a Partial Aggregate node. Second, the partial results are + transferred to the leader via Gather or Gather + Merge. Finally, the leader re-aggregates the results across all + workers in order to produce the final result. This is reflected in the + plan as a Finalize Aggregate node. + + + + Because the Finalize Aggregate node runs on the leader + process, queries that produce a relatively large number of groups in + comparison to the number of input rows will appear less favorable to the + query planner. For example, in the worst-case scenario the number of + groups seen by the Finalize Aggregate node could be as many as + the number of input rows that were seen by all worker processes in the + Partial Aggregate stage. For such cases, there is clearly + going to be no performance benefit to using parallel aggregation. The + query planner takes this into account during the planning process and is + unlikely to choose parallel aggregate in this scenario. + + + + Parallel aggregation is not supported in all situations. Each aggregate + must be safe for parallelism and must + have a combine function. If the aggregate has a transition state of type + internal, it must have serialization and deserialization + functions. See for more details. + Parallel aggregation is not supported if any aggregate function call + contains DISTINCT or ORDER BY clause and is also + not supported for ordered set aggregates or when the query involves + GROUPING SETS. It can only be used when all joins involved in + the query are also part of the parallel portion of the plan. + + + + + + Parallel Append + + + Whenever PostgreSQL needs to combine rows + from multiple sources into a single result set, it uses an + Append or MergeAppend plan node. + This commonly happens when implementing UNION ALL or + when scanning a partitioned table. Such nodes can be used in parallel + plans just as they can in any other plan. However, in a parallel plan, + the planner may instead use a Parallel Append node. + + + + When an Append node is used in a parallel plan, each + process will execute the child plans in the order in which they appear, + so that all participating processes cooperate to execute the first child + plan until it is complete and then move to the second plan at around the + same time. When a Parallel Append is used instead, the + executor will instead spread out the participating processes as evenly as + possible across its child plans, so that multiple child plans are executed + simultaneously. This avoids contention, and also avoids paying the startup + cost of a child plan in those processes that never execute it. + + + + Also, unlike a regular Append node, which can only have + partial children when used within a parallel plan, a Parallel + Append node can have both partial and non-partial child plans. + Non-partial children will be scanned by only a single process, since + scanning them more than once would produce duplicate results. Plans that + involve appending multiple results sets can therefore achieve + coarse-grained parallelism even when efficient partial plans are not + available. For example, consider a query against a partitioned table + that can only be implemented efficiently by using an index that does + not support parallel scans. The planner might choose a Parallel + Append of regular Index Scan plans; each + individual index scan would have to be executed to completion by a single + process, but different scans could be performed at the same time by + different processes. + + + + can be used to disable + this feature. + + + + + Parallel Plan Tips + + + If a query that is expected to do so does not produce a parallel plan, + you can try reducing or + . Of course, this plan may turn + out to be slower than the serial plan that the planner preferred, but + this will not always be the case. If you don't get a parallel + plan even with very small values of these settings (e.g., after setting + them both to zero), there may be some reason why the query planner is + unable to generate a parallel plan for your query. See + and + for information on why this may be + the case. + + + + When executing a parallel plan, you can use EXPLAIN (ANALYZE, + VERBOSE) to display per-worker statistics for each plan node. + This may be useful in determining whether the work is being evenly + distributed between all plan nodes and more generally in understanding the + performance characteristics of the plan. + + + + + + + Parallel Safety + + + The planner classifies operations involved in a query as either + parallel safe, parallel restricted, + or parallel unsafe. A parallel safe operation is one that + does not conflict with the use of parallel query. A parallel restricted + operation is one that cannot be performed in a parallel worker, but that + can be performed in the leader while parallel query is in use. Therefore, + parallel restricted operations can never occur below a Gather + or Gather Merge node, but can occur elsewhere in a plan that + contains such a node. A parallel unsafe operation is one that cannot + be performed while parallel query is in use, not even in the leader. + When a query contains anything that is parallel unsafe, parallel query + is completely disabled for that query. + + + + The following operations are always parallel restricted: + + + + + + Scans of common table expressions (CTEs). + + + + + + Scans of temporary tables. + + + + + + Scans of foreign tables, unless the foreign data wrapper has + an IsForeignScanParallelSafe API that indicates otherwise. + + + + + + Plan nodes to which an InitPlan is attached. + + + + + + Plan nodes that reference a correlated SubPlan. + + + + + + Parallel Labeling for Functions and Aggregates + + + The planner cannot automatically determine whether a user-defined + function or aggregate is parallel safe, parallel restricted, or parallel + unsafe, because this would require predicting every operation that the + function could possibly perform. In general, this is equivalent to the + Halting Problem and therefore impossible. Even for simple functions + where it could conceivably be done, we do not try, since this would be expensive + and error-prone. Instead, all user-defined functions are assumed to + be parallel unsafe unless otherwise marked. When using + or + , markings can be set by specifying + PARALLEL SAFE, PARALLEL RESTRICTED, or + PARALLEL UNSAFE as appropriate. When using + , the + PARALLEL option can be specified with SAFE, + RESTRICTED, or UNSAFE as the corresponding value. + + + + Functions and aggregates must be marked PARALLEL UNSAFE if + they write to the database, access sequences, change the transaction state + even temporarily (e.g., a PL/pgSQL function that establishes an + EXCEPTION block to catch errors), or make persistent changes to + settings. Similarly, functions must be marked PARALLEL + RESTRICTED if they access temporary tables, client connection state, + cursors, prepared statements, or miscellaneous backend-local state that + the system cannot synchronize across workers. For example, + setseed and random are parallel restricted for + this last reason. + + + + In general, if a function is labeled as being safe when it is restricted or + unsafe, or if it is labeled as being restricted when it is in fact unsafe, + it may throw errors or produce wrong answers when used in a parallel query. + C-language functions could in theory exhibit totally undefined behavior if + mislabeled, since there is no way for the system to protect itself against + arbitrary C code, but in most likely cases the result will be no worse than + for any other function. If in doubt, it is probably best to label functions + as UNSAFE. + + + + If a function executed within a parallel worker acquires locks that are + not held by the leader, for example by querying a table not referenced in + the query, those locks will be released at worker exit, not end of + transaction. If you write a function that does this, and this behavior + difference is important to you, mark such functions as + PARALLEL RESTRICTED + to ensure that they execute only in the leader. + + + + Note that the query planner does not consider deferring the evaluation of + parallel-restricted functions or aggregates involved in the query in + order to obtain a superior plan. So, for example, if a WHERE + clause applied to a particular table is parallel restricted, the query + planner will not consider performing a scan of that table in the parallel + portion of a plan. In some cases, it would be + possible (and perhaps even efficient) to include the scan of that table in + the parallel portion of the query and defer the evaluation of the + WHERE clause so that it happens above the Gather + node. However, the planner does not do this. + + + + + + + -- cgit v1.2.3