summaryrefslogtreecommitdiffstats
path: root/src/backend/optimizer/plan/README
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-04 12:17:33 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-04 12:17:33 +0000
commit5e45211a64149b3c659b90ff2de6fa982a5a93ed (patch)
tree739caf8c461053357daa9f162bef34516c7bf452 /src/backend/optimizer/plan/README
parentInitial commit. (diff)
downloadpostgresql-15-5e45211a64149b3c659b90ff2de6fa982a5a93ed.tar.xz
postgresql-15-5e45211a64149b3c659b90ff2de6fa982a5a93ed.zip
Adding upstream version 15.5.upstream/15.5
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/backend/optimizer/plan/README')
-rw-r--r--src/backend/optimizer/plan/README158
1 files changed, 158 insertions, 0 deletions
diff --git a/src/backend/optimizer/plan/README b/src/backend/optimizer/plan/README
new file mode 100644
index 0000000..013c0f9
--- /dev/null
+++ b/src/backend/optimizer/plan/README
@@ -0,0 +1,158 @@
+src/backend/optimizer/plan/README
+
+Subselects
+==========
+
+Vadim B. Mikheev
+
+
+From owner-pgsql-hackers@hub.org Fri Feb 13 09:01:19 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+ by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id JAA11576
+ for <maillist@candle.pha.pa.us>; Fri, 13 Feb 1998 09:01:17 -0500 (EST)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.14 $) with ESMTP id IAA09761 for <maillist@candle.pha.pa.us>; Fri, 13 Feb 1998 08:41:22 -0500 (EST)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id IAA08135; Fri, 13 Feb 1998 08:40:17 -0500 (EST)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Feb 1998 08:38:42 -0500 (EST)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id IAA06646 for pgsql-hackers-outgoing; Fri, 13 Feb 1998 08:38:35 -0500 (EST)
+Received: from dune.krasnet.ru (dune.krasnet.ru [193.125.44.86]) by hub.org (8.8.8/8.7.5) with ESMTP id IAA04568 for <hackers@postgreSQL.org>; Fri, 13 Feb 1998 08:37:16 -0500 (EST)
+Received: from sable.krasnoyarsk.su (dune.krasnet.ru [193.125.44.86])
+ by dune.krasnet.ru (8.8.7/8.8.7) with ESMTP id UAA13717
+ for <hackers@postgreSQL.org>; Fri, 13 Feb 1998 20:51:03 +0700 (KRS)
+ (envelope-from vadim@sable.krasnoyarsk.su)
+Message-ID: <34E44FBA.D64E7997@sable.krasnoyarsk.su>
+Date: Fri, 13 Feb 1998 20:50:50 +0700
+From: "Vadim B. Mikheev" <vadim@sable.krasnoyarsk.su>
+Organization: ITTS (Krasnoyarsk)
+X-Mailer: Mozilla 4.04 [en] (X11; I; FreeBSD 2.2.5-RELEASE i386)
+MIME-Version: 1.0
+To: PostgreSQL Developers List <hackers@postgreSQL.org>
+Subject: [HACKERS] Subselects are in CVS...
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: OR
+
+This is some implementation notes and opened issues...
+
+First, implementation uses new type of parameters - PARAM_EXEC - to deal
+with correlation Vars. When query_planner() is called, it first tries to
+replace all upper queries Var referenced in current query with Param of
+this type. Some global variables are used to keep mapping of Vars to
+Params and Params to Vars.
+
+After this, all current query' SubLinks are processed: for each SubLink
+found in query' qual union_planner() (old planner() function) will be
+called to plan corresponding subselect (union_planner() calls
+query_planner() for "simple" query and supports UNIONs). After subselect
+are planned, optimizer knows about is this correlated, un-correlated or
+_undirect_ correlated (references some grand-parent Vars but no parent
+ones: uncorrelated from the parent' point of view) query.
+
+For uncorrelated and undirect correlated subqueries of EXPRession or
+EXISTS type SubLinks will be replaced with "normal" clauses from
+SubLink->Oper list (I changed this list to be list of EXPR nodes,
+not just Oper ones). Right sides of these nodes are replaced with
+PARAM_EXEC parameters. This is second use of new parameter type.
+At run-time these parameters get value from result of subquery
+evaluation (i.e. - from target list of subquery). Execution plan of
+subquery itself becomes init plan of parent query. InitPlan knows
+what parameters are to get values from subquery' results and will be
+executed "on-demand" (for query select * from table where x > 0 and
+y > (select max(a) from table_a) subquery will not be executed at all
+if there are no tuples with x > 0 _and_ y is not used in index scan).
+
+SubLinks for subqueries of all other types are transformed into
+new type of Expr node - SUBPLAN_EXPR. Expr->args are just correlation
+variables from _parent_ query. Expr->oper is new SubPlan node.
+
+This node is used for InitPlan too. It keeps subquery range table,
+indices of Params which are to get value from _parent_ query Vars
+(i.e. - from Expr->args), indices of Params into which subquery'
+results are to be substituted (this is for InitPlans), SubLink
+and subquery' execution plan.
+
+Plan node was changed to know about dependencies on Params from
+parent queries and InitPlans, to keep list of changed Params
+(from the above) and so be re-scanned if this list is not NULL.
+Also, added list of InitPlans (actually, all of them for current
+query are in topmost plan node now) and other SubPlans (from
+plan->qual) - to initialize them and let them know about changed
+Params (from the list of their "interests").
+
+After all SubLinks are processed, query_planner() calls qual'
+canonificator and does "normal" work. By using Params optimizer
+is mostly unchanged.
+
+Well, Executor. To get subplans re-evaluated without ExecutorStart()
+and ExecutorEnd() (without opening and closing relations and indices
+and without many palloc() and pfree() - this is what SQL-funcs does
+on each call) ExecReScan() now supports most of Plan types...
+
+Explanation of EXPLAIN.
+
+vac=> explain select * from tmp where x >= (select max(x2) from test2
+where y2 = y and exists (select * from tempx where tx = x));
+NOTICE: QUERY PLAN:
+
+Seq Scan on tmp (cost=40.03 size=101 width=8)
+ SubPlan
+ ^^^^^^^ subquery is in Seq Scan' qual, its plan is below
+ -> Aggregate (cost=2.05 size=0 width=0)
+ InitPlan
+ ^^^^^^^^ EXISTS subsubquery is InitPlan of subquery
+ -> Seq Scan on tempx (cost=4.33 size=1 width=4)
+ -> Result (cost=2.05 size=0 width=0)
+ ^^^^^^ EXISTS subsubquery was transformed into Param
+ and so we have Result node here
+ -> Index Scan on test2 (cost=2.05 size=1 width=4)
+
+
+Opened issues.
+
+1. No read permissions checking (easy, just not done yet).
+2. readfuncs.c can't read subplan-s (easy, not critical, because of
+ we currently nowhere use ascii representation of execution plans).
+3. ExecReScan() doesn't support all plan types. At least support for
+ MergeJoin has to be implemented.
+4. Memory leaks in ExecReScan().
+5. I need in advice: if subquery introduced with NOT IN doesn't return
+ any tuples then qualification is failed, yes ?
+6. Regression tests !!!!!!!!!!!!!!!!!!!!
+ (Could we use data/queries from MySQL' crash.me ?
+ Copyright-ed ? Could they give us rights ?)
+7. Performance.
+ - Should be good when subquery is transformed into InitPlan.
+ - Something should be done for uncorrelated subqueries introduced
+ with ANY/ALL - keep thinking. Currently, subplan will be re-scanned
+ for each parent tuple - very slow...
+
+Results of some test. TMP is table with x,y (int4-s), x in 0-9,
+y = 100 - x, 1000 tuples (10 duplicates of each tuple). TEST2 is table
+with x2, y2 (int4-s), x2 in 1-99, y2 = 100 -x2, 10000 tuples (100 dups).
+
+ Trying
+
+select * from tmp where x >= (select max(x2) from test2 where y2 = y);
+
+ and
+
+begin;
+select y as ty, max(x2) as mx into table tsub from test2, tmp
+where y2 = y group by ty;
+vacuum tsub;
+select x, y from tmp, tsub where x >= mx and y = ty;
+drop table tsub;
+end;
+
+ Without index on test2(y2):
+
+SubSelect -> 320 sec
+Using temp table -> 32 sec
+
+ Having index
+
+SubSelect -> 17 sec (2M of memory)
+Using temp table -> 32 sec (12M of memory: -S 8192)
+
+Vadim