summaryrefslogtreecommitdiffstats
path: root/www/queryplanner-ng.html
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-06-03 05:16:48 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-06-03 05:16:48 +0000
commit3640b21e686fef7e3f25dc775112c7d4be43f197 (patch)
treee7fec2ad45891adeada1227d655062cbd201dd5a /www/queryplanner-ng.html
parentReleasing progress-linux version 3.45.3-1~progress7.99u1. (diff)
downloadsqlite3-3640b21e686fef7e3f25dc775112c7d4be43f197.tar.xz
sqlite3-3640b21e686fef7e3f25dc775112c7d4be43f197.zip
Merging upstream version 3.46.0.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'www/queryplanner-ng.html')
-rw-r--r--www/queryplanner-ng.html196
1 files changed, 105 insertions, 91 deletions
diff --git a/www/queryplanner-ng.html b/www/queryplanner-ng.html
index b9d7593..69eee6c 100644
--- a/www/queryplanner-ng.html
+++ b/www/queryplanner-ng.html
@@ -177,7 +177,7 @@ of the problems inherent to query planning, and outlines how the NGQP
solves those problems.</p>
<p>The NGQP is almost always better than the legacy query planner.
-However, there may exist legacy applications that unknowingly depend on
+However, there may exist legacy applications that unknowingly depend on
undefined and/or suboptimal behavior in the legacy query planner, and
upgrading to the NGQP on those legacy applications could cause performance
regressions. This risk is considered and a checklist is provided
@@ -203,12 +203,12 @@ this multitude of possibilities.</p>
(This is true of all SQL database engines, not just SQLite.)
The query planner frees the programmer from the chore of selecting
a particular query plan, and thereby allows the programmer to
-focus more mental energy on higher-level application issues and on
+focus more mental energy on higher-level application issues and on
providing more value to the end user. For simple queries where the choice
of query plan is obvious, this is convenient but not hugely important.
But as applications and schemas and queries grow more complex, a
clever query planner can greatly speed and simplify the work of application
-development.
+development.
There is amazing power in being about to tell
the database engine what content is desired, and then let the database
engine figure out the best way to retrieve that content.</p>
@@ -218,14 +218,14 @@ The query planner must work with incomplete information.
It cannot determine how long any particular plan will take
without actually running that plan. So when comparing two
or more plans to figure out which is "best", the query planner has to make
-some guesses and assumptions and those guesses and assumptions will
+some guesses and assumptions and those guesses and assumptions will
sometimes be wrong. A good query planner is one that will
find the correct solution often enough that application
programmers rarely need to get involved.</p>
<h2 id="_query_planning_in_sqlite"><span>2.1. </span> Query Planning In SQLite</h2>
-<p>SQLite computes joins using nested loops,
+<p>SQLite computes joins using nested loops,
one loop for each table
in the join. (Additional loops might be inserted for IN
and OR operators in the WHERE clause. SQLite considers those too,
@@ -250,7 +250,7 @@ SQLite will always pick the same query plan for any
given SQL statement as long as:
</p><ol type="a">
-<li>the database schema does not change in significant ways such as
+<li>the database schema does not change in significant ways such as
adding or dropping indexes,</li>
<li>the ANALYZE command is not rerun, </li>
<li>the same version of SQLite is used.</li>
@@ -263,22 +263,22 @@ invoking <a href="c3ref/db_config.html">sqlite3_db_config</a>(db,<a href="c3ref/
</p><p>The QPSG means that if all of your queries run efficiently
during testing, and if your application does not change the schema,
then SQLite will not suddenly decide to start using a different
-query plan, possibly causing a performance problem after your application
+query plan, possibly causing a performance problem after your application
is released to users. If your application works in the lab, it
will continue working the same way after deployment.</p>
-<p>Enterprise-class client/server SQL database engines do not normally
+<p>Client/server SQL database engines do not normally
make this guarantee.
In client/server SQL database engines, the server keeps track of
-statistics on the sizes of tables and on the quality of indexes
+statistics on the sizes of tables and on the quality of indexes
and the query planner uses those statistics to help select the best plans.
-As content is added, deleted, or changed in the database, the statistics
-will evolve and may cause the query planner to begin using a different
-query plan for some particular query. Usually the new plan will be better
+As content is added, deleted, or changed in the database, the statistics
+will evolve and may cause the query planner to begin using a different
+query plan for some particular query. Usually the new plan will be better
for the evolving structure of the data. But sometimes the new query plan will
cause a performance reduction. With a client/server database engine, there
-is typically a Database Administrator (DBA) on hand to deal with these
-rare problems as they come up. But DBAs are not available to fix problems
+is typically a Database Administrator (DBA) on hand to deal with these
+rare problems as they come up. But DBAs are not available to fix problems
in an embedded database like SQLite, and hence SQLite is careful to
ensure that plans do not change unexpectedly after deployment.</p>
@@ -287,22 +287,28 @@ changes in query plans.
The same version of SQLite will
always pick the same query plan, but if you relink your application to use
a different version of SQLite, then query plans might change. In rare
-cases, an SQLite version change might lead to a performance regression.
+cases, an SQLite version change might lead to a performance regression.
This is one reason
-you should consider statically linking your applications against SQLite
-rather than use a system-wide SQLite shared library which might
+you should consider statically linking your applications against SQLite
+rather than use a system-wide SQLite shared library which might
change without your knowledge or control.</p>
+<p>See also:
+</p><ul>
+<li> <a href="lang_analyze.html#req">Recommended usage patterns for ANALYZE</a>
+</li><li> <a href="pragma.html#pragma_optimize">PRAGMA optimize</a>
+</li></ul>
+
<h1 id="_a_difficult_case"><span>3. </span> A Difficult Case</h1>
<p>
"TPC-H Q8" is a test query from the
<a href="http://www.tpc.org/tpch/">Transaction Processing Performance
-Council</a>. The query planners in SQLite versions 3.7.17 and earlier
-do not choose good plans for TPC-H Q8. And it has been determined that
+Council</a>. The query planners in SQLite versions 3.7.17 and earlier
+do not choose good plans for TPC-H Q8. And it has been determined that
no amount
of tweaking of the legacy query planner will fix that. In order to find
-a good solution to the TPC-H Q8 query, and to continue improving the
+a good solution to the TPC-H Q8 query, and to continue improving the
quality of SQLite's query planner, it became necessary to redesign the
query planner. This section tries to explain why this redesign was
necessary and how the NGQP is different and addresses the TPC-H Q8 problem.
@@ -427,7 +433,7 @@ by the following diagram:
In the diagram, each of the 8 tables in the FROM clause of the query is
identified by a large circle with the label of the FROM-clause term:
N2, S, L, P, O, C, N1 and R.
-The arcs in the graph represent the estimated cost of computing each term
+The arcs in the graph represent the estimated cost of computing each term
assuming that the origin of the arc is in an outer loop. For example, the
cost of running the S loop as an inner loop to L is 2.30 whereas the
cost of running the S loop as an outer loop to L is 9.17.</p>
@@ -447,7 +453,7 @@ one of the other terms is in an outer loop, whichever gives the best
result. One can think of the *-costs as a short-hand notation indicating
multiple arcs, one from each of the other nodes in the
graph. The graph is therefore "complete", meaning that there are arcs
-(some explicit and some implied) in both directions between every pair of
+(some explicit and some implied) in both directions between every pair of
nodes in the graph.</p>
<p>The problem of finding the best query plan is equivalent to finding
@@ -475,7 +481,7 @@ shown in the graph.
SQLite computes several different estimated costs for each loop that apply at
different times. For example, there is a "setup" cost that is incurred
just once when the query starts. The setup cost is the cost of computing
-an <a href="optoverview.html#autoindex">automatic index</a> for a table that does not already
+an <a href="optoverview.html#autoindex">query-time index</a> for a table that does not already
have an index. Then there
is the cost of running each step of the loop. Finally, there is an estimate
of the number rows generated by the loop, which is information needed in
@@ -491,9 +497,9 @@ since there is no way for an arc to originate at two or more nodes at
once.</p>
<p>If the query contains an ORDER BY clause or a GROUP BY clause or if
-the query uses the DISTINCT keyword then it is advantageous to select a
-path through the graph that causes rows to naturally appear in sorted order,
-so that no separate sorting step is required. Automatic elimination of
+the query uses the DISTINCT keyword then it is advantageous to select a
+path through the graph that causes rows to naturally appear in sorted order,
+so that no separate sorting step is required. Automatic elimination of
ORDER BY clauses
can make a large performance difference, so this is another factor
that needs to be considered in a complete implementation.</p>
@@ -510,7 +516,7 @@ is neglected in the remainder of this article.</p>
<p>Prior to <a href="releaselog/3_8_0.html">version 3.8.0</a> (2013-08-26), SQLite always used
the "Nearest Neighbor" or "NN" heuristic when searching for the best query plan.
The NN heuristic makes a single traversal of the graph, always choosing
-the lowest-cost arc as the next step.
+the lowest-cost arc as the next step.
The NN heuristic works surprisingly well in most cases.
And NN is fast, so that SQLite is able to quickly find good plans
for even large 64-way joins. In contrast, other SQL database engines that
@@ -518,21 +524,21 @@ do more extensive searching tend to bog down when the
number of tables in a join goes above 10 or 15.</p>
<p>Unfortunately, the query plan computed by NN for TPC-H Q8 is not optimal.
-The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92.
+The plan computed using NN is R-N1-N2-S-C-O-L-P with a cost of 36.92.
The notation
in the previous sentence means that the R table is run in the outer loop,
N1 is in the next inner loop, N2 is in the third loop, and so forth down
to P which is in the inner-most loop. The shortest path through the
graph (as found via exhaustive search) is P-L-O-C-N1-R-S-N2
-with a cost of 27.38. The difference might not seem like much, but
+with a cost of 27.38. The difference might not seem like much, but
remember that
the costs are logarithmic, so the shortest path is nearly 750 times
faster than that path found using the NN heuristic.</p>
<p>One solution to this problem is to change SQLite to do an exhaustive
-search for the best path. But an exhaustive search requires time
+search for the best path. But an exhaustive search requires time
proportional to
-K! (where K is the number of tables in the join) and so when you get
+K! (where K is the number of tables in the join) and so when you get
beyond a 10-way join, the time
to run <a href="c3ref/prepare.html">sqlite3_prepare()</a> becomes very large.</p>
@@ -553,7 +559,7 @@ the four shortest paths to visit any single node in the graph:
P (cost: 7.71) <br>
</blockquote>
-<p>The second step finds the four shortest paths to visit two nodes
+<p>The second step finds the four shortest paths to visit two nodes
beginning with one of the four paths from the previous step. In the
case where two or more paths are equivalent (they have the same set of
visited nodes, though possibly in a different order) only the
@@ -576,21 +582,21 @@ the four shortest three-node paths:</p>
R-N2-S (cost: 15.08) <br>
</blockquote>
-<p>And so forth. There are 8 nodes in the TPC-H Q8 query,
+<p>And so forth. There are 8 nodes in the TPC-H Q8 query,
so this process repeats a total of 8
times. In the general case of a K-way join, the storage requirement
is O(N) and the computation time is O(K*N), which is significantly faster
than the O(2<small><sup>K</sup></small>) exact solution.</p>
<p>But what value to choose for N? One might try N=K. This makes the
-algorithm O(K<small><sup>2</sup></small>)
+algorithm O(K<small><sup>2</sup></small>)
which is actually still quite efficient, since the
-maximum value of K is 64 and K rarely exceeds 10.
+maximum value of K is 64 and K rarely exceeds 10.
But that is not enough for the TPC-H Q8
-problem. With N=8 on TPC-H Q8 the N3 algorithm finds
-the solution R-N1-C-O-L-S-N2-P with a cost of 29.78.
+problem. With N=8 on TPC-H Q8 the N3 algorithm finds
+the solution R-N1-C-O-L-S-N2-P with a cost of 29.78.
That is a big improvement over NN, but it is still
-not optimal. N3 finds the optimal solution for TPC-H Q8
+not optimal. N3 finds the optimal solution for TPC-H Q8
when N is 10 or greater.</p>
<p>The initial implementation of NGQP chooses N=1 for simple queries, N=5
@@ -601,30 +607,32 @@ formula for selecting N might change in subsequent releases.</p>
<h1 id="_hazards_of_upgrading_to_ngqp"><span>4. </span> Hazards Of Upgrading To NGQP</h1>
-<p><i>Update on 2018-11-24: This section was important
-when the NGQP was new. But five years have elapsed, the NGQP has been
-deployed successfully to billions of devices, and everyone has upgraded.
-The upgrade hazard has vanished.
+<p><i>&rarr; Update: <b>This section is obsolete and retained for
+historical reference only.</b> This section was important
+when the NGQP was new. But a decade has elapsed, the NGQP has been
+deployed successfully to billions of devices, and everyone has upgraded,
+and no performance regressions were ever reported back to the SQLite
+developers. The upgrade hazard has vanished.
This section is retained for historical reference only.
-Modern reads can skip ahead to the <a href="queryplanner-ng.html#howtofix">query planner checklist</a>.</i>
+Modern readers can <b>skip ahead to the <a href="queryplanner-ng.html#howtofix">query planner checklist</a></b>.&larr;</i>
</p><p>For most applications, upgrading from the legacy query planner to the NGQP
requires little thought or effort.
-Simply replace the older SQLite version with the newer version of SQLite
-and recompile and the application will run faster.
+Simply replace the older SQLite version with the newer version of SQLite
+and recompile and the application will run faster.
There are no API changes nor modifications
to compilation procedures.</p>
<p>But as with any query planner change, upgrading to the NGQP does carry
a small risk of introducing performance regressions. The problem here is
not that the NGQP is incorrect or buggy or inferior to the legacy query
-planner. Given reliable information about the selectivity of indexes,
-the NGQP should always pick a plan that is as good or better than before.
+planner. Given reliable information about the selectivity of indexes,
+the NGQP should always pick a plan that is as good as before, or better.
The problem is that some applications may be using low-quality and
low-selectivity indexes without having run <a href="lang_analyze.html">ANALYZE</a>. The older query
-planners look at many fewer possible implementations for each query and
-so they may have stumbled over a good plan by stupid luck. The NGQP, on
-the other hand, looks at many more query plan possibilities, and it may
+planners look at many fewer possible implementations for each query and
+so they may have stumbled over a good plan by stupid luck. The NGQP, on
+the other hand, looks at many more query plan possibilities, and it may
choose a different query plan that
works better in theory, assuming good indexes, but which gives a performance
regression in practice, because of the shape of the data.</p>
@@ -635,7 +643,7 @@ regression in practice, because of the shape of the data.</p>
<li><p>The NGQP will always find an equal or better query plan, compared to
prior query planners, as long as it
has access to accurate <a href="lang_analyze.html">ANALYZE</a> data in the <a href="fileformat2.html#stat1tab">SQLITE_STAT1</a> file.</p>
-</li><li><p>The NGQP will always find a good query plan
+</li><li><p>The NGQP will always find a good query plan
as long as the schema does not contain indexes that have more than
about 10 or 20 rows with the same value in the left-most column of the
index.</p>
@@ -654,7 +662,7 @@ control system used to track all of the SQLite source code.
A Fossil repository is an SQLite database file.
(Readers are invited to ponder this recursion as an independent exercise.)
Fossil is both the version-control system for SQLite and a test
-platform for SQLite. Whenever enhancements are made to SQLite,
+platform for SQLite. Whenever enhancements are made to SQLite,
Fossil is one of the first applications to test and evaluate those
enhancements. So Fossil was an early adopter of the NGQP.</p>
@@ -706,7 +714,7 @@ SELECT
</pre></blockquote>
<p>This query is not
-especially complicated, but even so it replaces hundreds or
+especially complicated, but even so it replaces hundreds or
perhaps thousands of lines of procedural code.
The gist of the query is this: Scan down the EVENT table looking
for the most recent 200 check-ins that satisfy any one of three conditions:
@@ -781,7 +789,7 @@ better mathematically. This is because, in the absence of other information,
the NGQP must assume that the indexes PLINK_I1 and TAGXREF_I1 are of
equal quality and are equally selective. Algorithm-2 uses one field
of the TAGXREF_I1 index and both fields of the PLINK_I1 index whereas
-algorithm-1 only uses the first field of each index. Since
+algorithm-1 only uses the first field of each index. Since
algorithm-2 uses more index material, the NGQP is correct
to judge it to be the better algorithm. The scores are close and
algorithm-2 just barely squeaks ahead of algorithm-1. But
@@ -797,7 +805,7 @@ this application.
The problem is that the indexes are not of equal quality.
A check-in is likely to only have one child. So the first
field of PLINK_I1 will usually narrow down the search to just a single
-row. But there are thousands and thousands check-ins tagged with "trunk",
+row. But there are thousands and thousands check-ins tagged with "trunk",
so the first field of TAGXREF_I1 will be
of little help in narrowing down the search.
</p>
@@ -806,7 +814,7 @@ of little help in narrowing down the search.
The NGQP has no way of knowing that TAGXREF_I1 is almost useless in this
query, unless <a href="lang_analyze.html">ANALYZE</a> has been run on the database. The <a href="lang_analyze.html">ANALYZE</a> command
gathers statistics on the quality of the various indexes and stores those
-statistics in <a href="fileformat2.html#stat1tab">SQLITE_STAT1</a> table.
+statistics in <a href="fileformat2.html#stat1tab">SQLITE_STAT1</a> table.
Having access to this statistical information,
the NGQP easily chooses algorithm-1 as the best algorithm, by a wide
margin.</p>
@@ -865,10 +873,10 @@ problem look like this:</p>
</center>
<p>
-In the "without ANALYZE" case on the left, the NN algorithm chooses
+In the "without ANALYZE" case on the left, the NN algorithm chooses
loop P (PLINK) as the outer loop because 4.9 is less than 5.2, resulting
in path P-T which is algorithm-1. NN only looks at the single best choice
-at each step so it completely misses the fact that
+at each step so it completely misses the fact that
5.2+4.4 makes a slightly cheaper plan than 4.9+4.8. But the N3 algorithm
keeps track of the 5 best paths for a 2-way join, so it ends up
selecting path T-P because of its slightly lower overall cost.
@@ -877,13 +885,13 @@ Path T-P is algorithm-2.
<p>
Note that with ANALYZE the cost estimates are
-better aligned with reality and algorithm-1 is
+better aligned with reality and algorithm-1 is
selected by both NN and N3.
</p>
-<p>(Side note: The costs estimates in the two most recent graphs
+<p>(Side note: The costs estimates in the two most recent graphs
were computed by the NGQP using a base-2 logarithm and slightly different
-cost assumptions compared to the legacy query planner.
+cost assumptions compared to the legacy query planner.
Hence, the cost estimates in
these latter two graphs are not directly comparable to the cost estimates
in the TPC-H Q8 graph.)</p>
@@ -893,7 +901,7 @@ in the TPC-H Q8 graph.)</p>
<p>Running <a href="lang_analyze.html">ANALYZE</a> on the repository database immediately fixed the
performance problem. However, we want Fossil to be robust and to always
work quickly regardless of whether or not its repository has been analyzed.
-For this reason, the query was modified to use the CROSS JOIN operator
+For this reason, the query was modified to use the CROSS JOIN operator
instead of the plain JOIN operator.
SQLite will not reorder the tables of a CROSS JOIN.
This is a long-standing feature of SQLite that is specifically designed
@@ -936,11 +944,11 @@ that hand-tuning of queries is no longer required. We have not had
to tweak a query in Fossil in ages.
</p><p>Therefore, the current recommendation for avoiding problems such
-as this one is to simply run "PRAGMA optimize" (possibly preceded by
-"<a href="pragma.html#pragma_analysis_limit">PRAGMA analysis_limit=200</a>") just prior to closing each database
-connection. The CROSS JOIN hack is still available, but if you keep
-the query planner statistics in the <a href="fileformat2.html#stat1tab">sqlite_stat1</a> table up-to-date,
-it usually won't be necessary.
+as this one is to simply run "PRAGMA optimize" just prior to closing
+each database connection. Or, if your application is long-running and
+never closes any database connections, then run "PRAGMA optimize" once
+per day or so. Also consider running "PRAGMA optimize" after any
+schema change.
<a name="howtofix"></a>
@@ -968,7 +976,7 @@ boolean or "enum" columns as the left-most columns of your indexes.</p>
<p>The Fossil performance problem described in the previous section of
this document arose because there were over
-ten-thousand entries in the TAGXREF table with the same value for the
+ten-thousand entries in the TAGXREF table with the same value for the
left-most column (the TAGID column) of the TAGXREF_I1 index.</p>
</li><li><p><b>If you must use a low-quality index, be sure to run <a href="lang_analyze.html">ANALYZE</a>.</b>
@@ -977,9 +985,9 @@ query planner knows that the indexes are of low quality. And the way
the query planner knows this is by the content of the <a href="fileformat2.html#stat1tab">SQLITE_STAT1</a> table,
which is computed by the ANALYZE command.</p>
-<p>Of course, ANALYZE only works effectively if you have a significant
-amount of content in your database in the first place. When creating a
-new database that you expect to accumulate a lot of data, you can run
+<p>Of course, ANALYZE only works effectively if you have a significant
+amount of content in your database in the first place. When creating a
+new database that you expect to accumulate a lot of data, you can run
the command "ANALYZE sqlite_schema" to create the SQLITE_STAT1 table,
then prepopulate the <a href="fileformat2.html#stat1tab">sqlite_stat1</a> table (using ordinary INSERT statements)
with content that describes a typical
@@ -993,7 +1001,23 @@ needed to keep the <a href="fileformat2.html#stat1tab">sqlite_stat1</a> table cu
Add logic that lets you know quickly and easily which queries are taking
too much time. Then work on just those specific queries.</p>
-</li><li><p><b>Use <a href="lang_corefunc.html#unlikely">unlikely()</a> and <a href="lang_corefunc.html#likelihood">likelihood()</a> SQL functions.</b>
+<p>
+</p><hr>
+<i>Update 2024: The query planner has been improved so much over
+the years that you should never need to use any of the
+hacks described below. The capabilities described below are still
+available, for backwards compatibility. But you shouldn't use
+them. If you do find a case where you are getting a suboptimal query
+plan, please report it to the SQLite developers on the
+<a href="https://sqlite.org/forum">SQLite Forum</a> so that they can try to fix
+the problem. In other words:
+<p>
+</p><center><b>Stop Reading Here!</b></center>
+<p>To help encourage you to stop reading, the remainder of this checklist
+is now grayed out.
+</p><hr>
+
+</i></li><li><p style="color:grey;">Use <a href="lang_corefunc.html#unlikely">unlikely()</a> and <a href="lang_corefunc.html#likelihood">likelihood()</a> SQL functions.
SQLite normally assumes that terms in the WHERE clause that cannot be used
by indexes have a strong probability of being true. If this assumption
is incorrect, it could lead to a suboptimal query plan. The
@@ -1002,23 +1026,13 @@ hints to the query planner about WHERE clause terms that are probably
not true, and thus aid the query planner in selecting the best possible
plan.
-</p></li><li><p><b>Use the <a href="optoverview.html#crossjoin">CROSS JOIN</a> syntax to enforce a particular
+</p></li><li><p style="color:grey;">Use the <a href="optoverview.html#crossjoin">CROSS JOIN</a> syntax to enforce a particular
loop nesting order on queries that might use low-quality indexes in an
-unanalyzed database.</b>
-SQLite <a href="lang_select.html#crossjoin">treats the CROSS JOIN operator specially</a>, forcing the table to
+unanalyzed database.
+SQLite <a href="lang_select.html#crossjoin">treats the CROSS JOIN operator specially</a>, forcing the table to
the left to be an outer loop relative to the table on the right.</p>
-<p>Avoid this step if possible, as it defeats one of the huge advantages
-of the whole SQL language concept, specifically that the application
-programmer does not need to get involved with query planning. If you
-do use CROSS JOIN, wait until late in your development cycle to do so,
-and comment the use of CROSS JOIN carefully so that you can take it out
-later if possible. Avoid using CROSS JOIN early in the development
-cycle as doing so is a premature optimization, which is well known to
-be <a href="http://c2.com/cgi/wiki?PrematureOptimization">the root of
-all evil</a>.</p>
-
-</li><li><p><b>Use unary "+" operators to disqualify WHERE clause terms.</b>
+</li><li><p style="color:grey;">Use unary "+" operators to disqualify WHERE clause terms.
If the query planner insists on selecting a poor-quality index for a particular
query when a much higher-quality index is available, then
<a href="optoverview.html#uplus">careful use of unary "+" operators</a> in the WHERE clause
@@ -1026,12 +1040,12 @@ can force the query planner away from the poor-quality index.
Avoid using this trick if at all possible, and especially avoid it
early in the application development cycle. Beware that
adding a unary "+" operator to an equality expression might change
-the result of that expression if
+the result of that expression if
<a href="datatype3.html#affinity">type affinity</a> is involved.</p>
-</li><li><p><b>Use the <a href="lang_indexedby.html">INDEXED BY</a> syntax to enforce the selection of
-particular indexes on problem queries.</b>
-As with the previous two bullets, avoid this step if possible, and
+</li><li><p style="color:grey;">Use the <a href="lang_indexedby.html">INDEXED BY</a> syntax to enforce the selection of
+particular indexes on problem queries.
+As with the previous two bullets, avoid this step if possible, and
especially avoid doing this early in development as it is clearly a
premature optimization.</p>
</li></ol>
@@ -1039,7 +1053,7 @@ premature optimization.</p>
<h1 id="_summary"><span>6. </span> Summary</h1>
<p>The query planner in SQLite normally does a terrific job of selecting
-fast algorithms for running your SQL statements. This is true of the
+fast algorithms for running your SQL statements. This is true of the
legacy query planner and even more true of the new NGQP. There may be
an occasional situation where, due to incomplete information, the query
planner selects a suboptimal plan. This will happen less often with the