summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/jit.sgml
blob: 0c6838930b9b12d8ac9cb2a2f7da5f1529e352dd (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
<!-- doc/src/sgml/jit.sgml -->

<chapter id="jit">
 <title>Just-in-Time Compilation (<acronym>JIT</acronym>)</title>

 <indexterm zone="jit">
  <primary><acronym>JIT</acronym></primary>
 </indexterm>

 <indexterm>
  <primary>Just-In-Time compilation</primary>
  <see><acronym>JIT</acronym></see>
 </indexterm>

 <para>
  This chapter explains what just-in-time compilation is, and how it can be
  configured in <productname>PostgreSQL</productname>.
 </para>

 <sect1 id="jit-reason">
  <title>What Is <acronym>JIT</acronym> compilation?</title>

  <para>
   Just-in-Time (<acronym>JIT</acronym>) compilation is the process of turning
   some form of interpreted program evaluation into a native program, and
   doing so at run time.
   For example, instead of using general-purpose code that can evaluate
   arbitrary SQL expressions to evaluate a particular SQL predicate
   like <literal>WHERE a.col = 3</literal>, it is possible to generate a
   function that is specific to that expression and can be natively executed
   by the CPU, yielding a speedup.
  </para>

  <para>
   <productname>PostgreSQL</productname> has builtin support to perform
   <acronym>JIT</acronym> compilation using <ulink
   url="https://llvm.org/"><productname>LLVM</productname></ulink> when
   <productname>PostgreSQL</productname> is built with
   <link linkend="configure-with-llvm"><literal>--with-llvm</literal></link>.
  </para>

  <para>
   See <filename>src/backend/jit/README</filename> for further details.
  </para>

  <sect2 id="jit-accelerated-operations">
   <title><acronym>JIT</acronym> Accelerated Operations</title>
   <para>
    Currently <productname>PostgreSQL</productname>'s <acronym>JIT</acronym>
    implementation has support for accelerating expression evaluation and
    tuple deforming.  Several other operations could be accelerated in the
    future.
   </para>
   <para>
    Expression evaluation is used to evaluate <literal>WHERE</literal>
    clauses, target lists, aggregates and projections. It can be accelerated
    by generating code specific to each case.
   </para>
   <para>
    Tuple deforming is the process of transforming an on-disk tuple (see <xref
    linkend="storage-tuple-layout"/>) into its in-memory representation.
    It can be accelerated by creating a function specific to the table layout
    and the number of columns to be extracted.
   </para>
  </sect2>

  <sect2 id="jit-inlining">
   <title>Inlining</title>
   <para>
    <productname>PostgreSQL</productname> is very extensible and allows new
    data types, functions, operators and other database objects to be defined;
    see <xref linkend="extend"/>. In fact the built-in objects are implemented
    using nearly the same mechanisms.  This extensibility implies some
    overhead, for example due to function calls (see <xref linkend="xfunc"/>).
    To reduce that overhead, <acronym>JIT</acronym> compilation can inline the
    bodies of small functions into the expressions using them. That allows a
    significant percentage of the overhead to be optimized away.
   </para>
  </sect2>

  <sect2 id="jit-optimization">
   <title>Optimization</title>
   <para>
    <productname>LLVM</productname> has support for optimizing generated
    code. Some of the optimizations are cheap enough to be performed whenever
    <acronym>JIT</acronym> is used, while others are only beneficial for
    longer-running queries.
    See <ulink url="https://llvm.org/docs/Passes.html#transform-passes"/> for
    more details about optimizations.
   </para>
  </sect2>

 </sect1>

 <sect1 id="jit-decision">
  <title>When to <acronym>JIT</acronym>?</title>

  <para>
   <acronym>JIT</acronym> compilation is beneficial primarily for long-running
   CPU-bound queries. Frequently these will be analytical queries.  For short
   queries the added overhead of performing <acronym>JIT</acronym> compilation
   will often be higher than the time it can save.
  </para>

  <para>
   To determine whether <acronym>JIT</acronym> compilation should be used,
   the total estimated cost of a query (see
   <xref linkend="planner-stats-details"/> and
   <xref linkend="runtime-config-query-constants"/>) is used.
   The estimated cost of the query will be compared with the setting of <xref
   linkend="guc-jit-above-cost"/>. If the cost is higher,
   <acronym>JIT</acronym> compilation will be performed.
   Two further decisions are then needed.
   Firstly, if the estimated cost is more
   than the setting of <xref linkend="guc-jit-inline-above-cost"/>, short
   functions and operators used in the query will be inlined.
   Secondly, if the estimated cost is more than the setting of <xref
   linkend="guc-jit-optimize-above-cost"/>, expensive optimizations are
   applied to improve the generated code.
   Each of these options increases the <acronym>JIT</acronym> compilation
   overhead, but can reduce query execution time considerably.
  </para>

  <para>
   These cost-based decisions will be made at plan time, not execution
   time. This means that when prepared statements are in use, and a generic
   plan is used (see <xref linkend="sql-prepare"/>), the values of the
   configuration parameters in effect at prepare time control the decisions,
   not the settings at execution time.
  </para>

  <note>
   <para>
    If <xref linkend="guc-jit"/> is set to <literal>off</literal>, or if no
    <acronym>JIT</acronym> implementation is available (for example because
    the server was compiled without <literal>--with-llvm</literal>),
    <acronym>JIT</acronym> will not be performed, even if it would be
    beneficial based on the above criteria.  Setting <xref linkend="guc-jit"/>
    to <literal>off</literal> has effects at both plan and execution time.
   </para>
  </note>

  <para>
   <xref linkend="sql-explain"/> can be used to see whether
   <acronym>JIT</acronym> is used or not.  As an example, here is a query that
   is not using <acronym>JIT</acronym>:
<screen>
=# EXPLAIN ANALYZE SELECT SUM(relpages) FROM pg_class;
                                                 QUERY PLAN
-------------------------------------------------------------------&zwsp;------------------------------------------
 Aggregate  (cost=16.27..16.29 rows=1 width=8) (actual time=0.303..0.303 rows=1 loops=1)
   ->  Seq Scan on pg_class  (cost=0.00..15.42 rows=342 width=4) (actual time=0.017..0.111 rows=356 loops=1)
 Planning Time: 0.116 ms
 Execution Time: 0.365 ms
(4 rows)
</screen>
   Given the cost of the plan, it is entirely reasonable that no
   <acronym>JIT</acronym> was used; the cost of <acronym>JIT</acronym> would
   have been bigger than the potential savings. Adjusting the cost limits
   will lead to <acronym>JIT</acronym> use:
<screen>
=# SET jit_above_cost = 10;
SET
=# EXPLAIN ANALYZE SELECT SUM(relpages) FROM pg_class;
                                                 QUERY PLAN
-------------------------------------------------------------------&zwsp;------------------------------------------
 Aggregate  (cost=16.27..16.29 rows=1 width=8) (actual time=6.049..6.049 rows=1 loops=1)
   ->  Seq Scan on pg_class  (cost=0.00..15.42 rows=342 width=4) (actual time=0.019..0.052 rows=356 loops=1)
 Planning Time: 0.133 ms
 JIT:
   Functions: 3
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 1.259 ms, Inlining 0.000 ms, Optimization 0.797 ms, Emission 5.048 ms, Total 7.104 ms
 Execution Time: 7.416 ms
</screen>
   As visible here, <acronym>JIT</acronym> was used, but inlining and
   expensive optimization were not. If <xref
   linkend="guc-jit-inline-above-cost"/> or <xref
   linkend="guc-jit-optimize-above-cost"/> were also lowered,
   that would change.
  </para>
 </sect1>

 <sect1 id="jit-configuration" xreflabel="JIT Configuration">
  <title>Configuration</title>

  <para>
   The configuration variable
   <xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym>
   compilation is enabled or disabled.
   If it is enabled, the configuration variables
   <xref linkend="guc-jit-above-cost"/>, <xref
   linkend="guc-jit-inline-above-cost"/>, and <xref
   linkend="guc-jit-optimize-above-cost"/> determine
   whether <acronym>JIT</acronym> compilation is performed for a query,
   and how much effort is spent doing so.
  </para>

  <para>
   <xref linkend="guc-jit-provider"/> determines which <acronym>JIT</acronym>
   implementation is used. It is rarely required to be changed. See <xref
   linkend="jit-pluggable"/>.
  </para>

  <para>
   For development and debugging purposes a few additional configuration
   parameters exist, as described in
   <xref linkend="runtime-config-developer"/>.
  </para>
 </sect1>

 <sect1 id="jit-extensibility">
  <title>Extensibility</title>

  <sect2 id="jit-extensibility-bitcode">
   <title>Inlining Support for Extensions</title>
   <para>
    <productname>PostgreSQL</productname>'s <acronym>JIT</acronym>
    implementation can inline the bodies of functions
    of types <literal>C</literal> and <literal>internal</literal>, as well as
    operators based on such functions.  To do so for functions in extensions,
    the definitions of those functions need to be made available.
    When using <link linkend="extend-pgxs">PGXS</link> to build an extension
    against a server that has been compiled with LLVM JIT support, the
    relevant files will be built and installed automatically.
   </para>

   <para>
    The relevant files have to be installed into
    <filename>$pkglibdir/bitcode/$extension/</filename> and a summary of them
    into <filename>$pkglibdir/bitcode/$extension.index.bc</filename>, where
    <literal>$pkglibdir</literal> is the directory returned by
    <literal>pg_config --pkglibdir</literal> and <literal>$extension</literal>
    is the base name of the extension's shared library.

    <note>
     <para>
      For functions built into <productname>PostgreSQL</productname> itself,
      the bitcode is installed into
      <literal>$pkglibdir/bitcode/postgres</literal>.
     </para>
    </note>
   </para>
  </sect2>

  <sect2 id="jit-pluggable">
   <title>Pluggable <acronym>JIT</acronym> Providers</title>

   <para>
    <productname>PostgreSQL</productname> provides a <acronym>JIT</acronym>
    implementation based on <productname>LLVM</productname>.  The interface to
    the <acronym>JIT</acronym> provider is pluggable and the provider can be
    changed without recompiling (although currently, the build process only
    provides inlining support data for <productname>LLVM</productname>).
    The active provider is chosen via the setting
    <xref linkend="guc-jit-provider"/>.
   </para>

   <sect3>
    <title><acronym>JIT</acronym> Provider Interface</title>
    <para>
     A <acronym>JIT</acronym> provider is loaded by dynamically loading the
     named shared library. The normal library search path is used to locate
     the library. To provide the required <acronym>JIT</acronym> provider
     callbacks and to indicate that the library is actually a
     <acronym>JIT</acronym> provider, it needs to provide a C function named
     <function>_PG_jit_provider_init</function>. This function is passed a
     struct that needs to be filled with the callback function pointers for
     individual actions:
<programlisting>
struct JitProviderCallbacks
{
    JitProviderResetAfterErrorCB reset_after_error;
    JitProviderReleaseContextCB release_context;
    JitProviderCompileExprCB compile_expr;
};

extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
</programlisting>
    </para>
   </sect3>
  </sect2>
 </sect1>

</chapter>