docs/advanced/pool.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332

.. currentmodule:: psycopg_pool

.. _connection-pools:

Connection pools
================

A `connection pool`__ is an object managing a set of connections and allowing
their use in functions needing one. Because the time to establish a new
connection can be relatively long, keeping connections open can reduce latency.

.. __: https://en.wikipedia.org/wiki/Connection_pool

This page explains a few basic concepts of Psycopg connection pool's
behaviour. Please refer to the `ConnectionPool` object API for details about
the pool operations.

.. note:: The connection pool objects are distributed in a package separate
   from the main `psycopg` package: use ``pip install "psycopg[pool]"`` or ``pip
   install psycopg_pool`` to make the `psycopg_pool` package available. See
   :ref:`pool-installation`.


Pool life cycle
---------------

A simple way to use the pool is to create a single instance of it, as a
global object, and to use this object in the rest of the program, allowing
other functions, modules, threads to use it::

    # module db.py in your program
    from psycopg_pool import ConnectionPool

    pool = ConnectionPool(conninfo, **kwargs)
    # the pool starts connecting immediately.

    # in another module
    from .db import pool

    def my_function():
        with pool.connection() as conn:
            conn.execute(...)

Ideally you may want to call `~ConnectionPool.close()` when the use of the
pool is finished. Failing to call `!close()` at the end of the program is not
terribly bad: probably it will just result in some warnings printed on stderr.
However, if you think that it's sloppy, you could use the `atexit` module to
have `!close()` called at the end of the program.

If you want to avoid starting to connect to the database at import time, and
want to wait for the application to be ready, you can create the pool using
`!open=False`, and call the `~ConnectionPool.open()` and
`~ConnectionPool.close()` methods when the conditions are right. Certain
frameworks provide callbacks triggered when the program is started and stopped
(for instance `FastAPI startup/shutdown events`__): they are perfect to
initiate and terminate the pool operations::

    pool = ConnectionPool(conninfo, open=False, **kwargs)

    @app.on_event("startup")
    def open_pool():
        pool.open()

    @app.on_event("shutdown")
    def close_pool():
        pool.close()

.. __: https://fastapi.tiangolo.com/advanced/events/#events-startup-shutdown

Creating a single pool as a global variable is not the mandatory use: your
program can create more than one pool, which might be useful to connect to
more than one database, or to provide different types of connections, for
instance to provide separate read/write and read-only connections. The pool
also acts as a context manager and is open and closed, if necessary, on
entering and exiting the context block::

    from psycopg_pool import ConnectionPool

    with ConnectionPool(conninfo, **kwargs) as pool:
        run_app(pool)

    # the pool is now closed

When the pool is open, the pool's background workers start creating the
requested `!min_size` connections, while the constructor (or the `!open()`
method) returns immediately. This allows the program some leeway to start
before the target database is up and running.  However, if your application is
misconfigured, or the network is down, it means that the program will be able
to start, but the threads requesting a connection will fail with a
`PoolTimeout` only after the timeout on `~ConnectionPool.connection()` is
expired. If this behaviour is not desirable (and you prefer your program to
crash hard and fast, if the surrounding conditions are not right, because
something else will respawn it) you should call the `~ConnectionPool.wait()`
method after creating the pool, or call `!open(wait=True)`: these methods will
block until the pool is full, or will raise a `PoolTimeout` exception if the
pool isn't ready within the allocated time.


Connections life cycle
----------------------

The pool background workers create connections according to the parameters
`!conninfo`, `!kwargs`, and `!connection_class` passed to `ConnectionPool`
constructor, invoking something like :samp:`{connection_class}({conninfo},
**{kwargs})`. Once a connection is created it is also passed to the
`!configure()` callback, if provided, after which it is put in the pool (or
passed to a client requesting it, if someone is already knocking at the door).

If a connection expires (it passes `!max_lifetime`), or is returned to the pool
in broken state, or is found closed by `~ConnectionPool.check()`), then the
pool will dispose of it and will start a new connection attempt in the
background.


Using connections from the pool
-------------------------------

The pool can be used to request connections from multiple threads or
concurrent tasks - it is hardly useful otherwise! If more connections than the
ones available in the pool are requested, the requesting threads are queued
and are served a connection as soon as one is available, either because
another client has finished using it or because the pool is allowed to grow
(when `!max_size` > `!min_size`) and a new connection is ready.

The main way to use the pool is to obtain a connection using the
`~ConnectionPool.connection()` context, which returns a `~psycopg.Connection`
or subclass::

    with my_pool.connection() as conn:
        conn.execute("what you want")

The `!connection()` context behaves like the `~psycopg.Connection` object
context: at the end of the block, if there is a transaction open, it will be
committed, or rolled back if the context is exited with as exception.

At the end of the block the connection is returned to the pool and shouldn't
be used anymore by the code which obtained it. If a `!reset()` function is
specified in the pool constructor, it is called on the connection before
returning it to the pool. Note that the `!reset()` function is called in a
worker thread, so that the thread which used the connection can keep its
execution without being slowed down by it.


Pool connection and sizing
--------------------------

A pool can have a fixed size (specifying no `!max_size` or `!max_size` =
`!min_size`) or a dynamic size (when `!max_size` > `!min_size`). In both
cases, as soon as the pool is created, it will try to acquire `!min_size`
connections in the background.

If an attempt to create a connection fails, a new attempt will be made soon
after, using an exponential backoff to increase the time between attempts,
until a maximum of `!reconnect_timeout` is reached. When that happens, the pool
will call the `!reconnect_failed()` function, if provided to the pool, and just
start a new connection attempt. You can use this function either to send
alerts or to interrupt the program and allow the rest of your infrastructure
to restart it.

If more than `!min_size` connections are requested concurrently, new ones are
created, up to `!max_size`. Note that the connections are always created by the
background workers, not by the thread asking for the connection: if a client
requests a new connection, and a previous client terminates its job before the
new connection is ready, the waiting client will be served the existing
connection. This is especially useful in scenarios where the time to establish
a connection dominates the time for which the connection is used (see `this
analysis`__, for instance).

.. __: https://github.com/brettwooldridge/HikariCP/blob/dev/documents/
       Welcome-To-The-Jungle.md

If a pool grows above `!min_size`, but its usage decreases afterwards, a number
of connections are eventually closed: one every time a connection is unused
after the `!max_idle` time specified in the pool constructor.


What's the right size for the pool?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Big question. Who knows. However, probably not as large as you imagine. Please
take a look at `this analysis`__ for some ideas.

.. __: https://github.com/brettwooldridge/HikariCP/wiki/About-Pool-Sizing

Something useful you can do is probably to use the
`~ConnectionPool.get_stats()` method and monitor the behaviour of your program
to tune the configuration parameters. The size of the pool can also be changed
at runtime using the `~ConnectionPool.resize()` method.


.. _null-pool:

Null connection pools
---------------------

.. versionadded:: 3.1

Sometimes you may want leave the choice of using or not using a connection
pool as a configuration parameter of your application. For instance, you might
want to use a pool if you are deploying a "large instance" of your application
and can dedicate it a handful of connections; conversely you might not want to
use it if you deploy the application in several instances, behind a load
balancer, and/or using an external connection pool process such as PgBouncer.

Switching between using or not using a pool requires some code change, because
the `ConnectionPool` API is different from the normal `~psycopg.connect()`
function and because the pool can perform additional connection configuration
(in the `!configure` parameter) that, if the pool is removed, should be
performed in some different code path of your application.

The `!psycopg_pool` 3.1 package introduces the `NullConnectionPool` class.
This class has the same interface, and largely the same behaviour, of the
`!ConnectionPool`, but doesn't create any connection beforehand. When a
connection is returned, unless there are other clients already waiting, it
is closed immediately and not kept in the pool state.

A null pool is not only a configuration convenience, but can also be used to
regulate the access to the server by a client program. If `!max_size` is set to
a value greater than 0, the pool will make sure that no more than `!max_size`
connections are created at any given time. If more clients ask for further
connections, they will be queued and served a connection as soon as a previous
client has finished using it, like for the basic pool. Other mechanisms to
throttle client requests (such as `!timeout` or `!max_waiting`) are respected
too.

.. note::

    Queued clients will be handed an already established connection, as soon
    as a previous client has finished using it (and after the pool has
    returned it to idle state and called `!reset()` on it, if necessary).

Because normally (i.e. unless queued) every client will be served a new
connection, the time to obtain the connection is paid by the waiting client;
background workers are not normally involved in obtaining new connections.


Connection quality
------------------

The state of the connection is verified when a connection is returned to the
pool: if a connection is broken during its usage it will be discarded on
return and a new connection will be created.

.. warning::

    The health of the connection is not checked when the pool gives it to a
    client.

Why not? Because doing so would require an extra network roundtrip: we want to
save you from its latency. Before getting too angry about it, just think that
the connection can be lost any moment while your program is using it. As your
program should already be able to cope with a loss of a connection during its
process, it should be able to tolerate to be served a broken connection:
unpleasant but not the end of the world.

.. warning::

    The health of the connection is not checked when the connection is in the
    pool.

Does the pool keep a watchful eye on the quality of the connections inside it?
No, it doesn't. Why not? Because you will do it for us! Your program is only
a big ruse to make sure the connections are still alive...

Not (entirely) trolling: if you are using a connection pool, we assume that
you are using and returning connections at a good pace. If the pool had to
check for the quality of a broken connection before your program notices it,
it should be polling each connection even faster than your program uses them.
Your database server wouldn't be amused...

Can you do something better than that? Of course you can, there is always a
better way than polling. You can use the same recipe of :ref:`disconnections`,
reserving a connection and using a thread to monitor for any activity
happening on it. If any activity is detected, you can call the pool
`~ConnectionPool.check()` method, which will run a quick check on each
connection in the pool, removing the ones found in broken state, and using the
background workers to replace them with fresh ones.

If you set up a similar check in your program, in case the database connection
is temporarily lost, we cannot do anything for the threads which had taken
already a connection from the pool, but no other thread should be served a
broken connection, because `!check()` would empty the pool and refill it with
working connections, as soon as they are available.

Faster than you can say poll. Or pool.


.. _pool-stats:

Pool stats
----------

The pool can return information about its usage using the methods
`~ConnectionPool.get_stats()` or `~ConnectionPool.pop_stats()`. Both methods
return the same values, but the latter reset the counters after its use. The
values can be sent to a monitoring system such as Graphite_ or Prometheus_.

.. _Graphite: https://graphiteapp.org/
.. _Prometheus: https://prometheus.io/

The following values should be provided, but please don't consider them as a
rigid interface: it is possible that they might change in the future. Keys
whose value is 0 may not be returned.


======================= =====================================================
Metric                  Meaning
======================= =====================================================
 ``pool_min``           Current value for `~ConnectionPool.min_size`
 ``pool_max``           Current value for `~ConnectionPool.max_size`
 ``pool_size``          Number of connections currently managed by the pool
                        (in the pool, given to clients, being prepared)
 ``pool_available``     Number of connections currently idle in the pool
 ``requests_waiting``   Number of requests currently waiting in a queue to
                        receive a connection
 ``usage_ms``           Total usage time of the connections outside the pool
 ``requests_num``       Number of connections requested to the pool
 ``requests_queued``    Number of requests queued because a connection wasn't
                        immediately available in the pool
 ``requests_wait_ms``   Total time in the queue for the clients waiting
 ``requests_errors``    Number of connection requests resulting in an error
                        (timeouts, queue full...)
 ``returns_bad``        Number of connections returned to the pool in a bad
                        state
 ``connections_num``    Number of connection attempts made by the pool to the
                        server
 ``connections_ms``     Total time spent to establish connections with the
                        server
 ``connections_errors`` Number of failed connection attempts
 ``connections_lost``   Number of connections lost identified by
                        `~ConnectionPool.check()`
======================= =====================================================