1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>19.4. Managing Kernel Resources</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="server-start.html" title="19.3. Starting the Database Server" /><link rel="next" href="server-shutdown.html" title="19.5. Shutting Down the Server" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">19.4. Managing Kernel Resources</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="server-start.html" title="19.3. Starting the Database Server">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><th width="60%" align="center">Chapter 19. Server Setup and Operation</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 16.3 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="server-shutdown.html" title="19.5. Shutting Down the Server">Next</a></td></tr></table><hr /></div><div class="sect1" id="KERNEL-RESOURCES"><div class="titlepage"><div><div><h2 class="title" style="clear: both">19.4. Managing Kernel Resources <a href="#KERNEL-RESOURCES" class="id_link">#</a></h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="kernel-resources.html#SYSVIPC">19.4.1. Shared Memory and Semaphores</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#SYSTEMD-REMOVEIPC">19.4.2. systemd RemoveIPC</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#KERNEL-RESOURCES-LIMITS">19.4.3. Resource Limits</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-MEMORY-OVERCOMMIT">19.4.4. Linux Memory Overcommit</a></span></dt><dt><span class="sect2"><a href="kernel-resources.html#LINUX-HUGE-PAGES">19.4.5. Linux Huge Pages</a></span></dt></dl></div><p>
<span class="productname">PostgreSQL</span> can sometimes exhaust various operating system
resource limits, especially when multiple copies of the server are running
on the same system, or in very large installations. This section explains
the kernel resources used by <span class="productname">PostgreSQL</span> and the steps you
can take to resolve problems related to kernel resource consumption.
</p><div class="sect2" id="SYSVIPC"><div class="titlepage"><div><div><h3 class="title">19.4.1. Shared Memory and Semaphores <a href="#SYSVIPC" class="id_link">#</a></h3></div></div></div><a id="id-1.6.6.7.3.2" class="indexterm"></a><a id="id-1.6.6.7.3.3" class="indexterm"></a><p>
<span class="productname">PostgreSQL</span> requires the operating system to provide
inter-process communication (<acronym class="acronym">IPC</acronym>) features, specifically
shared memory and semaphores. Unix-derived systems typically provide
<span class="quote">“<span class="quote"><span class="systemitem">System V</span></span>”</span> <acronym class="acronym">IPC</acronym>,
<span class="quote">“<span class="quote"><span class="systemitem">POSIX</span></span>”</span> <acronym class="acronym">IPC</acronym>, or both.
<span class="systemitem">Windows</span> has its own implementation of
these features and is not discussed here.
</p><p>
By default, <span class="productname">PostgreSQL</span> allocates
a very small amount of System V shared memory, as well as a much larger
amount of anonymous <code class="function">mmap</code> shared memory.
Alternatively, a single large System V shared memory region can be used
(see <a class="xref" href="runtime-config-resource.html#GUC-SHARED-MEMORY-TYPE">shared_memory_type</a>).
In addition a significant number of semaphores, which can be either
System V or POSIX style, are created at server startup. Currently,
POSIX semaphores are used on Linux and FreeBSD systems while other
platforms use System V semaphores.
</p><p>
System V <acronym class="acronym">IPC</acronym> features are typically constrained by
system-wide allocation limits.
When <span class="productname">PostgreSQL</span> exceeds one of these limits,
the server will refuse to start and
should leave an instructive error message describing the problem
and what to do about it. (See also <a class="xref" href="server-start.html#SERVER-START-FAILURES" title="19.3.1. Server Start-up Failures">Section 19.3.1</a>.) The relevant kernel
parameters are named consistently across different systems; <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 19.1. System V IPC Parameters">Table 19.1</a> gives an overview. The methods to set
them, however, vary. Suggestions for some platforms are given below.
</p><div class="table" id="SYSVIPC-PARAMETERS"><p class="title"><strong>Table 19.1. <span class="systemitem">System V</span> <acronym class="acronym">IPC</acronym> Parameters</strong></p><div class="table-contents"><table class="table" summary="System V IPC Parameters" border="1"><colgroup><col class="col1" /><col class="col2" /><col class="col3" /></colgroup><thead><tr><th>Name</th><th>Description</th><th>Values needed to run one <span class="productname">PostgreSQL</span> instance</th></tr></thead><tbody><tr><td><code class="varname">SHMMAX</code></td><td>Maximum size of shared memory segment (bytes)</td><td>at least 1kB, but the default is usually much higher</td></tr><tr><td><code class="varname">SHMMIN</code></td><td>Minimum size of shared memory segment (bytes)</td><td>1</td></tr><tr><td><code class="varname">SHMALL</code></td><td>Total amount of shared memory available (bytes or pages)</td><td>same as <code class="varname">SHMMAX</code> if bytes,
or <code class="literal">ceil(SHMMAX/PAGE_SIZE)</code> if pages,
plus room for other applications</td></tr><tr><td><code class="varname">SHMSEG</code></td><td>Maximum number of shared memory segments per process</td><td>only 1 segment is needed, but the default is much higher</td></tr><tr><td><code class="varname">SHMMNI</code></td><td>Maximum number of shared memory segments system-wide</td><td>like <code class="varname">SHMSEG</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNI</code></td><td>Maximum number of semaphore identifiers (i.e., sets)</td><td>at least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMNS</code></td><td>Maximum number of semaphores system-wide</td><td><code class="literal">ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16) * 17</code> plus room for other applications</td></tr><tr><td><code class="varname">SEMMSL</code></td><td>Maximum number of semaphores per set</td><td>at least 17</td></tr><tr><td><code class="varname">SEMMAP</code></td><td>Number of entries in semaphore map</td><td>see text</td></tr><tr><td><code class="varname">SEMVMX</code></td><td>Maximum value of semaphore</td><td>at least 1000 (The default is often 32767; do not change unless necessary)</td></tr></tbody></table></div></div><br class="table-break" /><p>
<span class="productname">PostgreSQL</span> requires a few bytes of System V shared memory
(typically 48 bytes, on 64-bit platforms) for each copy of the server.
On most modern operating systems, this amount can easily be allocated.
However, if you are running many copies of the server or you explicitly
configure the server to use large amounts of System V shared memory (see
<a class="xref" href="runtime-config-resource.html#GUC-SHARED-MEMORY-TYPE">shared_memory_type</a> and <a class="xref" href="runtime-config-resource.html#GUC-DYNAMIC-SHARED-MEMORY-TYPE">dynamic_shared_memory_type</a>), it may be necessary to
increase <code class="varname">SHMALL</code>, which is the total amount of System V shared
memory system-wide. Note that <code class="varname">SHMALL</code> is measured in pages
rather than bytes on many systems.
</p><p>
Less likely to cause problems is the minimum size for shared
memory segments (<code class="varname">SHMMIN</code>), which should be at most
approximately 32 bytes for <span class="productname">PostgreSQL</span> (it is
usually just 1). The maximum number of segments system-wide
(<code class="varname">SHMMNI</code>) or per-process (<code class="varname">SHMSEG</code>) are unlikely
to cause a problem unless your system has them set to zero.
</p><p>
When using System V semaphores,
<span class="productname">PostgreSQL</span> uses one semaphore per allowed connection
(<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process
(<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background
process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>), in sets of 16.
Each such set will
also contain a 17th semaphore which contains a <span class="quote">“<span class="quote">magic
number</span>”</span>, to detect collision with semaphore sets used by
other applications. The maximum number of semaphores in the system
is set by <code class="varname">SEMMNS</code>, which consequently must be at least
as high as <code class="varname">max_connections</code> plus
<code class="varname">autovacuum_max_workers</code> plus <code class="varname">max_wal_senders</code>,
plus <code class="varname">max_worker_processes</code>, plus one extra for each 16
allowed connections plus workers (see the formula in <a class="xref" href="kernel-resources.html#SYSVIPC-PARAMETERS" title="Table 19.1. System V IPC Parameters">Table 19.1</a>). The parameter <code class="varname">SEMMNI</code>
determines the limit on the number of semaphore sets that can
exist on the system at one time. Hence this parameter must be at
least <code class="literal">ceil((max_connections + autovacuum_max_workers + max_wal_senders + max_worker_processes + 5) / 16)</code>.
Lowering the number
of allowed connections is a temporary workaround for failures,
which are usually confusingly worded <span class="quote">“<span class="quote">No space
left on device</span>”</span>, from the function <code class="function">semget</code>.
</p><p>
In some cases it might also be necessary to increase
<code class="varname">SEMMAP</code> to be at least on the order of
<code class="varname">SEMMNS</code>. If the system has this parameter
(many do not), it defines the size of the semaphore
resource map, in which each contiguous block of available semaphores
needs an entry. When a semaphore set is freed it is either added to
an existing entry that is adjacent to the freed block or it is
registered under a new map entry. If the map is full, the freed
semaphores get lost (until reboot). Fragmentation of the semaphore
space could over time lead to fewer available semaphores than there
should be.
</p><p>
Various other settings related to <span class="quote">“<span class="quote">semaphore undo</span>”</span>, such as
<code class="varname">SEMMNU</code> and <code class="varname">SEMUME</code>, do not affect
<span class="productname">PostgreSQL</span>.
</p><p>
When using POSIX semaphores, the number of semaphores needed is the
same as for System V, that is one semaphore per allowed connection
(<a class="xref" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS">max_connections</a>), allowed autovacuum worker process
(<a class="xref" href="runtime-config-autovacuum.html#GUC-AUTOVACUUM-MAX-WORKERS">autovacuum_max_workers</a>) and allowed background
process (<a class="xref" href="runtime-config-resource.html#GUC-MAX-WORKER-PROCESSES">max_worker_processes</a>).
On the platforms where this option is preferred, there is no specific
kernel limit on the number of POSIX semaphores.
</p><div class="variablelist"><dl class="variablelist"><dt><span class="term"><span class="systemitem">AIX</span>
<a id="id-1.6.6.7.3.14.1.1.2" class="indexterm"></a>
</span></dt><dd><p>
It should not be necessary to do
any special configuration for such parameters as
<code class="varname">SHMMAX</code>, as it appears this is configured to
allow all memory to be used as shared memory. That is the
sort of configuration commonly used for other databases such
as <span class="application">DB/2</span>.</p><p> It might, however, be necessary to modify the global
<code class="command">ulimit</code> information in
<code class="filename">/etc/security/limits</code>, as the default hard
limits for file sizes (<code class="varname">fsize</code>) and numbers of
files (<code class="varname">nofiles</code>) might be too low.
</p></dd><dt><span class="term"><span class="systemitem">FreeBSD</span>
<a id="id-1.6.6.7.3.14.2.1.2" class="indexterm"></a>
</span></dt><dd><p>
The default shared memory settings are usually good enough, unless
you have set <code class="literal">shared_memory_type</code> to <code class="literal">sysv</code>.
System V semaphores are not used on this platform.
</p><p>
The default IPC settings can be changed using
the <code class="command">sysctl</code> or
<code class="command">loader</code> interfaces. The following
parameters can be set using <code class="command">sysctl</code>:
</p><pre class="screen">
<code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmall=32768</code></strong>
<code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.ipc.shmmax=134217728</code></strong>
</pre><p>
To make these settings persist over reboots, modify
<code class="filename">/etc/sysctl.conf</code>.
</p><p>
If you have set <code class="literal">shared_memory_type</code> to
<code class="literal">sysv</code>, you might also want to configure your kernel
to lock System V shared memory into RAM and prevent it from being paged
out to swap. This can be accomplished using the <code class="command">sysctl</code>
setting <code class="literal">kern.ipc.shm_use_phys</code>.
</p><p>
If running in a FreeBSD jail, you should set its
<code class="literal">sysvshm</code> parameter to <code class="literal">new</code>, so that
it has its own separate System V shared memory namespace.
(Before FreeBSD 11.0, it was necessary to enable shared access to
the host's IPC namespace from jails, and take measures to avoid
collisions.)
</p></dd><dt><span class="term"><span class="systemitem">NetBSD</span>
<a id="id-1.6.6.7.3.14.3.1.2" class="indexterm"></a>
</span></dt><dd><p>
The default shared memory settings are usually good enough, unless
you have set <code class="literal">shared_memory_type</code> to <code class="literal">sysv</code>.
You will usually want to increase <code class="literal">kern.ipc.semmni</code>
and <code class="literal">kern.ipc.semmns</code>,
as <span class="systemitem">NetBSD</span>'s default settings
for these are uncomfortably small.
</p><p>
IPC parameters can be adjusted using <code class="command">sysctl</code>,
for example:
</p><pre class="screen">
<code class="prompt">#</code> <strong class="userinput"><code>sysctl -w kern.ipc.semmni=100</code></strong>
</pre><p>
To make these settings persist over reboots, modify
<code class="filename">/etc/sysctl.conf</code>.
</p><p>
If you have set <code class="literal">shared_memory_type</code> to
<code class="literal">sysv</code>, you might also want to configure your kernel
to lock System V shared memory into RAM and prevent it from being paged
out to swap. This can be accomplished using the <code class="command">sysctl</code>
setting <code class="literal">kern.ipc.shm_use_phys</code>.
</p></dd><dt><span class="term"><span class="systemitem">OpenBSD</span>
<a id="id-1.6.6.7.3.14.4.1.2" class="indexterm"></a>
</span></dt><dd><p>
The default shared memory settings are usually good enough, unless
you have set <code class="literal">shared_memory_type</code> to <code class="literal">sysv</code>.
You will usually want to
increase <code class="literal">kern.seminfo.semmni</code>
and <code class="literal">kern.seminfo.semmns</code>,
as <span class="systemitem">OpenBSD</span>'s default settings
for these are uncomfortably small.
</p><p>
IPC parameters can be adjusted using <code class="command">sysctl</code>,
for example:
</p><pre class="screen">
<code class="prompt">#</code> <strong class="userinput"><code>sysctl kern.seminfo.semmni=100</code></strong>
</pre><p>
To make these settings persist over reboots, modify
<code class="filename">/etc/sysctl.conf</code>.
</p></dd><dt><span class="term"><span class="systemitem">Linux</span>
<a id="id-1.6.6.7.3.14.5.1.2" class="indexterm"></a>
</span></dt><dd><p>
The default shared memory settings are usually good enough, unless
you have set <code class="literal">shared_memory_type</code> to <code class="literal">sysv</code>,
and even then only on older kernel versions that shipped with low defaults.
System V semaphores are not used on this platform.
</p><p>
The shared memory size settings can be changed via the
<code class="command">sysctl</code> interface. For example, to allow 16 GB:
</p><pre class="screen">
<code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmmax=17179869184</code></strong>
<code class="prompt">$</code> <strong class="userinput"><code>sysctl -w kernel.shmall=4194304</code></strong>
</pre><p>
To make these settings persist over reboots, see
<code class="filename">/etc/sysctl.conf</code>.
</p></dd><dt><span class="term"><span class="systemitem">macOS</span>
<a id="id-1.6.6.7.3.14.6.1.2" class="indexterm"></a>
</span></dt><dd><p>
The default shared memory and semaphore settings are usually good enough, unless
you have set <code class="literal">shared_memory_type</code> to <code class="literal">sysv</code>.
</p><p>
The recommended method for configuring shared memory in macOS
is to create a file named <code class="filename">/etc/sysctl.conf</code>,
containing variable assignments such as:
</p><pre class="programlisting">
kern.sysv.shmmax=4194304
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.sysv.shmall=1024
</pre><p>
Note that in some macOS versions,
<span class="emphasis"><em>all five</em></span> shared-memory parameters must be set in
<code class="filename">/etc/sysctl.conf</code>, else the values will be ignored.
</p><p>
<code class="varname">SHMMAX</code> can only be set to a multiple of 4096.
</p><p>
<code class="varname">SHMALL</code> is measured in 4 kB pages on this platform.
</p><p>
It is possible to change all but <code class="varname">SHMMNI</code> on the fly, using
<span class="application">sysctl</span>. But it's still best to set up your preferred
values via <code class="filename">/etc/sysctl.conf</code>, so that the values will be
kept across reboots.
</p></dd><dt><span class="term"><span class="systemitem">Solaris</span><br /></span><span class="term"><span class="systemitem">illumos</span></span></dt><dd><p>
The default shared memory and semaphore settings are usually good enough for most
<span class="productname">PostgreSQL</span> applications. Solaris defaults
to a <code class="varname">SHMMAX</code> of one-quarter of system <acronym class="acronym">RAM</acronym>.
To further adjust this setting, use a project setting associated
with the <code class="literal">postgres</code> user. For example, run the
following as <code class="literal">root</code>:
</p><pre class="programlisting">
projadd -c "PostgreSQL DB User" -K "project.max-shm-memory=(privileged,8GB,deny)" -U postgres -G postgres user.postgres
</pre><p>
</p><p>
This command adds the <code class="literal">user.postgres</code> project and
sets the shared memory maximum for the <code class="literal">postgres</code>
user to 8GB, and takes effect the next time that user logs
in, or when you restart <span class="productname">PostgreSQL</span> (not reload).
The above assumes that <span class="productname">PostgreSQL</span> is run by
the <code class="literal">postgres</code> user in the <code class="literal">postgres</code>
group. No server reboot is required.
</p><p>
Other recommended kernel setting changes for database servers which will
have a large number of connections are:
</p><pre class="programlisting">
project.max-shm-ids=(priv,32768,deny)
project.max-sem-ids=(priv,4096,deny)
project.max-msg-ids=(priv,4096,deny)
</pre><p>
</p><p>
Additionally, if you are running <span class="productname">PostgreSQL</span>
inside a zone, you may need to raise the zone resource usage
limits as well. See "Chapter2: Projects and Tasks" in the
<em class="citetitle">System Administrator's Guide</em> for more
information on <code class="literal">projects</code> and <code class="command">prctl</code>.
</p></dd></dl></div></div><div class="sect2" id="SYSTEMD-REMOVEIPC"><div class="titlepage"><div><div><h3 class="title">19.4.2. systemd RemoveIPC <a href="#SYSTEMD-REMOVEIPC" class="id_link">#</a></h3></div></div></div><a id="id-1.6.6.7.4.2" class="indexterm"></a><p>
If <span class="productname">systemd</span> is in use, some care must be taken
that IPC resources (including shared memory) are not prematurely
removed by the operating system. This is especially of concern when
installing PostgreSQL from source. Users of distribution packages of
PostgreSQL are less likely to be affected, as
the <code class="literal">postgres</code> user is then normally created as a system
user.
</p><p>
The setting <code class="literal">RemoveIPC</code>
in <code class="filename">logind.conf</code> controls whether IPC objects are
removed when a user fully logs out. System users are exempt. This
setting defaults to on in stock <span class="productname">systemd</span>, but
some operating system distributions default it to off.
</p><p>
A typical observed effect when this setting is on is that shared memory
objects used for parallel query execution are removed at apparently random
times, leading to errors and warnings while attempting to open and remove
them, like
</p><pre class="screen">
WARNING: could not remove shared memory segment "/PostgreSQL.1450751626": No such file or directory
</pre><p>
Different types of IPC objects (shared memory vs. semaphores, System V
vs. POSIX) are treated slightly differently
by <span class="productname">systemd</span>, so one might observe that some IPC
resources are not removed in the same way as others. But it is not
advisable to rely on these subtle differences.
</p><p>
A <span class="quote">“<span class="quote">user logging out</span>”</span> might happen as part of a maintenance
job or manually when an administrator logs in as
the <code class="literal">postgres</code> user or something similar, so it is hard
to prevent in general.
</p><p>
What is a <span class="quote">“<span class="quote">system user</span>”</span> is determined
at <span class="productname">systemd</span> compile time from
the <code class="symbol">SYS_UID_MAX</code> setting
in <code class="filename">/etc/login.defs</code>.
</p><p>
Packaging and deployment scripts should be careful to create
the <code class="literal">postgres</code> user as a system user by
using <code class="literal">useradd -r</code>, <code class="literal">adduser --system</code>,
or equivalent.
</p><p>
Alternatively, if the user account was created incorrectly or cannot be
changed, it is recommended to set
</p><pre class="programlisting">
RemoveIPC=no
</pre><p>
in <code class="filename">/etc/systemd/logind.conf</code> or another appropriate
configuration file.
</p><div class="caution"><h3 class="title">Caution</h3><p>
At least one of these two things has to be ensured, or the PostgreSQL
server will be very unreliable.
</p></div></div><div class="sect2" id="KERNEL-RESOURCES-LIMITS"><div class="titlepage"><div><div><h3 class="title">19.4.3. Resource Limits <a href="#KERNEL-RESOURCES-LIMITS" class="id_link">#</a></h3></div></div></div><p>
Unix-like operating systems enforce various kinds of resource limits
that might interfere with the operation of your
<span class="productname">PostgreSQL</span> server. Of particular
importance are limits on the number of processes per user, the
number of open files per process, and the amount of memory available
to each process. Each of these have a <span class="quote">“<span class="quote">hard</span>”</span> and a
<span class="quote">“<span class="quote">soft</span>”</span> limit. The soft limit is what actually counts
but it can be changed by the user up to the hard limit. The hard
limit can only be changed by the root user. The system call
<code class="function">setrlimit</code> is responsible for setting these
parameters. The shell's built-in command <code class="command">ulimit</code>
(Bourne shells) or <code class="command">limit</code> (<span class="application">csh</span>) is
used to control the resource limits from the command line. On
BSD-derived systems the file <code class="filename">/etc/login.conf</code>
controls the various resource limits set during login. See the
operating system documentation for details. The relevant
parameters are <code class="varname">maxproc</code>,
<code class="varname">openfiles</code>, and <code class="varname">datasize</code>. For
example:
</p><pre class="programlisting">
default:\
...
:datasize-cur=256M:\
:maxproc-cur=256:\
:openfiles-cur=256:\
...
</pre><p>
(<code class="literal">-cur</code> is the soft limit. Append
<code class="literal">-max</code> to set the hard limit.)
</p><p>
Kernels can also have system-wide limits on some resources.
</p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
On <span class="productname">Linux</span> the kernel parameter
<code class="varname">fs.file-max</code> determines the maximum number of open
files that the kernel will support. It can be changed with
<code class="literal">sysctl -w fs.file-max=<em class="replaceable"><code>N</code></em></code>.
To make the setting persist across reboots, add an assignment
in <code class="filename">/etc/sysctl.conf</code>.
The maximum limit of files per process is fixed at the time the
kernel is compiled; see
<code class="filename">/usr/src/linux/Documentation/proc.txt</code> for
more information.
</p></li></ul></div><p>
</p><p>
The <span class="productname">PostgreSQL</span> server uses one process
per connection so you should provide for at least as many processes
as allowed connections, in addition to what you need for the rest
of your system. This is usually not a problem but if you run
several servers on one machine things might get tight.
</p><p>
The factory default limit on open files is often set to
<span class="quote">“<span class="quote">socially friendly</span>”</span> values that allow many users to
coexist on a machine without using an inappropriate fraction of
the system resources. If you run many servers on a machine this
is perhaps what you want, but on dedicated servers you might want to
raise this limit.
</p><p>
On the other side of the coin, some systems allow individual
processes to open large numbers of files; if more than a few
processes do so then the system-wide limit can easily be exceeded.
If you find this happening, and you do not want to alter the
system-wide limit, you can set <span class="productname">PostgreSQL</span>'s <a class="xref" href="runtime-config-resource.html#GUC-MAX-FILES-PER-PROCESS">max_files_per_process</a> configuration parameter to
limit the consumption of open files.
</p><p>
Another kernel limit that may be of concern when supporting large
numbers of client connections is the maximum socket connection queue
length. If more than that many connection requests arrive within a very
short period, some may get rejected before the <span class="productname">PostgreSQL</span> server can service
the requests, with those clients receiving unhelpful connection failure
errors such as <span class="quote">“<span class="quote">Resource temporarily unavailable</span>”</span> or
<span class="quote">“<span class="quote">Connection refused</span>”</span>. The default queue length limit is 128
on many platforms. To raise it, adjust the appropriate kernel parameter
via <span class="application">sysctl</span>, then restart the <span class="productname">PostgreSQL</span> server.
The parameter is variously named <code class="varname">net.core.somaxconn</code>
on Linux, <code class="varname">kern.ipc.soacceptqueue</code> on newer FreeBSD,
and <code class="varname">kern.ipc.somaxconn</code> on macOS and other BSD
variants.
</p></div><div class="sect2" id="LINUX-MEMORY-OVERCOMMIT"><div class="titlepage"><div><div><h3 class="title">19.4.4. Linux Memory Overcommit <a href="#LINUX-MEMORY-OVERCOMMIT" class="id_link">#</a></h3></div></div></div><a id="id-1.6.6.7.6.2" class="indexterm"></a><a id="id-1.6.6.7.6.3" class="indexterm"></a><a id="id-1.6.6.7.6.4" class="indexterm"></a><p>
The default virtual memory behavior on Linux is not
optimal for <span class="productname">PostgreSQL</span>. Because of the
way that the kernel implements memory overcommit, the kernel might
terminate the <span class="productname">PostgreSQL</span> postmaster (the
supervisor server process) if the memory demands of either
<span class="productname">PostgreSQL</span> or another process cause the
system to run out of virtual memory.
</p><p>
If this happens, you will see a kernel message that looks like
this (consult your system documentation and configuration on where
to look for such a message):
</p><pre class="programlisting">
Out of Memory: Killed process 12345 (postgres).
</pre><p>
This indicates that the <code class="filename">postgres</code> process
has been terminated due to memory pressure.
Although existing database connections will continue to function
normally, no new connections will be accepted. To recover,
<span class="productname">PostgreSQL</span> will need to be restarted.
</p><p>
One way to avoid this problem is to run
<span class="productname">PostgreSQL</span> on a machine where you can
be sure that other processes will not run the machine out of
memory. If memory is tight, increasing the swap space of the
operating system can help avoid the problem, because the
out-of-memory (OOM) killer is invoked only when physical memory and
swap space are exhausted.
</p><p>
If <span class="productname">PostgreSQL</span> itself is the cause of the
system running out of memory, you can avoid the problem by changing
your configuration. In some cases, it may help to lower memory-related
configuration parameters, particularly
<a class="link" href="runtime-config-resource.html#GUC-SHARED-BUFFERS"><code class="varname">shared_buffers</code></a>,
<a class="link" href="runtime-config-resource.html#GUC-WORK-MEM"><code class="varname">work_mem</code></a>, and
<a class="link" href="runtime-config-resource.html#GUC-HASH-MEM-MULTIPLIER"><code class="varname">hash_mem_multiplier</code></a>.
In other cases, the problem may be caused by allowing too many
connections to the database server itself. In many cases, it may
be better to reduce
<a class="link" href="runtime-config-connection.html#GUC-MAX-CONNECTIONS"><code class="varname">max_connections</code></a>
and instead make use of external connection-pooling software.
</p><p>
It is possible to modify the
kernel's behavior so that it will not <span class="quote">“<span class="quote">overcommit</span>”</span> memory.
Although this setting will not prevent the <a class="ulink" href="https://lwn.net/Articles/104179/" target="_top">OOM killer</a> from being invoked
altogether, it will lower the chances significantly and will therefore
lead to more robust system behavior. This is done by selecting strict
overcommit mode via <code class="command">sysctl</code>:
</p><pre class="programlisting">
sysctl -w vm.overcommit_memory=2
</pre><p>
or placing an equivalent entry in <code class="filename">/etc/sysctl.conf</code>.
You might also wish to modify the related setting
<code class="varname">vm.overcommit_ratio</code>. For details see the kernel documentation
file <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/overcommit-accounting" target="_top">https://www.kernel.org/doc/Documentation/vm/overcommit-accounting</a>.
</p><p>
Another approach, which can be used with or without altering
<code class="varname">vm.overcommit_memory</code>, is to set the process-specific
<em class="firstterm">OOM score adjustment</em> value for the postmaster process to
<code class="literal">-1000</code>, thereby guaranteeing it will not be targeted by the OOM
killer. The simplest way to do this is to execute
</p><pre class="programlisting">
echo -1000 > /proc/self/oom_score_adj
</pre><p>
in the <span class="productname">PostgreSQL</span> startup script just before
invoking <code class="filename">postgres</code>.
Note that this action must be done as root, or it will have no effect;
so a root-owned startup script is the easiest place to do it. If you
do this, you should also set these environment variables in the startup
script before invoking <code class="filename">postgres</code>:
</p><pre class="programlisting">
export PG_OOM_ADJUST_FILE=/proc/self/oom_score_adj
export PG_OOM_ADJUST_VALUE=0
</pre><p>
These settings will cause postmaster child processes to run with the
normal OOM score adjustment of zero, so that the OOM killer can still
target them at need. You could use some other value for
<code class="envar">PG_OOM_ADJUST_VALUE</code> if you want the child processes to run
with some other OOM score adjustment. (<code class="envar">PG_OOM_ADJUST_VALUE</code>
can also be omitted, in which case it defaults to zero.) If you do not
set <code class="envar">PG_OOM_ADJUST_FILE</code>, the child processes will run with the
same OOM score adjustment as the postmaster, which is unwise since the
whole point is to ensure that the postmaster has a preferential setting.
</p></div><div class="sect2" id="LINUX-HUGE-PAGES"><div class="titlepage"><div><div><h3 class="title">19.4.5. Linux Huge Pages <a href="#LINUX-HUGE-PAGES" class="id_link">#</a></h3></div></div></div><p>
Using huge pages reduces overhead when using large contiguous chunks of
memory, as <span class="productname">PostgreSQL</span> does, particularly when
using large values of <a class="xref" href="runtime-config-resource.html#GUC-SHARED-BUFFERS">shared_buffers</a>. To use this
feature in <span class="productname">PostgreSQL</span> you need a kernel
with <code class="varname">CONFIG_HUGETLBFS=y</code> and
<code class="varname">CONFIG_HUGETLB_PAGE=y</code>. You will also have to configure
the operating system to provide enough huge pages of the desired size.
To determine the number of huge pages needed, use the
<code class="command">postgres</code> command to see the value of
<a class="xref" href="runtime-config-preset.html#GUC-SHARED-MEMORY-SIZE-IN-HUGE-PAGES">shared_memory_size_in_huge_pages</a>. Note that the
server must be shut down to view this runtime-computed parameter.
This might look like:
</p><pre class="programlisting">
$ <strong class="userinput"><code>postgres -D $PGDATA -C shared_memory_size_in_huge_pages</code></strong>
3170
$ <strong class="userinput"><code>grep ^Hugepagesize /proc/meminfo</code></strong>
Hugepagesize: 2048 kB
$ <strong class="userinput"><code>ls /sys/kernel/mm/hugepages</code></strong>
hugepages-1048576kB hugepages-2048kB
</pre><p>
In this example the default is 2MB, but you can also explicitly request
either 2MB or 1GB with <a class="xref" href="runtime-config-resource.html#GUC-HUGE-PAGE-SIZE">huge_page_size</a> to adapt
the number of pages calculated by
<code class="varname">shared_memory_size_in_huge_pages</code>.
While we need at least <code class="literal">3170</code> huge pages in this example,
a larger setting would be appropriate if other programs on the machine
also need huge pages.
We can set this with:
</p><pre class="programlisting">
# <strong class="userinput"><code>sysctl -w vm.nr_hugepages=3170</code></strong>
</pre><p>
Don't forget to add this setting to <code class="filename">/etc/sysctl.conf</code>
so that it is reapplied after reboots. For non-default huge page sizes,
we can instead use:
</p><pre class="programlisting">
# <strong class="userinput"><code>echo 3170 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages</code></strong>
</pre><p>
It is also possible to provide these settings at boot time using
kernel parameters such as <code class="literal">hugepagesz=2M hugepages=3170</code>.
</p><p>
Sometimes the kernel is not able to allocate the desired number of huge
pages immediately due to fragmentation, so it might be necessary
to repeat the command or to reboot. (Immediately after a reboot, most of
the machine's memory should be available to convert into huge pages.)
To verify the huge page allocation situation for a given size, use:
</p><pre class="programlisting">
$ <strong class="userinput"><code>cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages</code></strong>
</pre><p>
</p><p>
It may also be necessary to give the database server's operating system
user permission to use huge pages by setting
<code class="varname">vm.hugetlb_shm_group</code> via <span class="application">sysctl</span>, and/or
give permission to lock memory with <code class="command">ulimit -l</code>.
</p><p>
The default behavior for huge pages in
<span class="productname">PostgreSQL</span> is to use them when possible, with
the system's default huge page size, and
to fall back to normal pages on failure. To enforce the use of huge
pages, you can set <a class="xref" href="runtime-config-resource.html#GUC-HUGE-PAGES">huge_pages</a>
to <code class="literal">on</code> in <code class="filename">postgresql.conf</code>.
Note that with this setting <span class="productname">PostgreSQL</span> will fail to
start if not enough huge pages are available.
</p><p>
For a detailed description of the <span class="productname">Linux</span> huge
pages feature have a look
at <a class="ulink" href="https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt" target="_top">https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt</a>.
</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="server-start.html" title="19.3. Starting the Database Server">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="runtime.html" title="Chapter 19. Server Setup and Operation">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="server-shutdown.html" title="19.5. Shutting Down the Server">Next</a></td></tr><tr><td width="40%" align="left" valign="top">19.3. Starting the Database Server </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 16.3 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 19.5. Shutting Down the Server</td></tr></table></div></body></html>
|