1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>27.2. Log-Shipping Standby Servers</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="different-replication-solutions.html" title="27.1. Comparison of Different Solutions" /><link rel="next" href="warm-standby-failover.html" title="27.3. Failover" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">27.2. Log-Shipping Standby Servers</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="different-replication-solutions.html" title="27.1. Comparison of Different Solutions">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication">Up</a></td><th width="60%" align="center">Chapter 27. High Availability, Load Balancing, and Replication</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="warm-standby-failover.html" title="27.3. Failover">Next</a></td></tr></table><hr /></div><div class="sect1" id="WARM-STANDBY"><div class="titlepage"><div><div><h2 class="title" style="clear: both">27.2. Log-Shipping Standby Servers</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="warm-standby.html#STANDBY-PLANNING">27.2.1. Planning</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#STANDBY-SERVER-OPERATION">27.2.2. Standby Server Operation</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#PREPARING-PRIMARY-FOR-STANDBY">27.2.3. Preparing the Primary for Standby Servers</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#STANDBY-SERVER-SETUP">27.2.4. Setting Up a Standby Server</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#STREAMING-REPLICATION">27.2.5. Streaming Replication</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#STREAMING-REPLICATION-SLOTS">27.2.6. Replication Slots</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#CASCADING-REPLICATION">27.2.7. Cascading Replication</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#SYNCHRONOUS-REPLICATION">27.2.8. Synchronous Replication</a></span></dt><dt><span class="sect2"><a href="warm-standby.html#CONTINUOUS-ARCHIVING-IN-STANDBY">27.2.9. Continuous Archiving in Standby</a></span></dt></dl></div><p>
Continuous archiving can be used to create a <em class="firstterm">high
availability</em> (HA) cluster configuration with one or more
<em class="firstterm">standby servers</em> ready to take over operations if the
primary server fails. This capability is widely referred to as
<em class="firstterm">warm standby</em> or <em class="firstterm">log shipping</em>.
</p><p>
The primary and standby server work together to provide this capability,
though the servers are only loosely coupled. The primary server operates
in continuous archiving mode, while each standby server operates in
continuous recovery mode, reading the WAL files from the primary. No
changes to the database tables are required to enable this capability,
so it offers low administration overhead compared to some other
replication solutions. This configuration also has relatively low
performance impact on the primary server.
</p><p>
Directly moving WAL records from one database server to another
is typically described as log shipping. <span class="productname">PostgreSQL</span>
implements file-based log shipping by transferring WAL records
one file (WAL segment) at a time. WAL files (16MB) can be
shipped easily and cheaply over any distance, whether it be to an
adjacent system, another system at the same site, or another system on
the far side of the globe. The bandwidth required for this technique
varies according to the transaction rate of the primary server.
Record-based log shipping is more granular and streams WAL changes
incrementally over a network connection (see <a class="xref" href="warm-standby.html#STREAMING-REPLICATION" title="27.2.5. Streaming Replication">Section 27.2.5</a>).
</p><p>
It should be noted that log shipping is asynchronous, i.e., the WAL
records are shipped after transaction commit. As a result, there is a
window for data loss should the primary server suffer a catastrophic
failure; transactions not yet shipped will be lost. The size of the
data loss window in file-based log shipping can be limited by use of the
<code class="varname">archive_timeout</code> parameter, which can be set as low
as a few seconds. However such a low setting will
substantially increase the bandwidth required for file shipping.
Streaming replication (see <a class="xref" href="warm-standby.html#STREAMING-REPLICATION" title="27.2.5. Streaming Replication">Section 27.2.5</a>)
allows a much smaller window of data loss.
</p><p>
Recovery performance is sufficiently good that the standby will
typically be only moments away from full
availability once it has been activated. As a result, this is called
a warm standby configuration which offers high
availability. Restoring a server from an archived base backup and
rollforward will take considerably longer, so that technique only
offers a solution for disaster recovery, not high availability.
A standby server can also be used for read-only queries, in which case
it is called a <em class="firstterm">hot standby</em> server. See
<a class="xref" href="hot-standby.html" title="27.4. Hot Standby">Section 27.4</a> for more information.
</p><a id="id-1.6.14.16.7" class="indexterm"></a><a id="id-1.6.14.16.8" class="indexterm"></a><a id="id-1.6.14.16.9" class="indexterm"></a><a id="id-1.6.14.16.10" class="indexterm"></a><a id="id-1.6.14.16.11" class="indexterm"></a><a id="id-1.6.14.16.12" class="indexterm"></a><div class="sect2" id="STANDBY-PLANNING"><div class="titlepage"><div><div><h3 class="title">27.2.1. Planning</h3></div></div></div><p>
It is usually wise to create the primary and standby servers
so that they are as similar as possible, at least from the
perspective of the database server. In particular, the path names
associated with tablespaces will be passed across unmodified, so both
primary and standby servers must have the same mount paths for
tablespaces if that feature is used. Keep in mind that if
<a class="xref" href="sql-createtablespace.html" title="CREATE TABLESPACE"><span class="refentrytitle">CREATE TABLESPACE</span></a>
is executed on the primary, any new mount point needed for it must
be created on the primary and all standby servers before the command
is executed. Hardware need not be exactly the same, but experience shows
that maintaining two identical systems is easier than maintaining two
dissimilar ones over the lifetime of the application and system.
In any case the hardware architecture must be the same — shipping
from, say, a 32-bit to a 64-bit system will not work.
</p><p>
In general, log shipping between servers running different major
<span class="productname">PostgreSQL</span> release
levels is not possible. It is the policy of the PostgreSQL Global
Development Group not to make changes to disk formats during minor release
upgrades, so it is likely that running different minor release levels
on primary and standby servers will work successfully. However, no
formal support for that is offered and you are advised to keep primary
and standby servers at the same release level as much as possible.
When updating to a new minor release, the safest policy is to update
the standby servers first — a new minor release is more likely
to be able to read WAL files from a previous minor release than vice
versa.
</p></div><div class="sect2" id="STANDBY-SERVER-OPERATION"><div class="titlepage"><div><div><h3 class="title">27.2.2. Standby Server Operation</h3></div></div></div><p>
A server enters standby mode if a
<span id="FILE-STANDBY-SIGNAL"></span>
<code class="filename">standby.signal</code>
<a id="id-1.6.14.16.14.2.3" class="indexterm"></a>
file exists in the data directory when the server is started.
</p><p>
In standby mode, the server continuously applies WAL received from the
primary server. The standby server can read WAL from a WAL archive
(see <a class="xref" href="runtime-config-wal.html#GUC-RESTORE-COMMAND">restore_command</a>) or directly from the primary
over a TCP connection (streaming replication). The standby server will
also attempt to restore any WAL found in the standby cluster's
<code class="filename">pg_wal</code> directory. That typically happens after a server
restart, when the standby replays again WAL that was streamed from the
primary before the restart, but you can also manually copy files to
<code class="filename">pg_wal</code> at any time to have them replayed.
</p><p>
At startup, the standby begins by restoring all WAL available in the
archive location, calling <code class="varname">restore_command</code>. Once it
reaches the end of WAL available there and <code class="varname">restore_command</code>
fails, it tries to restore any WAL available in the <code class="filename">pg_wal</code> directory.
If that fails, and streaming replication has been configured, the
standby tries to connect to the primary server and start streaming WAL
from the last valid record found in archive or <code class="filename">pg_wal</code>. If that fails
or streaming replication is not configured, or if the connection is
later disconnected, the standby goes back to step 1 and tries to
restore the file from the archive again. This loop of retries from the
archive, <code class="filename">pg_wal</code>, and via streaming replication goes on until the server
is stopped or failover is triggered by a trigger file.
</p><p>
Standby mode is exited and the server switches to normal operation
when <code class="command">pg_ctl promote</code> is run,
<code class="function">pg_promote()</code> is called, or a trigger file is found
(<code class="varname">promote_trigger_file</code>). Before failover,
any WAL immediately available in the archive or in <code class="filename">pg_wal</code> will be
restored, but no attempt is made to connect to the primary.
</p></div><div class="sect2" id="PREPARING-PRIMARY-FOR-STANDBY"><div class="titlepage"><div><div><h3 class="title">27.2.3. Preparing the Primary for Standby Servers</h3></div></div></div><p>
Set up continuous archiving on the primary to an archive directory
accessible from the standby, as described
in <a class="xref" href="continuous-archiving.html" title="26.3. Continuous Archiving and Point-in-Time Recovery (PITR)">Section 26.3</a>. The archive location should be
accessible from the standby even when the primary is down, i.e., it should
reside on the standby server itself or another trusted server, not on
the primary server.
</p><p>
If you want to use streaming replication, set up authentication on the
primary server to allow replication connections from the standby
server(s); that is, create a role and provide a suitable entry or
entries in <code class="filename">pg_hba.conf</code> with the database field set to
<code class="literal">replication</code>. Also ensure <code class="varname">max_wal_senders</code> is set
to a sufficiently large value in the configuration file of the primary
server. If replication slots will be used,
ensure that <code class="varname">max_replication_slots</code> is set sufficiently
high as well.
</p><p>
Take a base backup as described in <a class="xref" href="continuous-archiving.html#BACKUP-BASE-BACKUP" title="26.3.2. Making a Base Backup">Section 26.3.2</a>
to bootstrap the standby server.
</p></div><div class="sect2" id="STANDBY-SERVER-SETUP"><div class="titlepage"><div><div><h3 class="title">27.2.4. Setting Up a Standby Server</h3></div></div></div><p>
To set up the standby server, restore the base backup taken from primary
server (see <a class="xref" href="continuous-archiving.html#BACKUP-PITR-RECOVERY" title="26.3.4. Recovering Using a Continuous Archive Backup">Section 26.3.4</a>). Create a file
<a class="link" href="warm-standby.html#FILE-STANDBY-SIGNAL"><code class="filename">standby.signal</code></a><a id="id-1.6.14.16.16.2.3" class="indexterm"></a>
in the standby's cluster data
directory. Set <a class="xref" href="runtime-config-wal.html#GUC-RESTORE-COMMAND">restore_command</a> to a simple command to copy files from
the WAL archive. If you plan to have multiple standby servers for high
availability purposes, make sure that <code class="varname">recovery_target_timeline</code> is set to
<code class="literal">latest</code> (the default), to make the standby server follow the timeline change
that occurs at failover to another standby.
</p><div class="note"><h3 class="title">Note</h3><p>
<a class="xref" href="runtime-config-wal.html#GUC-RESTORE-COMMAND">restore_command</a> should return immediately
if the file does not exist; the server will retry the command again if
necessary.
</p></div><p>
If you want to use streaming replication, fill in
<a class="xref" href="runtime-config-replication.html#GUC-PRIMARY-CONNINFO">primary_conninfo</a> with a libpq connection string, including
the host name (or IP address) and any additional details needed to
connect to the primary server. If the primary needs a password for
authentication, the password needs to be specified in
<a class="xref" href="runtime-config-replication.html#GUC-PRIMARY-CONNINFO">primary_conninfo</a> as well.
</p><p>
If you're setting up the standby server for high availability purposes,
set up WAL archiving, connections and authentication like the primary
server, because the standby server will work as a primary server after
failover.
</p><p>
If you're using a WAL archive, its size can be minimized using the <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-CLEANUP-COMMAND">archive_cleanup_command</a> parameter to remove files that are no
longer required by the standby server.
The <span class="application">pg_archivecleanup</span> utility is designed specifically to
be used with <code class="varname">archive_cleanup_command</code> in typical single-standby
configurations, see <a class="xref" href="pgarchivecleanup.html" title="pg_archivecleanup"><span class="refentrytitle"><span class="application">pg_archivecleanup</span></span></a>.
Note however, that if you're using the archive for backup purposes, you
need to retain files needed to recover from at least the latest base
backup, even if they're no longer needed by the standby.
</p><p>
A simple example of configuration is:
</p><pre class="programlisting">
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass options=''-c wal_sender_timeout=5000'''
restore_command = 'cp /path/to/archive/%f %p'
archive_cleanup_command = 'pg_archivecleanup /path/to/archive %r'
</pre><p>
</p><p>
You can have any number of standby servers, but if you use streaming
replication, make sure you set <code class="varname">max_wal_senders</code> high enough in
the primary to allow them to be connected simultaneously.
</p></div><div class="sect2" id="STREAMING-REPLICATION"><div class="titlepage"><div><div><h3 class="title">27.2.5. Streaming Replication</h3></div></div></div><a id="id-1.6.14.16.17.2" class="indexterm"></a><p>
Streaming replication allows a standby server to stay more up-to-date
than is possible with file-based log shipping. The standby connects
to the primary, which streams WAL records to the standby as they're
generated, without waiting for the WAL file to be filled.
</p><p>
Streaming replication is asynchronous by default
(see <a class="xref" href="warm-standby.html#SYNCHRONOUS-REPLICATION" title="27.2.8. Synchronous Replication">Section 27.2.8</a>), in which case there is
a small delay between committing a transaction in the primary and the
changes becoming visible in the standby. This delay is however much
smaller than with file-based log shipping, typically under one second
assuming the standby is powerful enough to keep up with the load. With
streaming replication, <code class="varname">archive_timeout</code> is not required to
reduce the data loss window.
</p><p>
If you use streaming replication without file-based continuous
archiving, the server might recycle old WAL segments before the standby
has received them. If this occurs, the standby will need to be
reinitialized from a new base backup. You can avoid this by setting
<code class="varname">wal_keep_size</code> to a value large enough to ensure that
WAL segments are not recycled too early, or by configuring a replication
slot for the standby. If you set up a WAL archive that's accessible from
the standby, these solutions are not required, since the standby can
always use the archive to catch up provided it retains enough segments.
</p><p>
To use streaming replication, set up a file-based log-shipping standby
server as described in <a class="xref" href="warm-standby.html" title="27.2. Log-Shipping Standby Servers">Section 27.2</a>. The step that
turns a file-based log-shipping standby into streaming replication
standby is setting the <code class="varname">primary_conninfo</code> setting
to point to the primary server. Set
<a class="xref" href="runtime-config-connection.html#GUC-LISTEN-ADDRESSES">listen_addresses</a> and authentication options
(see <code class="filename">pg_hba.conf</code>) on the primary so that the standby server
can connect to the <code class="literal">replication</code> pseudo-database on the primary
server (see <a class="xref" href="warm-standby.html#STREAMING-REPLICATION-AUTHENTICATION" title="27.2.5.1. Authentication">Section 27.2.5.1</a>).
</p><p>
On systems that support the keepalive socket option, setting
<a class="xref" href="runtime-config-connection.html#GUC-TCP-KEEPALIVES-IDLE">tcp_keepalives_idle</a>,
<a class="xref" href="runtime-config-connection.html#GUC-TCP-KEEPALIVES-INTERVAL">tcp_keepalives_interval</a> and
<a class="xref" href="runtime-config-connection.html#GUC-TCP-KEEPALIVES-COUNT">tcp_keepalives_count</a> helps the primary promptly
notice a broken connection.
</p><p>
Set the maximum number of concurrent connections from the standby servers
(see <a class="xref" href="runtime-config-replication.html#GUC-MAX-WAL-SENDERS">max_wal_senders</a> for details).
</p><p>
When the standby is started and <code class="varname">primary_conninfo</code> is set
correctly, the standby will connect to the primary after replaying all
WAL files available in the archive. If the connection is established
successfully, you will see a <code class="literal">walreceiver</code> in the standby, and
a corresponding <code class="literal">walsender</code> process in the primary.
</p><div class="sect3" id="STREAMING-REPLICATION-AUTHENTICATION"><div class="titlepage"><div><div><h4 class="title">27.2.5.1. Authentication</h4></div></div></div><p>
It is very important that the access privileges for replication be set up
so that only trusted users can read the WAL stream, because it is
easy to extract privileged information from it. Standby servers must
authenticate to the primary as an account that has the
<code class="literal">REPLICATION</code> privilege or a superuser. It is
recommended to create a dedicated user account with
<code class="literal">REPLICATION</code> and <code class="literal">LOGIN</code>
privileges for replication. While <code class="literal">REPLICATION</code>
privilege gives very high permissions, it does not allow the user to
modify any data on the primary system, which the
<code class="literal">SUPERUSER</code> privilege does.
</p><p>
Client authentication for replication is controlled by a
<code class="filename">pg_hba.conf</code> record specifying <code class="literal">replication</code> in the
<em class="replaceable"><code>database</code></em> field. For example, if the standby is running on
host IP <code class="literal">192.168.1.100</code> and the account name for replication
is <code class="literal">foo</code>, the administrator can add the following line to the
<code class="filename">pg_hba.conf</code> file on the primary:
</p><pre class="programlisting">
# Allow the user "foo" from host 192.168.1.100 to connect to the primary
# as a replication standby if the user's password is correctly supplied.
#
# TYPE DATABASE USER ADDRESS METHOD
host replication foo 192.168.1.100/32 md5
</pre><p>
</p><p>
The host name and port number of the primary, connection user name,
and password are specified in the <a class="xref" href="runtime-config-replication.html#GUC-PRIMARY-CONNINFO">primary_conninfo</a>.
The password can also be set in the <code class="filename">~/.pgpass</code> file on the
standby (specify <code class="literal">replication</code> in the <em class="replaceable"><code>database</code></em>
field).
For example, if the primary is running on host IP <code class="literal">192.168.1.50</code>,
port <code class="literal">5432</code>, the account name for replication is
<code class="literal">foo</code>, and the password is <code class="literal">foopass</code>, the administrator
can add the following line to the <code class="filename">postgresql.conf</code> file on the
standby:
</p><pre class="programlisting">
# The standby connects to the primary that is running on host 192.168.1.50
# and port 5432 as the user "foo" whose password is "foopass".
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
</pre><p>
</p></div><div class="sect3" id="STREAMING-REPLICATION-MONITORING"><div class="titlepage"><div><div><h4 class="title">27.2.5.2. Monitoring</h4></div></div></div><p>
An important health indicator of streaming replication is the amount
of WAL records generated in the primary, but not yet applied in the
standby. You can calculate this lag by comparing the current WAL write
location on the primary with the last WAL location received by the
standby. These locations can be retrieved using
<code class="function">pg_current_wal_lsn</code> on the primary and
<code class="function">pg_last_wal_receive_lsn</code> on the standby,
respectively (see <a class="xref" href="functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE" title="Table 9.89. Backup Control Functions">Table 9.89</a> and
<a class="xref" href="functions-admin.html#FUNCTIONS-RECOVERY-INFO-TABLE" title="Table 9.90. Recovery Information Functions">Table 9.90</a> for details).
The last WAL receive location in the standby is also displayed in the
process status of the WAL receiver process, displayed using the
<code class="command">ps</code> command (see <a class="xref" href="monitoring-ps.html" title="28.1. Standard Unix Tools">Section 28.1</a> for details).
</p><p>
You can retrieve a list of WAL sender processes via the
<a class="link" href="monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-VIEW" title="28.2.4. pg_stat_replication"><code class="structname">
pg_stat_replication</code></a> view. Large differences between
<code class="function">pg_current_wal_lsn</code> and the view's <code class="literal">sent_lsn</code> field
might indicate that the primary server is under heavy load, while
differences between <code class="literal">sent_lsn</code> and
<code class="function">pg_last_wal_receive_lsn</code> on the standby might indicate
network delay, or that the standby is under heavy load.
</p><p>
On a hot standby, the status of the WAL receiver process can be retrieved
via the <a class="link" href="monitoring-stats.html#MONITORING-PG-STAT-WAL-RECEIVER-VIEW" title="28.2.6. pg_stat_wal_receiver">
<code class="structname">pg_stat_wal_receiver</code></a> view. A large
difference between <code class="function">pg_last_wal_replay_lsn</code> and the
view's <code class="literal">flushed_lsn</code> indicates that WAL is being
received faster than it can be replayed.
</p></div></div><div class="sect2" id="STREAMING-REPLICATION-SLOTS"><div class="titlepage"><div><div><h3 class="title">27.2.6. Replication Slots</h3></div></div></div><a id="id-1.6.14.16.18.2" class="indexterm"></a><p>
Replication slots provide an automated way to ensure that the primary does
not remove WAL segments until they have been received by all standbys,
and that the primary does not remove rows which could cause a
<a class="link" href="hot-standby.html#HOT-STANDBY-CONFLICT" title="27.4.2. Handling Query Conflicts">recovery conflict</a> even when the
standby is disconnected.
</p><p>
In lieu of using replication slots, it is possible to prevent the removal
of old WAL segments using <a class="xref" href="runtime-config-replication.html#GUC-WAL-KEEP-SIZE">wal_keep_size</a>, or by
storing the segments in an archive using
<a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-COMMAND">archive_command</a> or <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-LIBRARY">archive_library</a>.
However, these methods often result in retaining more WAL segments than
required, whereas replication slots retain only the number of segments
known to be needed. On the other hand, replication slots can retain so
many WAL segments that they fill up the space allocated
for <code class="literal">pg_wal</code>;
<a class="xref" href="runtime-config-replication.html#GUC-MAX-SLOT-WAL-KEEP-SIZE">max_slot_wal_keep_size</a> limits the size of WAL files
retained by replication slots.
</p><p>
Similarly, <a class="xref" href="runtime-config-replication.html#GUC-HOT-STANDBY-FEEDBACK">hot_standby_feedback</a>
and <a class="xref" href="runtime-config-replication.html#GUC-VACUUM-DEFER-CLEANUP-AGE">vacuum_defer_cleanup_age</a> provide protection against
relevant rows being removed by vacuum, but the former provides no
protection during any time period when the standby is not connected,
and the latter often needs to be set to a high value to provide adequate
protection. Replication slots overcome these disadvantages.
</p><div class="sect3" id="STREAMING-REPLICATION-SLOTS-MANIPULATION"><div class="titlepage"><div><div><h4 class="title">27.2.6.1. Querying and Manipulating Replication Slots</h4></div></div></div><p>
Each replication slot has a name, which can contain lower-case letters,
numbers, and the underscore character.
</p><p>
Existing replication slots and their state can be seen in the
<a class="link" href="view-pg-replication-slots.html" title="54.19. pg_replication_slots"><code class="structname">pg_replication_slots</code></a>
view.
</p><p>
Slots can be created and dropped either via the streaming replication
protocol (see <a class="xref" href="protocol-replication.html" title="55.4. Streaming Replication Protocol">Section 55.4</a>) or via SQL
functions (see <a class="xref" href="functions-admin.html#FUNCTIONS-REPLICATION" title="9.27.6. Replication Management Functions">Section 9.27.6</a>).
</p></div><div class="sect3" id="STREAMING-REPLICATION-SLOTS-CONFIG"><div class="titlepage"><div><div><h4 class="title">27.2.6.2. Configuration Example</h4></div></div></div><p>
You can create a replication slot like this:
</p><pre class="programlisting">
postgres=# SELECT * FROM pg_create_physical_replication_slot('node_a_slot');
slot_name | lsn
-------------+-----
node_a_slot |
postgres=# SELECT slot_name, slot_type, active FROM pg_replication_slots;
slot_name | slot_type | active
-------------+-----------+--------
node_a_slot | physical | f
(1 row)
</pre><p>
To configure the standby to use this slot, <code class="varname">primary_slot_name</code>
should be configured on the standby. Here is a simple example:
</p><pre class="programlisting">
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
primary_slot_name = 'node_a_slot'
</pre><p>
</p></div></div><div class="sect2" id="CASCADING-REPLICATION"><div class="titlepage"><div><div><h3 class="title">27.2.7. Cascading Replication</h3></div></div></div><a id="id-1.6.14.16.19.2" class="indexterm"></a><p>
The cascading replication feature allows a standby server to accept replication
connections and stream WAL records to other standbys, acting as a relay.
This can be used to reduce the number of direct connections to the primary
and also to minimize inter-site bandwidth overheads.
</p><p>
A standby acting as both a receiver and a sender is known as a cascading
standby. Standbys that are more directly connected to the primary are known
as upstream servers, while those standby servers further away are downstream
servers. Cascading replication does not place limits on the number or
arrangement of downstream servers, though each standby connects to only
one upstream server which eventually links to a single primary server.
</p><p>
A cascading standby sends not only WAL records received from the
primary but also those restored from the archive. So even if the replication
connection in some upstream connection is terminated, streaming replication
continues downstream for as long as new WAL records are available.
</p><p>
Cascading replication is currently asynchronous. Synchronous replication
(see <a class="xref" href="warm-standby.html#SYNCHRONOUS-REPLICATION" title="27.2.8. Synchronous Replication">Section 27.2.8</a>) settings have no effect on
cascading replication at present.
</p><p>
Hot standby feedback propagates upstream, whatever the cascaded arrangement.
</p><p>
If an upstream standby server is promoted to become the new primary, downstream
servers will continue to stream from the new primary if
<code class="varname">recovery_target_timeline</code> is set to <code class="literal">'latest'</code> (the default).
</p><p>
To use cascading replication, set up the cascading standby so that it can
accept replication connections (that is, set
<a class="xref" href="runtime-config-replication.html#GUC-MAX-WAL-SENDERS">max_wal_senders</a> and <a class="xref" href="runtime-config-replication.html#GUC-HOT-STANDBY">hot_standby</a>,
and configure
<a class="link" href="auth-pg-hba-conf.html" title="21.1. The pg_hba.conf File">host-based authentication</a>).
You will also need to set <code class="varname">primary_conninfo</code> in the downstream
standby to point to the cascading standby.
</p></div><div class="sect2" id="SYNCHRONOUS-REPLICATION"><div class="titlepage"><div><div><h3 class="title">27.2.8. Synchronous Replication</h3></div></div></div><a id="id-1.6.14.16.20.2" class="indexterm"></a><p>
<span class="productname">PostgreSQL</span> streaming replication is asynchronous by
default. If the primary server
crashes then some transactions that were committed may not have been
replicated to the standby server, causing data loss. The amount
of data loss is proportional to the replication delay at the time of
failover.
</p><p>
Synchronous replication offers the ability to confirm that all changes
made by a transaction have been transferred to one or more synchronous
standby servers. This extends that standard level of durability
offered by a transaction commit. This level of protection is referred
to as 2-safe replication in computer science theory, and group-1-safe
(group-safe and 1-safe) when <code class="varname">synchronous_commit</code> is set to
<code class="literal">remote_write</code>.
</p><p>
When requesting synchronous replication, each commit of a
write transaction will wait until confirmation is
received that the commit has been written to the write-ahead log on disk
of both the primary and standby server. The only possibility that data
can be lost is if both the primary and the standby suffer crashes at the
same time. This can provide a much higher level of durability, though only
if the sysadmin is cautious about the placement and management of the two
servers. Waiting for confirmation increases the user's confidence that the
changes will not be lost in the event of server crashes but it also
necessarily increases the response time for the requesting transaction.
The minimum wait time is the round-trip time between primary and standby.
</p><p>
Read-only transactions and transaction rollbacks need not wait for
replies from standby servers. Subtransaction commits do not wait for
responses from standby servers, only top-level commits. Long
running actions such as data loading or index building do not wait
until the very final commit message. All two-phase commit actions
require commit waits, including both prepare and commit.
</p><p>
A synchronous standby can be a physical replication standby or a logical
replication subscriber. It can also be any other physical or logical WAL
replication stream consumer that knows how to send the appropriate
feedback messages. Besides the built-in physical and logical replication
systems, this includes special programs such
as <code class="command">pg_receivewal</code> and <code class="command">pg_recvlogical</code>
as well as some third-party replication systems and custom programs.
Check the respective documentation for details on synchronous replication
support.
</p><div class="sect3" id="SYNCHRONOUS-REPLICATION-CONFIG"><div class="titlepage"><div><div><h4 class="title">27.2.8.1. Basic Configuration</h4></div></div></div><p>
Once streaming replication has been configured, configuring synchronous
replication requires only one additional configuration step:
<a class="xref" href="runtime-config-replication.html#GUC-SYNCHRONOUS-STANDBY-NAMES">synchronous_standby_names</a> must be set to
a non-empty value. <code class="varname">synchronous_commit</code> must also be set to
<code class="literal">on</code>, but since this is the default value, typically no change is
required. (See <a class="xref" href="runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS" title="20.5.1. Settings">Section 20.5.1</a> and
<a class="xref" href="runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-PRIMARY" title="20.6.2. Primary Server">Section 20.6.2</a>.)
This configuration will cause each commit to wait for
confirmation that the standby has written the commit record to durable
storage.
<code class="varname">synchronous_commit</code> can be set by individual
users, so it can be configured in the configuration file, for particular
users or databases, or dynamically by applications, in order to control
the durability guarantee on a per-transaction basis.
</p><p>
After a commit record has been written to disk on the primary, the
WAL record is then sent to the standby. The standby sends reply
messages each time a new batch of WAL data is written to disk, unless
<code class="varname">wal_receiver_status_interval</code> is set to zero on the standby.
In the case that <code class="varname">synchronous_commit</code> is set to
<code class="literal">remote_apply</code>, the standby sends reply messages when the commit
record is replayed, making the transaction visible.
If the standby is chosen as a synchronous standby, according to the setting
of <code class="varname">synchronous_standby_names</code> on the primary, the reply
messages from that standby will be considered along with those from other
synchronous standbys to decide when to release transactions waiting for
confirmation that the commit record has been received. These parameters
allow the administrator to specify which standby servers should be
synchronous standbys. Note that the configuration of synchronous
replication is mainly on the primary. Named standbys must be directly
connected to the primary; the primary knows nothing about downstream
standby servers using cascaded replication.
</p><p>
Setting <code class="varname">synchronous_commit</code> to <code class="literal">remote_write</code> will
cause each commit to wait for confirmation that the standby has received
the commit record and written it out to its own operating system, but not
for the data to be flushed to disk on the standby. This
setting provides a weaker guarantee of durability than <code class="literal">on</code>
does: the standby could lose the data in the event of an operating system
crash, though not a <span class="productname">PostgreSQL</span> crash.
However, it's a useful setting in practice
because it can decrease the response time for the transaction.
Data loss could only occur if both the primary and the standby crash and
the database of the primary gets corrupted at the same time.
</p><p>
Setting <code class="varname">synchronous_commit</code> to <code class="literal">remote_apply</code> will
cause each commit to wait until the current synchronous standbys report
that they have replayed the transaction, making it visible to user
queries. In simple cases, this allows for load balancing with causal
consistency.
</p><p>
Users will stop waiting if a fast shutdown is requested. However, as
when using asynchronous replication, the server will not fully
shutdown until all outstanding WAL records are transferred to the currently
connected standby servers.
</p></div><div class="sect3" id="SYNCHRONOUS-REPLICATION-MULTIPLE-STANDBYS"><div class="titlepage"><div><div><h4 class="title">27.2.8.2. Multiple Synchronous Standbys</h4></div></div></div><p>
Synchronous replication supports one or more synchronous standby servers;
transactions will wait until all the standby servers which are considered
as synchronous confirm receipt of their data. The number of synchronous
standbys that transactions must wait for replies from is specified in
<code class="varname">synchronous_standby_names</code>. This parameter also specifies
a list of standby names and the method (<code class="literal">FIRST</code> and
<code class="literal">ANY</code>) to choose synchronous standbys from the listed ones.
</p><p>
The method <code class="literal">FIRST</code> specifies a priority-based synchronous
replication and makes transaction commits wait until their WAL records are
replicated to the requested number of synchronous standbys chosen based on
their priorities. The standbys whose names appear earlier in the list are
given higher priority and will be considered as synchronous. Other standby
servers appearing later in this list represent potential synchronous
standbys. If any of the current synchronous standbys disconnects for
whatever reason, it will be replaced immediately with the
next-highest-priority standby.
</p><p>
An example of <code class="varname">synchronous_standby_names</code> for
a priority-based multiple synchronous standbys is:
</p><pre class="programlisting">
synchronous_standby_names = 'FIRST 2 (s1, s2, s3)'
</pre><p>
In this example, if four standby servers <code class="literal">s1</code>, <code class="literal">s2</code>,
<code class="literal">s3</code> and <code class="literal">s4</code> are running, the two standbys
<code class="literal">s1</code> and <code class="literal">s2</code> will be chosen as synchronous standbys
because their names appear early in the list of standby names.
<code class="literal">s3</code> is a potential synchronous standby and will take over
the role of synchronous standby when either of <code class="literal">s1</code> or
<code class="literal">s2</code> fails. <code class="literal">s4</code> is an asynchronous standby since
its name is not in the list.
</p><p>
The method <code class="literal">ANY</code> specifies a quorum-based synchronous
replication and makes transaction commits wait until their WAL records
are replicated to <span class="emphasis"><em>at least</em></span> the requested number of
synchronous standbys in the list.
</p><p>
An example of <code class="varname">synchronous_standby_names</code> for
a quorum-based multiple synchronous standbys is:
</p><pre class="programlisting">
synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
</pre><p>
In this example, if four standby servers <code class="literal">s1</code>, <code class="literal">s2</code>,
<code class="literal">s3</code> and <code class="literal">s4</code> are running, transaction commits will
wait for replies from at least any two standbys of <code class="literal">s1</code>,
<code class="literal">s2</code> and <code class="literal">s3</code>. <code class="literal">s4</code> is an asynchronous
standby since its name is not in the list.
</p><p>
The synchronous states of standby servers can be viewed using
the <code class="structname">pg_stat_replication</code> view.
</p></div><div class="sect3" id="SYNCHRONOUS-REPLICATION-PERFORMANCE"><div class="titlepage"><div><div><h4 class="title">27.2.8.3. Planning for Performance</h4></div></div></div><p>
Synchronous replication usually requires carefully planned and placed
standby servers to ensure applications perform acceptably. Waiting
doesn't utilize system resources, but transaction locks continue to be
held until the transfer is confirmed. As a result, incautious use of
synchronous replication will reduce performance for database
applications because of increased response times and higher contention.
</p><p>
<span class="productname">PostgreSQL</span> allows the application developer
to specify the durability level required via replication. This can be
specified for the system overall, though it can also be specified for
specific users or connections, or even individual transactions.
</p><p>
For example, an application workload might consist of:
10% of changes are important customer details, while
90% of changes are less important data that the business can more
easily survive if it is lost, such as chat messages between users.
</p><p>
With synchronous replication options specified at the application level
(on the primary) we can offer synchronous replication for the most
important changes, without slowing down the bulk of the total workload.
Application level options are an important and practical tool for allowing
the benefits of synchronous replication for high performance applications.
</p><p>
You should consider that the network bandwidth must be higher than
the rate of generation of WAL data.
</p></div><div class="sect3" id="SYNCHRONOUS-REPLICATION-HA"><div class="titlepage"><div><div><h4 class="title">27.2.8.4. Planning for High Availability</h4></div></div></div><p>
<code class="varname">synchronous_standby_names</code> specifies the number and
names of synchronous standbys that transaction commits made when
<code class="varname">synchronous_commit</code> is set to <code class="literal">on</code>,
<code class="literal">remote_apply</code> or <code class="literal">remote_write</code> will wait for
responses from. Such transaction commits may never be completed
if any one of synchronous standbys should crash.
</p><p>
The best solution for high availability is to ensure you keep as many
synchronous standbys as requested. This can be achieved by naming multiple
potential synchronous standbys using <code class="varname">synchronous_standby_names</code>.
</p><p>
In a priority-based synchronous replication, the standbys whose names
appear earlier in the list will be used as synchronous standbys.
Standbys listed after these will take over the role of synchronous standby
if one of current ones should fail.
</p><p>
In a quorum-based synchronous replication, all the standbys appearing
in the list will be used as candidates for synchronous standbys.
Even if one of them should fail, the other standbys will keep performing
the role of candidates of synchronous standby.
</p><p>
When a standby first attaches to the primary, it will not yet be properly
synchronized. This is described as <code class="literal">catchup</code> mode. Once
the lag between standby and primary reaches zero for the first time
we move to real-time <code class="literal">streaming</code> state.
The catch-up duration may be long immediately after the standby has
been created. If the standby is shut down, then the catch-up period
will increase according to the length of time the standby has been down.
The standby is only able to become a synchronous standby
once it has reached <code class="literal">streaming</code> state.
This state can be viewed using
the <code class="structname">pg_stat_replication</code> view.
</p><p>
If primary restarts while commits are waiting for acknowledgment, those
waiting transactions will be marked fully committed once the primary
database recovers.
There is no way to be certain that all standbys have received all
outstanding WAL data at time of the crash of the primary. Some
transactions may not show as committed on the standby, even though
they show as committed on the primary. The guarantee we offer is that
the application will not receive explicit acknowledgment of the
successful commit of a transaction until the WAL data is known to be
safely received by all the synchronous standbys.
</p><p>
If you really cannot keep as many synchronous standbys as requested
then you should decrease the number of synchronous standbys that
transaction commits must wait for responses from
in <code class="varname">synchronous_standby_names</code> (or disable it) and
reload the configuration file on the primary server.
</p><p>
If the primary is isolated from remaining standby servers you should
fail over to the best candidate of those other remaining standby servers.
</p><p>
If you need to re-create a standby server while transactions are
waiting, make sure that the commands pg_backup_start() and
pg_backup_stop() are run in a session with
<code class="varname">synchronous_commit</code> = <code class="literal">off</code>, otherwise those
requests will wait forever for the standby to appear.
</p></div></div><div class="sect2" id="CONTINUOUS-ARCHIVING-IN-STANDBY"><div class="titlepage"><div><div><h3 class="title">27.2.9. Continuous Archiving in Standby</h3></div></div></div><a id="id-1.6.14.16.21.2" class="indexterm"></a><p>
When continuous WAL archiving is used in a standby, there are two
different scenarios: the WAL archive can be shared between the primary
and the standby, or the standby can have its own WAL archive. When
the standby has its own WAL archive, set <code class="varname">archive_mode</code>
to <code class="literal">always</code>, and the standby will call the archive
command for every WAL segment it receives, whether it's by restoring
from the archive or by streaming replication. The shared archive can
be handled similarly, but the <code class="varname">archive_command</code> or <code class="varname">archive_library</code> must
test if the file being archived exists already, and if the existing file
has identical contents. This requires more care in the
<code class="varname">archive_command</code> or <code class="varname">archive_library</code>, as it must
be careful to not overwrite an existing file with different contents,
but return success if the exactly same file is archived twice. And
all that must be done free of race conditions, if two servers attempt
to archive the same file at the same time.
</p><p>
If <code class="varname">archive_mode</code> is set to <code class="literal">on</code>, the
archiver is not enabled during recovery or standby mode. If the standby
server is promoted, it will start archiving after the promotion, but
will not archive any WAL or timeline history files that
it did not generate itself. To get a complete
series of WAL files in the archive, you must ensure that all WAL is
archived, before it reaches the standby. This is inherently true with
file-based log shipping, as the standby can only restore files that
are found in the archive, but not if streaming replication is enabled.
When a server is not in recovery mode, there is no difference between
<code class="literal">on</code> and <code class="literal">always</code> modes.
</p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="different-replication-solutions.html" title="27.1. Comparison of Different Solutions">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="warm-standby-failover.html" title="27.3. Failover">Next</a></td></tr><tr><td width="40%" align="left" valign="top">27.1. Comparison of Different Solutions </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.4 Documentation">Home</a></td><td width="40%" align="right" valign="top"> 27.3. Failover</td></tr></table></div></body></html>
|