summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/html/continuous-archiving.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/src/sgml/html/continuous-archiving.html')
-rw-r--r--doc/src/sgml/html/continuous-archiving.html757
1 files changed, 757 insertions, 0 deletions
diff --git a/doc/src/sgml/html/continuous-archiving.html b/doc/src/sgml/html/continuous-archiving.html
new file mode 100644
index 0000000..adc7ae7
--- /dev/null
+++ b/doc/src/sgml/html/continuous-archiving.html
@@ -0,0 +1,757 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>26.3. Continuous Archiving and Point-in-Time Recovery (PITR)</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="backup-file.html" title="26.2. File System Level Backup" /><link rel="next" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">26.3. Continuous Archiving and Point-in-Time Recovery (PITR)</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="backup-file.html" title="26.2. File System Level Backup">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="backup.html" title="Chapter 26. Backup and Restore">Up</a></td><th width="60%" align="center">Chapter 26. Backup and Restore</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication">Next</a></td></tr></table><hr /></div><div class="sect1" id="CONTINUOUS-ARCHIVING"><div class="titlepage"><div><div><h2 class="title" style="clear: both">26.3. Continuous Archiving and Point-in-Time Recovery (PITR)</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-ARCHIVING-WAL">26.3.1. Setting Up WAL Archiving</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-BASE-BACKUP">26.3.2. Making a Base Backup</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP">26.3.3. Making a Base Backup Using the Low Level API</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-PITR-RECOVERY">26.3.4. Recovering Using a Continuous Archive Backup</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-TIMELINES">26.3.5. Timelines</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-TIPS">26.3.6. Tips and Examples</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#CONTINUOUS-ARCHIVING-CAVEATS">26.3.7. Caveats</a></span></dt></dl></div><a id="id-1.6.13.7.2" class="indexterm"></a><a id="id-1.6.13.7.3" class="indexterm"></a><a id="id-1.6.13.7.4" class="indexterm"></a><p>
+ At all times, <span class="productname">PostgreSQL</span> maintains a
+ <em class="firstterm">write ahead log</em> (WAL) in the <code class="filename">pg_wal/</code>
+ subdirectory of the cluster's data directory. The log records
+ every change made to the database's data files. This log exists
+ primarily for crash-safety purposes: if the system crashes, the
+ database can be restored to consistency by <span class="quote">“<span class="quote">replaying</span>”</span> the
+ log entries made since the last checkpoint. However, the existence
+ of the log makes it possible to use a third strategy for backing up
+ databases: we can combine a file-system-level backup with backup of
+ the WAL files. If recovery is needed, we restore the file system backup and
+ then replay from the backed-up WAL files to bring the system to a
+ current state. This approach is more complex to administer than
+ either of the previous approaches, but it has some significant
+ benefits:
+ </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+ We do not need a perfectly consistent file system backup as the starting point.
+ Any internal inconsistency in the backup will be corrected by log
+ replay (this is not significantly different from what happens during
+ crash recovery). So we do not need a file system snapshot capability,
+ just <span class="application">tar</span> or a similar archiving tool.
+ </p></li><li class="listitem"><p>
+ Since we can combine an indefinitely long sequence of WAL files
+ for replay, continuous backup can be achieved simply by continuing to archive
+ the WAL files. This is particularly valuable for large databases, where
+ it might not be convenient to take a full backup frequently.
+ </p></li><li class="listitem"><p>
+ It is not necessary to replay the WAL entries all the
+ way to the end. We could stop the replay at any point and have a
+ consistent snapshot of the database as it was at that time. Thus,
+ this technique supports <em class="firstterm">point-in-time recovery</em>: it is
+ possible to restore the database to its state at any time since your base
+ backup was taken.
+ </p></li><li class="listitem"><p>
+ If we continuously feed the series of WAL files to another
+ machine that has been loaded with the same base backup file, we
+ have a <em class="firstterm">warm standby</em> system: at any point we can bring up
+ the second machine and it will have a nearly-current copy of the
+ database.
+ </p></li></ul></div><p>
+ </p><div class="note"><h3 class="title">Note</h3><p>
+ <span class="application">pg_dump</span> and
+ <span class="application">pg_dumpall</span> do not produce file-system-level
+ backups and cannot be used as part of a continuous-archiving solution.
+ Such dumps are <span class="emphasis"><em>logical</em></span> and do not contain enough
+ information to be used by WAL replay.
+ </p></div><p>
+ As with the plain file-system-backup technique, this method can only
+ support restoration of an entire database cluster, not a subset.
+ Also, it requires a lot of archival storage: the base backup might be bulky,
+ and a busy system will generate many megabytes of WAL traffic that
+ have to be archived. Still, it is the preferred backup technique in
+ many situations where high reliability is needed.
+ </p><p>
+ To recover successfully using continuous archiving (also called
+ <span class="quote">“<span class="quote">online backup</span>”</span> by many database vendors), you need a continuous
+ sequence of archived WAL files that extends back at least as far as the
+ start time of your backup. So to get started, you should set up and test
+ your procedure for archiving WAL files <span class="emphasis"><em>before</em></span> you take your
+ first base backup. Accordingly, we first discuss the mechanics of
+ archiving WAL files.
+ </p><div class="sect2" id="BACKUP-ARCHIVING-WAL"><div class="titlepage"><div><div><h3 class="title">26.3.1. Setting Up WAL Archiving</h3></div></div></div><p>
+ In an abstract sense, a running <span class="productname">PostgreSQL</span> system
+ produces an indefinitely long sequence of WAL records. The system
+ physically divides this sequence into WAL <em class="firstterm">segment
+ files</em>, which are normally 16MB apiece (although the segment size
+ can be altered during <span class="application">initdb</span>). The segment
+ files are given numeric names that reflect their position in the
+ abstract WAL sequence. When not using WAL archiving, the system
+ normally creates just a few segment files and then
+ <span class="quote">“<span class="quote">recycles</span>”</span> them by renaming no-longer-needed segment files
+ to higher segment numbers. It's assumed that segment files whose
+ contents precede the last checkpoint are no longer of
+ interest and can be recycled.
+ </p><p>
+ When archiving WAL data, we need to capture the contents of each segment
+ file once it is filled, and save that data somewhere before the segment
+ file is recycled for reuse. Depending on the application and the
+ available hardware, there could be many different ways of <span class="quote">“<span class="quote">saving
+ the data somewhere</span>”</span>: we could copy the segment files to an NFS-mounted
+ directory on another machine, write them onto a tape drive (ensuring that
+ you have a way of identifying the original name of each file), or batch
+ them together and burn them onto CDs, or something else entirely. To
+ provide the database administrator with flexibility,
+ <span class="productname">PostgreSQL</span> tries not to make any assumptions about how
+ the archiving will be done. Instead, <span class="productname">PostgreSQL</span> lets
+ the administrator specify a shell command or an archive library to be executed to copy a
+ completed segment file to wherever it needs to go. This could be as simple
+ as a shell command that uses <code class="literal">cp</code>, or it could invoke a
+ complex C function — it's all up to you.
+ </p><p>
+ To enable WAL archiving, set the <a class="xref" href="runtime-config-wal.html#GUC-WAL-LEVEL">wal_level</a>
+ configuration parameter to <code class="literal">replica</code> or higher,
+ <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-MODE">archive_mode</a> to <code class="literal">on</code>,
+ specify the shell command to use in the <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-COMMAND">archive_command</a> configuration parameter
+ or specify the library to use in the <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-LIBRARY">archive_library</a> configuration parameter. In practice
+ these settings will always be placed in the
+ <code class="filename">postgresql.conf</code> file.
+ </p><p>
+ In <code class="varname">archive_command</code>,
+ <code class="literal">%p</code> is replaced by the path name of the file to
+ archive, while <code class="literal">%f</code> is replaced by only the file name.
+ (The path name is relative to the current working directory,
+ i.e., the cluster's data directory.)
+ Use <code class="literal">%%</code> if you need to embed an actual <code class="literal">%</code>
+ character in the command. The simplest useful command is something
+ like:
+</p><pre class="programlisting">
+archive_command = 'test ! -f /mnt/server/archivedir/%f &amp;&amp; cp %p /mnt/server/archivedir/%f' # Unix
+archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
+</pre><p>
+ which will copy archivable WAL segments to the directory
+ <code class="filename">/mnt/server/archivedir</code>. (This is an example, not a
+ recommendation, and might not work on all platforms.) After the
+ <code class="literal">%p</code> and <code class="literal">%f</code> parameters have been replaced,
+ the actual command executed might look like this:
+</p><pre class="programlisting">
+test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065
+</pre><p>
+ A similar command will be generated for each new file to be archived.
+ </p><p>
+ The archive command will be executed under the ownership of the same
+ user that the <span class="productname">PostgreSQL</span> server is running as. Since
+ the series of WAL files being archived contains effectively everything
+ in your database, you will want to be sure that the archived data is
+ protected from prying eyes; for example, archive into a directory that
+ does not have group or world read access.
+ </p><p>
+ It is important that the archive command return zero exit status if and
+ only if it succeeds. Upon getting a zero result,
+ <span class="productname">PostgreSQL</span> will assume that the file has been
+ successfully archived, and will remove or recycle it. However, a nonzero
+ status tells <span class="productname">PostgreSQL</span> that the file was not archived;
+ it will try again periodically until it succeeds.
+ </p><p>
+ Another way to archive is to use a custom archive module as the
+ <code class="varname">archive_library</code>. Since such modules are written in
+ <code class="literal">C</code>, creating your own may require considerably more effort
+ than writing a shell command. However, archive modules can be more
+ performant than archiving via shell, and they will have access to many
+ useful server resources. For more information about archive modules, see
+ <a class="xref" href="archive-modules.html" title="Chapter 51. Archive Modules">Chapter 51</a>.
+ </p><p>
+ When the archive command is terminated by a signal (other than
+ <span class="systemitem">SIGTERM</span> that is used as part of a server
+ shutdown) or an error by the shell with an exit status greater than
+ 125 (such as command not found), or if the archive function emits an
+ <code class="literal">ERROR</code> or <code class="literal">FATAL</code>, the archiver process
+ aborts and gets restarted by the postmaster. In such cases, the failure is
+ not reported in <a class="xref" href="monitoring-stats.html#PG-STAT-ARCHIVER-VIEW" title="Table 28.22. pg_stat_archiver View">pg_stat_archiver</a>.
+ </p><p>
+ Archive commands and libraries should generally be designed to refuse to overwrite
+ any pre-existing archive file. This is an important safety feature to
+ preserve the integrity of your archive in case of administrator error
+ (such as sending the output of two different servers to the same archive
+ directory).
+ </p><p>
+ It is advisable to test your proposed archive command or library to ensure that it
+ indeed does not overwrite an existing file, <span class="emphasis"><em>and that it returns
+ nonzero status or <code class="literal">false</code>, respectively, in this case</em></span>.
+ The example command above for Unix ensures this by including a separate
+ <code class="command">test</code> step. On some Unix platforms, <code class="command">cp</code> has
+ switches such as <code class="option">-i</code> that can be used to do the same thing
+ less verbosely, but you should not rely on these without verifying that
+ the right exit status is returned. (In particular, GNU <code class="command">cp</code>
+ will return status zero when <code class="option">-i</code> is used and the target file
+ already exists, which is <span class="emphasis"><em>not</em></span> the desired behavior.)
+ </p><p>
+ While designing your archiving setup, consider what will happen if
+ the archive command or library fails repeatedly because some aspect requires
+ operator intervention or the archive runs out of space. For example, this
+ could occur if you write to tape without an autochanger; when the tape
+ fills, nothing further can be archived until the tape is swapped.
+ You should ensure that any error condition or request to a human operator
+ is reported appropriately so that the situation can be
+ resolved reasonably quickly. The <code class="filename">pg_wal/</code> directory will
+ continue to fill with WAL segment files until the situation is resolved.
+ (If the file system containing <code class="filename">pg_wal/</code> fills up,
+ <span class="productname">PostgreSQL</span> will do a PANIC shutdown. No committed
+ transactions will be lost, but the database will remain offline until
+ you free some space.)
+ </p><p>
+ The speed of the archive command or library is unimportant as long as it can keep up
+ with the average rate at which your server generates WAL data. Normal
+ operation continues even if the archiving process falls a little behind.
+ If archiving falls significantly behind, this will increase the amount of
+ data that would be lost in the event of a disaster. It will also mean that
+ the <code class="filename">pg_wal/</code> directory will contain large numbers of
+ not-yet-archived segment files, which could eventually exceed available
+ disk space. You are advised to monitor the archiving process to ensure that
+ it is working as you intend.
+ </p><p>
+ In writing your archive command or library, you should assume that the file names to
+ be archived can be up to 64 characters long and can contain any
+ combination of ASCII letters, digits, and dots. It is not necessary to
+ preserve the original relative path (<code class="literal">%p</code>) but it is necessary to
+ preserve the file name (<code class="literal">%f</code>).
+ </p><p>
+ Note that although WAL archiving will allow you to restore any
+ modifications made to the data in your <span class="productname">PostgreSQL</span> database,
+ it will not restore changes made to configuration files (that is,
+ <code class="filename">postgresql.conf</code>, <code class="filename">pg_hba.conf</code> and
+ <code class="filename">pg_ident.conf</code>), since those are edited manually rather
+ than through SQL operations.
+ You might wish to keep the configuration files in a location that will
+ be backed up by your regular file system backup procedures. See
+ <a class="xref" href="runtime-config-file-locations.html" title="20.2. File Locations">Section 20.2</a> for how to relocate the
+ configuration files.
+ </p><p>
+ The archive command or function is only invoked on completed WAL segments. Hence,
+ if your server generates only little WAL traffic (or has slack periods
+ where it does so), there could be a long delay between the completion
+ of a transaction and its safe recording in archive storage. To put
+ a limit on how old unarchived data can be, you can set
+ <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-TIMEOUT">archive_timeout</a> to force the server to switch
+ to a new WAL segment file at least that often. Note that archived
+ files that are archived early due to a forced switch are still the same
+ length as completely full files. It is therefore unwise to set a very
+ short <code class="varname">archive_timeout</code> — it will bloat your archive
+ storage. <code class="varname">archive_timeout</code> settings of a minute or so are
+ usually reasonable.
+ </p><p>
+ Also, you can force a segment switch manually with
+ <code class="function">pg_switch_wal</code> if you want to ensure that a
+ just-finished transaction is archived as soon as possible. Other utility
+ functions related to WAL management are listed in <a class="xref" href="functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE" title="Table 9.89. Backup Control Functions">Table 9.89</a>.
+ </p><p>
+ When <code class="varname">wal_level</code> is <code class="literal">minimal</code> some SQL commands
+ are optimized to avoid WAL logging, as described in <a class="xref" href="populate.html#POPULATE-PITR" title="14.4.7. Disable WAL Archival and Streaming Replication">Section 14.4.7</a>. If archiving or streaming replication were
+ turned on during execution of one of these statements, WAL would not
+ contain enough information for archive recovery. (Crash recovery is
+ unaffected.) For this reason, <code class="varname">wal_level</code> can only be changed at
+ server start. However, <code class="varname">archive_command</code> and <code class="varname">archive_library</code> can be changed with a
+ configuration file reload. If you are archiving via shell and wish to
+ temporarily stop archiving,
+ one way to do it is to set <code class="varname">archive_command</code> to the empty
+ string (<code class="literal">''</code>).
+ This will cause WAL files to accumulate in <code class="filename">pg_wal/</code> until a
+ working <code class="varname">archive_command</code> is re-established.
+ </p></div><div class="sect2" id="BACKUP-BASE-BACKUP"><div class="titlepage"><div><div><h3 class="title">26.3.2. Making a Base Backup</h3></div></div></div><p>
+ The easiest way to perform a base backup is to use the
+ <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a> tool. It can create
+ a base backup either as regular files or as a tar archive. If more
+ flexibility than <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a> can provide is
+ required, you can also make a base backup using the low level API
+ (see <a class="xref" href="continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP" title="26.3.3. Making a Base Backup Using the Low Level API">Section 26.3.3</a>).
+ </p><p>
+ It is not necessary to be concerned about the amount of time it takes
+ to make a base backup. However, if you normally run the
+ server with <code class="varname">full_page_writes</code> disabled, you might notice a drop
+ in performance while the backup runs since <code class="varname">full_page_writes</code> is
+ effectively forced on during backup mode.
+ </p><p>
+ To make use of the backup, you will need to keep all the WAL
+ segment files generated during and after the file system backup.
+ To aid you in doing this, the base backup process
+ creates a <em class="firstterm">backup history file</em> that is immediately
+ stored into the WAL archive area. This file is named after the first
+ WAL segment file that you need for the file system backup.
+ For example, if the starting WAL file is
+ <code class="literal">0000000100001234000055CD</code> the backup history file will be
+ named something like
+ <code class="literal">0000000100001234000055CD.007C9330.backup</code>. (The second
+ part of the file name stands for an exact position within the WAL
+ file, and can ordinarily be ignored.) Once you have safely archived
+ the file system backup and the WAL segment files used during the
+ backup (as specified in the backup history file), all archived WAL
+ segments with names numerically less are no longer needed to recover
+ the file system backup and can be deleted. However, you should
+ consider keeping several backup sets to be absolutely certain that
+ you can recover your data.
+ </p><p>
+ The backup history file is just a small text file. It contains the
+ label string you gave to <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a>, as well as
+ the starting and ending times and WAL segments of the backup.
+ If you used the label to identify the associated dump file,
+ then the archived history file is enough to tell you which dump file to
+ restore.
+ </p><p>
+ Since you have to keep around all the archived WAL files back to your
+ last base backup, the interval between base backups should usually be
+ chosen based on how much storage you want to expend on archived WAL
+ files. You should also consider how long you are prepared to spend
+ recovering, if recovery should be necessary — the system will have to
+ replay all those WAL segments, and that could take awhile if it has
+ been a long time since the last base backup.
+ </p></div><div class="sect2" id="BACKUP-LOWLEVEL-BASE-BACKUP"><div class="titlepage"><div><div><h3 class="title">26.3.3. Making a Base Backup Using the Low Level API</h3></div></div></div><p>
+ The procedure for making a base backup using the low level
+ APIs contains a few more steps than
+ the <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a> method, but is relatively
+ simple. It is very important that these steps are executed in
+ sequence, and that the success of a step is verified before
+ proceeding to the next step.
+ </p><p>
+ Multiple backups are able to be run concurrently (both those
+ started using this backup API and those started using
+ <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a>).
+ </p><p>
+ </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
+ Ensure that WAL archiving is enabled and working.
+ </p></li><li class="listitem"><p>
+ Connect to the server (it does not matter which database) as a user with
+ rights to run <code class="function">pg_backup_start</code> (superuser,
+ or a user who has been granted <code class="literal">EXECUTE</code> on the
+ function) and issue the command:
+</p><pre class="programlisting">
+SELECT pg_backup_start(label =&gt; 'label', fast =&gt; false);
+</pre><p>
+ where <code class="literal">label</code> is any string you want to use to uniquely
+ identify this backup operation. The connection
+ calling <code class="function">pg_backup_start</code> must be maintained until the end of
+ the backup, or the backup will be automatically aborted.
+ </p><p>
+ Online backups are always started at the beginning of a checkpoint.
+ By default, <code class="function">pg_backup_start</code> will wait for the next
+ regularly scheduled checkpoint to complete, which may take a long time (see the
+ configuration parameters <a class="xref" href="runtime-config-wal.html#GUC-CHECKPOINT-TIMEOUT">checkpoint_timeout</a> and
+ <a class="xref" href="runtime-config-wal.html#GUC-CHECKPOINT-COMPLETION-TARGET">checkpoint_completion_target</a>). This is
+ usually preferable as it minimizes the impact on the running system. If you
+ want to start the backup as soon as possible, pass <code class="literal">true</code> as
+ the second parameter to <code class="function">pg_backup_start</code> and it will
+ request an immediate checkpoint, which will finish as fast as possible using
+ as much I/O as possible.
+ </p></li><li class="listitem"><p>
+ Perform the backup, using any convenient file-system-backup tool
+ such as <span class="application">tar</span> or <span class="application">cpio</span> (not
+ <span class="application">pg_dump</span> or
+ <span class="application">pg_dumpall</span>). It is neither
+ necessary nor desirable to stop normal operation of the database
+ while you do this. See
+ <a class="xref" href="continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA" title="26.3.3.1. Backing Up the Data Directory">Section 26.3.3.1</a> for things to
+ consider during this backup.
+ </p></li><li class="listitem"><p>
+ In the same connection as before, issue the command:
+</p><pre class="programlisting">
+SELECT * FROM pg_backup_stop(wait_for_archive =&gt; true);
+</pre><p>
+ This terminates backup mode. On a primary, it also performs an automatic
+ switch to the next WAL segment. On a standby, it is not possible to
+ automatically switch WAL segments, so you may wish to run
+ <code class="function">pg_switch_wal</code> on the primary to perform a manual
+ switch. The reason for the switch is to arrange for
+ the last WAL segment file written during the backup interval to be
+ ready to archive.
+ </p><p>
+ <code class="function">pg_backup_stop</code> will return one row with three
+ values. The second of these fields should be written to a file named
+ <code class="filename">backup_label</code> in the root directory of the backup. The
+ third field should be written to a file named
+ <code class="filename">tablespace_map</code> unless the field is empty. These files are
+ vital to the backup working and must be written byte for byte without
+ modification, which may require opening the file in binary mode.
+ </p></li><li class="listitem"><p>
+ Once the WAL segment files active during the backup are archived, you are
+ done. The file identified by <code class="function">pg_backup_stop</code>'s first return
+ value is the last segment that is required to form a complete set of
+ backup files. On a primary, if <code class="varname">archive_mode</code> is enabled and the
+ <code class="literal">wait_for_archive</code> parameter is <code class="literal">true</code>,
+ <code class="function">pg_backup_stop</code> does not return until the last segment has
+ been archived.
+ On a standby, <code class="varname">archive_mode</code> must be <code class="literal">always</code> in order
+ for <code class="function">pg_backup_stop</code> to wait.
+ Archiving of these files happens automatically since you have
+ already configured <code class="varname">archive_command</code> or <code class="varname">archive_library</code>.
+ In most cases this happens quickly, but you are advised to monitor your
+ archive system to ensure there are no delays.
+ If the archive process has fallen behind because of failures of the
+ archive command or library, it will keep retrying
+ until the archive succeeds and the backup is complete.
+ If you wish to place a time limit on the execution of
+ <code class="function">pg_backup_stop</code>, set an appropriate
+ <code class="varname">statement_timeout</code> value, but make note that if
+ <code class="function">pg_backup_stop</code> terminates because of this your backup
+ may not be valid.
+ </p><p>
+ If the backup process monitors and ensures that all WAL segment files
+ required for the backup are successfully archived then the
+ <code class="literal">wait_for_archive</code> parameter (which defaults to true) can be set
+ to false to have
+ <code class="function">pg_backup_stop</code> return as soon as the stop backup record is
+ written to the WAL. By default, <code class="function">pg_backup_stop</code> will wait
+ until all WAL has been archived, which can take some time. This option
+ must be used with caution: if WAL archiving is not monitored correctly
+ then the backup might not include all of the WAL files and will
+ therefore be incomplete and not able to be restored.
+ </p></li></ol></div><p>
+ </p><div class="sect3" id="BACKUP-LOWLEVEL-BASE-BACKUP-DATA"><div class="titlepage"><div><div><h4 class="title">26.3.3.1. Backing Up the Data Directory</h4></div></div></div><p>
+ Some file system backup tools emit warnings or errors
+ if the files they are trying to copy change while the copy proceeds.
+ When taking a base backup of an active database, this situation is normal
+ and not an error. However, you need to ensure that you can distinguish
+ complaints of this sort from real errors. For example, some versions
+ of <span class="application">rsync</span> return a separate exit code for
+ <span class="quote">“<span class="quote">vanished source files</span>”</span>, and you can write a driver script to
+ accept this exit code as a non-error case. Also, some versions of
+ GNU <span class="application">tar</span> return an error code indistinguishable from
+ a fatal error if a file was truncated while <span class="application">tar</span> was
+ copying it. Fortunately, GNU <span class="application">tar</span> versions 1.16 and
+ later exit with 1 if a file was changed during the backup,
+ and 2 for other errors. With GNU <span class="application">tar</span> version 1.23 and
+ later, you can use the warning options <code class="literal">--warning=no-file-changed
+ --warning=no-file-removed</code> to hide the related warning messages.
+ </p><p>
+ Be certain that your backup includes all of the files under
+ the database cluster directory (e.g., <code class="filename">/usr/local/pgsql/data</code>).
+ If you are using tablespaces that do not reside underneath this directory,
+ be careful to include them as well (and be sure that your backup
+ archives symbolic links as links, otherwise the restore will corrupt
+ your tablespaces).
+ </p><p>
+ You should, however, omit from the backup the files within the
+ cluster's <code class="filename">pg_wal/</code> subdirectory. This
+ slight adjustment is worthwhile because it reduces the risk
+ of mistakes when restoring. This is easy to arrange if
+ <code class="filename">pg_wal/</code> is a symbolic link pointing to someplace outside
+ the cluster directory, which is a common setup anyway for performance
+ reasons. You might also want to exclude <code class="filename">postmaster.pid</code>
+ and <code class="filename">postmaster.opts</code>, which record information
+ about the running <span class="application">postmaster</span>, not about the
+ <span class="application">postmaster</span> which will eventually use this backup.
+ (These files can confuse <span class="application">pg_ctl</span>.)
+ </p><p>
+ It is often a good idea to also omit from the backup the files
+ within the cluster's <code class="filename">pg_replslot/</code> directory, so that
+ replication slots that exist on the primary do not become part of the
+ backup. Otherwise, the subsequent use of the backup to create a standby
+ may result in indefinite retention of WAL files on the standby, and
+ possibly bloat on the primary if hot standby feedback is enabled, because
+ the clients that are using those replication slots will still be connecting
+ to and updating the slots on the primary, not the standby. Even if the
+ backup is only intended for use in creating a new primary, copying the
+ replication slots isn't expected to be particularly useful, since the
+ contents of those slots will likely be badly out of date by the time
+ the new primary comes on line.
+ </p><p>
+ The contents of the directories <code class="filename">pg_dynshmem/</code>,
+ <code class="filename">pg_notify/</code>, <code class="filename">pg_serial/</code>,
+ <code class="filename">pg_snapshots/</code>, <code class="filename">pg_stat_tmp/</code>,
+ and <code class="filename">pg_subtrans/</code> (but not the directories themselves) can be
+ omitted from the backup as they will be initialized on postmaster startup.
+ </p><p>
+ Any file or directory beginning with <code class="filename">pgsql_tmp</code> can be
+ omitted from the backup. These files are removed on postmaster start and
+ the directories will be recreated as needed.
+ </p><p>
+ <code class="filename">pg_internal.init</code> files can be omitted from the
+ backup whenever a file of that name is found. These files contain
+ relation cache data that is always rebuilt when recovering.
+ </p><p>
+ The backup label
+ file includes the label string you gave to <code class="function">pg_backup_start</code>,
+ as well as the time at which <code class="function">pg_backup_start</code> was run, and
+ the name of the starting WAL file. In case of confusion it is therefore
+ possible to look inside a backup file and determine exactly which
+ backup session the dump file came from. The tablespace map file includes
+ the symbolic link names as they exist in the directory
+ <code class="filename">pg_tblspc/</code> and the full path of each symbolic link.
+ These files are not merely for your information; their presence and
+ contents are critical to the proper operation of the system's recovery
+ process.
+ </p><p>
+ It is also possible to make a backup while the server is
+ stopped. In this case, you obviously cannot use
+ <code class="function">pg_backup_start</code> or <code class="function">pg_backup_stop</code>, and
+ you will therefore be left to your own devices to keep track of which
+ backup is which and how far back the associated WAL files go.
+ It is generally better to follow the continuous archiving procedure above.
+ </p></div></div><div class="sect2" id="BACKUP-PITR-RECOVERY"><div class="titlepage"><div><div><h3 class="title">26.3.4. Recovering Using a Continuous Archive Backup</h3></div></div></div><p>
+ Okay, the worst has happened and you need to recover from your backup.
+ Here is the procedure:
+ </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
+ Stop the server, if it's running.
+ </p></li><li class="listitem"><p>
+ If you have the space to do so,
+ copy the whole cluster data directory and any tablespaces to a temporary
+ location in case you need them later. Note that this precaution will
+ require that you have enough free space on your system to hold two
+ copies of your existing database. If you do not have enough space,
+ you should at least save the contents of the cluster's <code class="filename">pg_wal</code>
+ subdirectory, as it might contain logs which
+ were not archived before the system went down.
+ </p></li><li class="listitem"><p>
+ Remove all existing files and subdirectories under the cluster data
+ directory and under the root directories of any tablespaces you are using.
+ </p></li><li class="listitem"><p>
+ Restore the database files from your file system backup. Be sure that they
+ are restored with the right ownership (the database system user, not
+ <code class="literal">root</code>!) and with the right permissions. If you are using
+ tablespaces,
+ you should verify that the symbolic links in <code class="filename">pg_tblspc/</code>
+ were correctly restored.
+ </p></li><li class="listitem"><p>
+ Remove any files present in <code class="filename">pg_wal/</code>; these came from the
+ file system backup and are therefore probably obsolete rather than current.
+ If you didn't archive <code class="filename">pg_wal/</code> at all, then recreate
+ it with proper permissions,
+ being careful to ensure that you re-establish it as a symbolic link
+ if you had it set up that way before.
+ </p></li><li class="listitem"><p>
+ If you have unarchived WAL segment files that you saved in step 2,
+ copy them into <code class="filename">pg_wal/</code>. (It is best to copy them,
+ not move them, so you still have the unmodified files if a
+ problem occurs and you have to start over.)
+ </p></li><li class="listitem"><p>
+ Set recovery configuration settings in
+ <code class="filename">postgresql.conf</code> (see <a class="xref" href="runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY" title="20.5.5. Archive Recovery">Section 20.5.5</a>) and create a file
+ <code class="filename">recovery.signal</code> in the cluster
+ data directory. You might
+ also want to temporarily modify <code class="filename">pg_hba.conf</code> to prevent
+ ordinary users from connecting until you are sure the recovery was successful.
+ </p></li><li class="listitem"><p>
+ Start the server. The server will go into recovery mode and
+ proceed to read through the archived WAL files it needs. Should the
+ recovery be terminated because of an external error, the server can
+ simply be restarted and it will continue recovery. Upon completion
+ of the recovery process, the server will remove
+ <code class="filename">recovery.signal</code> (to prevent
+ accidentally re-entering recovery mode later) and then
+ commence normal database operations.
+ </p></li><li class="listitem"><p>
+ Inspect the contents of the database to ensure you have recovered to
+ the desired state. If not, return to step 1. If all is well,
+ allow your users to connect by restoring <code class="filename">pg_hba.conf</code> to normal.
+ </p></li></ol></div><p>
+ </p><p>
+ The key part of all this is to set up a recovery configuration that
+ describes how you want to recover and how far the recovery should
+ run. The one thing that you absolutely must specify is the <code class="varname">restore_command</code>,
+ which tells <span class="productname">PostgreSQL</span> how to retrieve archived
+ WAL file segments. Like the <code class="varname">archive_command</code>, this is
+ a shell command string. It can contain <code class="literal">%f</code>, which is
+ replaced by the name of the desired log file, and <code class="literal">%p</code>,
+ which is replaced by the path name to copy the log file to.
+ (The path name is relative to the current working directory,
+ i.e., the cluster's data directory.)
+ Write <code class="literal">%%</code> if you need to embed an actual <code class="literal">%</code>
+ character in the command. The simplest useful command is
+ something like:
+</p><pre class="programlisting">
+restore_command = 'cp /mnt/server/archivedir/%f %p'
+</pre><p>
+ which will copy previously archived WAL segments from the directory
+ <code class="filename">/mnt/server/archivedir</code>. Of course, you can use something
+ much more complicated, perhaps even a shell script that requests the
+ operator to mount an appropriate tape.
+ </p><p>
+ It is important that the command return nonzero exit status on failure.
+ The command <span class="emphasis"><em>will</em></span> be called requesting files that are not
+ present in the archive; it must return nonzero when so asked. This is not
+ an error condition. An exception is that if the command was terminated by
+ a signal (other than <span class="systemitem">SIGTERM</span>, which is used as
+ part of a database server shutdown) or an error by the shell (such as
+ command not found), then recovery will abort and the server will not start
+ up.
+ </p><p>
+ Not all of the requested files will be WAL segment
+ files; you should also expect requests for files with a suffix of
+ <code class="literal">.history</code>. Also be aware that
+ the base name of the <code class="literal">%p</code> path will be different from
+ <code class="literal">%f</code>; do not expect them to be interchangeable.
+ </p><p>
+ WAL segments that cannot be found in the archive will be sought in
+ <code class="filename">pg_wal/</code>; this allows use of recent un-archived segments.
+ However, segments that are available from the archive will be used in
+ preference to files in <code class="filename">pg_wal/</code>.
+ </p><p>
+ Normally, recovery will proceed through all available WAL segments,
+ thereby restoring the database to the current point in time (or as
+ close as possible given the available WAL segments). Therefore, a normal
+ recovery will end with a <span class="quote">“<span class="quote">file not found</span>”</span> message, the exact text
+ of the error message depending upon your choice of
+ <code class="varname">restore_command</code>. You may also see an error message
+ at the start of recovery for a file named something like
+ <code class="filename">00000001.history</code>. This is also normal and does not
+ indicate a problem in simple recovery situations; see
+ <a class="xref" href="continuous-archiving.html#BACKUP-TIMELINES" title="26.3.5. Timelines">Section 26.3.5</a> for discussion.
+ </p><p>
+ If you want to recover to some previous point in time (say, right before
+ the junior DBA dropped your main transaction table), just specify the
+ required <a class="link" href="runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY-TARGET" title="20.5.6. Recovery Target">stopping point</a>. You can specify
+ the stop point, known as the <span class="quote">“<span class="quote">recovery target</span>”</span>, either by
+ date/time, named restore point or by completion of a specific transaction
+ ID. As of this writing only the date/time and named restore point options
+ are very usable, since there are no tools to help you identify with any
+ accuracy which transaction ID to use.
+ </p><div class="note"><h3 class="title">Note</h3><p>
+ The stop point must be after the ending time of the base backup, i.e.,
+ the end time of <code class="function">pg_backup_stop</code>. You cannot use a base backup
+ to recover to a time when that backup was in progress. (To
+ recover to such a time, you must go back to your previous base backup
+ and roll forward from there.)
+ </p></div><p>
+ If recovery finds corrupted WAL data, recovery will
+ halt at that point and the server will not start. In such a case the
+ recovery process could be re-run from the beginning, specifying a
+ <span class="quote">“<span class="quote">recovery target</span>”</span> before the point of corruption so that recovery
+ can complete normally.
+ If recovery fails for an external reason, such as a system crash or
+ if the WAL archive has become inaccessible, then the recovery can simply
+ be restarted and it will restart almost from where it failed.
+ Recovery restart works much like checkpointing in normal operation:
+ the server periodically forces all its state to disk, and then updates
+ the <code class="filename">pg_control</code> file to indicate that the already-processed
+ WAL data need not be scanned again.
+ </p></div><div class="sect2" id="BACKUP-TIMELINES"><div class="titlepage"><div><div><h3 class="title">26.3.5. Timelines</h3></div></div></div><a id="id-1.6.13.7.13.2" class="indexterm"></a><p>
+ The ability to restore the database to a previous point in time creates
+ some complexities that are akin to science-fiction stories about time
+ travel and parallel universes. For example, in the original history of the database,
+ suppose you dropped a critical table at 5:15PM on Tuesday evening, but
+ didn't realize your mistake until Wednesday noon.
+ Unfazed, you get out your backup, restore to the point-in-time 5:14PM
+ Tuesday evening, and are up and running. In <span class="emphasis"><em>this</em></span> history of
+ the database universe, you never dropped the table. But suppose
+ you later realize this wasn't such a great idea, and would like
+ to return to sometime Wednesday morning in the original history.
+ You won't be able
+ to if, while your database was up-and-running, it overwrote some of the
+ WAL segment files that led up to the time you now wish you
+ could get back to. Thus, to avoid this, you need to distinguish the series of
+ WAL records generated after you've done a point-in-time recovery from
+ those that were generated in the original database history.
+ </p><p>
+ To deal with this problem, <span class="productname">PostgreSQL</span> has a notion
+ of <em class="firstterm">timelines</em>. Whenever an archive recovery completes,
+ a new timeline is created to identify the series of WAL records
+ generated after that recovery. The timeline
+ ID number is part of WAL segment file names so a new timeline does
+ not overwrite the WAL data generated by previous timelines. It is
+ in fact possible to archive many different timelines. While that might
+ seem like a useless feature, it's often a lifesaver. Consider the
+ situation where you aren't quite sure what point-in-time to recover to,
+ and so have to do several point-in-time recoveries by trial and error
+ until you find the best place to branch off from the old history. Without
+ timelines this process would soon generate an unmanageable mess. With
+ timelines, you can recover to <span class="emphasis"><em>any</em></span> prior state, including
+ states in timeline branches that you abandoned earlier.
+ </p><p>
+ Every time a new timeline is created, <span class="productname">PostgreSQL</span> creates
+ a <span class="quote">“<span class="quote">timeline history</span>”</span> file that shows which timeline it branched
+ off from and when. These history files are necessary to allow the system
+ to pick the right WAL segment files when recovering from an archive that
+ contains multiple timelines. Therefore, they are archived into the WAL
+ archive area just like WAL segment files. The history files are just
+ small text files, so it's cheap and appropriate to keep them around
+ indefinitely (unlike the segment files which are large). You can, if
+ you like, add comments to a history file to record your own notes about
+ how and why this particular timeline was created. Such comments will be
+ especially valuable when you have a thicket of different timelines as
+ a result of experimentation.
+ </p><p>
+ The default behavior of recovery is to recover to the latest timeline found
+ in the archive. If you wish to recover to the timeline that was current
+ when the base backup was taken or into a specific child timeline (that
+ is, you want to return to some state that was itself generated after a
+ recovery attempt), you need to specify <code class="literal">current</code> or the
+ target timeline ID in <a class="xref" href="runtime-config-wal.html#GUC-RECOVERY-TARGET-TIMELINE">recovery_target_timeline</a>. You
+ cannot recover into timelines that branched off earlier than the base backup.
+ </p></div><div class="sect2" id="BACKUP-TIPS"><div class="titlepage"><div><div><h3 class="title">26.3.6. Tips and Examples</h3></div></div></div><p>
+ Some tips for configuring continuous archiving are given here.
+ </p><div class="sect3" id="BACKUP-STANDALONE"><div class="titlepage"><div><div><h4 class="title">26.3.6.1. Standalone Hot Backups</h4></div></div></div><p>
+ It is possible to use <span class="productname">PostgreSQL</span>'s backup facilities to
+ produce standalone hot backups. These are backups that cannot be used
+ for point-in-time recovery, yet are typically much faster to backup and
+ restore than <span class="application">pg_dump</span> dumps. (They are also much larger
+ than <span class="application">pg_dump</span> dumps, so in some cases the speed advantage
+ might be negated.)
+ </p><p>
+ As with base backups, the easiest way to produce a standalone
+ hot backup is to use the <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a>
+ tool. If you include the <code class="literal">-X</code> parameter when calling
+ it, all the write-ahead log required to use the backup will be
+ included in the backup automatically, and no special action is
+ required to restore the backup.
+ </p></div><div class="sect3" id="COMPRESSED-ARCHIVE-LOGS"><div class="titlepage"><div><div><h4 class="title">26.3.6.2. Compressed Archive Logs</h4></div></div></div><p>
+ If archive storage size is a concern, you can use
+ <span class="application">gzip</span> to compress the archive files:
+</p><pre class="programlisting">
+archive_command = 'gzip &lt; %p &gt; /mnt/server/archivedir/%f.gz'
+</pre><p>
+ You will then need to use <span class="application">gunzip</span> during recovery:
+</p><pre class="programlisting">
+restore_command = 'gunzip &lt; /mnt/server/archivedir/%f.gz &gt; %p'
+</pre><p>
+ </p></div><div class="sect3" id="BACKUP-SCRIPTS"><div class="titlepage"><div><div><h4 class="title">26.3.6.3. <code class="varname">archive_command</code> Scripts</h4></div></div></div><p>
+ Many people choose to use scripts to define their
+ <code class="varname">archive_command</code>, so that their
+ <code class="filename">postgresql.conf</code> entry looks very simple:
+</p><pre class="programlisting">
+archive_command = 'local_backup_script.sh "%p" "%f"'
+</pre><p>
+ Using a separate script file is advisable any time you want to use
+ more than a single command in the archiving process.
+ This allows all complexity to be managed within the script, which
+ can be written in a popular scripting language such as
+ <span class="application">bash</span> or <span class="application">perl</span>.
+ </p><p>
+ Examples of requirements that might be solved within a script include:
+ </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+ Copying data to secure off-site data storage
+ </p></li><li class="listitem"><p>
+ Batching WAL files so that they are transferred every three hours,
+ rather than one at a time
+ </p></li><li class="listitem"><p>
+ Interfacing with other backup and recovery software
+ </p></li><li class="listitem"><p>
+ Interfacing with monitoring software to report errors
+ </p></li></ul></div><p>
+ </p><div class="tip"><h3 class="title">Tip</h3><p>
+ When using an <code class="varname">archive_command</code> script, it's desirable
+ to enable <a class="xref" href="runtime-config-logging.html#GUC-LOGGING-COLLECTOR">logging_collector</a>.
+ Any messages written to <span class="systemitem">stderr</span> from the script will then
+ appear in the database server log, allowing complex configurations to
+ be diagnosed easily if they fail.
+ </p></div></div></div><div class="sect2" id="CONTINUOUS-ARCHIVING-CAVEATS"><div class="titlepage"><div><div><h3 class="title">26.3.7. Caveats</h3></div></div></div><p>
+ At this writing, there are several limitations of the continuous archiving
+ technique. These will probably be fixed in future releases:
+
+ </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+ If a <a class="link" href="sql-createdatabase.html" title="CREATE DATABASE"><code class="command">CREATE DATABASE</code></a>
+ command is executed while a base backup is being taken, and then
+ the template database that the <code class="command">CREATE DATABASE</code> copied
+ is modified while the base backup is still in progress, it is
+ possible that recovery will cause those modifications to be
+ propagated into the created database as well. This is of course
+ undesirable. To avoid this risk, it is best not to modify any
+ template databases while taking a base backup.
+ </p></li><li class="listitem"><p>
+ <a class="link" href="sql-createtablespace.html" title="CREATE TABLESPACE"><code class="command">CREATE TABLESPACE</code></a>
+ commands are WAL-logged with the literal absolute path, and will
+ therefore be replayed as tablespace creations with the same
+ absolute path. This might be undesirable if the log is being
+ replayed on a different machine. It can be dangerous even if the
+ log is being replayed on the same machine, but into a new data
+ directory: the replay will still overwrite the contents of the
+ original tablespace. To avoid potential gotchas of this sort,
+ the best practice is to take a new base backup after creating or
+ dropping tablespaces.
+ </p></li></ul></div><p>
+ </p><p>
+ It should also be noted that the default <acronym class="acronym">WAL</acronym>
+ format is fairly bulky since it includes many disk page snapshots.
+ These page snapshots are designed to support crash recovery, since
+ we might need to fix partially-written disk pages. Depending on
+ your system hardware and software, the risk of partial writes might
+ be small enough to ignore, in which case you can significantly
+ reduce the total volume of archived logs by turning off page
+ snapshots using the <a class="xref" href="runtime-config-wal.html#GUC-FULL-PAGE-WRITES">full_page_writes</a>
+ parameter. (Read the notes and warnings in <a class="xref" href="wal.html" title="Chapter 30. Reliability and the Write-Ahead Log">Chapter 30</a>
+ before you do so.) Turning off page snapshots does not prevent
+ use of the logs for PITR operations. An area for future
+ development is to compress archived WAL data by removing
+ unnecessary page copies even when <code class="varname">full_page_writes</code> is
+ on. In the meantime, administrators might wish to reduce the number
+ of page snapshots included in WAL by increasing the checkpoint
+ interval parameters as much as feasible.
+ </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="backup-file.html" title="26.2. File System Level Backup">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="backup.html" title="Chapter 26. Backup and Restore">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication">Next</a></td></tr><tr><td width="40%" align="left" valign="top">26.2. File System Level Backup </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Chapter 27. High Availability, Load Balancing, and Replication</td></tr></table></div></body></html> \ No newline at end of file