1 files changed, 757 insertions, 0 deletions
diff --git a/doc/src/sgml/html/continuous-archiving.html b/doc/src/sgml/html/continuous-archiving.html
new file mode 100644
index 0000000..adc7ae7
--- /dev/null
+++ b/doc/src/sgml/html/continuous-archiving.html
@@ -0,0 +1,757 @@
+<?xml version="1.0" encoding="UTF-8" standalone="no"?>
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>26.3. Continuous Archiving and Point-in-Time Recovery (PITR)</title><link rel="stylesheet" type="text/css" href="stylesheet.css" /><link rev="made" href="pgsql-docs@lists.postgresql.org" /><meta name="generator" content="DocBook XSL Stylesheets Vsnapshot" /><link rel="prev" href="backup-file.html" title="26.2. File System Level Backup" /><link rel="next" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication" /></head><body id="docContent" class="container-fluid col-10"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="5" align="center">26.3. Continuous Archiving and Point-in-Time Recovery (PITR)</th></tr><tr><td width="10%" align="left"><a accesskey="p" href="backup-file.html" title="26.2. File System Level Backup">Prev</a> </td><td width="10%" align="left"><a accesskey="u" href="backup.html" title="Chapter 26. Backup and Restore">Up</a></td><th width="60%" align="center">Chapter 26. Backup and Restore</th><td width="10%" align="right"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="10%" align="right"> <a accesskey="n" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication">Next</a></td></tr></table><hr /></div><div class="sect1" id="CONTINUOUS-ARCHIVING"><div class="titlepage"><div><div><h2 class="title" style="clear: both">26.3. Continuous Archiving and Point-in-Time Recovery (PITR)</h2></div></div></div><div class="toc"><dl class="toc"><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-ARCHIVING-WAL">26.3.1. Setting Up WAL Archiving</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-BASE-BACKUP">26.3.2. Making a Base Backup</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP">26.3.3. Making a Base Backup Using the Low Level API</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-PITR-RECOVERY">26.3.4. Recovering Using a Continuous Archive Backup</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-TIMELINES">26.3.5. Timelines</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#BACKUP-TIPS">26.3.6. Tips and Examples</a></span></dt><dt><span class="sect2"><a href="continuous-archiving.html#CONTINUOUS-ARCHIVING-CAVEATS">26.3.7. Caveats</a></span></dt></dl></div><a id="id-1.6.13.7.2" class="indexterm"></a><a id="id-1.6.13.7.3" class="indexterm"></a><a id="id-1.6.13.7.4" class="indexterm"></a><p>
+   At all times, <span class="productname">PostgreSQL</span> maintains a
+   <em class="firstterm">write ahead log</em> (WAL) in the <code class="filename">pg_wal/</code>
+   subdirectory of the cluster's data directory. The log records
+   every change made to the database's data files.  This log exists
+   primarily for crash-safety purposes: if the system crashes, the
+   database can be restored to consistency by <span class="quote">“<span class="quote">replaying</span>”</span> the
+   log entries made since the last checkpoint.  However, the existence
+   of the log makes it possible to use a third strategy for backing up
+   databases: we can combine a file-system-level backup with backup of
+   the WAL files.  If recovery is needed, we restore the file system backup and
+   then replay from the backed-up WAL files to bring the system to a
+   current state.  This approach is more complex to administer than
+   either of the previous approaches, but it has some significant
+   benefits:
+  </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+     We do not need a perfectly consistent file system backup as the starting point.
+     Any internal inconsistency in the backup will be corrected by log
+     replay (this is not significantly different from what happens during
+     crash recovery).  So we do not need a file system snapshot capability,
+     just <span class="application">tar</span> or a similar archiving tool.
+    </p></li><li class="listitem"><p>
+     Since we can combine an indefinitely long sequence of WAL files
+     for replay, continuous backup can be achieved simply by continuing to archive
+     the WAL files.  This is particularly valuable for large databases, where
+     it might not be convenient to take a full backup frequently.
+    </p></li><li class="listitem"><p>
+     It is not necessary to replay the WAL entries all the
+     way to the end.  We could stop the replay at any point and have a
+     consistent snapshot of the database as it was at that time.  Thus,
+     this technique supports <em class="firstterm">point-in-time recovery</em>: it is
+     possible to restore the database to its state at any time since your base
+     backup was taken.
+    </p></li><li class="listitem"><p>
+     If we continuously feed the series of WAL files to another
+     machine that has been loaded with the same base backup file, we
+     have a <em class="firstterm">warm standby</em> system: at any point we can bring up
+     the second machine and it will have a nearly-current copy of the
+     database.
+    </p></li></ul></div><p>
+  </p><div class="note"><h3 class="title">Note</h3><p>
+    <span class="application">pg_dump</span> and
+    <span class="application">pg_dumpall</span> do not produce file-system-level
+    backups and cannot be used as part of a continuous-archiving solution.
+    Such dumps are <span class="emphasis"><em>logical</em></span> and do not contain enough
+    information to be used by WAL replay.
+   </p></div><p>
+   As with the plain file-system-backup technique, this method can only
+   support restoration of an entire database cluster, not a subset.
+   Also, it requires a lot of archival storage: the base backup might be bulky,
+   and a busy system will generate many megabytes of WAL traffic that
+   have to be archived.  Still, it is the preferred backup technique in
+   many situations where high reliability is needed.
+  </p><p>
+   To recover successfully using continuous archiving (also called
+   <span class="quote">“<span class="quote">online backup</span>”</span> by many database vendors), you need a continuous
+   sequence of archived WAL files that extends back at least as far as the
+   start time of your backup.  So to get started, you should set up and test
+   your procedure for archiving WAL files <span class="emphasis"><em>before</em></span> you take your
+   first base backup.  Accordingly, we first discuss the mechanics of
+   archiving WAL files.
+  </p><div class="sect2" id="BACKUP-ARCHIVING-WAL"><div class="titlepage"><div><div><h3 class="title">26.3.1. Setting Up WAL Archiving</h3></div></div></div><p>
+    In an abstract sense, a running <span class="productname">PostgreSQL</span> system
+    produces an indefinitely long sequence of WAL records.  The system
+    physically divides this sequence into WAL <em class="firstterm">segment
+    files</em>, which are normally 16MB apiece (although the segment size
+    can be altered during <span class="application">initdb</span>).  The segment
+    files are given numeric names that reflect their position in the
+    abstract WAL sequence.  When not using WAL archiving, the system
+    normally creates just a few segment files and then
+    <span class="quote">“<span class="quote">recycles</span>”</span> them by renaming no-longer-needed segment files
+    to higher segment numbers.  It's assumed that segment files whose
+    contents precede the last checkpoint are no longer of
+    interest and can be recycled.
+   </p><p>
+    When archiving WAL data, we need to capture the contents of each segment
+    file once it is filled, and save that data somewhere before the segment
+    file is recycled for reuse.  Depending on the application and the
+    available hardware, there could be many different ways of <span class="quote">“<span class="quote">saving
+    the data somewhere</span>”</span>: we could copy the segment files to an NFS-mounted
+    directory on another machine, write them onto a tape drive (ensuring that
+    you have a way of identifying the original name of each file), or batch
+    them together and burn them onto CDs, or something else entirely.  To
+    provide the database administrator with flexibility,
+    <span class="productname">PostgreSQL</span> tries not to make any assumptions about how
+    the archiving will be done.  Instead, <span class="productname">PostgreSQL</span> lets
+    the administrator specify a shell command or an archive library to be executed to copy a
+    completed segment file to wherever it needs to go.  This could be as simple
+    as a shell command that uses <code class="literal">cp</code>, or it could invoke a
+    complex C function — it's all up to you.
+   </p><p>
+    To enable WAL archiving, set the <a class="xref" href="runtime-config-wal.html#GUC-WAL-LEVEL">wal_level</a>
+    configuration parameter to <code class="literal">replica</code> or higher,
+    <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-MODE">archive_mode</a> to <code class="literal">on</code>,
+    specify the shell command to use in the <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-COMMAND">archive_command</a> configuration parameter
+    or specify the library to use in the <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-LIBRARY">archive_library</a> configuration parameter.  In practice
+    these settings will always be placed in the
+    <code class="filename">postgresql.conf</code> file.
+   </p><p>
+    In <code class="varname">archive_command</code>,
+    <code class="literal">%p</code> is replaced by the path name of the file to
+    archive, while <code class="literal">%f</code> is replaced by only the file name.
+    (The path name is relative to the current working directory,
+    i.e., the cluster's data directory.)
+    Use <code class="literal">%%</code> if you need to embed an actual <code class="literal">%</code>
+    character in the command.  The simplest useful command is something
+    like:
+</p><pre class="programlisting">
+archive_command = 'test ! -f /mnt/server/archivedir/%f &amp;&amp; cp %p /mnt/server/archivedir/%f'  # Unix
+archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"'  # Windows
+</pre><p>
+    which will copy archivable WAL segments to the directory
+    <code class="filename">/mnt/server/archivedir</code>.  (This is an example, not a
+    recommendation, and might not work on all platforms.)  After the
+    <code class="literal">%p</code> and <code class="literal">%f</code> parameters have been replaced,
+    the actual command executed might look like this:
+</p><pre class="programlisting">
+test ! -f /mnt/server/archivedir/00000001000000A900000065 &amp;&amp; cp pg_wal/00000001000000A900000065 /mnt/server/archivedir/00000001000000A900000065
+</pre><p>
+    A similar command will be generated for each new file to be archived.
+   </p><p>
+    The archive command will be executed under the ownership of the same
+    user that the <span class="productname">PostgreSQL</span> server is running as.  Since
+    the series of WAL files being archived contains effectively everything
+    in your database, you will want to be sure that the archived data is
+    protected from prying eyes; for example, archive into a directory that
+    does not have group or world read access.
+   </p><p>
+    It is important that the archive command return zero exit status if and
+    only if it succeeds.  Upon getting a zero result,
+    <span class="productname">PostgreSQL</span> will assume that the file has been
+    successfully archived, and will remove or recycle it.  However, a nonzero
+    status tells <span class="productname">PostgreSQL</span> that the file was not archived;
+    it will try again periodically until it succeeds.
+   </p><p>
+    Another way to archive is to use a custom archive module as the
+    <code class="varname">archive_library</code>.  Since such modules are written in
+    <code class="literal">C</code>, creating your own may require considerably more effort
+    than writing a shell command.  However, archive modules can be more
+    performant than archiving via shell, and they will have access to many
+    useful server resources.  For more information about archive modules, see
+    <a class="xref" href="archive-modules.html" title="Chapter 51. Archive Modules">Chapter 51</a>.
+   </p><p>
+    When the archive command is terminated by a signal (other than
+    <span class="systemitem">SIGTERM</span> that is used as part of a server
+    shutdown) or an error by the shell with an exit status greater than
+    125 (such as command not found), or if the archive function emits an
+    <code class="literal">ERROR</code> or <code class="literal">FATAL</code>, the archiver process
+    aborts and gets restarted by the postmaster. In such cases, the failure is
+    not reported in <a class="xref" href="monitoring-stats.html#PG-STAT-ARCHIVER-VIEW" title="Table 28.22. pg_stat_archiver View">pg_stat_archiver</a>.
+   </p><p>
+    Archive commands and libraries should generally be designed to refuse to overwrite
+    any pre-existing archive file.  This is an important safety feature to
+    preserve the integrity of your archive in case of administrator error
+    (such as sending the output of two different servers to the same archive
+    directory).
+   </p><p>
+    It is advisable to test your proposed archive command or library to ensure that it
+    indeed does not overwrite an existing file, <span class="emphasis"><em>and that it returns
+    nonzero status or <code class="literal">false</code>, respectively, in this case</em></span>.
+    The example command above for Unix ensures this by including a separate
+    <code class="command">test</code> step.  On some Unix platforms, <code class="command">cp</code> has
+    switches such as <code class="option">-i</code> that can be used to do the same thing
+    less verbosely, but you should not rely on these without verifying that
+    the right exit status is returned.  (In particular, GNU <code class="command">cp</code>
+    will return status zero when <code class="option">-i</code> is used and the target file
+    already exists, which is <span class="emphasis"><em>not</em></span> the desired behavior.)
+   </p><p>
+    While designing your archiving setup, consider what will happen if
+    the archive command or library fails repeatedly because some aspect requires
+    operator intervention or the archive runs out of space. For example, this
+    could occur if you write to tape without an autochanger; when the tape
+    fills, nothing further can be archived until the tape is swapped.
+    You should ensure that any error condition or request to a human operator
+    is reported appropriately so that the situation can be
+    resolved reasonably quickly. The <code class="filename">pg_wal/</code> directory will
+    continue to fill with WAL segment files until the situation is resolved.
+    (If the file system containing <code class="filename">pg_wal/</code> fills up,
+    <span class="productname">PostgreSQL</span> will do a PANIC shutdown.  No committed
+    transactions will be lost, but the database will remain offline until
+    you free some space.)
+   </p><p>
+    The speed of the archive command or library is unimportant as long as it can keep up
+    with the average rate at which your server generates WAL data.  Normal
+    operation continues even if the archiving process falls a little behind.
+    If archiving falls significantly behind, this will increase the amount of
+    data that would be lost in the event of a disaster. It will also mean that
+    the <code class="filename">pg_wal/</code> directory will contain large numbers of
+    not-yet-archived segment files, which could eventually exceed available
+    disk space. You are advised to monitor the archiving process to ensure that
+    it is working as you intend.
+   </p><p>
+    In writing your archive command or library, you should assume that the file names to
+    be archived can be up to 64 characters long and can contain any
+    combination of ASCII letters, digits, and dots.  It is not necessary to
+    preserve the original relative path (<code class="literal">%p</code>) but it is necessary to
+    preserve the file name (<code class="literal">%f</code>).
+   </p><p>
+    Note that although WAL archiving will allow you to restore any
+    modifications made to the data in your <span class="productname">PostgreSQL</span> database,
+    it will not restore changes made to configuration files (that is,
+    <code class="filename">postgresql.conf</code>, <code class="filename">pg_hba.conf</code> and
+    <code class="filename">pg_ident.conf</code>), since those are edited manually rather
+    than through SQL operations.
+    You might wish to keep the configuration files in a location that will
+    be backed up by your regular file system backup procedures.  See
+    <a class="xref" href="runtime-config-file-locations.html" title="20.2. File Locations">Section 20.2</a> for how to relocate the
+    configuration files.
+   </p><p>
+    The archive command or function is only invoked on completed WAL segments.  Hence,
+    if your server generates only little WAL traffic (or has slack periods
+    where it does so), there could be a long delay between the completion
+    of a transaction and its safe recording in archive storage.  To put
+    a limit on how old unarchived data can be, you can set
+    <a class="xref" href="runtime-config-wal.html#GUC-ARCHIVE-TIMEOUT">archive_timeout</a> to force the server to switch
+    to a new WAL segment file at least that often.  Note that archived
+    files that are archived early due to a forced switch are still the same
+    length as completely full files.  It is therefore unwise to set a very
+    short <code class="varname">archive_timeout</code> — it will bloat your archive
+    storage.  <code class="varname">archive_timeout</code> settings of a minute or so are
+    usually reasonable.
+   </p><p>
+    Also, you can force a segment switch manually with
+    <code class="function">pg_switch_wal</code> if you want to ensure that a
+    just-finished transaction is archived as soon as possible.  Other utility
+    functions related to WAL management are listed in <a class="xref" href="functions-admin.html#FUNCTIONS-ADMIN-BACKUP-TABLE" title="Table 9.89. Backup Control Functions">Table 9.89</a>.
+   </p><p>
+    When <code class="varname">wal_level</code> is <code class="literal">minimal</code> some SQL commands
+    are optimized to avoid WAL logging, as described in <a class="xref" href="populate.html#POPULATE-PITR" title="14.4.7. Disable WAL Archival and Streaming Replication">Section 14.4.7</a>.  If archiving or streaming replication were
+    turned on during execution of one of these statements, WAL would not
+    contain enough information for archive recovery.  (Crash recovery is
+    unaffected.)  For this reason, <code class="varname">wal_level</code> can only be changed at
+    server start.  However, <code class="varname">archive_command</code> and <code class="varname">archive_library</code> can be changed with a
+    configuration file reload.  If you are archiving via shell and wish to
+    temporarily stop archiving,
+    one way to do it is to set <code class="varname">archive_command</code> to the empty
+    string (<code class="literal">''</code>).
+    This will cause WAL files to accumulate in <code class="filename">pg_wal/</code> until a
+    working <code class="varname">archive_command</code> is re-established.
+   </p></div><div class="sect2" id="BACKUP-BASE-BACKUP"><div class="titlepage"><div><div><h3 class="title">26.3.2. Making a Base Backup</h3></div></div></div><p>
+    The easiest way to perform a base backup is to use the
+    <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a> tool. It can create
+    a base backup either as regular files or as a tar archive. If more
+    flexibility than <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a> can provide is
+    required, you can also make a base backup using the low level API
+    (see <a class="xref" href="continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP" title="26.3.3. Making a Base Backup Using the Low Level API">Section 26.3.3</a>).
+   </p><p>
+    It is not necessary to be concerned about the amount of time it takes
+    to make a base backup. However, if you normally run the
+    server with <code class="varname">full_page_writes</code> disabled, you might notice a drop
+    in performance while the backup runs since <code class="varname">full_page_writes</code> is
+    effectively forced on during backup mode.
+   </p><p>
+    To make use of the backup, you will need to keep all the WAL
+    segment files generated during and after the file system backup.
+    To aid you in doing this, the base backup process
+    creates a <em class="firstterm">backup history file</em> that is immediately
+    stored into the WAL archive area. This file is named after the first
+    WAL segment file that you need for the file system backup.
+    For example, if the starting WAL file is
+    <code class="literal">0000000100001234000055CD</code> the backup history file will be
+    named something like
+    <code class="literal">0000000100001234000055CD.007C9330.backup</code>. (The second
+    part of the file name stands for an exact position within the WAL
+    file, and can ordinarily be ignored.) Once you have safely archived
+    the file system backup and the WAL segment files used during the
+    backup (as specified in the backup history file), all archived WAL
+    segments with names numerically less are no longer needed to recover
+    the file system backup and can be deleted. However, you should
+    consider keeping several backup sets to be absolutely certain that
+    you can recover your data.
+   </p><p>
+    The backup history file is just a small text file. It contains the
+    label string you gave to <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a>, as well as
+    the starting and ending times and WAL segments of the backup.
+    If you used the label to identify the associated dump file,
+    then the archived history file is enough to tell you which dump file to
+    restore.
+   </p><p>
+    Since you have to keep around all the archived WAL files back to your
+    last base backup, the interval between base backups should usually be
+    chosen based on how much storage you want to expend on archived WAL
+    files.  You should also consider how long you are prepared to spend
+    recovering, if recovery should be necessary — the system will have to
+    replay all those WAL segments, and that could take awhile if it has
+    been a long time since the last base backup.
+   </p></div><div class="sect2" id="BACKUP-LOWLEVEL-BASE-BACKUP"><div class="titlepage"><div><div><h3 class="title">26.3.3. Making a Base Backup Using the Low Level API</h3></div></div></div><p>
+    The procedure for making a base backup using the low level
+    APIs contains a few more steps than
+    the <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a> method, but is relatively
+    simple. It is very important that these steps are executed in
+    sequence, and that the success of a step is verified before
+    proceeding to the next step.
+   </p><p>
+     Multiple backups are able to be run concurrently (both those
+     started using this backup API and those started using
+     <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a>).
+    </p><p>
+  </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
+     Ensure that WAL archiving is enabled and working.
+    </p></li><li class="listitem"><p>
+     Connect to the server (it does not matter which database) as a user with
+     rights to run <code class="function">pg_backup_start</code> (superuser,
+     or a user who has been granted <code class="literal">EXECUTE</code> on the
+     function) and issue the command:
+</p><pre class="programlisting">
+SELECT pg_backup_start(label =&gt; 'label', fast =&gt; false);
+</pre><p>
+     where <code class="literal">label</code> is any string you want to use to uniquely
+     identify this backup operation. The connection
+     calling <code class="function">pg_backup_start</code> must be maintained until the end of
+     the backup, or the backup will be automatically aborted.
+    </p><p>
+     Online backups are always started at the beginning of a checkpoint.
+     By default, <code class="function">pg_backup_start</code> will wait for the next
+     regularly scheduled checkpoint to complete, which may take a long time (see the
+     configuration parameters <a class="xref" href="runtime-config-wal.html#GUC-CHECKPOINT-TIMEOUT">checkpoint_timeout</a> and
+     <a class="xref" href="runtime-config-wal.html#GUC-CHECKPOINT-COMPLETION-TARGET">checkpoint_completion_target</a>).  This is
+     usually preferable as it minimizes the impact on the running system.  If you
+     want to start the backup as soon as possible, pass <code class="literal">true</code> as
+     the second parameter to <code class="function">pg_backup_start</code> and it will
+     request an immediate checkpoint, which will finish as fast as possible using
+     as much I/O as possible.
+    </p></li><li class="listitem"><p>
+     Perform the backup, using any convenient file-system-backup tool
+     such as <span class="application">tar</span> or <span class="application">cpio</span> (not
+     <span class="application">pg_dump</span> or
+     <span class="application">pg_dumpall</span>).  It is neither
+     necessary nor desirable to stop normal operation of the database
+     while you do this. See
+     <a class="xref" href="continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA" title="26.3.3.1. Backing Up the Data Directory">Section 26.3.3.1</a> for things to
+     consider during this backup.
+    </p></li><li class="listitem"><p>
+     In the same connection as before, issue the command:
+</p><pre class="programlisting">
+SELECT * FROM pg_backup_stop(wait_for_archive =&gt; true);
+</pre><p>
+     This terminates backup mode. On a primary, it also performs an automatic
+     switch to the next WAL segment.  On a standby, it is not possible to
+     automatically switch WAL segments, so you may wish to run
+     <code class="function">pg_switch_wal</code> on the primary to perform a manual
+     switch. The reason for the switch is to arrange for
+     the last WAL segment file written during the backup interval to be
+     ready to archive.
+    </p><p>
+     <code class="function">pg_backup_stop</code> will return one row with three
+     values. The second of these fields should be written to a file named
+     <code class="filename">backup_label</code> in the root directory of the backup. The
+     third field should be written to a file named
+     <code class="filename">tablespace_map</code> unless the field is empty. These files are
+     vital to the backup working and must be written byte for byte without
+     modification, which may require opening the file in binary mode.
+    </p></li><li class="listitem"><p>
+     Once the WAL segment files active during the backup are archived, you are
+     done.  The file identified by <code class="function">pg_backup_stop</code>'s first return
+     value is the last segment that is required to form a complete set of
+     backup files.  On a primary, if <code class="varname">archive_mode</code> is enabled and the
+     <code class="literal">wait_for_archive</code> parameter is <code class="literal">true</code>,
+     <code class="function">pg_backup_stop</code> does not return until the last segment has
+     been archived.
+     On a standby, <code class="varname">archive_mode</code> must be <code class="literal">always</code> in order
+     for <code class="function">pg_backup_stop</code> to wait.
+     Archiving of these files happens automatically since you have
+     already configured <code class="varname">archive_command</code> or <code class="varname">archive_library</code>.
+     In most cases this happens quickly, but you are advised to monitor your
+     archive system to ensure there are no delays.
+     If the archive process has fallen behind because of failures of the
+     archive command or library, it will keep retrying
+     until the archive succeeds and the backup is complete.
+     If you wish to place a time limit on the execution of
+     <code class="function">pg_backup_stop</code>, set an appropriate
+     <code class="varname">statement_timeout</code> value, but make note that if
+     <code class="function">pg_backup_stop</code> terminates because of this your backup
+     may not be valid.
+    </p><p>
+     If the backup process monitors and ensures that all WAL segment files
+     required for the backup are successfully archived then the
+     <code class="literal">wait_for_archive</code> parameter (which defaults to true) can be set
+     to false to have
+     <code class="function">pg_backup_stop</code> return as soon as the stop backup record is
+     written to the WAL.  By default, <code class="function">pg_backup_stop</code> will wait
+     until all WAL has been archived, which can take some time.  This option
+     must be used with caution: if WAL archiving is not monitored correctly
+     then the backup might not include all of the WAL files and will
+     therefore be incomplete and not able to be restored.
+    </p></li></ol></div><p>
+    </p><div class="sect3" id="BACKUP-LOWLEVEL-BASE-BACKUP-DATA"><div class="titlepage"><div><div><h4 class="title">26.3.3.1. Backing Up the Data Directory</h4></div></div></div><p>
+    Some file system backup tools emit warnings or errors
+    if the files they are trying to copy change while the copy proceeds.
+    When taking a base backup of an active database, this situation is normal
+    and not an error.  However, you need to ensure that you can distinguish
+    complaints of this sort from real errors.  For example, some versions
+    of <span class="application">rsync</span> return a separate exit code for
+    <span class="quote">“<span class="quote">vanished source files</span>”</span>, and you can write a driver script to
+    accept this exit code as a non-error case.  Also, some versions of
+    GNU <span class="application">tar</span> return an error code indistinguishable from
+    a fatal error if a file was truncated while <span class="application">tar</span> was
+    copying it.  Fortunately, GNU <span class="application">tar</span> versions 1.16 and
+    later exit with 1 if a file was changed during the backup,
+    and 2 for other errors.  With GNU <span class="application">tar</span> version 1.23 and
+    later, you can use the warning options <code class="literal">--warning=no-file-changed
+    --warning=no-file-removed</code> to hide the related warning messages.
+   </p><p>
+    Be certain that your backup includes all of the files under
+    the database cluster directory (e.g., <code class="filename">/usr/local/pgsql/data</code>).
+    If you are using tablespaces that do not reside underneath this directory,
+    be careful to include them as well (and be sure that your backup
+    archives symbolic links as links, otherwise the restore will corrupt
+    your tablespaces).
+   </p><p>
+    You should, however, omit from the backup the files within the
+    cluster's <code class="filename">pg_wal/</code> subdirectory.  This
+    slight adjustment is worthwhile because it reduces the risk
+    of mistakes when restoring.  This is easy to arrange if
+    <code class="filename">pg_wal/</code> is a symbolic link pointing to someplace outside
+    the cluster directory, which is a common setup anyway for performance
+    reasons.  You might also want to exclude <code class="filename">postmaster.pid</code>
+    and <code class="filename">postmaster.opts</code>, which record information
+    about the running <span class="application">postmaster</span>, not about the
+    <span class="application">postmaster</span> which will eventually use this backup.
+    (These files can confuse <span class="application">pg_ctl</span>.)
+   </p><p>
+    It is often a good idea to also omit from the backup the files
+    within the cluster's <code class="filename">pg_replslot/</code> directory, so that
+    replication slots that exist on the primary do not become part of the
+    backup.  Otherwise, the subsequent use of the backup to create a standby
+    may result in indefinite retention of WAL files on the standby, and
+    possibly bloat on the primary if hot standby feedback is enabled, because
+    the clients that are using those replication slots will still be connecting
+    to and updating the slots on the primary, not the standby.  Even if the
+    backup is only intended for use in creating a new primary, copying the
+    replication slots isn't expected to be particularly useful, since the
+    contents of those slots will likely be badly out of date by the time
+    the new primary comes on line.
+   </p><p>
+    The contents of the directories <code class="filename">pg_dynshmem/</code>,
+    <code class="filename">pg_notify/</code>, <code class="filename">pg_serial/</code>,
+    <code class="filename">pg_snapshots/</code>, <code class="filename">pg_stat_tmp/</code>,
+    and <code class="filename">pg_subtrans/</code> (but not the directories themselves) can be
+    omitted from the backup as they will be initialized on postmaster startup.
+   </p><p>
+    Any file or directory beginning with <code class="filename">pgsql_tmp</code> can be
+    omitted from the backup.  These files are removed on postmaster start and
+    the directories will be recreated as needed.
+   </p><p>
+    <code class="filename">pg_internal.init</code> files can be omitted from the
+    backup whenever a file of that name is found.  These files contain
+    relation cache data that is always rebuilt when recovering.
+   </p><p>
+    The backup label
+    file includes the label string you gave to <code class="function">pg_backup_start</code>,
+    as well as the time at which <code class="function">pg_backup_start</code> was run, and
+    the name of the starting WAL file.  In case of confusion it is therefore
+    possible to look inside a backup file and determine exactly which
+    backup session the dump file came from.  The tablespace map file includes
+    the symbolic link names as they exist in the directory
+    <code class="filename">pg_tblspc/</code> and the full path of each symbolic link.
+    These files are not merely for your information; their presence and
+    contents are critical to the proper operation of the system's recovery
+    process.
+   </p><p>
+    It is also possible to make a backup while the server is
+    stopped.  In this case, you obviously cannot use
+    <code class="function">pg_backup_start</code> or <code class="function">pg_backup_stop</code>, and
+    you will therefore be left to your own devices to keep track of which
+    backup is which and how far back the associated WAL files go.
+    It is generally better to follow the continuous archiving procedure above.
+   </p></div></div><div class="sect2" id="BACKUP-PITR-RECOVERY"><div class="titlepage"><div><div><h3 class="title">26.3.4. Recovering Using a Continuous Archive Backup</h3></div></div></div><p>
+    Okay, the worst has happened and you need to recover from your backup.
+    Here is the procedure:
+  </p><div class="orderedlist"><ol class="orderedlist" type="1"><li class="listitem"><p>
+     Stop the server, if it's running.
+    </p></li><li class="listitem"><p>
+     If you have the space to do so,
+     copy the whole cluster data directory and any tablespaces to a temporary
+     location in case you need them later. Note that this precaution will
+     require that you have enough free space on your system to hold two
+     copies of your existing database. If you do not have enough space,
+     you should at least save the contents of the cluster's <code class="filename">pg_wal</code>
+     subdirectory, as it might contain logs which
+     were not archived before the system went down.
+    </p></li><li class="listitem"><p>
+     Remove all existing files and subdirectories under the cluster data
+     directory and under the root directories of any tablespaces you are using.
+    </p></li><li class="listitem"><p>
+     Restore the database files from your file system backup.  Be sure that they
+     are restored with the right ownership (the database system user, not
+     <code class="literal">root</code>!) and with the right permissions.  If you are using
+     tablespaces,
+     you should verify that the symbolic links in <code class="filename">pg_tblspc/</code>
+     were correctly restored.
+    </p></li><li class="listitem"><p>
+     Remove any files present in <code class="filename">pg_wal/</code>; these came from the
+     file system backup and are therefore probably obsolete rather than current.
+     If you didn't archive <code class="filename">pg_wal/</code> at all, then recreate
+     it with proper permissions,
+     being careful to ensure that you re-establish it as a symbolic link
+     if you had it set up that way before.
+    </p></li><li class="listitem"><p>
+     If you have unarchived WAL segment files that you saved in step 2,
+     copy them into <code class="filename">pg_wal/</code>.  (It is best to copy them,
+     not move them, so you still have the unmodified files if a
+     problem occurs and you have to start over.)
+    </p></li><li class="listitem"><p>
+     Set recovery configuration settings in
+     <code class="filename">postgresql.conf</code> (see <a class="xref" href="runtime-config-wal.html#RUNTIME-CONFIG-WAL-ARCHIVE-RECOVERY" title="20.5.5. Archive Recovery">Section 20.5.5</a>) and create a file
+     <code class="filename">recovery.signal</code> in the cluster
+     data directory. You might
+     also want to temporarily modify <code class="filename">pg_hba.conf</code> to prevent
+     ordinary users from connecting until you are sure the recovery was successful.
+    </p></li><li class="listitem"><p>
+     Start the server.  The server will go into recovery mode and
+     proceed to read through the archived WAL files it needs.  Should the
+     recovery be terminated because of an external error, the server can
+     simply be restarted and it will continue recovery.  Upon completion
+     of the recovery process, the server will remove
+     <code class="filename">recovery.signal</code> (to prevent
+     accidentally re-entering recovery mode later) and then
+     commence normal database operations.
+    </p></li><li class="listitem"><p>
+     Inspect the contents of the database to ensure you have recovered to
+     the desired state.  If not, return to step 1.  If all is well,
+     allow your users to connect by restoring <code class="filename">pg_hba.conf</code> to normal.
+    </p></li></ol></div><p>
+   </p><p>
+    The key part of all this is to set up a recovery configuration that
+    describes how you want to recover and how far the recovery should
+    run.  The one thing that you absolutely must specify is the <code class="varname">restore_command</code>,
+    which tells <span class="productname">PostgreSQL</span> how to retrieve archived
+    WAL file segments.  Like the <code class="varname">archive_command</code>, this is
+    a shell command string.  It can contain <code class="literal">%f</code>, which is
+    replaced by the name of the desired log file, and <code class="literal">%p</code>,
+    which is replaced by the path name to copy the log file to.
+    (The path name is relative to the current working directory,
+    i.e., the cluster's data directory.)
+    Write <code class="literal">%%</code> if you need to embed an actual <code class="literal">%</code>
+    character in the command.  The simplest useful command is
+    something like:
+</p><pre class="programlisting">
+restore_command = 'cp /mnt/server/archivedir/%f %p'
+</pre><p>
+    which will copy previously archived WAL segments from the directory
+    <code class="filename">/mnt/server/archivedir</code>.  Of course, you can use something
+    much more complicated, perhaps even a shell script that requests the
+    operator to mount an appropriate tape.
+   </p><p>
+    It is important that the command return nonzero exit status on failure.
+    The command <span class="emphasis"><em>will</em></span> be called requesting files that are not
+    present in the archive; it must return nonzero when so asked.  This is not
+    an error condition.  An exception is that if the command was terminated by
+    a signal (other than <span class="systemitem">SIGTERM</span>, which is used as
+    part of a database server shutdown) or an error by the shell (such as
+    command not found), then recovery will abort and the server will not start
+    up.
+   </p><p>
+    Not all of the requested files will be WAL segment
+    files; you should also expect requests for files with a suffix of
+    <code class="literal">.history</code>. Also be aware that
+    the base name of the <code class="literal">%p</code> path will be different from
+    <code class="literal">%f</code>; do not expect them to be interchangeable.
+   </p><p>
+    WAL segments that cannot be found in the archive will be sought in
+    <code class="filename">pg_wal/</code>; this allows use of recent un-archived segments.
+    However, segments that are available from the archive will be used in
+    preference to files in <code class="filename">pg_wal/</code>.
+   </p><p>
+    Normally, recovery will proceed through all available WAL segments,
+    thereby restoring the database to the current point in time (or as
+    close as possible given the available WAL segments).  Therefore, a normal
+    recovery will end with a <span class="quote">“<span class="quote">file not found</span>”</span> message, the exact text
+    of the error message depending upon your choice of
+    <code class="varname">restore_command</code>.  You may also see an error message
+    at the start of recovery for a file named something like
+    <code class="filename">00000001.history</code>.  This is also normal and does not
+    indicate a problem in simple recovery situations; see
+    <a class="xref" href="continuous-archiving.html#BACKUP-TIMELINES" title="26.3.5. Timelines">Section 26.3.5</a> for discussion.
+   </p><p>
+    If you want to recover to some previous point in time (say, right before
+    the junior DBA dropped your main transaction table), just specify the
+    required <a class="link" href="runtime-config-wal.html#RUNTIME-CONFIG-WAL-RECOVERY-TARGET" title="20.5.6. Recovery Target">stopping point</a>.  You can specify
+    the stop point, known as the <span class="quote">“<span class="quote">recovery target</span>”</span>, either by
+    date/time, named restore point or by completion of a specific transaction
+    ID.  As of this writing only the date/time and named restore point options
+    are very usable, since there are no tools to help you identify with any
+    accuracy which transaction ID to use.
+   </p><div class="note"><h3 class="title">Note</h3><p>
+      The stop point must be after the ending time of the base backup, i.e.,
+      the end time of <code class="function">pg_backup_stop</code>.  You cannot use a base backup
+      to recover to a time when that backup was in progress.  (To
+      recover to such a time, you must go back to your previous base backup
+      and roll forward from there.)
+     </p></div><p>
+    If recovery finds corrupted WAL data, recovery will
+    halt at that point and the server will not start. In such a case the
+    recovery process could be re-run from the beginning, specifying a
+    <span class="quote">“<span class="quote">recovery target</span>”</span> before the point of corruption so that recovery
+    can complete normally.
+    If recovery fails for an external reason, such as a system crash or
+    if the WAL archive has become inaccessible, then the recovery can simply
+    be restarted and it will restart almost from where it failed.
+    Recovery restart works much like checkpointing in normal operation:
+    the server periodically forces all its state to disk, and then updates
+    the <code class="filename">pg_control</code> file to indicate that the already-processed
+    WAL data need not be scanned again.
+   </p></div><div class="sect2" id="BACKUP-TIMELINES"><div class="titlepage"><div><div><h3 class="title">26.3.5. Timelines</h3></div></div></div><a id="id-1.6.13.7.13.2" class="indexterm"></a><p>
+    The ability to restore the database to a previous point in time creates
+    some complexities that are akin to science-fiction stories about time
+    travel and parallel universes.  For example, in the original history of the database,
+    suppose you dropped a critical table at 5:15PM on Tuesday evening, but
+    didn't realize your mistake until Wednesday noon.
+    Unfazed, you get out your backup, restore to the point-in-time 5:14PM
+    Tuesday evening, and are up and running.  In <span class="emphasis"><em>this</em></span> history of
+    the database universe, you never dropped the table.  But suppose
+    you later realize this wasn't such a great idea, and would like
+    to return to sometime Wednesday morning in the original history.
+    You won't be able
+    to if, while your database was up-and-running, it overwrote some of the
+    WAL segment files that led up to the time you now wish you
+    could get back to.  Thus, to avoid this, you need to distinguish the series of
+    WAL records generated after you've done a point-in-time recovery from
+    those that were generated in the original database history.
+   </p><p>
+    To deal with this problem, <span class="productname">PostgreSQL</span> has a notion
+    of <em class="firstterm">timelines</em>.  Whenever an archive recovery completes,
+    a new timeline is created to identify the series of WAL records
+    generated after that recovery.  The timeline
+    ID number is part of WAL segment file names so a new timeline does
+    not overwrite the WAL data generated by previous timelines.  It is
+    in fact possible to archive many different timelines.  While that might
+    seem like a useless feature, it's often a lifesaver.  Consider the
+    situation where you aren't quite sure what point-in-time to recover to,
+    and so have to do several point-in-time recoveries by trial and error
+    until you find the best place to branch off from the old history.  Without
+    timelines this process would soon generate an unmanageable mess.  With
+    timelines, you can recover to <span class="emphasis"><em>any</em></span> prior state, including
+    states in timeline branches that you abandoned earlier.
+   </p><p>
+    Every time a new timeline is created, <span class="productname">PostgreSQL</span> creates
+    a <span class="quote">“<span class="quote">timeline history</span>”</span> file that shows which timeline it branched
+    off from and when.  These history files are necessary to allow the system
+    to pick the right WAL segment files when recovering from an archive that
+    contains multiple timelines.  Therefore, they are archived into the WAL
+    archive area just like WAL segment files.  The history files are just
+    small text files, so it's cheap and appropriate to keep them around
+    indefinitely (unlike the segment files which are large).  You can, if
+    you like, add comments to a history file to record your own notes about
+    how and why this particular timeline was created.  Such comments will be
+    especially valuable when you have a thicket of different timelines as
+    a result of experimentation.
+   </p><p>
+    The default behavior of recovery is to recover to the latest timeline found
+    in the archive. If you wish to recover to the timeline that was current
+    when the base backup was taken or into a specific child timeline (that
+    is, you want to return to some state that was itself generated after a
+    recovery attempt), you need to specify <code class="literal">current</code> or the
+    target timeline ID in <a class="xref" href="runtime-config-wal.html#GUC-RECOVERY-TARGET-TIMELINE">recovery_target_timeline</a>. You
+    cannot recover into timelines that branched off earlier than the base backup.
+   </p></div><div class="sect2" id="BACKUP-TIPS"><div class="titlepage"><div><div><h3 class="title">26.3.6. Tips and Examples</h3></div></div></div><p>
+    Some tips for configuring continuous archiving are given here.
+   </p><div class="sect3" id="BACKUP-STANDALONE"><div class="titlepage"><div><div><h4 class="title">26.3.6.1. Standalone Hot Backups</h4></div></div></div><p>
+      It is possible to use <span class="productname">PostgreSQL</span>'s backup facilities to
+      produce standalone hot backups. These are backups that cannot be used
+      for point-in-time recovery, yet are typically much faster to backup and
+      restore than <span class="application">pg_dump</span> dumps.  (They are also much larger
+      than <span class="application">pg_dump</span> dumps, so in some cases the speed advantage
+      might be negated.)
+     </p><p>
+      As with base backups, the easiest way to produce a standalone
+      hot backup is to use the <a class="xref" href="app-pgbasebackup.html" title="pg_basebackup"><span class="refentrytitle"><span class="application">pg_basebackup</span></span></a>
+      tool. If you include the <code class="literal">-X</code> parameter when calling
+      it, all the write-ahead log required to use the backup will be
+      included in the backup automatically, and no special action is
+      required to restore the backup.
+     </p></div><div class="sect3" id="COMPRESSED-ARCHIVE-LOGS"><div class="titlepage"><div><div><h4 class="title">26.3.6.2. Compressed Archive Logs</h4></div></div></div><p>
+      If archive storage size is a concern, you can use
+      <span class="application">gzip</span> to compress the archive files:
+</p><pre class="programlisting">
+archive_command = 'gzip &lt; %p &gt; /mnt/server/archivedir/%f.gz'
+</pre><p>
+      You will then need to use <span class="application">gunzip</span> during recovery:
+</p><pre class="programlisting">
+restore_command = 'gunzip &lt; /mnt/server/archivedir/%f.gz &gt; %p'
+</pre><p>
+     </p></div><div class="sect3" id="BACKUP-SCRIPTS"><div class="titlepage"><div><div><h4 class="title">26.3.6.3. <code class="varname">archive_command</code> Scripts</h4></div></div></div><p>
+      Many people choose to use scripts to define their
+      <code class="varname">archive_command</code>, so that their
+      <code class="filename">postgresql.conf</code> entry looks very simple:
+</p><pre class="programlisting">
+archive_command = 'local_backup_script.sh "%p" "%f"'
+</pre><p>
+      Using a separate script file is advisable any time you want to use
+      more than a single command in the archiving process.
+      This allows all complexity to be managed within the script, which
+      can be written in a popular scripting language such as
+      <span class="application">bash</span> or <span class="application">perl</span>.
+     </p><p>
+      Examples of requirements that might be solved within a script include:
+      </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+         Copying data to secure off-site data storage
+        </p></li><li class="listitem"><p>
+         Batching WAL files so that they are transferred every three hours,
+         rather than one at a time
+        </p></li><li class="listitem"><p>
+         Interfacing with other backup and recovery software
+        </p></li><li class="listitem"><p>
+         Interfacing with monitoring software to report errors
+        </p></li></ul></div><p>
+     </p><div class="tip"><h3 class="title">Tip</h3><p>
+       When using an <code class="varname">archive_command</code> script, it's desirable
+       to enable <a class="xref" href="runtime-config-logging.html#GUC-LOGGING-COLLECTOR">logging_collector</a>.
+       Any messages written to <span class="systemitem">stderr</span> from the script will then
+       appear in the database server log, allowing complex configurations to
+       be diagnosed easily if they fail.
+      </p></div></div></div><div class="sect2" id="CONTINUOUS-ARCHIVING-CAVEATS"><div class="titlepage"><div><div><h3 class="title">26.3.7. Caveats</h3></div></div></div><p>
+    At this writing, there are several limitations of the continuous archiving
+    technique.  These will probably be fixed in future releases:
+
+  </p><div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; "><li class="listitem"><p>
+     If a <a class="link" href="sql-createdatabase.html" title="CREATE DATABASE"><code class="command">CREATE DATABASE</code></a>
+     command is executed while a base backup is being taken, and then
+     the template database that the <code class="command">CREATE DATABASE</code> copied
+     is modified while the base backup is still in progress, it is
+     possible that recovery will cause those modifications to be
+     propagated into the created database as well.  This is of course
+     undesirable.  To avoid this risk, it is best not to modify any
+     template databases while taking a base backup.
+    </p></li><li class="listitem"><p>
+     <a class="link" href="sql-createtablespace.html" title="CREATE TABLESPACE"><code class="command">CREATE TABLESPACE</code></a>
+     commands are WAL-logged with the literal absolute path, and will
+     therefore be replayed as tablespace creations with the same
+     absolute path.  This might be undesirable if the log is being
+     replayed on a different machine.  It can be dangerous even if the
+     log is being replayed on the same machine, but into a new data
+     directory: the replay will still overwrite the contents of the
+     original tablespace.  To avoid potential gotchas of this sort,
+     the best practice is to take a new base backup after creating or
+     dropping tablespaces.
+    </p></li></ul></div><p>
+   </p><p>
+    It should also be noted that the default <acronym class="acronym">WAL</acronym>
+    format is fairly bulky since it includes many disk page snapshots.
+    These page snapshots are designed to support crash recovery, since
+    we might need to fix partially-written disk pages.  Depending on
+    your system hardware and software, the risk of partial writes might
+    be small enough to ignore, in which case you can significantly
+    reduce the total volume of archived logs by turning off page
+    snapshots using the <a class="xref" href="runtime-config-wal.html#GUC-FULL-PAGE-WRITES">full_page_writes</a>
+    parameter.  (Read the notes and warnings in <a class="xref" href="wal.html" title="Chapter 30. Reliability and the Write-Ahead Log">Chapter 30</a>
+    before you do so.)  Turning off page snapshots does not prevent
+    use of the logs for PITR operations.  An area for future
+    development is to compress archived WAL data by removing
+    unnecessary page copies even when <code class="varname">full_page_writes</code> is
+    on.  In the meantime, administrators might wish to reduce the number
+    of page snapshots included in WAL by increasing the checkpoint
+    interval parameters as much as feasible.
+   </p></div></div><div class="navfooter"><hr /><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="backup-file.html" title="26.2. File System Level Backup">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="backup.html" title="Chapter 26. Backup and Restore">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="high-availability.html" title="Chapter 27. High Availability, Load Balancing, and Replication">Next</a></td></tr><tr><td width="40%" align="left" valign="top">26.2. File System Level Backup </td><td width="20%" align="center"><a accesskey="h" href="index.html" title="PostgreSQL 15.5 Documentation">Home</a></td><td width="40%" align="right" valign="top"> Chapter 27. High Availability, Load Balancing, and Replication</td></tr></table></div></body></html>
+\ No newline at end of file