summaryrefslogtreecommitdiffstats
path: root/doc/src/sgml/backup-manifest.sgml
blob: 6ecf9977a54b3d036821757801091d0da408f39f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
<!-- doc/src/sgml/backupmanifest.sgml -->

<chapter id="backup-manifest-format">
 <title>Backup Manifest Format</title>

  <indexterm>
   <primary>Backup Manifest</primary>
  </indexterm>

  <para>
   The backup manifest generated by <xref linkend="app-pgbasebackup" /> is
   primarily intended to permit the backup to be verified using
   <xref linkend="app-pgverifybackup" />. However, it is
   also possible for other tools to read the backup manifest file and use
   the information contained therein for their own purposes. To that end,
   this chapter describes the format of the backup manifest file.
  </para>

  <para>
   A backup manifest is a JSON document encoded as UTF-8. (Although in
   general JSON documents are required to be Unicode, PostgreSQL permits
   the <type>json</type> and <type>jsonb</type> data types to be used with any
   supported server encoding. There is no similar exception for backup
   manifests.) The JSON document is always an object; the keys that are present
   in this object are described in the next section.
  </para>

 <sect1 id="backup-manifest-toplevel">
  <title>Backup Manifest Top-level Object</title>

  <para>
   The backup manifest JSON document contains the following keys.
  </para>

  <variablelist>
   <varlistentry>
    <term><literal>PostgreSQL-Backup-Manifest-Version</literal></term>
    <listitem>
     <para>
      The associated value is always the integer 1.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>Files</literal></term>
    <listitem>
     <para>
      The associated value is always a list of objects, each describing one
      file that is present in the backup. No entries are present in this
      list for the WAL files that are needed in order to use the backup,
      or for the backup manifest itself.  The structure of each object in the
      list is described in <xref linkend="backup-manifest-files" />.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>WAL-Ranges</literal></term>
    <listitem>
     <para>
      The associated value is always a list of objects, each describing a
      range of WAL records that must be readable from a particular timeline
      in order to make use of the backup.  The structure of these objects is
      further described in <xref linkend="backup-manifest-wal-ranges" />.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>Manifest-Checksum</literal></term>
    <listitem>
     <para>
      This key is always present on the last line of the backup manifest file.
      The associated value is a SHA256 checksum of all the preceding lines.
      We use a fixed checksum method here to make it possible for clients
      to do incremental parsing of the manifest. While a SHA256 checksum
      is significantly more expensive than a CRC32C checksum, the manifest
      should normally be small enough that the extra computation won't matter
      very much.
     </para>
    </listitem>
   </varlistentry>
  </variablelist>
 </sect1>

 <sect1 id="backup-manifest-files">
  <title>Backup Manifest File Object</title>

  <para>
   The object which describes a single file contains either a
   <literal>Path</literal> key or an <literal>Encoded-Path</literal> key.
   Normally, the <literal>Path</literal> key will be present. The
   associated string value is the path of the file relative to the root
   of the backup directory. Files located in a user-defined tablespace
   will have paths whose first two components are <filename>pg_tblspc</filename> and the OID
   of the tablespace. If the path is not a string that is legal in UTF-8,
   or if the user requests that encoded paths be used for all files, then
   the <literal>Encoded-Path</literal> key will be present instead.  This
   stores the same data, but it is encoded as a string of hexadecimal
   digits. Each pair of hexadecimal digits in the string represents a
   single octet.
  </para>

  <para>
   The following two keys are always present:
  </para>

  <variablelist>
   <varlistentry>
    <term><literal>Size</literal></term>
    <listitem>
     <para>
      The expected size of this file, as an integer.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>Last-Modified</literal></term>
    <listitem>
     <para>
      The last modification time of the file as reported by the server at
      the time of the backup. Unlike the other fields stored in the backup,
      this field is not used by <xref linkend="app-pgverifybackup" />.
      It is included only for informational purposes.
     </para>
    </listitem>
   </varlistentry>
  </variablelist>

  <para>
   If the backup was taken with file checksums enabled, the following
   keys will be present:
  </para>

  <variablelist>
   <varlistentry>
    <term><literal>Checksum-Algorithm</literal></term>
    <listitem>
     <para>
      The checksum algorithm used to compute a checksum for this file.
      Currently, this will be the same for every file in the backup
      manifest, but this may change in future releases. At present, the
      supported checksum algorithms are <literal>CRC32C</literal>,
      <literal>SHA224</literal>,
      <literal>SHA256</literal>,
      <literal>SHA384</literal>, and
      <literal>SHA512</literal>.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>Checksum</literal></term>
    <listitem>
     <para>
      The checksum computed for this file, stored as a series of
      hexadecimal characters, two for each byte of the checksum.
     </para>
    </listitem>
   </varlistentry>
  </variablelist>
 </sect1>

 <sect1 id="backup-manifest-wal-ranges">
  <title>Backup Manifest WAL Range Object</title>

  <para>
   The object which describes a WAL range always has three keys:
  </para>

  <variablelist>
   <varlistentry>
    <term><literal>Timeline</literal></term>
    <listitem>
     <para>
      The timeline for this range of WAL records, as an integer.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>Start-LSN</literal></term>
    <listitem>
     <para>
      The LSN at which replay must begin on the indicated timeline in order to
      make use of this backup.  The LSN is stored in the format normally used
      by <productname>PostgreSQL</productname>; that is, it is a string
      consisting of two strings of hexadecimal characters, each with a length
      of between 1 and 8, separated by a slash.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>End-LSN</literal></term>
    <listitem>
     <para>
      The earliest LSN at which replay on the indicated timeline may end when
      making use of this backup. This is stored in the same format as
      <literal>Start-LSN</literal>.
     </para>
    </listitem>
   </varlistentry>
  </variablelist>

  <para>
   Ordinarily, there will be only a single WAL range. However, if a backup is
   taken from a standby which switches timelines during the backup due to an
   upstream promotion, it is possible for multiple ranges to be present, each
   with a different timeline. There will never be multiple WAL ranges present
   for the same timeline.
  </para>
 </sect1>
</chapter>