summaryrefslogtreecommitdiffstats
path: root/src/bin/pg_upgrade/IMPLEMENTATION
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-04 12:15:05 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-05-04 12:15:05 +0000
commit46651ce6fe013220ed397add242004d764fc0153 (patch)
tree6e5299f990f88e60174a1d3ae6e48eedd2688b2b /src/bin/pg_upgrade/IMPLEMENTATION
parentInitial commit. (diff)
downloadpostgresql-14-46651ce6fe013220ed397add242004d764fc0153.tar.xz
postgresql-14-46651ce6fe013220ed397add242004d764fc0153.zip
Adding upstream version 14.5.upstream/14.5upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/bin/pg_upgrade/IMPLEMENTATION')
-rw-r--r--src/bin/pg_upgrade/IMPLEMENTATION98
1 files changed, 98 insertions, 0 deletions
diff --git a/src/bin/pg_upgrade/IMPLEMENTATION b/src/bin/pg_upgrade/IMPLEMENTATION
new file mode 100644
index 0000000..69fcd70
--- /dev/null
+++ b/src/bin/pg_upgrade/IMPLEMENTATION
@@ -0,0 +1,98 @@
+------------------------------------------------------------------------------
+PG_UPGRADE: IN-PLACE UPGRADES FOR POSTGRESQL
+------------------------------------------------------------------------------
+
+Upgrading a PostgreSQL database from one major release to another can be
+an expensive process. For minor upgrades, you can simply install new
+executables and forget about upgrading existing data. But for major
+upgrades, you have to export all of your data using pg_dump, install the
+new release, run initdb to create a new cluster, and then import your
+old data. If you have a lot of data, that can take a considerable amount
+of time. If you have too much data, you may have to buy more storage
+since you need enough room to hold the original data plus the exported
+data. pg_upgrade can reduce the amount of time and disk space required
+for many upgrades.
+
+The URL http://momjian.us/main/writings/pgsql/pg_upgrade.pdf contains a
+presentation about pg_upgrade internals that mirrors the text
+description below.
+
+------------------------------------------------------------------------------
+WHAT IT DOES
+------------------------------------------------------------------------------
+
+pg_upgrade is a tool that performs an in-place upgrade of existing
+data. Some upgrades change the on-disk representation of data;
+pg_upgrade cannot help in those upgrades. However, many upgrades do
+not change the on-disk representation of a user-defined table. In those
+cases, pg_upgrade can move existing user-defined tables from the old
+database cluster into the new cluster.
+
+There are two factors that determine whether an in-place upgrade is
+practical.
+
+Every table in a cluster shares the same on-disk representation of the
+table headers and trailers and the on-disk representation of tuple
+headers. If this changes between the old version of PostgreSQL and the
+new version, pg_upgrade cannot move existing tables to the new cluster;
+you will have to pg_dump the old data and then import that data into the
+new cluster.
+
+Second, all data types should have the same binary representation
+between the two major PostgreSQL versions.
+
+------------------------------------------------------------------------------
+HOW IT WORKS
+------------------------------------------------------------------------------
+
+To use pg_upgrade during an upgrade, start by installing a fresh
+cluster using the newest version in a new directory. When you've
+finished installation, the new cluster will contain the new executables
+and the usual template0, template1, and postgres, but no user-defined
+tables. At this point, you can shut down the old and new postmasters and
+invoke pg_upgrade.
+
+When pg_upgrade starts, it ensures that all required executables are
+present and contain the expected version numbers. The verification
+process also checks the old and new $PGDATA directories to ensure that
+the expected files and subdirectories are in place. If the verification
+process succeeds, pg_upgrade starts the old postmaster and runs
+pg_dumpall --schema-only to capture the metadata contained in the old
+cluster. The script produced by pg_dumpall will be used in a later step
+to recreate all user-defined objects in the new cluster.
+
+Note that the script produced by pg_dumpall will only recreate
+user-defined objects, not system-defined objects. The new cluster will
+contain the system-defined objects created by the latest version of
+PostgreSQL.
+
+Once pg_upgrade has extracted the metadata from the old cluster, it
+performs a number of bookkeeping tasks required to 'sync up' the new
+cluster with the existing data.
+
+First, pg_upgrade copies the commit status information and 'next
+transaction ID' from the old cluster to the new cluster. This step
+ensures that the proper tuples are visible from the new cluster.
+Remember, pg_upgrade does not export/import the content of user-defined
+tables so the transaction IDs in the new cluster must match the
+transaction IDs in the old data. pg_upgrade also copies the starting
+address for write-ahead logs from the old cluster to the new cluster.
+
+Now pg_upgrade begins reconstructing the metadata obtained from the old
+cluster using the first part of the pg_dumpall output.
+
+Next, pg_upgrade executes the remainder of the script produced earlier
+by pg_dumpall --- this script effectively creates the complete
+user-defined metadata from the old cluster to the new cluster. It
+preserves the relfilenode numbers so TOAST and other references
+to relfilenodes in user data is preserved. (See binary-upgrade usage
+in pg_dump).
+
+Finally, pg_upgrade links or copies each user-defined table and its
+supporting indexes and toast tables from the old cluster to the new
+cluster.
+
+An important feature of the pg_upgrade design is that it leaves the
+original cluster intact --- if a problem occurs during the upgrade, you
+can still run the previous version, after renaming the tablespaces back
+to the original names.