diff options
Diffstat (limited to 'doc/src/sgml/logical-replication.sgml')
-rw-r--r-- | doc/src/sgml/logical-replication.sgml | 1677 |
1 files changed, 1677 insertions, 0 deletions
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml new file mode 100644 index 0000000..bc3f5ec --- /dev/null +++ b/doc/src/sgml/logical-replication.sgml @@ -0,0 +1,1677 @@ +<!-- doc/src/sgml/logical-replication.sgml --> + +<chapter id="logical-replication"> + <title>Logical Replication</title> + + <para> + Logical replication is a method of replicating data objects and their + changes, based upon their replication identity (usually a primary key). We + use the term logical in contrast to physical replication, which uses exact + block addresses and byte-by-byte replication. PostgreSQL supports both + mechanisms concurrently, see <xref linkend="high-availability"/>. Logical + replication allows fine-grained control over both data replication and + security. + </para> + + <para> + Logical replication uses a <firstterm>publish</firstterm> + and <firstterm>subscribe</firstterm> model with one or + more <firstterm>subscribers</firstterm> subscribing to one or more + <firstterm>publications</firstterm> on a <firstterm>publisher</firstterm> + node. Subscribers pull data from the publications they subscribe to and may + subsequently re-publish data to allow cascading replication or more complex + configurations. + </para> + + <para> + Logical replication of a table typically starts with taking a snapshot + of the data on the publisher database and copying that to the subscriber. + Once that is done, the changes on the publisher are sent to the subscriber + as they occur in real-time. The subscriber applies the data in the same + order as the publisher so that transactional consistency is guaranteed for + publications within a single subscription. This method of data replication + is sometimes referred to as transactional replication. + </para> + + <para> + The typical use-cases for logical replication are: + + <itemizedlist> + <listitem> + <para> + Sending incremental changes in a single database or a subset of a + database to subscribers as they occur. + </para> + </listitem> + + <listitem> + <para> + Firing triggers for individual changes as they arrive on the + subscriber. + </para> + </listitem> + + <listitem> + <para> + Consolidating multiple databases into a single one (for example for + analytical purposes). + </para> + </listitem> + + <listitem> + <para> + Replicating between different major versions of PostgreSQL. + </para> + </listitem> + + <listitem> + <para> + Replicating between PostgreSQL instances on different platforms (for + example Linux to Windows) + </para> + </listitem> + + <listitem> + <para> + Giving access to replicated data to different groups of users. + </para> + </listitem> + + <listitem> + <para> + Sharing a subset of the database between multiple databases. + </para> + </listitem> + </itemizedlist> + </para> + + <para> + The subscriber database behaves in the same way as any other PostgreSQL + instance and can be used as a publisher for other databases by defining its + own publications. When the subscriber is treated as read-only by + application, there will be no conflicts from a single subscription. On the + other hand, if there are other writes done either by an application or by other + subscribers to the same set of tables, conflicts can arise. + </para> + + <sect1 id="logical-replication-publication"> + <title>Publication</title> + + <para> + A <firstterm>publication</firstterm> can be defined on any physical + replication primary. The node where a publication is defined is referred to + as <firstterm>publisher</firstterm>. A publication is a set of changes + generated from a table or a group of tables, and might also be described as + a change set or replication set. Each publication exists in only one database. + </para> + + <para> + Publications are different from schemas and do not affect how the table is + accessed. Each table can be added to multiple publications if needed. + Publications may currently only contain tables and all tables in schema. + Objects must be added explicitly, except when a publication is created for + <literal>ALL TABLES</literal>. + </para> + + <para> + Publications can choose to limit the changes they produce to + any combination of <command>INSERT</command>, <command>UPDATE</command>, + <command>DELETE</command>, and <command>TRUNCATE</command>, similar to how triggers are fired by + particular event types. By default, all operation types are replicated. + These publication specifications apply only for DML operations; they do not affect the initial + data synchronization copy. (Row filters have no effect for + <command>TRUNCATE</command>. See <xref linkend="logical-replication-row-filter"/>). + </para> + + <para> + A published table must have a <quote>replica identity</quote> configured in + order to be able to replicate <command>UPDATE</command> + and <command>DELETE</command> operations, so that appropriate rows to + update or delete can be identified on the subscriber side. By default, + this is the primary key, if there is one. Another unique index (with + certain additional requirements) can also be set to be the replica + identity. If the table does not have any suitable key, then it can be set + to replica identity <quote>full</quote>, which means the entire row becomes + the key. This, however, is very inefficient and should only be used as a + fallback if no other solution is possible. If a replica identity other + than <quote>full</quote> is set on the publisher side, a replica identity + comprising the same or fewer columns must also be set on the subscriber + side. See <xref linkend="sql-altertable-replica-identity"/> for details on + how to set the replica identity. If a table without a replica identity is + added to a publication that replicates <command>UPDATE</command> + or <command>DELETE</command> operations then + subsequent <command>UPDATE</command> or <command>DELETE</command> + operations will cause an error on the publisher. <command>INSERT</command> + operations can proceed regardless of any replica identity. + </para> + + <para> + Every publication can have multiple subscribers. + </para> + + <para> + A publication is created using the <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link> + command and may later be altered or dropped using corresponding commands. + </para> + + <para> + The individual tables can be added and removed dynamically using + <link linkend="sql-alterpublication"><command>ALTER PUBLICATION</command></link>. Both the <literal>ADD + TABLE</literal> and <literal>DROP TABLE</literal> operations are + transactional; so the table will start or stop replicating at the correct + snapshot once the transaction has committed. + </para> + </sect1> + + <sect1 id="logical-replication-subscription"> + <title>Subscription</title> + + <para> + A <firstterm>subscription</firstterm> is the downstream side of logical + replication. The node where a subscription is defined is referred to as + the <firstterm>subscriber</firstterm>. A subscription defines the connection + to another database and set of publications (one or more) to which it wants + to subscribe. + </para> + + <para> + The subscriber database behaves in the same way as any other PostgreSQL + instance and can be used as a publisher for other databases by defining its + own publications. + </para> + + <para> + A subscriber node may have multiple subscriptions if desired. It is + possible to define multiple subscriptions between a single + publisher-subscriber pair, in which case care must be taken to ensure + that the subscribed publication objects don't overlap. + </para> + + <para> + Each subscription will receive changes via one replication slot (see + <xref linkend="streaming-replication-slots"/>). Additional replication + slots may be required for the initial data synchronization of + pre-existing table data and those will be dropped at the end of data + synchronization. + </para> + + <para> + A logical replication subscription can be a standby for synchronous + replication (see <xref linkend="synchronous-replication"/>). The standby + name is by default the subscription name. An alternative name can be + specified as <literal>application_name</literal> in the connection + information of the subscription. + </para> + + <para> + Subscriptions are dumped by <command>pg_dump</command> if the current user + is a superuser. Otherwise a warning is written and subscriptions are + skipped, because non-superusers cannot read all subscription information + from the <structname>pg_subscription</structname> catalog. + </para> + + <para> + The subscription is added using <link linkend="sql-createsubscription"><command>CREATE SUBSCRIPTION</command></link> and + can be stopped/resumed at any time using the + <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION</command></link> command and removed using + <link linkend="sql-dropsubscription"><command>DROP SUBSCRIPTION</command></link>. + </para> + + <para> + When a subscription is dropped and recreated, the synchronization + information is lost. This means that the data has to be resynchronized + afterwards. + </para> + + <para> + The schema definitions are not replicated, and the published tables must + exist on the subscriber. Only regular tables may be + the target of replication. For example, you can't replicate to a view. + </para> + + <para> + The tables are matched between the publisher and the subscriber using the + fully qualified table name. Replication to differently-named tables on the + subscriber is not supported. + </para> + + <para> + Columns of a table are also matched by name. The order of columns in the + subscriber table does not need to match that of the publisher. The data + types of the columns do not need to match, as long as the text + representation of the data can be converted to the target type. For + example, you can replicate from a column of type <type>integer</type> to a + column of type <type>bigint</type>. The target table can also have + additional columns not provided by the published table. Any such columns + will be filled with the default value as specified in the definition of the + target table. + </para> + + <sect2 id="logical-replication-subscription-slot"> + <title>Replication Slot Management</title> + + <para> + As mentioned earlier, each (active) subscription receives changes from a + replication slot on the remote (publishing) side. + </para> + <para> + Additional table synchronization slots are normally transient, created + internally to perform initial table synchronization and dropped + automatically when they are no longer needed. These table synchronization + slots have generated names: <quote><literal>pg_%u_sync_%u_%llu</literal></quote> + (parameters: Subscription <parameter>oid</parameter>, + Table <parameter>relid</parameter>, system identifier <parameter>sysid</parameter>) + </para> + <para> + Normally, the remote replication slot is created automatically when the + subscription is created using <command>CREATE SUBSCRIPTION</command> and it + is dropped automatically when the subscription is dropped using + <command>DROP SUBSCRIPTION</command>. In some situations, however, it can + be useful or necessary to manipulate the subscription and the underlying + replication slot separately. Here are some scenarios: + + <itemizedlist> + <listitem> + <para> + When creating a subscription, the replication slot already exists. In + that case, the subscription can be created using + the <literal>create_slot = false</literal> option to associate with the + existing slot. + </para> + </listitem> + + <listitem> + <para> + When creating a subscription, the remote host is not reachable or in an + unclear state. In that case, the subscription can be created using + the <literal>connect = false</literal> option. The remote host will then not + be contacted at all. This is what <application>pg_dump</application> + uses. The remote replication slot will then have to be created + manually before the subscription can be activated. + </para> + </listitem> + + <listitem> + <para> + When dropping a subscription, the replication slot should be kept. + This could be useful when the subscriber database is being moved to a + different host and will be activated from there. In that case, + disassociate the slot from the subscription using <command>ALTER + SUBSCRIPTION</command> before attempting to drop the subscription. + </para> + </listitem> + + <listitem> + <para> + When dropping a subscription, the remote host is not reachable. In + that case, disassociate the slot from the subscription + using <command>ALTER SUBSCRIPTION</command> before attempting to drop + the subscription. If the remote database instance no longer exists, no + further action is then necessary. If, however, the remote database + instance is just unreachable, the replication slot (and any still + remaining table synchronization slots) should then be + dropped manually; otherwise it/they would continue to reserve WAL and might + eventually cause the disk to fill up. Such cases should be carefully + investigated. + </para> + </listitem> + </itemizedlist> + </para> + </sect2> + + <sect2 id="logical-replication-subscription-examples"> + <title>Examples</title> + + <para> + Create some test tables on the publisher. +<programlisting> +test_pub=# CREATE TABLE t1(a int, b text, PRIMARY KEY(a)); +CREATE TABLE +test_pub=# CREATE TABLE t2(c int, d text, PRIMARY KEY(c)); +CREATE TABLE +test_pub=# CREATE TABLE t3(e int, f text, PRIMARY KEY(e)); +CREATE TABLE +</programlisting></para> + + <para> + Create the same tables on the subscriber. +<programlisting> +test_sub=# CREATE TABLE t1(a int, b text, PRIMARY KEY(a)); +CREATE TABLE +test_sub=# CREATE TABLE t2(c int, d text, PRIMARY KEY(c)); +CREATE TABLE +test_sub=# CREATE TABLE t3(e int, f text, PRIMARY KEY(e)); +CREATE TABLE +</programlisting></para> + + <para> + Insert data to the tables at the publisher side. +<programlisting> +test_pub=# INSERT INTO t1 VALUES (1, 'one'), (2, 'two'), (3, 'three'); +INSERT 0 3 +test_pub=# INSERT INTO t2 VALUES (1, 'A'), (2, 'B'), (3, 'C'); +INSERT 0 3 +test_pub=# INSERT INTO t3 VALUES (1, 'i'), (2, 'ii'), (3, 'iii'); +INSERT 0 3 +</programlisting></para> + + <para> + Create publications for the tables. The publications <literal>pub2</literal> + and <literal>pub3a</literal> disallow some <literal>publish</literal> + operations. The publication <literal>pub3b</literal> has a row filter (see + <xref linkend="logical-replication-row-filter"/>). +<programlisting> +test_pub=# CREATE PUBLICATION pub1 FOR TABLE t1; +CREATE PUBLICATION +test_pub=# CREATE PUBLICATION pub2 FOR TABLE t2 WITH (publish = 'truncate'); +CREATE PUBLICATION +test_pub=# CREATE PUBLICATION pub3a FOR TABLE t3 WITH (publish = 'truncate'); +CREATE PUBLICATION +test_pub=# CREATE PUBLICATION pub3b FOR TABLE t3 WHERE (e > 5); +CREATE PUBLICATION +</programlisting></para> + + <para> + Create subscriptions for the publications. The subscription + <literal>sub3</literal> subscribes to both <literal>pub3a</literal> and + <literal>pub3b</literal>. All subscriptions will copy initial data by default. +<programlisting> +test_sub=# CREATE SUBSCRIPTION sub1 +test_sub-# CONNECTION 'host=localhost dbname=test_pub application_name=sub1' +test_sub-# PUBLICATION pub1; +CREATE SUBSCRIPTION +test_sub=# CREATE SUBSCRIPTION sub2 +test_sub-# CONNECTION 'host=localhost dbname=test_pub application_name=sub2' +test_sub-# PUBLICATION pub2; +CREATE SUBSCRIPTION +test_sub=# CREATE SUBSCRIPTION sub3 +test_sub-# CONNECTION 'host=localhost dbname=test_pub application_name=sub3' +test_sub-# PUBLICATION pub3a, pub3b; +CREATE SUBSCRIPTION +</programlisting></para> + + <para> + Observe that initial table data is copied, regardless of the + <literal>publish</literal> operation of the publication. +<programlisting> +test_sub=# SELECT * FROM t1; + a | b +---+------- + 1 | one + 2 | two + 3 | three +(3 rows) + +test_sub=# SELECT * FROM t2; + c | d +---+--- + 1 | A + 2 | B + 3 | C +(3 rows) +</programlisting></para> + + <para> + Furthermore, because the initial data copy ignores the <literal>publish</literal> + operation, and because publication <literal>pub3a</literal> has no row filter, + it means the copied table <literal>t3</literal> contains all rows even when + they do not match the row filter of publication <literal>pub3b</literal>. +<programlisting> +test_sub=# SELECT * FROM t3; + e | f +---+----- + 1 | i + 2 | ii + 3 | iii +(3 rows) +</programlisting></para> + + <para> + Insert more data to the tables at the publisher side. +<programlisting> +test_pub=# INSERT INTO t1 VALUES (4, 'four'), (5, 'five'), (6, 'six'); +INSERT 0 3 +test_pub=# INSERT INTO t2 VALUES (4, 'D'), (5, 'E'), (6, 'F'); +INSERT 0 3 +test_pub=# INSERT INTO t3 VALUES (4, 'iv'), (5, 'v'), (6, 'vi'); +INSERT 0 3 +</programlisting></para> + + <para> + Now the publisher side data looks like: +<programlisting> +test_pub=# SELECT * FROM t1; + a | b +---+------- + 1 | one + 2 | two + 3 | three + 4 | four + 5 | five + 6 | six +(6 rows) + +test_pub=# SELECT * FROM t2; + c | d +---+--- + 1 | A + 2 | B + 3 | C + 4 | D + 5 | E + 6 | F +(6 rows) + +test_pub=# SELECT * FROM t3; + e | f +---+----- + 1 | i + 2 | ii + 3 | iii + 4 | iv + 5 | v + 6 | vi +(6 rows) +</programlisting></para> + + <para> + Observe that during normal replication the appropriate + <literal>publish</literal> operations are used. This means publications + <literal>pub2</literal> and <literal>pub3a</literal> will not replicate the + <literal>INSERT</literal>. Also, publication <literal>pub3b</literal> will + only replicate data that matches the row filter of <literal>pub3b</literal>. + Now the subscriber side data looks like: +<programlisting> +test_sub=# SELECT * FROM t1; + a | b +---+------- + 1 | one + 2 | two + 3 | three + 4 | four + 5 | five + 6 | six +(6 rows) + +test_sub=# SELECT * FROM t2; + c | d +---+--- + 1 | A + 2 | B + 3 | C +(3 rows) + +test_sub=# SELECT * FROM t3; + e | f +---+----- + 1 | i + 2 | ii + 3 | iii + 6 | vi +(4 rows) +</programlisting></para> + </sect2> + + </sect1> + + <sect1 id="logical-replication-row-filter"> + <title>Row Filters</title> + + <para> + By default, all data from all published tables will be replicated to the + appropriate subscribers. The replicated data can be reduced by using a + <firstterm>row filter</firstterm>. A user might choose to use row filters + for behavioral, security or performance reasons. If a published table sets a + row filter, a row is replicated only if its data satisfies the row filter + expression. This allows a set of tables to be partially replicated. The row + filter is defined per table. Use a <literal>WHERE</literal> clause after the + table name for each published table that requires data to be filtered out. + The <literal>WHERE</literal> clause must be enclosed by parentheses. See + <xref linkend="sql-createpublication"/> for details. + </para> + + <sect2 id="logical-replication-row-filter-rules"> + <title>Row Filter Rules</title> + + <para> + Row filters are applied <emphasis>before</emphasis> publishing the changes. + If the row filter evaluates to <literal>false</literal> or <literal>NULL</literal> + then the row is not replicated. The <literal>WHERE</literal> clause expression + is evaluated with the same role used for the replication connection (i.e. + the role specified in the <literal>CONNECTION</literal> clause of the + <xref linkend="sql-createsubscription"/>). Row filters have no effect for + <command>TRUNCATE</command> command. + </para> + + </sect2> + + <sect2 id="logical-replication-row-filter-restrictions"> + <title>Expression Restrictions</title> + + <para> + The <literal>WHERE</literal> clause allows only simple expressions. It + cannot contain user-defined functions, operators, types, and collations, + system column references or non-immutable built-in functions. + </para> + + <para> + If a publication publishes <command>UPDATE</command> or + <command>DELETE</command> operations, the row filter <literal>WHERE</literal> + clause must contain only columns that are covered by the replica identity + (see <xref linkend="sql-altertable-replica-identity"/>). If a publication + publishes only <command>INSERT</command> operations, the row filter + <literal>WHERE</literal> clause can use any column. + </para> + + </sect2> + + <sect2 id="logical-replication-row-filter-transformations"> + <title>UPDATE Transformations</title> + + <para> + Whenever an <command>UPDATE</command> is processed, the row filter + expression is evaluated for both the old and new row (i.e. using the data + before and after the update). If both evaluations are <literal>true</literal>, + it replicates the <command>UPDATE</command> change. If both evaluations are + <literal>false</literal>, it doesn't replicate the change. If only one of + the old/new rows matches the row filter expression, the <command>UPDATE</command> + is transformed to <command>INSERT</command> or <command>DELETE</command>, to + avoid any data inconsistency. The row on the subscriber should reflect what + is defined by the row filter expression on the publisher. + </para> + + <para> + If the old row satisfies the row filter expression (it was sent to the + subscriber) but the new row doesn't, then, from a data consistency + perspective the old row should be removed from the subscriber. + So the <command>UPDATE</command> is transformed into a <command>DELETE</command>. + </para> + + <para> + If the old row doesn't satisfy the row filter expression (it wasn't sent + to the subscriber) but the new row does, then, from a data consistency + perspective the new row should be added to the subscriber. + So the <command>UPDATE</command> is transformed into an <command>INSERT</command>. + </para> + + <para> + <xref linkend="logical-replication-row-filter-transformations-summary"/> + summarizes the applied transformations. + </para> + + <table id="logical-replication-row-filter-transformations-summary"> + <title><command>UPDATE</command> Transformation Summary</title> + <tgroup cols="3"> + <thead> + <row> + <entry>Old row</entry><entry>New row</entry><entry>Transformation</entry> + </row> + </thead> + <tbody> + <row> + <entry>no match</entry><entry>no match</entry><entry>don't replicate</entry> + </row> + <row> + <entry>no match</entry><entry>match</entry><entry><literal>INSERT</literal></entry> + </row> + <row> + <entry>match</entry><entry>no match</entry><entry><literal>DELETE</literal></entry> + </row> + <row> + <entry>match</entry><entry>match</entry><entry><literal>UPDATE</literal></entry> + </row> + </tbody> + </tgroup> + </table> + + </sect2> + + <sect2 id="logical-replication-row-filter-partitioned-table"> + <title>Partitioned Tables</title> + + <para> + If the publication contains a partitioned table, the publication parameter + <literal>publish_via_partition_root</literal> determines which row filter + is used. If <literal>publish_via_partition_root</literal> is <literal>true</literal>, + the <emphasis>root partitioned table's</emphasis> row filter is used. Otherwise, + if <literal>publish_via_partition_root</literal> is <literal>false</literal> + (default), each <emphasis>partition's</emphasis> row filter is used. + </para> + + </sect2> + + <sect2 id="logical-replication-row-filter-initial-data-sync"> + <title>Initial Data Synchronization</title> + + <para> + If the subscription requires copying pre-existing table data + and a publication contains <literal>WHERE</literal> clauses, only data that + satisfies the row filter expressions is copied to the subscriber. + </para> + + <para> + If the subscription has several publications in which a table has been + published with different <literal>WHERE</literal> clauses, rows that satisfy + <emphasis>any</emphasis> of the expressions will be copied. See + <xref linkend="logical-replication-row-filter-combining"/> for details. + </para> + + <warning> + <para> + Because initial data synchronization does not take into account the + <literal>publish</literal> parameter when copying existing table data, + some rows may be copied that would not be replicated using DML. Refer to + <xref linkend="logical-replication-snapshot"/>, and see + <xref linkend="logical-replication-subscription-examples"/> for examples. + </para> + </warning> + + <note> + <para> + If the subscriber is in a release prior to 15, copy pre-existing data + doesn't use row filters even if they are defined in the publication. + This is because old releases can only copy the entire table data. + </para> + </note> + + </sect2> + + <sect2 id="logical-replication-row-filter-combining"> + <title>Combining Multiple Row Filters</title> + + <para> + If the subscription has several publications in which the same table has + been published with different row filters (for the same <literal>publish</literal> + operation), those expressions get ORed together, so that rows satisfying + <emphasis>any</emphasis> of the expressions will be replicated. This means all + the other row filters for the same table become redundant if: + <itemizedlist> + <listitem> + <para> + One of the publications has no row filter. + </para> + </listitem> + <listitem> + <para> + One of the publications was created using <literal>FOR ALL TABLES</literal>. + This clause does not allow row filters. + </para> + </listitem> + <listitem> + <para> + One of the publications was created using + <literal>FOR TABLES IN SCHEMA</literal> and the table belongs to + the referred schema. This clause does not allow row filters. + </para> + </listitem> + </itemizedlist></para> + + </sect2> + + <sect2 id="logical-replication-row-filter-examples"> + <title>Examples</title> + + <para> + Create some tables to be used in the following examples. +<programlisting> +test_pub=# CREATE TABLE t1(a int, b int, c text, PRIMARY KEY(a,c)); +CREATE TABLE +test_pub=# CREATE TABLE t2(d int, e int, f int, PRIMARY KEY(d)); +CREATE TABLE +test_pub=# CREATE TABLE t3(g int, h int, i int, PRIMARY KEY(g)); +CREATE TABLE +</programlisting></para> + + <para> + Create some publications. Publication <literal>p1</literal> has one table + (<literal>t1</literal>) and that table has a row filter. Publication + <literal>p2</literal> has two tables. Table <literal>t1</literal> has no row + filter, and table <literal>t2</literal> has a row filter. Publication + <literal>p3</literal> has two tables, and both of them have a row filter. +<programlisting> +test_pub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a > 5 AND c = 'NSW'); +CREATE PUBLICATION +test_pub=# CREATE PUBLICATION p2 FOR TABLE t1, t2 WHERE (e = 99); +CREATE PUBLICATION +test_pub=# CREATE PUBLICATION p3 FOR TABLE t2 WHERE (d = 10), t3 WHERE (g = 10); +CREATE PUBLICATION +</programlisting></para> + + <para> + <command>psql</command> can be used to show the row filter expressions (if + defined) for each publication. +<programlisting> +test_pub=# \dRp+ + Publication p1 + Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root +----------+------------+---------+---------+---------+-----------+---------- + postgres | f | t | t | t | t | f +Tables: + "public.t1" WHERE ((a > 5) AND (c = 'NSW'::text)) + + Publication p2 + Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root +----------+------------+---------+---------+---------+-----------+---------- + postgres | f | t | t | t | t | f +Tables: + "public.t1" + "public.t2" WHERE (e = 99) + + Publication p3 + Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root +----------+------------+---------+---------+---------+-----------+---------- + postgres | f | t | t | t | t | f +Tables: + "public.t2" WHERE (d = 10) + "public.t3" WHERE (g = 10) +</programlisting></para> + + <para> + <command>psql</command> can be used to show the row filter expressions (if + defined) for each table. See that table <literal>t1</literal> is a member + of two publications, but has a row filter only in <literal>p1</literal>. + See that table <literal>t2</literal> is a member of two publications, and + has a different row filter in each of them. +<programlisting> +test_pub=# \d t1 + Table "public.t1" + Column | Type | Collation | Nullable | Default +--------+---------+-----------+----------+--------- + a | integer | | not null | + b | integer | | | + c | text | | not null | +Indexes: + "t1_pkey" PRIMARY KEY, btree (a, c) +Publications: + "p1" WHERE ((a > 5) AND (c = 'NSW'::text)) + "p2" + +test_pub=# \d t2 + Table "public.t2" + Column | Type | Collation | Nullable | Default +--------+---------+-----------+----------+--------- + d | integer | | not null | + e | integer | | | + f | integer | | | +Indexes: + "t2_pkey" PRIMARY KEY, btree (d) +Publications: + "p2" WHERE (e = 99) + "p3" WHERE (d = 10) + +test_pub=# \d t3 + Table "public.t3" + Column | Type | Collation | Nullable | Default +--------+---------+-----------+----------+--------- + g | integer | | not null | + h | integer | | | + i | integer | | | +Indexes: + "t3_pkey" PRIMARY KEY, btree (g) +Publications: + "p3" WHERE (g = 10) +</programlisting></para> + + <para> + On the subscriber node, create a table <literal>t1</literal> with the same + definition as the one on the publisher, and also create the subscription + <literal>s1</literal> that subscribes to the publication <literal>p1</literal>. +<programlisting> +test_sub=# CREATE TABLE t1(a int, b int, c text, PRIMARY KEY(a,c)); +CREATE TABLE +test_sub=# CREATE SUBSCRIPTION s1 +test_sub-# CONNECTION 'host=localhost dbname=test_pub application_name=s1' +test_sub-# PUBLICATION p1; +CREATE SUBSCRIPTION +</programlisting></para> + + <para> + Insert some rows. Only the rows satisfying the <literal>t1 WHERE</literal> + clause of publication <literal>p1</literal> are replicated. +<programlisting> +test_pub=# INSERT INTO t1 VALUES (2, 102, 'NSW'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (3, 103, 'QLD'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (4, 104, 'VIC'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (5, 105, 'ACT'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (6, 106, 'NSW'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (7, 107, 'NT'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (8, 108, 'QLD'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES (9, 109, 'NSW'); +INSERT 0 1 + +test_pub=# SELECT * FROM t1; + a | b | c +---+-----+----- + 2 | 102 | NSW + 3 | 103 | QLD + 4 | 104 | VIC + 5 | 105 | ACT + 6 | 106 | NSW + 7 | 107 | NT + 8 | 108 | QLD + 9 | 109 | NSW +(8 rows) +</programlisting> +<programlisting> +test_sub=# SELECT * FROM t1; + a | b | c +---+-----+----- + 6 | 106 | NSW + 9 | 109 | NSW +(2 rows) +</programlisting></para> + + <para> + Update some data, where the old and new row values both + satisfy the <literal>t1 WHERE</literal> clause of publication + <literal>p1</literal>. The <command>UPDATE</command> replicates + the change as normal. +<programlisting> +test_pub=# UPDATE t1 SET b = 999 WHERE a = 6; +UPDATE 1 + +test_pub=# SELECT * FROM t1; + a | b | c +---+-----+----- + 2 | 102 | NSW + 3 | 103 | QLD + 4 | 104 | VIC + 5 | 105 | ACT + 7 | 107 | NT + 8 | 108 | QLD + 9 | 109 | NSW + 6 | 999 | NSW +(8 rows) +</programlisting> +<programlisting> +test_sub=# SELECT * FROM t1; + a | b | c +---+-----+----- + 9 | 109 | NSW + 6 | 999 | NSW +(2 rows) +</programlisting></para> + + <para> + Update some data, where the old row values did not satisfy + the <literal>t1 WHERE</literal> clause of publication <literal>p1</literal>, + but the new row values do satisfy it. The <command>UPDATE</command> is + transformed into an <command>INSERT</command> and the change is replicated. + See the new row on the subscriber. +<programlisting> +test_pub=# UPDATE t1 SET a = 555 WHERE a = 2; +UPDATE 1 + +test_pub=# SELECT * FROM t1; + a | b | c +-----+-----+----- + 3 | 103 | QLD + 4 | 104 | VIC + 5 | 105 | ACT + 7 | 107 | NT + 8 | 108 | QLD + 9 | 109 | NSW + 6 | 999 | NSW + 555 | 102 | NSW +(8 rows) +</programlisting> +<programlisting> +test_sub=# SELECT * FROM t1; + a | b | c +-----+-----+----- + 9 | 109 | NSW + 6 | 999 | NSW + 555 | 102 | NSW +(3 rows) +</programlisting></para> + + <para> + Update some data, where the old row values satisfied + the <literal>t1 WHERE</literal> clause of publication <literal>p1</literal>, + but the new row values do not satisfy it. The <command>UPDATE</command> is + transformed into a <command>DELETE</command> and the change is replicated. + See that the row is removed from the subscriber. +<programlisting> +test_pub=# UPDATE t1 SET c = 'VIC' WHERE a = 9; +UPDATE 1 + +test_pub=# SELECT * FROM t1; + a | b | c +-----+-----+----- + 3 | 103 | QLD + 4 | 104 | VIC + 5 | 105 | ACT + 7 | 107 | NT + 8 | 108 | QLD + 6 | 999 | NSW + 555 | 102 | NSW + 9 | 109 | VIC +(8 rows) +</programlisting> +<programlisting> +test_sub=# SELECT * FROM t1; + a | b | c +-----+-----+----- + 6 | 999 | NSW + 555 | 102 | NSW +(2 rows) +</programlisting></para> + + <para> + The following examples show how the publication parameter + <literal>publish_via_partition_root</literal> determines whether the row + filter of the parent or child table will be used in the case of partitioned + tables. + </para> + + <para> + Create a partitioned table on the publisher. +<programlisting> +test_pub=# CREATE TABLE parent(a int PRIMARY KEY) PARTITION BY RANGE(a); +CREATE TABLE +test_pub=# CREATE TABLE child PARTITION OF parent DEFAULT; +CREATE TABLE +</programlisting> + Create the same tables on the subscriber. +<programlisting> +test_sub=# CREATE TABLE parent(a int PRIMARY KEY) PARTITION BY RANGE(a); +CREATE TABLE +test_sub=# CREATE TABLE child PARTITION OF parent DEFAULT; +CREATE TABLE +</programlisting></para> + + <para> + Create a publication <literal>p4</literal>, and then subscribe to it. The + publication parameter <literal>publish_via_partition_root</literal> is set + as true. There are row filters defined on both the partitioned table + (<literal>parent</literal>), and on the partition (<literal>child</literal>). +<programlisting> +test_pub=# CREATE PUBLICATION p4 FOR TABLE parent WHERE (a < 5), child WHERE (a >= 5) +test_pub-# WITH (publish_via_partition_root=true); +CREATE PUBLICATION +</programlisting> +<programlisting> +test_sub=# CREATE SUBSCRIPTION s4 +test_sub-# CONNECTION 'host=localhost dbname=test_pub application_name=s4' +test_sub-# PUBLICATION p4; +CREATE SUBSCRIPTION +</programlisting></para> + + <para> + Insert some values directly into the <literal>parent</literal> and + <literal>child</literal> tables. They replicate using the row filter of + <literal>parent</literal> (because <literal>publish_via_partition_root</literal> + is true). +<programlisting> +test_pub=# INSERT INTO parent VALUES (2), (4), (6); +INSERT 0 3 +test_pub=# INSERT INTO child VALUES (3), (5), (7); +INSERT 0 3 + +test_pub=# SELECT * FROM parent ORDER BY a; + a +--- + 2 + 3 + 4 + 5 + 6 + 7 +(6 rows) +</programlisting> +<programlisting> +test_sub=# SELECT * FROM parent ORDER BY a; + a +--- + 2 + 3 + 4 +(3 rows) +</programlisting></para> + + <para> + Repeat the same test, but with a different value for <literal>publish_via_partition_root</literal>. + The publication parameter <literal>publish_via_partition_root</literal> is + set as false. A row filter is defined on the partition (<literal>child</literal>). +<programlisting> +test_pub=# DROP PUBLICATION p4; +DROP PUBLICATION +test_pub=# CREATE PUBLICATION p4 FOR TABLE parent, child WHERE (a >= 5) +test_pub-# WITH (publish_via_partition_root=false); +CREATE PUBLICATION +</programlisting> +<programlisting> +test_sub=# ALTER SUBSCRIPTION s4 REFRESH PUBLICATION; +ALTER SUBSCRIPTION +</programlisting></para> + + <para> + Do the inserts on the publisher same as before. They replicate using the + row filter of <literal>child</literal> (because + <literal>publish_via_partition_root</literal> is false). +<programlisting> +test_pub=# TRUNCATE parent; +TRUNCATE TABLE +test_pub=# INSERT INTO parent VALUES (2), (4), (6); +INSERT 0 3 +test_pub=# INSERT INTO child VALUES (3), (5), (7); +INSERT 0 3 + +test_pub=# SELECT * FROM parent ORDER BY a; + a +--- + 2 + 3 + 4 + 5 + 6 + 7 +(6 rows) +</programlisting> +<programlisting> +test_sub=# SELECT * FROM child ORDER BY a; + a +--- + 5 + 6 + 7 +(3 rows) +</programlisting></para> + + </sect2> + + </sect1> + + <sect1 id="logical-replication-col-lists"> + <title>Column Lists</title> + + <para> + Each publication can optionally specify which columns of each table are + replicated to subscribers. The table on the subscriber side must have at + least all the columns that are published. If no column list is specified, + then all columns on the publisher are replicated. + See <xref linkend="sql-createpublication"/> for details on the syntax. + </para> + + <para> + The choice of columns can be based on behavioral or performance reasons. + However, do not rely on this feature for security: a malicious subscriber + is able to obtain data from columns that are not specifically + published. If security is a consideration, protections can be applied + at the publisher side. + </para> + + <para> + If no column list is specified, any columns added later are automatically + replicated. This means that having a column list which names all columns + is not the same as having no column list at all. + </para> + + <para> + A column list can contain only simple column references. The order + of columns in the list is not preserved. + </para> + + <para> + Specifying a column list when the publication also publishes + <literal>FOR TABLES IN SCHEMA</literal> is not supported. + </para> + + <para> + For partitioned tables, the publication parameter + <literal>publish_via_partition_root</literal> determines which column list + is used. If <literal>publish_via_partition_root</literal> is + <literal>true</literal>, the root partitioned table's column list is used. + Otherwise, if <literal>publish_via_partition_root</literal> is + <literal>false</literal> (the default), each partition's column list is used. + </para> + + <para> + If a publication publishes <command>UPDATE</command> or + <command>DELETE</command> operations, any column list must include the + table's replica identity columns (see + <xref linkend="sql-altertable-replica-identity"/>). + If a publication publishes only <command>INSERT</command> operations, then + the column list may omit replica identity columns. + </para> + + <para> + Column lists have no effect for the <literal>TRUNCATE</literal> command. + </para> + + <para> + During initial data synchronization, only the published columns are + copied. However, if the subscriber is from a release prior to 15, then + all the columns in the table are copied during initial data synchronization, + ignoring any column lists. + </para> + + <warning id="logical-replication-col-list-combining"> + <title>Warning: Combining Column Lists from Multiple Publications</title> + <para> + There's currently no support for subscriptions comprising several + publications where the same table has been published with different + column lists. <xref linkend="sql-createsubscription"/> disallows + creating such subscriptions, but it is still possible to get into + that situation by adding or altering column lists on the publication + side after a subscription has been created. + </para> + <para> + This means changing the column lists of tables on publications that are + already subscribed could lead to errors being thrown on the subscriber + side. + </para> + <para> + If a subscription is affected by this problem, the only way to resume + replication is to adjust one of the column lists on the publication + side so that they all match; and then either recreate the subscription, + or use <literal>ALTER SUBSCRIPTION ... DROP PUBLICATION</literal> to + remove one of the offending publications and add it again. + </para> + </warning> + + <sect2 id="logical-replication-col-list-examples"> + <title>Examples</title> + + <para> + Create a table <literal>t1</literal> to be used in the following example. +<programlisting> +test_pub=# CREATE TABLE t1(id int, a text, b text, c text, d text, e text, PRIMARY KEY(id)); +CREATE TABLE +</programlisting></para> + + <para> + Create a publication <literal>p1</literal>. A column list is defined for + table <literal>t1</literal> to reduce the number of columns that will be + replicated. Notice that the order of column names in the column list does + not matter. +<programlisting> +test_pub=# CREATE PUBLICATION p1 FOR TABLE t1 (id, b, a, d); +CREATE PUBLICATION +</programlisting></para> + + <para> + <literal>psql</literal> can be used to show the column lists (if defined) + for each publication. +<programlisting> +test_pub=# \dRp+ + Publication p1 + Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root +----------+------------+---------+---------+---------+-----------+---------- + postgres | f | t | t | t | t | f +Tables: + "public.t1" (id, a, b, d) +</programlisting></para> + + <para> + <literal>psql</literal> can be used to show the column lists (if defined) + for each table. +<programlisting> +test_pub=# \d t1 + Table "public.t1" + Column | Type | Collation | Nullable | Default +--------+---------+-----------+----------+--------- + id | integer | | not null | + a | text | | | + b | text | | | + c | text | | | + d | text | | | + e | text | | | +Indexes: + "t1_pkey" PRIMARY KEY, btree (id) +Publications: + "p1" (id, a, b, d) +</programlisting></para> + + <para> + On the subscriber node, create a table <literal>t1</literal> which now + only needs a subset of the columns that were on the publisher table + <literal>t1</literal>, and also create the subscription + <literal>s1</literal> that subscribes to the publication + <literal>p1</literal>. +<programlisting> +test_sub=# CREATE TABLE t1(id int, b text, a text, d text, PRIMARY KEY(id)); +CREATE TABLE +test_sub=# CREATE SUBSCRIPTION s1 +test_sub-# CONNECTION 'host=localhost dbname=test_pub application_name=s1' +test_sub-# PUBLICATION p1; +CREATE SUBSCRIPTION +</programlisting></para> + + <para> + On the publisher node, insert some rows to table <literal>t1</literal>. +<programlisting> +test_pub=# INSERT INTO t1 VALUES(1, 'a-1', 'b-1', 'c-1', 'd-1', 'e-1'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES(2, 'a-2', 'b-2', 'c-2', 'd-2', 'e-2'); +INSERT 0 1 +test_pub=# INSERT INTO t1 VALUES(3, 'a-3', 'b-3', 'c-3', 'd-3', 'e-3'); +INSERT 0 1 +test_pub=# SELECT * FROM t1 ORDER BY id; + id | a | b | c | d | e +----+-----+-----+-----+-----+----- + 1 | a-1 | b-1 | c-1 | d-1 | e-1 + 2 | a-2 | b-2 | c-2 | d-2 | e-2 + 3 | a-3 | b-3 | c-3 | d-3 | e-3 +(3 rows) +</programlisting></para> + + <para> + Only data from the column list of publication <literal>p1</literal> is + replicated. +<programlisting> +test_sub=# SELECT * FROM t1 ORDER BY id; + id | b | a | d +----+-----+-----+----- + 1 | b-1 | a-1 | d-1 + 2 | b-2 | a-2 | d-2 + 3 | b-3 | a-3 | d-3 +(3 rows) +</programlisting></para> + + </sect2> + + </sect1> + + <sect1 id="logical-replication-conflicts"> + <title>Conflicts</title> + + <para> + Logical replication behaves similarly to normal DML operations in that + the data will be updated even if it was changed locally on the subscriber + node. If incoming data violates any constraints the replication will + stop. This is referred to as a <firstterm>conflict</firstterm>. When + replicating <command>UPDATE</command> or <command>DELETE</command> + operations, missing data will not produce a conflict and such operations + will simply be skipped. + </para> + + <para> + Logical replication operations are performed with the privileges of the role + which owns the subscription. Permissions failures on target tables will + cause replication conflicts, as will enabled + <link linkend="ddl-rowsecurity">row-level security</link> on target tables + that the subscription owner is subject to, without regard to whether any + policy would ordinarily reject the <command>INSERT</command>, + <command>UPDATE</command>, <command>DELETE</command> or + <command>TRUNCATE</command> which is being replicated. This restriction on + row-level security may be lifted in a future version of + <productname>PostgreSQL</productname>. + </para> + + <para> + A conflict will produce an error and will stop the replication; it must be + resolved manually by the user. Details about the conflict can be found in + the subscriber's server log. + </para> + + <para> + The resolution can be done either by changing data or permissions on the subscriber so + that it does not conflict with the incoming change or by skipping the + transaction that conflicts with the existing data. When a conflict produces + an error, the replication won't proceed, and the logical replication worker will + emit the following kind of message to the subscriber's server log: +<screen> +ERROR: duplicate key value violates unique constraint "test_pkey" +DETAIL: Key (c)=(1) already exists. +CONTEXT: processing remote data for replication origin "pg_16395" during "INSERT" for replication target relation "public.test" in transaction 725 finished at 0/14C0378 +</screen> + The LSN of the transaction that contains the change violating the constraint and + the replication origin name can be found from the server log (LSN 0/14C0378 and + replication origin <literal>pg_16395</literal> in the above case). The + transaction that produced the conflict can be skipped by using + <command>ALTER SUBSCRIPTION ... SKIP</command> with the finish LSN + (i.e., LSN 0/14C0378). The finish LSN could be an LSN at which the transaction + is committed or prepared on the publisher. Alternatively, the transaction can + also be skipped by calling the <link linkend="pg-replication-origin-advance"> + <function>pg_replication_origin_advance()</function></link> function. + Before using this function, the subscription needs to be disabled temporarily + either by <command>ALTER SUBSCRIPTION ... DISABLE</command> or, the + subscription can be used with the <literal>disable_on_error</literal> option. + Then, you can use <function>pg_replication_origin_advance()</function> function + with the <parameter>node_name</parameter> (i.e., <literal>pg_16395</literal>) + and the next LSN of the finish LSN (i.e., 0/14C0379). The current position of + origins can be seen in the <link linkend="view-pg-replication-origin-status"> + <structname>pg_replication_origin_status</structname></link> system view. + Please note that skipping the whole transaction includes skipping changes that + might not violate any constraint. This can easily make the subscriber + inconsistent. + </para> + </sect1> + + <sect1 id="logical-replication-restrictions"> + <title>Restrictions</title> + + <para> + Logical replication currently has the following restrictions or missing + functionality. These might be addressed in future releases. + </para> + + <itemizedlist> + <listitem> + <para> + The database schema and DDL commands are not replicated. The initial + schema can be copied by hand using <literal>pg_dump + --schema-only</literal>. Subsequent schema changes would need to be kept + in sync manually. (Note, however, that there is no need for the schemas + to be absolutely the same on both sides.) Logical replication is robust + when schema definitions change in a live database: When the schema is + changed on the publisher and replicated data starts arriving at the + subscriber but does not fit into the table schema, replication will error + until the schema is updated. In many cases, intermittent errors can be + avoided by applying additive schema changes to the subscriber first. + </para> + </listitem> + + <listitem> + <para> + Sequence data is not replicated. The data in serial or identity columns + backed by sequences will of course be replicated as part of the table, + but the sequence itself would still show the start value on the + subscriber. If the subscriber is used as a read-only database, then this + should typically not be a problem. If, however, some kind of switchover + or failover to the subscriber database is intended, then the sequences + would need to be updated to the latest values, either by copying the + current data from the publisher (perhaps + using <command>pg_dump</command>) or by determining a sufficiently high + value from the tables themselves. + </para> + </listitem> + + <listitem> + <para> + Replication of <command>TRUNCATE</command> commands is supported, but + some care must be taken when truncating groups of tables connected by + foreign keys. When replicating a truncate action, the subscriber will + truncate the same group of tables that was truncated on the publisher, + either explicitly specified or implicitly collected via + <literal>CASCADE</literal>, minus tables that are not part of the + subscription. This will work correctly if all affected tables are part + of the same subscription. But if some tables to be truncated on the + subscriber have foreign-key links to tables that are not part of the same + (or any) subscription, then the application of the truncate action on the + subscriber will fail. + </para> + </listitem> + + <listitem> + <para> + Large objects (see <xref linkend="largeobjects"/>) are not replicated. + There is no workaround for that, other than storing data in normal + tables. + </para> + </listitem> + + <listitem> + <para> + Replication is only supported by tables, including partitioned tables. + Attempts to replicate other types of relations, such as views, materialized + views, or foreign tables, will result in an error. + </para> + </listitem> + + <listitem> + <para> + When replicating between partitioned tables, the actual replication + originates, by default, from the leaf partitions on the publisher, so + partitions on the publisher must also exist on the subscriber as valid + target tables. (They could either be leaf partitions themselves, or they + could be further subpartitioned, or they could even be independent + tables.) Publications can also specify that changes are to be replicated + using the identity and schema of the partitioned root table instead of + that of the individual leaf partitions in which the changes actually + originate (see <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link>). + </para> + </listitem> + </itemizedlist> + </sect1> + + <sect1 id="logical-replication-architecture"> + <title>Architecture</title> + + <para> + Logical replication starts by copying a snapshot of the data on the + publisher database. Once that is done, changes on the publisher are sent + to the subscriber as they occur in real time. The subscriber applies data + in the order in which commits were made on the publisher so that + transactional consistency is guaranteed for the publications within any + single subscription. + </para> + + <para> + Logical replication is built with an architecture similar to physical + streaming replication (see <xref linkend="streaming-replication"/>). It is + implemented by <quote>walsender</quote> and <quote>apply</quote> + processes. The walsender process starts logical decoding (described + in <xref linkend="logicaldecoding"/>) of the WAL and loads the standard + logical decoding plugin (pgoutput). The plugin transforms the changes read + from WAL to the logical replication protocol + (see <xref linkend="protocol-logical-replication"/>) and filters the data + according to the publication specification. The data is then continuously + transferred using the streaming replication protocol to the apply worker, + which maps the data to local tables and applies the individual changes as + they are received, in correct transactional order. + </para> + + <para> + The apply process on the subscriber database always runs with + <link linkend="guc-session-replication-role"><varname>session_replication_role</varname></link> + set to <literal>replica</literal>. This means that, by default, + triggers and rules will not fire on a subscriber. Users can optionally choose to + enable triggers and rules on a table using the + <link linkend="sql-altertable"><command>ALTER TABLE</command></link> command + and the <literal>ENABLE TRIGGER</literal> and <literal>ENABLE RULE</literal> + clauses. + </para> + + <para> + The logical replication apply process currently only fires row triggers, + not statement triggers. The initial table synchronization, however, is + implemented like a <command>COPY</command> command and thus fires both row + and statement triggers for <command>INSERT</command>. + </para> + + <sect2 id="logical-replication-snapshot"> + <title>Initial Snapshot</title> + <para> + The initial data in existing subscribed tables are snapshotted and + copied in a parallel instance of a special kind of apply process. + This process will create its own replication slot and copy the existing + data. As soon as the copy is finished the table contents will become + visible to other backends. Once existing data is copied, the worker + enters synchronization mode, which ensures that the table is brought + up to a synchronized state with the main apply process by streaming + any changes that happened during the initial data copy using standard + logical replication. During this synchronization phase, the changes + are applied and committed in the same order as they happened on the + publisher. Once synchronization is done, control of the + replication of the table is given back to the main apply process where + replication continues as normal. + </para> + <note> + <para> + The publication <literal>publish</literal> parameter only affects what + DML operations will be replicated. The initial data synchronization does + not take this parameter into account when copying the existing table data. + </para> + </note> + </sect2> + </sect1> + + <sect1 id="logical-replication-monitoring"> + <title>Monitoring</title> + + <para> + Because logical replication is based on a similar architecture as + <link linkend="streaming-replication">physical streaming replication</link>, + the monitoring on a publication node is similar to monitoring of a + physical replication primary + (see <xref linkend="streaming-replication-monitoring"/>). + </para> + + <para> + The monitoring information about subscription is visible in + <link linkend="monitoring-pg-stat-subscription"> + <structname>pg_stat_subscription</structname></link>. + This view contains one row for every subscription worker. A subscription + can have zero or more active subscription workers depending on its state. + </para> + + <para> + Normally, there is a single apply process running for an enabled + subscription. A disabled subscription or a crashed subscription will have + zero rows in this view. If the initial data synchronization of any + table is in progress, there will be additional workers for the tables + being synchronized. + </para> + </sect1> + + <sect1 id="logical-replication-security"> + <title>Security</title> + + <para> + A user able to modify the schema of subscriber-side tables can execute + arbitrary code as the role which owns any subscription which modifies those tables. Limit ownership + and <literal>TRIGGER</literal> privilege on such tables to trusted roles. + Moreover, if untrusted users can create tables, use only + publications that list tables explicitly. That is to say, create a + subscription <literal>FOR ALL TABLES</literal> or + <literal>FOR TABLES IN SCHEMA</literal> only when superusers trust + every user permitted to create a non-temp table on the publisher or the + subscriber. + </para> + + <para> + The role used for the replication connection must have + the <literal>REPLICATION</literal> attribute (or be a superuser). If the + role lacks <literal>SUPERUSER</literal> and <literal>BYPASSRLS</literal>, + publisher row security policies can execute. If the role does not trust + all table owners, include <literal>options=-crow_security=off</literal> in + the connection string; if a table owner then adds a row security policy, + that setting will cause replication to halt rather than execute the policy. + Access for the role must be configured in <filename>pg_hba.conf</filename> + and it must have the <literal>LOGIN</literal> attribute. + </para> + + <para> + In order to be able to copy the initial table data, the role used for the + replication connection must have the <literal>SELECT</literal> privilege on + a published table (or be a superuser). + </para> + + <para> + To create a publication, the user must have the <literal>CREATE</literal> + privilege in the database. + </para> + + <para> + To add tables to a publication, the user must have ownership rights on the + table. To add all tables in schema to a publication, the user must be a + superuser. To create a publication that publishes all tables or all tables in + schema automatically, the user must be a superuser. + </para> + + <para> + To create a subscription, the user must be a superuser. + </para> + + <para> + The subscription apply process will run in the local database with the + privileges of the subscription owner. + </para> + + <para> + On the publisher, privileges are only checked once at the start of a + replication connection and are not re-checked as each change record is read. + </para> + + <para> + On the subscriber, the subscription owner's privileges are re-checked for + each transaction when applied. If a worker is in the process of applying a + transaction when the ownership of the subscription is changed by a + concurrent transaction, the application of the current transaction will + continue under the old owner's privileges. + </para> + </sect1> + + <sect1 id="logical-replication-config"> + <title>Configuration Settings</title> + + <para> + Logical replication requires several configuration options to be set. + </para> + + <para> + On the publisher side, <varname>wal_level</varname> must be set to + <literal>logical</literal>, and <varname>max_replication_slots</varname> + must be set to at least the number of subscriptions expected to connect, + plus some reserve for table synchronization. And + <varname>max_wal_senders</varname> should be set to at least the same as + <varname>max_replication_slots</varname> plus the number of physical + replicas that are connected at the same time. + </para> + + <para> + <varname>max_replication_slots</varname> must also be set on the subscriber. + It should be set to at least the number of subscriptions that will be added + to the subscriber, plus some reserve for table synchronization. + <varname>max_logical_replication_workers</varname> must be set to at least + the number of subscriptions, again plus some reserve for the table + synchronization. Additionally the <varname>max_worker_processes</varname> + may need to be adjusted to accommodate for replication workers, at least + (<varname>max_logical_replication_workers</varname> + + <literal>1</literal>). Note that some extensions and parallel queries + also take worker slots from <varname>max_worker_processes</varname>. + </para> + </sect1> + + <sect1 id="logical-replication-quick-setup"> + <title>Quick Setup</title> + + <para> + First set the configuration options in <filename>postgresql.conf</filename>: +<programlisting> +wal_level = logical +</programlisting> + The other required settings have default values that are sufficient for a + basic setup. + </para> + + <para> + <filename>pg_hba.conf</filename> needs to be adjusted to allow replication + (the values here depend on your actual network configuration and user you + want to use for connecting): +<programlisting> +host all repuser 0.0.0.0/0 md5 +</programlisting> + </para> + + <para> + Then on the publisher database: +<programlisting> +CREATE PUBLICATION mypub FOR TABLE users, departments; +</programlisting> + </para> + + <para> + And on the subscriber database: +<programlisting> +CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICATION mypub; +</programlisting> + </para> + + <para> + The above will start the replication process, which synchronizes the + initial table contents of the tables <literal>users</literal> and + <literal>departments</literal> and then starts replicating + incremental changes to those tables. + </para> + </sect1> +</chapter> |