PG 16.1 standby logical decoding

原创大表哥 All China Database Union 2023-12-12

842

大家好，今天我们聊一聊PG16的新特性： logical decoding on standby server.

官方文档：

业界著名厂商crunchydata 早在今年6月份的blog中也公布这个PG16的新特性：

1.可以在只读的standby server上进行逻辑解码了
2.减少了primary server的压力
3.为下游数据同步和系统审计提供了新的高可用的方式

我们来手动做下面的一个实验：

Server IP	port	PG version	DB role
10.67.38.170	6100	16.1	Primary
10.67.38.171	6100	16.1	Standby (streaming replication)
10.67.38.173	6100	16.1	Primary (logical replication from standby)

我们选用2节点170和171作为1主1从的stream replication： wal_level是logical

postgres=# select version();
                                               version                                                
--------------------------------------------------------------------------------------------------------
PostgreSQL 16.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5), 64-bit
(1 row)


postgres=# show wal_level;
wal_level
-----------
logical
(1 row)


postgres=# select * from pg_stat_replication;
 pid  | usesysid | usename | application_name | client_addr  | client_hostname | client_port |         backen
d_start         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn |    write_lag    
|    flush_lag    |   replay_lag    | sync_priority | sync_state |          reply_time          
-------+----------+---------+------------------+--------------+-----------------+-------------+---------------
----------------+--------------+-----------+-----------+-----------+-----------+------------+-----------------
+-----------------+-----------------+---------------+------------+-------------------------------
56985 |    16384 | repmgr  | pref171          | 10.67.38.171 |                 |       47208 | 2023-12-10 17:
17:43.222998+08 |              | streaming | 0/E004B38 | 0/E004B38 | 0/E004B38 | 0/E004B38  | 00:00:00.000321
| 00:00:00.000743 | 00:00:00.000809 |             0 | async      | 2023-12-10 17:18:17.885619+08
(1 row)

我们先登录主库：创建一个全新的数据库，同步的表，以及同步的账户

postgres=# create database db_pg16_local_decode template template0 encoding UTF8 lc_collate 'C' lc_ctype 'C.UTF-8';
CREATE DATABASE
postgres=# \c db_pg16_local_decode
You are now connected to database "db_pg16_local_decode" as user "postgres".
postgres=# create user logical_user replication login connection limit 10 encrypted password 'logical_user';
CREATE ROLE
postgres=# create table t_order(id int  not null primary key , good_name varchar(200), location varchar(20));
CREATE TABLE
postgres=# grant select on t_order to logical_user;
GRANT

我们继续在主库创建一个发布者 publication:

db_pg16_local_decode=# CREATE PUBLICATION pg16_pub FOR TABLE t_order;
CREATE PUBLICATION

db_pg16_local_decode=# select  * from pg_publication;
 oid  | pubname  | pubowner | puballtables | pubinsert | pubupdate | pubdelete | pubtruncate | pubviaroot
-------+----------+----------+--------------+-----------+-----------+-----------+-------------+------------
16627 | pg16_pub |       10 | f            | t         | t         | t         | t           | f
(1 row)

我们登陆streaming replication的从库，设置一下 hot_standby_feedback = on
（这个参数的打开，会产生一些负面的影响，比如影响主库的vacuum , 可以参考 https://www.cybertec-postgresql.com/en/what-hot_standby_feedback-in-postgresql-really-does/ ）

postgres=# alter system set hot_standby_feedback = 'on';
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

postgres=# show hot_standby_feedback;
hot_standby_feedback
----------------------
on
(1 row)

我们登录reporting的PG实例，创建目标数据库和同步的目标表

postgres=# create database pg16_target_report_db template template0 encoding UTF8 LC_Collate 'C' LC_CTYPE 'C.UTF-8';
CREATE DATABASE
postgres=# \c pg16_target_report_db
You are now connected to database "pg16_target_report_db" as user "postgres".
pg16_target_report_db=# create table t_order(id int  not null primary key , good_name varchar(200), location varchar(20));
CREATE TABLE

接下来，我们创建一个订阅者：这里我们指定的host的地址是standby server

pg16_target_report_db=# CREATE SUBSCRIPTION pg16_sub CONNECTION 'host=10.67.38.171 port=6100 dbname=db_pg16_local_decode application_name=logical_replication user=logical_user password=logical_user' publication pg16_pub;
...

这个命令会卡主，我们需要在主库上执行：

db_pg16_local_decode=# SELECT pg_log_standby_snapshot();
pg_log_standby_snapshot
-------------------------
0/10AF1340
(1 row)

这时reporting的PG，会返回创建订阅者成功的消息：

NOTICE:  created replication slot "pg16_sub" on publisher
CREATE SUBSCRIPTION

这个时候，我们登陆standby database , 查看复制流的信息：

postgres=# SELECT pid, application_name, state, sync_state FROM pg_stat_replication;
 pid  |  application_name   |   state   | sync_state
-------+---------------------+-----------+------------
27777 | logical_replication | streaming | async
(1 row)

查看standby database上的逻辑复制槽的信息

postgres=# select slot_name,slot_type,database ,confirmed_flush_lsn from pg_replication_slots ;
slot_name | slot_type |       database       | confirmed_flush_lsn
-----------+-----------+----------------------+---------------------
pg16_sub  | logical   | db_pg16_local_decode | 0/10B1A950
(1 row)

我们来插入数据测试一下：

登陆主库

db_pg16_local_decode=# insert into t_order (id, good_name,location) values (1,'豆腐脑','TJ');
INSERT 0 1
db_pg16_local_decode=# insert into t_order (id, good_name,location) values (2,'煎饼果子','TJ');
INSERT 0 1
db_pg16_local_decode=# insert into t_order (id, good_name,location) values (3,'卤煮','BJ');
INSERT 0 1
db_pg16_local_decode=# insert into t_order (id, good_name,location) values (4,'小笼包','SH');
INSERT 0 1

登陆reporting 的PG：我们发现数据已经同步到位

pg16_target_report_db=# select * from t_order;
id | good_name | location
----+-----------+----------
 1 | 豆腐脑    | TJ
 2 | 煎饼果子  | TJ
 3 | 卤煮      | BJ
 4 | 小笼包    | SH
(4 rows)

下面我们测试另外一个场景：

PG16之前的版本只能在主库创建逻辑槽，当数据库发生switchover或者failover的时候，新promote的主库上之前逻辑槽并不存在，需要DBA重新创建。

下面我们测试一下，数据库一旦发生主从切换，standby database promote成为主库之后，逻辑复制槽依然正常工作的场景。

我们手动进行一个switchover: 这里我们采用repmgr的swithover的命令

INFRA [postgres@ljzdccapp007 repmgr]# /opt/pgsql-16/bin/repmgr standby switchover -f /data/prod6100/repmgr/repmgr.conf --siblings-follow
NOTICE: executing switchover on node "pref171" (ID: 2)
WARNING: option "--sibling-nodes" specified, but no sibling nodes exist
NOTICE: attempting to pause repmgrd on 2 nodes
NOTICE: local node "pref171" (ID: 2) will be promoted to primary; current primary "pref170" (ID: 1) will be demoted to standby
NOTICE: stopping current primary node "pref170" (ID: 1)
NOTICE: issuing CHECKPOINT on node "pref170" (ID: 1)
DETAIL: executing server command "/opt/pgsql-16/bin/pg_ctl  -D '/data/prod6100/data' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/11000028
NOTICE: promoting standby to primary
DETAIL: promoting server "pref171" (ID: 2) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
NOTICE: STANDBY PROMOTE successful
DETAIL: server "pref171" (ID: 2) was successfully promoted to primary
NOTICE: node "pref171" (ID: 2) promoted to primary, node "pref170" (ID: 1) demoted to standby
NOTICE: switchover was successful
DETAIL: node "pref171" is now primary and node "pref170" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

我们查看复制集的状态：我们看到主从已经切换成功

INFRA [postgres@ljzdccapp007 repmgr]# /opt/pgsql-16/bin/repmgr  -f /data/prod6100/repmgr/repmgr.conf cluster show
ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                      
----+---------+---------+-----------+----------+----------+----------+----------+--------------------------------------------------------------------------
1  | pref170 | standby |   running | pref171  | default  | 100      | 3        | host=10.67.38.170 port=6100 user=repmgr  dbname=repmgr connect_timeout=2
2  | pref171 | primary | * running |          | default  | 100      | 4        | host=10.67.38.171 port=6100 user=repmgr  dbname=repmgr connect_timeout=2

我们去新的主库观察逻辑复制槽：confirmed_flush_lsn 一直在不停的变化说明同步一直在进行

postgres=# select slot_name,slot_type,database ,confirmed_flush_lsn from pg_replication_slots ;
slot_name | slot_type |       database       | confirmed_flush_lsn
-----------+-----------+----------------------+---------------------
pg16_sub  | logical   | db_pg16_local_decode | 0/1100A3D8
(1 row)


postgres=# select slot_name,slot_type,database ,confirmed_flush_lsn from pg_replication_slots ;
slot_name | slot_type |       database       | confirmed_flush_lsn
-----------+-----------+----------------------+---------------------
pg16_sub  | logical   | db_pg16_local_decode | 0/1100AE58
(1 row)

我们新插入一些数据进行测试：

postgres=# \c db_pg16_local_decode
You are now connected to database "db_pg16_local_decode" as user "postgres".
db_pg16_local_decode=# insert into t_order (id, good_name,location) values (5,'车仔面','HK');
INSERT 0 1
db_pg16_local_decode=# insert into t_order (id, good_name,location) values (6,'蚵仔煎','TW');
INSERT 0 1

我们查看目标reporting的PG实例，车仔面和蚵仔煎已经同步到位 🙂

pg16_target_report_db=# select * from t_order;
id | good_name | location
----+-----------+----------
 1 | 豆腐脑    | TJ
 2 | 煎饼果子  | TJ
 3 | 卤煮      | BJ
 4 | 小笼包    | SH
 5 | 车仔面    | HK
 6 | 蚵仔煎    | TW
(6 rows)

这个时候可能有的同学会问，同步的架构又变成了主库 logical decoding 到 reporting 数据库：

那么想要继续恢复 standby logical decoding 到 reporting 数据库的架构该如何做呢？

大致需要如下步骤：

1.我们创建一下disable状态的订阅者指向new standby WITH (enabled=false,copy_data=false)

pg16_target_report_db=# CREATE SUBSCRIPTION pg16_sub_new CONNECTION 'host=10.67.38.170 port=6100 dbname=db_pg16_local_decode application_name=logical_replication user=logical_user password=logical_user' publication pg16_pub WITH (enabled=false,copy_data=false);
NOTICE:  created replication slot "pg16_sub_new" on publisher
CREATE SUBSCRIPTION

我们暂停指向主库的订阅者

pg16_target_report_db=# alter  SUBSCRIPTION pg16_sub disable;
ALTER SUBSCRIPTION

3.查看同步的LSN的位置: 记录下LSN = 0/113449D8

confirmed_flush_lsn：表示订阅者消费wal中的LSN的位置点

db_pg16_local_decode=# select slot_name,slot_type,database ,confirmed_flush_lsn from pg_replication_slots ;
slot_name | slot_type |       database       | confirmed_flush_lsn
-----------+-----------+----------------------+---------------------
pg16_sub  | logical   | db_pg16_local_decode | 0/113449D8
(1 row)

我们可以在这个时候，模拟一些数据插入

db_pg16_local_decode=# insert into t_order (id, good_name,location) values (8,'叉烧饭','GZ');
INSERT 0 1

4.我们设置新的订阅者的 original LSN的位置是 0/113449D8

pg16_target_report_db=# select roident, subname, roname from pg_subscription sub, pg_replication_origin ro where 'pg_' || sub.oid = ro.roname;
roident |   subname    |  roname  
---------+--------------+----------
       1 | pg16_sub     | pg_24744
       2 | pg16_sub_new | pg_24746
(2 rows)

pg16_target_report_db=# select pg_replication_origin_advance('pg_24750', '0/113449D8'::pg_lsn);
pg_replication_origin_advance
-------------------------------
(1 row)

5.激活新的订阅者

pg16_target_report_db=# alter SUBSCRIPTION pg16_sub_new enable;
ALTER SUBSCRIPTION

6.查看数据同步：我们发现叉烧饭已经同步过来

pg16_target_report_db=# select * from t_order;
id | good_name | location
----+-----------+----------
  1 | 豆腐脑    | TJ
  2 | 煎饼果子  | TJ
  3 | 卤煮      | BJ
  4 | 小笼包    | SH
  5 | 车仔面    | HK
  6 | 蚵仔煎    | TW
  8 | 叉烧饭    | GZ
(7 rows)

7.观察new standby 上的逻辑槽的运行状态正常

db_pg16_local_decode=# select slot_name,slot_type,database ,confirmed_flush_lsn from pg_replication_slots ;
  slot_name   | slot_type |       database       | confirmed_flush_lsn
--------------+-----------+----------------------+---------------------
pg16_sub_new | logical   | db_pg16_local_decode | 0/113748A8
(1 row)


db_pg16_local_decode=# select slot_name,slot_type,database ,confirmed_flush_lsn from pg_replication_slots ;
  slot_name   | slot_type |       database       | confirmed_flush_lsn
--------------+-----------+----------------------+---------------------
pg16_sub_new | logical   | db_pg16_local_decode | 0/11374A78
(1 row)

8.最后我们来删除指向主库的逻辑复制订阅者

pg16_target_report_db=# drop SUBSCRIPTION pg16_sub;
NOTICE:  dropped replication slot "pg16_sub" on publisher
DROP SUBSCRIPTION

最后我们看看关于逻辑槽的监控：

在实际工作中，逻辑槽的消费订阅者有可能是不同版本的PG数据库，也有可能是flink CDC 这种程序来消费。

我们可以监控逻辑槽中的LSN的位置：

replication_lag 查询 from主库：pg_current_wal_lsn() - confirmed_flush_lsn
replication_lag 查询 from从库：pg_last_wal_replay_lsn() - confirmed_flush_lsn

db_pg16_local_decode=# select slot_name,slot_type,database,pg_last_wal_replay_lsn(),confirmed_flush_lsn, pg_last_wal_replay_lsn() - confirmed_flush_lsn AS replication_lag FROM pg_replication_slots;
  slot_name   | slot_type |       database       | pg_last_wal_replay_lsn | confirmed_flush_lsn | replication_lag
--------------+-----------+----------------------+------------------------+---------------------+-----------------
pg16_sub_new | logical   | db_pg16_local_decode | 0/113F6890             | 0/113F6890          |               0
(1 row)

我们也可以监控一下复制集中wal 的 replay time:

db_pg16_local_decode=# select  usename,application_name ,now() - reply_time as replay_lag from pg_stat_replication;
   usename    |  application_name   |   replay_lag    
--------------+---------------------+-----------------
logical_user | logical_replication | 00:00:01.475631
(1 row)

最后我们总结一下：

1)PG 16版本开始支持 standby logical decoding, 当 standby promote 成主库的时候，逻辑复制一般情况下不受影响（我们测试了switchover的情况）。

2)逻辑复制如果想要从primary上迁移到standby 上的步骤：
a.创建一个新的standby 的订阅者，初始状态是disable的，并且设置copy_data=false
b.停止原先指向主库的订阅者，并记录下之前同步截止的LSN （confirmed_flush_lsn from pg_replication_slots）
c.通过函数pg_replication_origin_advance 设置 standby 订阅者同步起点LSN的位置。（pg_replication_origin_advance 这个函数还可以解决逻辑同步冲突的问题）
d.激活standby 订阅者
e.查看standby订阅者的状态
f.删除老的指向primary的订阅者。

3）从LSN 和 replay_time 2个维度，减少了逻辑复制的延时lag的监控。

Have a fun 🙂 !

本文配图来自于 PG 业界著名厂商 Crunchy data。
参考文档 References:

https://www.postgresql.org/docs/current/sql-createsubscription.html#SQL-CREATESUBSCRIPTION-NOTES
https://www.crunchydata.com/blog/data-to-go-postgres-logical-replication
https://www.crunchydata.com/blog/logical-replication-on-standbys-in-postgres-16
https://www.percona.com/blog/how-to-start-logical-replication-in-postgresql-for-specific-tables-based-on-a-pg_dump/
https://www.postgresql.fastware.com/blog/addressing-replication-conflicts-using-alter-subscription-skip
https://www.postgresql.fastware.com/blog/how-to-handle-logical-replication-conflicts-in-postgresql

最后修改时间：2023-12-13 09:21:33

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

PG 16.1 standby logical decoding

评论