暂无图片
暂无图片
3
暂无图片
暂无图片
1
暂无图片

PostgreSQL17新特性之关于参数standby_slot_names的一些理解

原创 仙人掌 2024-05-31
330
概述

上一篇中提到在配置复制槽同步的时候,强烈建议在主库配置参数standby_slot_names=主备间的物理复制槽,来确保复制槽同步的一致性,防止订阅端超前消费。

今天又看了这个参数的详细解释,其中提到如果指定的插槽不存在或无效,则逻辑复制将不会继续。那就意味着,如果备库关闭之后,逻辑复制也将暂停。https://www.postgresql.org/docs/17/runtime-config-replication.html

验证
1、配置standby_slot_names,停止备库,逻辑复制暂停

会话1-备库

停止备库

[postgres@mydb1a ~]$ pg_ctl -D /app/pg17_beta1/data stop waiting for server to shut down....2024-05-31 10:16:52.206 CST [3768891] LOG: received fast shutdown request 2024-05-31 10:16:52.207 CST [3768891] LOG: aborting any active transactions 2024-05-31 10:16:52.207 CST [3768896] FATAL: terminating connection due to administrator command 2024-05-31 10:16:52.208 CST [3768895] FATAL: terminating walreceiver process due to administrator command 2024-05-31 10:16:52.208 CST [3768892] LOG: shutting down 2024-05-31 10:16:52.208 CST [3768892] LOG: restartpoint starting: shutdown immediate 2024-05-31 10:16:52.213 CST [3768892] LOG: restartpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, sync=0.001 s, total=0.005 s; sync files=2, longest=0.001 s, average=0.001 s; distance=2 kB, estimate=2 kB; lsn=0/C01D168, redo lsn=0/C01D110 2024-05-31 10:16:52.213 CST [3768892] LOG: recovery restart point at 0/C01D110 2024-05-31 10:16:52.213 CST [3768892] DETAIL: Last completed transaction was at log time 2024-05-31 10:13:40.231112+08. 2024-05-31 10:16:52.215 CST [3768891] LOG: database system is shut down done server stopped
复制

会话2-主库

主库对发布表插入一条数据

postgres=# select now(); now ------------------------------- 2024-05-31 10:16:58.055527+08 (1 row) postgres=# insert into test select 1; INSERT 0 1
复制

会话3-pg_recvlogical

逻辑复制在发布表插入数据后并没有同步,提示信息在等待使用了复制槽“s_5417”的备库反馈,且在会话1备库重启后继续同步

[postgres@mydb1b ~]$ pg_recvlogical -p 5417 -S test_slot -d postgres -U postgres -h mydb1b -P test_decoding -f - --start BEGIN 814 table public.test: INSERT: id[integer]:1 COMMIT 814 WARNING: replication slot "s_5417" specified in parameter standby_slot_names does not have active_pid DETAIL: Logical replication is waiting on the standby associated with "s_5417". HINT: Consider starting standby associated with "s_5417" or amend parameter standby_slot_names. BEGIN 815 table public.test: INSERT: id[integer]:1 COMMIT 815
复制
2、主库修改standby_slot_names,逻辑复制继续

当备库关闭后,在主库修改参数standby_slot_names=default,逻辑复制依然可以继续

会话1-主库

发布表插入数据

postgres=# insert into test select 6; INSERT 0 1 postgres=# alter system set standby_slot_names TO default; ALTER SYSTEM postgres=# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) postgres=#
复制

会话2-pg_recvlogical

当前备库已关闭,主库发布表插入数据后,逻辑复制暂停,在主库修改standby_slot_names后逻辑复制继续

BEGIN 816 table public.test: INSERT: id[integer]:1 COMMIT 816 WARNING: replication slot "s_5417" specified in parameter standby_slot_names does not have active_pid DETAIL: Logical replication is waiting on the standby associated with "s_5417". HINT: Consider starting standby associated with "s_5417" or amend parameter standby_slot_names. WARNING: replication slot "s_5417" specified in parameter standby_slot_names does not have active_pid DETAIL: Logical replication is waiting on the standby associated with "s_5417". HINT: Consider starting standby associated with "s_5417" or amend parameter standby_slot_names. WARNING: replication slot "s_5417" specified in parameter standby_slot_names does not have active_pid DETAIL: Logical replication is waiting on the standby associated with "s_5417". HINT: Consider starting standby associated with "s_5417" or amend parameter standby_slot_names. BEGIN 817 table public.test: INSERT: id[integer]:6 COMMIT 817
复制
3、逻辑复制可以从指定lsn(比confirmed_flush_lsn较新的lsn)开始复制

如果逻辑复制可以指定比复制槽confirmed_flush_lsn较新的位点开始,那么我们就可以记录同步的LSN位点信息,而不是依赖使用参数standby_slot_names来保证failover后的数据一致,这样即使复制槽的位点信息未及时同步到备库,备库提升后,只要有复制槽存在,逻辑复制就可以通过记录的位点信息接续上

首先查看复制槽当前位点,并插入几条数据

postgres=# select slot_name,catalog_xmin , restart_lsn , confirmed_flush_lsn from pg_replication_slots where slot_type='logical'; slot_name | catalog_xmin | restart_lsn | confirmed_flush_lsn -----------+--------------+-------------+--------------------- test_slot | 832 | 0/C021728 | 0/C021760 (1 row) postgres=# insert into test select 9; INSERT 0 1 postgres=# insert into test select 10; INSERT 0 1 postgres=# insert into test select 11; INSERT 0 1 postgres=# insert into test select 12; INSERT 0 1 postgres=# insert into test select 13; INSERT 0 1 postgres=# select xmin,id from test where id>8; xmin | id ------+---- 833 | 9 834 | 10 835 | 11 836 | 12 837 | 13
复制

假如我们想要从id=12 开始同步,那解析wal日志找到xid=836前的lsn,并从该lsn开始逻辑复制

[postgres@mydb1b pg_wal]$ pg_waldump 00000002000000000000000C|grep -w 836 pg_waldump: error: error in WAL record at 0/C022648: invalid record length at 0/C022680: expected at least 24, got 0 rmgr: Standby len (rec/tot): 50/ 50, tx: 0, lsn: 0/0C022530, prev 0/0C022500, desc: RUNNING_XACTS nextXid 836 latestCompletedXid 835 oldestRunningXid 836 rmgr: Heap len (rec/tot): 59/ 59, tx: 836, lsn: 0/0C022568, prev 0/0C022530, desc: INSERT off: 81, flags: 0x08, blkref #0: rel 1663/5/16394 blk 0 rmgr: Transaction len (rec/tot): 46/ 46, tx: 836, lsn: 0/0C0225A8, prev 0/0C022568, desc: COMMIT 2024-05-31 11:19:53.561730 CST [postgres@mydb1b pg_wal]$ [postgres@mydb1b pg_wal]$ pg_recvlogical -p 5417 -S test_slot -d postgres -U postgres -h mydb1b -P test_decoding -f - -I 0/0C022530 --start BEGIN 836 table public.test: INSERT: id[integer]:12 COMMIT 836 BEGIN 837 table public.test: INSERT: id[integer]:13 COMMIT 837
复制

逻辑复制可以指定比复制槽confirmed_flush_lsn较新的位点开始

总结
  • 主库配置参数standby_slot_names,为了保证逻辑复制在failover后数据的一致性,但备库关闭逻辑复制就会暂停
  • 如果备库长时间无法恢复的情况下,可以修改standby_slot_names去掉相关备库使用的复制槽,逻辑复制可恢复
  • 数据同步工具在使用逻辑复制的时候,可以在工具内部记录同步的LSN位点信息,而不是依赖使用参数standby_slot_names来保证failover后的数据一致,这样即使复制槽的位点信息未及时同步到备库,备库提升后,只要有复制槽存在,逻辑复制就可以通过工具内部记录的位点信息接续上(理论上备库位点不会比主库超前,如果备库位点超过了主库,那即使记录了位点信息,也无法复制confirmed_flush_lsn前的数据)
最后修改时间:2024-05-31 18:21:01
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

zy
暂无图片
10月前
评论
暂无图片 0
主库配置参数standby_slot_names,为了保证逻辑复制在failover后数据的一致性,但备库关闭逻辑复制就会暂停
10月前
暂无图片 点赞
评论