1、主备搭建
1.1 主库配置
# pg_hba.conf host replication all 0/0 md5 # postgresql.conf listen_addresses='*' max_wal_senders = 10 wal_level = replica min_wal_size = 800M # 默认值80M通常太小,容易导致备库失败
复制
1.2 构建基础备份
在备库构建基础备份
[postgres@devin1 ~]$ pg_basebackup -h 192.168.56.102 -U postgres -F p -P -X stream -R -D $PGDATA -l backup20211124 Password: 5454432/5454432 kB (100%), 1/1 tablespace
复制
1.3 启动备库
[postgres@devin1 data]$ pg_ctl start
waiting for server to start....2021-11-24 09:55:40.278 CST [22528] FATAL: could not access file "pg_stat_statements": No such file or directory
2021-11-24 09:55:40.278 CST [22528] LOG: database system is shut down
stopped waiting
pg_ctl: could not start server
Examine the log output.
# 因为主库安装了pg_stat_statements插件,启动失败,备库重新编译安装即可
[postgres@devin1 ~]$ cd postgresql-13.2/contrib/pg_stat_statements/
[postgres@devin1 pg_stat_statements]$ make
make -C ../../src/backend generated-headers
make[1]: Entering directory `/home/postgres/postgresql-13.2/src/backend'
make -C catalog distprep generated-header-symlinks
make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/catalog'
make[2]: Nothing to be done for `distprep'.
make[2]: Nothing to be done for `generated-header-symlinks'.
make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/catalog'
make -C utils distprep generated-header-symlinks
make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/utils'
make[2]: Nothing to be done for `distprep'.
make[2]: Nothing to be done for `generated-header-symlinks'.
make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/utils'
make[1]: Leaving directory `/home/postgres/postgresql-13.2/src/backend'
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I. -I../../src/include -D_GNU_SOURCE -c -o pg_stat_statements.o pg_stat_statements.c
gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o pg_stat_statements.so pg_stat_statements.o -L../../src/port -L../../src/common -Wl,--as-needed -Wl,-rpath,'/home/postgres/pg13/lib',--enable-new-dtags -lm
[postgres@devin1 pg_stat_statements]$ make install
make -C ../../src/backend generated-headers
make[1]: Entering directory `/home/postgres/postgresql-13.2/src/backend'
make -C catalog distprep generated-header-symlinks
make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/catalog'
make[2]: Nothing to be done for `distprep'.
make[2]: Nothing to be done for `generated-header-symlinks'.
make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/catalog'
make -C utils distprep generated-header-symlinks
make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/utils'
make[2]: Nothing to be done for `distprep'.
make[2]: Nothing to be done for `generated-header-symlinks'.
make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/utils'
make[1]: Leaving directory `/home/postgres/postgresql-13.2/src/backend'
/bin/mkdir -p '/home/postgres/pg13/lib'
/bin/mkdir -p '/home/postgres/pg13/share/extension'
/bin/mkdir -p '/home/postgres/pg13/share/extension'
/bin/install -c -m 755 pg_stat_statements.so '/home/postgres/pg13/lib/pg_stat_statements.so'
/bin/install -c -m 644 ./pg_stat_statements.control '/home/postgres/pg13/share/extension/'
/bin/install -c -m 644 ./pg_stat_statements--1.4.sql ./pg_stat_statements--1.7--1.8.sql ./pg_stat_statements--1.6--1.7.sql ./pg_stat_statements--1.5--1.6.sql ./pg_stat_statements--1.4--1.5.sql ./pg_stat_statements--1.3--1.4.sql ./pg_stat_statements--1.2--1.3.sql ./pg_stat_statements--1.1--1.2.sql ./pg_stat_statements--1.0--1.1.sql '/home/postgres/pg13/share/extension/'
# 重新启动备库
[postgres@devin1 pg_stat_statements]$ pg_ctl start
waiting for server to start....2021-11-24 09:56:56.703 CST [22555] LOG: redirecting log output to logging collector process
2021-11-24 09:56:56.703 CST [22555] HINT: Future log output will appear in directory "/home/postgres/pg13/log".
... done
server started
复制
1.4 主备复制状态检查
# 主库
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent
_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------
------+------------+------------+------------+-----------+-----------+------------+---------------+------------+-------------------------------
21989 | 10 | postgres | walreceiver | 192.168.56.103 | | 55120 | 2021-11-24 09:57:00.829921+08 | | streaming | 7/D50
00148 | 7/D5000148 | 7/D5000148 | 7/D5000148 | | | | 0 | async | 2021-11-24 10:00:08.582396+08
(1 row)
devin=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
# 备库
devin=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
复制
2、主备切换
2.1 停止主备库
# 主库
[postgres@devin-enmo data]$ pg_ctl stop
waiting for server to shut down.... done
server stopped
# 备库
[postgres@devin1 pg_stat_statements]$ pg_ctl stop
waiting for server to shut down....... done
server stopped
复制
2.2 源主库配置
# 创建standby.signal文件
[postgres@devin-enmo data]$ touch standby.signal
# 配置postgresql.conf
primary_conninfo = 'user=postgres password=xjlove1224 channel_binding=disable host=192.168.56.103 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_versi
on=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
复制
2.3 原备库配置
# 删除standby.signal文件
[postgres@devin1 data]$ rm -rf standby.signal
复制
2.4 启动
# 新主库(原备库)
[postgres@devin1 data]$ pg_ctl start
waiting for server to start....2021-11-24 10:20:12.050 CST [30755] LOG: redirecting log output to logging collector process
2021-11-24 10:20:12.050 CST [30755] HINT: Future log output will appear in directory "/home/postgres/pg13/log".
done
server started
# 新备库(原主库)
[postgres@devin-enmo data]$ pg_ctl start
waiting for server to start....2021-11-24 10:20:16.956 CST [23406] LOG: redirecting log output to logging collector process
2021-11-24 10:20:16.956 CST [23406] HINT: Future log output will appear in directory "/home/postgres/pg13/log".
done
server started
复制
2.5 主备状态检查
# 新主库
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent
_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------
------+------------+------------+------------+-----------+-----------+------------+---------------+------------+-------------------------------
30768 | 10 | postgres | walreceiver | 192.168.56.102 | | 45698 | 2021-11-24 10:20:16.953641+08 | | streaming | 7/D60
000D8 | 7/D60000D8 | 7/D60000D8 | 7/D60000D8 | | | | 0 | async | 2021-11-24 10:20:47.327088+08
(1 row)
# 新备库
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
复制
3、故障切换
3.1 模拟主库宕机
[postgres@devin1 data]$ exit
logout
[root@devin1 postgres]# shutdown -P now
Remote side unexpectedly closed network connection
────────────────────────────────────────────────────────────────────────────────────────────────────────
Session stopped
- Press <return> to exit tab
- Press R to restart session
- Press S to save terminal output to file
复制
3.2 提升备库为主库
[postgres@devin-enmo data]$ pg_ctl promote
waiting for server to promote.... done
server promoted
[postgres@devin-enmo data]$ psql
psql (13.2)
Type "help" for help.
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
复制
3.3 源主库修复后加入备库
# 需打开checksum,wal_log_hint设置成on
[postgres@devin1 ~]$ pg_rewind -D $PGDATA --source-server='host=192.168.56.102 user=postgres password=xjlove1224'
pg_rewind: servers diverged at WAL location 7/D60000D8 on timeline 2
pg_rewind: rewinding from last common checkpoint at 7/D6000028 on timeline 2
pg_rewind: Done!
复制
3.4 备库配置
[postgres@devin1 data]$ touch standby.signal
primary_conninfo = 'user=postgres password=xjlove1224 channel_binding=disable host=192.168.56.102 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
复制
3.5 启动备库
[postgres@devin1 data]$ pg_ctl start
waiting for server to start....2021-11-24 10:45:49.333 CST [13029] LOG: redirecting log output to logging collector process
2021-11-24 10:45:49.333 CST [13029] HINT: Future log output will appear in directory "/home/postgres/pg13/log".
done
server started
复制
3.6 主备状态检查
# 主库
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent
_lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time
-------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------
------+------------+------------+------------+-----------------+-----------------+-----------------+---------------+------------+-------------------------------
24952 | 10 | postgres | walreceiver | 192.168.56.103 | | 55392 | 2021-11-24 10:46:49.794563+08 | | streaming | 7/D70
000A0 | 7/D70000A0 | 7/D70000A0 | 7/D70000A0 | 00:00:00.000676 | 00:00:00.000676 | 00:00:00.000676 | 0 | async | 2021-11-24 10:46:49.736116+08
(1 row)
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)
# 备库
postgres=# select pg_is_in_recovery();
pg_is_in_recovery
-------------------
t
(1 row)
复制
4、同步流复制
4.1 架构
同步流复制要求wal日志写入standby数据库后commit才返回成功,所以,standby出现故障时,主库会hang住。解决这个问题的方法是启动两个standby数据库,这两个standby只要有一个是正常的就不会让主库hang住。
4.2 同步复制的配置
# 主库
synchronous_standby_names:该参数指定多个standby的名称,用逗号分开,而standby的名称是由参数aplication_name指定,pg10开始设置预计quorum的方式配置,示例:
synchronous_standby_names='ANY 2(standby01,standby02,standby03)':只要wal传到了任意两个备库,commit就可以返回
synchronous_standby_names='standby01,standby02,standby03':只要wal传到任意一个备库,commit就可以返回
synchronous_commit:
remote_apply:wal日志被传到备库并被apply,commit才返回
on:wal日志被传到备库并被持久化(不必apply),commit才返回
remote_write:wal日志被传到备库的内存中(不必持久化),commit才返回
local:wal日志被本地持久化,不管是否传到远程,commit返回
off:不必等wal日志被本地持久化,也不管是否传到远程,commit立即返回
所以,同步复制此参数的可选范围为on,remote_apply,remote_write
# 备库
primary_conninfo:需要添加application_name参数
复制
5、检查备库及流复制情况
# 检查异步复制情况
postgres=# select pid,state,client_addr,sync_priority,sync_state from pg_stat_replication;
pid | state | client_addr | sync_priority | sync_state
-------+-----------+----------------+---------------+------------
24952 | streaming | 192.168.56.103 | 0 | async
# 查看备库落后主库多少字节的wal日志
postgres=# select pg_wal_lsn_diff(pg_current_wal_lsn(),replay_lsn) from pg_stat_replication;
pg_wal_lsn_diff
-----------------
0
(1 row)
# 检查同步复制情况
postgres=# select pid,state,client_addr,sync_priority,sync_state from pg_stat_replication;
# 查看备库状态
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+---------------------------------------------------------------------------------
pid | 13050
status | streaming
receive_start_lsn | 7/D7000000
receive_start_tli | 3
written_lsn | 7/D70001C0
flushed_lsn | 7/D70001C0
received_tli | 3
last_msg_send_time | 2021-11-24 11:05:29.139785+08
last_msg_receipt_time | 2021-11-24 11:05:28.976063+08
latest_end_lsn | 7/D70001C0
latest_end_time | 2021-11-24 10:51:55.975756+08
slot_name |
sender_host | 192.168.56.102
sender_port | 5432
conninfo | user=postgres password=******** channel_binding=disable dbname=replication host=192.168.56.102 port=5432 fallback_application_name=walreceiver sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any
# 查看主备角色状态
select pg_is_in_recovery()
主库为false,备库为true
复制
6、hot standby查询冲突处理
主库上的一些操作会与hot standby备库上的查询产生冲突,导致冲突的原因有一下几个:
- 主库上运行vacuum清理掉了备库上的查询需要的多版本数据
- 主库上执行lock命令或各种DDL语句会在表上产生exclusive锁,而在备库上对这些表进行查询时会有冲突
- 在主库上删除了一个表空间,而备库上的查询需要存放一些临时文件在此表空间
- 在主库上删除了一个数据库,而备库上有很多session还连在该数据库上
当发生冲突时,解决的方法有一下几种:
- 让备库上的应用wal日志的过程等待一段时间
- 取消备库上正在执行的查询
相关参数:
-
max_standby_archive_delay:备库从wal归档中读取时的最大延迟,默认30s,如果为-1,则一直等待
-
max_standby_streaming_delay:备库从流复制中读取wal时的最大延迟,默认30上,如果为-1,则一直等待
如果备库用作高可用,可以把上面参数设置小一些,如果备库用来执行一些大的查询,则相反。
大多数冲突发生的原因是主库上把备库需要的多版本数据给清理掉了,这时可以通过设置hot_standby_feedback为true来缓解此问题,设置为true后,备库会通知主库,哪些多版本数据在备库上还需要,这样主库auto vacuum则不会清理这些数据,还有一个办法就是把vacuum_defer_cleanup_age设置大一些,延迟清理多版本数据。
在备库上因为冲突而取消的SQL命令的数量可以在pg_stat_database_conflicts中查询,建议应用程序最好能检测到这个错误并重新执行。
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。
评论
同步复制的状态应该是sync,文章中同步复制的状态是async
5月前

评论
同步复制的状态应该是sync,文章中同步复制的状态是async
2年前

评论

2年前

评论
您好,您的文章已入选合格奖,10墨值奖励已经到账请查收!
❤️我们还会实时派发您的流量收益。
3年前

评论
相关阅读
外国CTO也感兴趣的开源数据库项目——openHalo
小满未满、
1241次阅读
2025-04-21 16:58:09
9.9 分高危漏洞,尽快升级到 pgAdmin 4 v9.2 进行修复
严少安
382次阅读
2025-04-11 10:43:23
3月“墨力原创作者计划”获奖名单公布
墨天轮编辑部
366次阅读
2025-04-15 14:48:05
openHalo问世,全球首款基于PostgreSQL兼容MySQL协议的国产开源数据库
严少安
356次阅读
2025-04-07 12:14:29
转发有奖 | PostgreSQL 16 PGCM高级认证课程直播班招生中!
墨天轮小教习
173次阅读
2025-04-14 15:58:34
SQL 优化之 OR 子句改写
xiongcc
123次阅读
2025-04-21 00:08:06
融合Redis缓存的PostgreSQL高可用架构
梧桐
109次阅读
2025-04-08 06:35:40
PostgreSQL拓展PGQ实现解析
chirpyli
107次阅读
2025-04-07 11:23:17
中国PostgreSQL培训认证体系新增PGAI应用工程师方向
开源软件联盟PostgreSQL分会
90次阅读
2025-05-06 10:21:13
PostgreSQL的dblink扩展模块使用方法
szrsu
90次阅读
2025-04-24 17:39:30