暂无图片
暂无图片
3
暂无图片
暂无图片
4
暂无图片

PostgreSQL-主备

原创 肖杰 2021-11-24
1420

1、主备搭建

1.1 主库配置

# pg_hba.conf host replication all 0/0 md5 # postgresql.conf listen_addresses='*' max_wal_senders = 10 wal_level = replica min_wal_size = 800M # 默认值80M通常太小,容易导致备库失败
复制

1.2 构建基础备份

在备库构建基础备份

[postgres@devin1 ~]$ pg_basebackup -h 192.168.56.102 -U postgres -F p -P -X stream -R -D $PGDATA -l backup20211124
Password:
5454432/5454432 kB (100%), 1/1 tablespace

复制

1.3 启动备库

[postgres@devin1 data]$ pg_ctl start waiting for server to start....2021-11-24 09:55:40.278 CST [22528] FATAL: could not access file "pg_stat_statements": No such file or directory 2021-11-24 09:55:40.278 CST [22528] LOG: database system is shut down stopped waiting pg_ctl: could not start server Examine the log output. # 因为主库安装了pg_stat_statements插件,启动失败,备库重新编译安装即可 [postgres@devin1 ~]$ cd postgresql-13.2/contrib/pg_stat_statements/ [postgres@devin1 pg_stat_statements]$ make make -C ../../src/backend generated-headers make[1]: Entering directory `/home/postgres/postgresql-13.2/src/backend' make -C catalog distprep generated-header-symlinks make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/catalog' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/catalog' make -C utils distprep generated-header-symlinks make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/utils' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/utils' make[1]: Leaving directory `/home/postgres/postgresql-13.2/src/backend' gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -I. -I. -I../../src/include -D_GNU_SOURCE -c -o pg_stat_statements.o pg_stat_statements.c gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werror=vla -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -O2 -fPIC -shared -o pg_stat_statements.so pg_stat_statements.o -L../../src/port -L../../src/common -Wl,--as-needed -Wl,-rpath,'/home/postgres/pg13/lib',--enable-new-dtags -lm [postgres@devin1 pg_stat_statements]$ make install make -C ../../src/backend generated-headers make[1]: Entering directory `/home/postgres/postgresql-13.2/src/backend' make -C catalog distprep generated-header-symlinks make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/catalog' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/catalog' make -C utils distprep generated-header-symlinks make[2]: Entering directory `/home/postgres/postgresql-13.2/src/backend/utils' make[2]: Nothing to be done for `distprep'. make[2]: Nothing to be done for `generated-header-symlinks'. make[2]: Leaving directory `/home/postgres/postgresql-13.2/src/backend/utils' make[1]: Leaving directory `/home/postgres/postgresql-13.2/src/backend' /bin/mkdir -p '/home/postgres/pg13/lib' /bin/mkdir -p '/home/postgres/pg13/share/extension' /bin/mkdir -p '/home/postgres/pg13/share/extension' /bin/install -c -m 755 pg_stat_statements.so '/home/postgres/pg13/lib/pg_stat_statements.so' /bin/install -c -m 644 ./pg_stat_statements.control '/home/postgres/pg13/share/extension/' /bin/install -c -m 644 ./pg_stat_statements--1.4.sql ./pg_stat_statements--1.7--1.8.sql ./pg_stat_statements--1.6--1.7.sql ./pg_stat_statements--1.5--1.6.sql ./pg_stat_statements--1.4--1.5.sql ./pg_stat_statements--1.3--1.4.sql ./pg_stat_statements--1.2--1.3.sql ./pg_stat_statements--1.1--1.2.sql ./pg_stat_statements--1.0--1.1.sql '/home/postgres/pg13/share/extension/' # 重新启动备库 [postgres@devin1 pg_stat_statements]$ pg_ctl start waiting for server to start....2021-11-24 09:56:56.703 CST [22555] LOG: redirecting log output to logging collector process 2021-11-24 09:56:56.703 CST [22555] HINT: Future log output will appear in directory "/home/postgres/pg13/log". ... done server started
复制

1.4 主备复制状态检查

# 主库 postgres=# select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent _lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time -------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------ ------+------------+------------+------------+-----------+-----------+------------+---------------+------------+------------------------------- 21989 | 10 | postgres | walreceiver | 192.168.56.103 | | 55120 | 2021-11-24 09:57:00.829921+08 | | streaming | 7/D50 00148 | 7/D5000148 | 7/D5000148 | 7/D5000148 | | | | 0 | async | 2021-11-24 10:00:08.582396+08 (1 row) devin=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- f (1 row) # 备库 devin=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- t (1 row)
复制

2、主备切换

2.1 停止主备库

# 主库 [postgres@devin-enmo data]$ pg_ctl stop waiting for server to shut down.... done server stopped # 备库 [postgres@devin1 pg_stat_statements]$ pg_ctl stop waiting for server to shut down....... done server stopped
复制

2.2 源主库配置

# 创建standby.signal文件 [postgres@devin-enmo data]$ touch standby.signal # 配置postgresql.conf primary_conninfo = 'user=postgres password=xjlove1224 channel_binding=disable host=192.168.56.103 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_versi on=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
复制

2.3 原备库配置

# 删除standby.signal文件 [postgres@devin1 data]$ rm -rf standby.signal
复制

2.4 启动

# 新主库(原备库) [postgres@devin1 data]$ pg_ctl start waiting for server to start....2021-11-24 10:20:12.050 CST [30755] LOG: redirecting log output to logging collector process 2021-11-24 10:20:12.050 CST [30755] HINT: Future log output will appear in directory "/home/postgres/pg13/log". done server started # 新备库(原主库) [postgres@devin-enmo data]$ pg_ctl start waiting for server to start....2021-11-24 10:20:16.956 CST [23406] LOG: redirecting log output to logging collector process 2021-11-24 10:20:16.956 CST [23406] HINT: Future log output will appear in directory "/home/postgres/pg13/log". done server started
复制

2.5 主备状态检查

# 新主库 postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- f (1 row) postgres=# select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent _lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time -------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------ ------+------------+------------+------------+-----------+-----------+------------+---------------+------------+------------------------------- 30768 | 10 | postgres | walreceiver | 192.168.56.102 | | 45698 | 2021-11-24 10:20:16.953641+08 | | streaming | 7/D60 000D8 | 7/D60000D8 | 7/D60000D8 | 7/D60000D8 | | | | 0 | async | 2021-11-24 10:20:47.327088+08 (1 row) # 新备库 postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- t (1 row)
复制

3、故障切换

3.1 模拟主库宕机

[postgres@devin1 data]$ exit logout [root@devin1 postgres]# shutdown -P now Remote side unexpectedly closed network connection ──────────────────────────────────────────────────────────────────────────────────────────────────────── Session stopped - Press <return> to exit tab - Press R to restart session - Press S to save terminal output to file
复制

3.2 提升备库为主库

[postgres@devin-enmo data]$ pg_ctl promote waiting for server to promote.... done server promoted [postgres@devin-enmo data]$ psql psql (13.2) Type "help" for help. postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- f (1 row)
复制

3.3 源主库修复后加入备库

# 需打开checksum,wal_log_hint设置成on [postgres@devin1 ~]$ pg_rewind -D $PGDATA --source-server='host=192.168.56.102 user=postgres password=xjlove1224' pg_rewind: servers diverged at WAL location 7/D60000D8 on timeline 2 pg_rewind: rewinding from last common checkpoint at 7/D6000028 on timeline 2 pg_rewind: Done!
复制

3.4 备库配置

[postgres@devin1 data]$ touch standby.signal primary_conninfo = 'user=postgres password=xjlove1224 channel_binding=disable host=192.168.56.102 port=5432 sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any'
复制

3.5 启动备库

[postgres@devin1 data]$ pg_ctl start waiting for server to start....2021-11-24 10:45:49.333 CST [13029] LOG: redirecting log output to logging collector process 2021-11-24 10:45:49.333 CST [13029] HINT: Future log output will appear in directory "/home/postgres/pg13/log". done server started
复制

3.6 主备状态检查

# 主库 postgres=# select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent _lsn | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state | reply_time -------+----------+----------+------------------+----------------+-----------------+-------------+-------------------------------+--------------+-----------+------ ------+------------+------------+------------+-----------------+-----------------+-----------------+---------------+------------+------------------------------- 24952 | 10 | postgres | walreceiver | 192.168.56.103 | | 55392 | 2021-11-24 10:46:49.794563+08 | | streaming | 7/D70 000A0 | 7/D70000A0 | 7/D70000A0 | 7/D70000A0 | 00:00:00.000676 | 00:00:00.000676 | 00:00:00.000676 | 0 | async | 2021-11-24 10:46:49.736116+08 (1 row) postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- f (1 row) # 备库 postgres=# select pg_is_in_recovery(); pg_is_in_recovery ------------------- t (1 row)
复制

4、同步流复制

4.1 架构

同步流复制要求wal日志写入standby数据库后commit才返回成功,所以,standby出现故障时,主库会hang住。解决这个问题的方法是启动两个standby数据库,这两个standby只要有一个是正常的就不会让主库hang住。

4.2 同步复制的配置

# 主库 synchronous_standby_names:该参数指定多个standby的名称,用逗号分开,而standby的名称是由参数aplication_name指定,pg10开始设置预计quorum的方式配置,示例: synchronous_standby_names='ANY 2(standby01,standby02,standby03)':只要wal传到了任意两个备库,commit就可以返回 synchronous_standby_names='standby01,standby02,standby03':只要wal传到任意一个备库,commit就可以返回 synchronous_commit: remote_apply:wal日志被传到备库并被apply,commit才返回 on:wal日志被传到备库并被持久化(不必apply),commit才返回 remote_write:wal日志被传到备库的内存中(不必持久化),commit才返回 local:wal日志被本地持久化,不管是否传到远程,commit返回 off:不必等wal日志被本地持久化,也不管是否传到远程,commit立即返回 所以,同步复制此参数的可选范围为on,remote_apply,remote_write # 备库 primary_conninfo:需要添加application_name参数
复制

5、检查备库及流复制情况

# 检查异步复制情况 postgres=# select pid,state,client_addr,sync_priority,sync_state from pg_stat_replication; pid | state | client_addr | sync_priority | sync_state -------+-----------+----------------+---------------+------------ 24952 | streaming | 192.168.56.103 | 0 | async # 查看备库落后主库多少字节的wal日志 postgres=# select pg_wal_lsn_diff(pg_current_wal_lsn(),replay_lsn) from pg_stat_replication; pg_wal_lsn_diff ----------------- 0 (1 row) # 检查同步复制情况 postgres=# select pid,state,client_addr,sync_priority,sync_state from pg_stat_replication; # 查看备库状态 postgres=# \x Expanded display is on. postgres=# select * from pg_stat_wal_receiver; -[ RECORD 1 ]---------+--------------------------------------------------------------------------------- pid | 13050 status | streaming receive_start_lsn | 7/D7000000 receive_start_tli | 3 written_lsn | 7/D70001C0 flushed_lsn | 7/D70001C0 received_tli | 3 last_msg_send_time | 2021-11-24 11:05:29.139785+08 last_msg_receipt_time | 2021-11-24 11:05:28.976063+08 latest_end_lsn | 7/D70001C0 latest_end_time | 2021-11-24 10:51:55.975756+08 slot_name | sender_host | 192.168.56.102 sender_port | 5432 conninfo | user=postgres password=******** channel_binding=disable dbname=replication host=192.168.56.102 port=5432 fallback_application_name=walreceiver sslmode=disable sslcompression=0 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any # 查看主备角色状态 select pg_is_in_recovery() 主库为false,备库为true
复制

6、hot standby查询冲突处理

主库上的一些操作会与hot standby备库上的查询产生冲突,导致冲突的原因有一下几个:

  • 主库上运行vacuum清理掉了备库上的查询需要的多版本数据
  • 主库上执行lock命令或各种DDL语句会在表上产生exclusive锁,而在备库上对这些表进行查询时会有冲突
  • 在主库上删除了一个表空间,而备库上的查询需要存放一些临时文件在此表空间
  • 在主库上删除了一个数据库,而备库上有很多session还连在该数据库上

当发生冲突时,解决的方法有一下几种:

  • 让备库上的应用wal日志的过程等待一段时间
  • 取消备库上正在执行的查询

相关参数:

  • max_standby_archive_delay:备库从wal归档中读取时的最大延迟,默认30s,如果为-1,则一直等待

  • max_standby_streaming_delay:备库从流复制中读取wal时的最大延迟,默认30上,如果为-1,则一直等待

    如果备库用作高可用,可以把上面参数设置小一些,如果备库用来执行一些大的查询,则相反。

​ 大多数冲突发生的原因是主库上把备库需要的多版本数据给清理掉了,这时可以通过设置hot_standby_feedback为true来缓解此问题,设置为true后,备库会通知主库,哪些多版本数据在备库上还需要,这样主库auto vacuum则不会清理这些数据,还有一个办法就是把vacuum_defer_cleanup_age设置大一些,延迟清理多版本数据。

​ 在备库上因为冲突而取消的SQL命令的数量可以在pg_stat_database_conflicts中查询,建议应用程序最好能检测到这个错误并重新执行。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

冷狼
暂无图片
5月前
评论
暂无图片 0
同步复制的状态应该是sync,文章中同步复制的状态是async
5月前
暂无图片 点赞
评论
哈哈😄
暂无图片
2年前
评论
暂无图片 0
同步复制的状态应该是sync,文章中同步复制的状态是async
2年前
暂无图片 点赞
评论
柚子身上依
暂无图片
2年前
评论
暂无图片 0
2年前
暂无图片 点赞
评论
墨天轮福利君
暂无图片
3年前
评论
暂无图片 0
您好,您的文章已入选合格奖,10墨值奖励已经到账请查收! ❤️我们还会实时派发您的流量收益。
3年前
暂无图片 点赞
评论
目录
  • 1、主备搭建
    • 1.1 主库配置
    • 1.2 构建基础备份
    • 1.3 启动备库
    • 1.4 主备复制状态检查
  • 2、主备切换
    • 2.1 停止主备库
    • 2.2 源主库配置
    • 2.3 原备库配置
    • 2.4 启动
    • 2.5 主备状态检查
  • 3、故障切换
    • 3.1 模拟主库宕机
    • 3.2 提升备库为主库
    • 3.3 源主库修复后加入备库
    • 3.4 备库配置
    • 3.5 启动备库
    • 3.6 主备状态检查
  • 4、同步流复制
    • 4.1 架构
    • 4.2 同步复制的配置
  • 5、检查备库及流复制情况
  • 6、hot standby查询冲突处理