PostgreSQL部署|基于Stream复制的手工主从切换

DigOps 2022-04-17

979

微信号：digops

加关注哦

延续上一篇写到【PostgreSQL部署|基于Stream复制的高可用部署】搭建的一主两从的环境，继续完成如何手工切换。本文的方法适用PostgreSQL 12及以上的版本，12以下的版本略有差异。

复制技术：PostgreSQL Stream
主从架构：一主两从
数据库版本：13.6
操作系统：Centos 7.6

IP地址	操作系统	数据库版本	角色	说明
192.168.56.180	Centos 7.6	13.6	主	初始角色
192.168.56.181	Centos 7.6	13.6	从	提升为主
192.168.56.182	Centos 7.6	13.6	从	初始角色

本文以实验方式进行验证，涉及三种切换方式：

方式一，主备均正常停止（先主后备），数据停止后是保证一致，且数据库不会进行recover，然后进行主备切换。
方式二，主备数据可能不一致情况，通过pg_basebackup工具做重新同步，这种方式相当于重做复制环境。
方式三，主备数据可能不一致情况，通过pg_rewind工具做重新同步，相较于pg_basebackup，只比对不一致数据。

三种方式的切换流程基本上一致的。只是在重建主从关系时，因方法不同，而实施的过程有些变化。

需要提前了解的信息

1. 主备切换流程的一般步骤

停主库
将其中一台备库提升为主库
在源主库配置主从关系（源主上的PGDATA下创建standby.signal
文件，并修改postgres.conf
及postgres.auto.conf
中的复制信息）
在源主上通过pg_basebackup
或pg_rewind
将数据拉至一致。
启动源主数据库，此时已变为新从节点。
按照3-5步骤，将另一台备节点变成新主的被节点。

2. 如何确认PostgreSQL的节点是主还是从？

通过pg_controldata
可以查Database cluster state的值：

in archive recovery
：为从节点
in production
：为主节点

language-shell

[root@opengaussdb2 opt]# pg_controldata -D /data/pgdata|grep 'Database cluster state'
Database cluster state:               in archive recovery
[root@opengaussdb1 opt]# pg_controldata -D /data/pgdata|grep 'Database cluster state'
Database cluster state:               in production

如果确认了主节点，想看其有哪些从节点，可以通过视图pg_stat_replication
来查看，用psql工具让每一行数据可以更好的观看，用\x
进行转化。

language-sql

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 2413
usesysid         | 16384
usename          | repusr
application_name | pg_digops
client_addr      | 192.168.56.181
client_hostname  | 
client_port      | 60830
backend_start    | 2022-04-13 16:09:28.102455+08
backend_xmin     | 
state            | streaming
sent_lsn         | 0/1501BAA8
write_lsn        | 0/1501BAA8
flush_lsn        | 0/1501BAA8
replay_lsn       | 0/1501BAA8
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2022-04-13 16:59:52.208028+08
-[ RECORD 2 ]----+------------------------------
pid              | 2622
usesysid         | 16384
usename          | repusr
application_name | pg_digops03
client_addr      | 192.168.56.182
client_hostname  | 
client_port      | 33194
backend_start    | 2022-04-13 16:14:42.575754+08
backend_xmin     | 
state            | streaming
sent_lsn         | 0/1501BAA8
write_lsn        | 0/1501BAA8
flush_lsn        | 0/1501BAA8
replay_lsn       | 0/1501BAA8
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2022-04-13 16:59:52.527587+08

方式一

1. 停主库

pg_ctl
命令用于启动和关闭数据库，-D
参数要指定PGDATA目录。如果在环境变量中已设置$PGDATA，可不用在此指定。

language-shell

-bash-4.2$ pg_ctl stop -D /data/pgdata
waiting for server to shut down....2022-04-13 17:03:09.053 CST [2401] LOG:  received fast shutdown request
2022-04-13 17:03:09.067 CST [2401] LOG:  aborting any active transactions
2022-04-13 17:03:09.067 CST [2611] FATAL:  terminating connection due to administrator command
2022-04-13 17:03:09.070 CST [2401] LOG:  background worker "logical replication launcher" (PID 2409) exited with exit code 1
2022-04-13 17:03:09.070 CST [2403] LOG:  shutting down
2022-04-13 17:03:09.712 CST [4195] FATAL:  the database system is shutting down
2022-04-13 17:03:09.744 CST [2401] LOG:  database system is shut down
 done
server stopped

2. 将其中一台备库提升为主库

可通过pg_ctl
的promote
选项或者数据库中pg_promote()
函数来提升

pg_ctl
的promote
选项

language-sql

-bash-4.2$ pg_ctl promote -D /data/pgdata
waiting for server to promote.... done
server promoted

pg_promote()
函数

language-sql

postgres=# select pg_promote(true,10);
2022-04-13 17:46:22.631 CST [28596] LOG:  received promote request
2022-04-13 17:46:22.631 CST [31100] FATAL:  terminating walreceiver process due to administrator command
2022-04-13 17:46:22.632 CST [28596] LOG:  invalid record length at 0/16000BC0: wanted 24, got 0
2022-04-13 17:46:22.632 CST [28596] LOG:  redo done at 0/16000B88
2022-04-13 17:46:22.632 CST [28596] LOG:  last completed transaction was at log time 2022-04-13 16:15:22.115665+08
2022-04-13 17:46:22.637 CST [28596] LOG:  selected new timeline ID: 8
2022-04-13 17:46:22.680 CST [28596] LOG:  archive recovery complete
2022-04-13 17:46:22.700 CST [28595] LOG:  database system is ready to accept connections
 pg_promote 
------------
 t
(1 row)

数字10单位是秒，表示等待10秒钟反馈切换结果，默认值是60。

日志上的体现

在停掉主库时，一致回现实无法连接主库，当执行了提升后，打印了新的信息表示已经可以连接访问了。

language-shell

2022-04-13 17:06:44.991 CST [23674] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.56.180" and accepting
                TCP/IP connections on port 5432?
2022-04-13 17:06:50.000 CST [23675] FATAL:  could not connect to the primary server: could not connect to server: Connection refused
                Is the server running on host "192.168.56.180" and accepting
                TCP/IP connections on port 5432?
2022-04-13 17:06:50.455 CST [21326] LOG:  received promote request
2022-04-13 17:06:50.455 CST [21326] LOG:  redo done at 0/16000028
2022-04-13 17:06:50.456 CST [21326] LOG:  last completed transaction was at log time 2022-04-13 16:15:22.115665+08
2022-04-13 17:06:50.469 CST [21326] LOG:  selected new timeline ID: 7
2022-04-13 17:06:50.515 CST [21326] LOG:  archive recovery complete
2022-04-13 17:06:50.534 CST [21325] LOG:  database system is ready to accept connections

pg_controldata
的Database cluster state: 由in archive recovery
转变为in production

3. 在源主库配置主从关系

包括源主上的PGDATA下创建standby.signal
文件，并修改postgres.conf
及postgres.auto.conf
中的复制信息

本节内容均用postgres用户操作

3.1 postgres.conf
修改

language-shell

# recovery.conf
primary_conninfo = 'user=repusr passfile=/var/lib/pgsql/pgpass host=192.168.56.181 port=5432 sslmode=prefer application_name=pg_digops01 gssencmode=prefer channel_binding=prefer'
primary_slot_name = 'pg_digops01'

user：复制用户
passfile：如果有密码文件可直接指定，如果没有可使用password项来代替passfile，如password=repusr.
language-shell
```
# /var/lib/pgsql/pgpass
192.168.56.181:5432:*:repusr:repusr
```
host：主节点IP地址
application_name：本机主机名
primary_slot_name：在主节点上的事务槽名称，可随便起名。

3.2 postgres.auto.conf
修改

没有内容，则不用动，如果因之前通过别的方式在里面写入了primary_conninfo
，删除内容即可。

3.3 PGDATA下创建standby.signal

language-shell

cd $PGDATA
touch standby.signal

4. 启动源主数据库，此时已变为新从节点。

language-shell

pg_ctl start -D /data/pgdata

这时启动数据库后，会一直打印FATAL日志，找不到复制槽。处理方式是，登录新主库，创建复制槽。

FATAL日志

language-shell

2022-04-13 17:27:48.460 CST [4977] FATAL:  could not start WAL streaming: ERROR:  replication slot "pg_digops01" does not exist
2022-04-13 17:27:53.484 CST [4979] FATAL:  could not start WAL streaming: ERROR:  replication slot "pg_digops01" does not exist

新主上创建复制槽

language-sql

-- 创建
postgres=#  SELECT * FROM pg_create_physical_replication_slot('pg_digops01');
  slot_name  | lsn 
-------------+-----
 pg_digops01 | 
(1 row)
 -- 查看
postgres=#  SELECT slot_name, slot_type, active FROM pg_replication_slots;
  slot_name  | slot_type | active 
-------------+-----------+--------
 pg_digops01 | physical  | t
(1 row)

5. 验证主备关系

主库查看pg_stat_replication
，这时，已经看到源主已经变成从节点加入到新主集群。按照3-4步骤，将另一个节点也加入到新复制集群。

language-sql

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 24539
usesysid         | 16384
usename          | repusr
application_name | pg_digops01
client_addr      | 192.168.56.180
client_hostname  | 
client_port      | 33072
backend_start    | 2022-04-13 17:32:33.727991+08
backend_xmin     | 
state            | streaming
sent_lsn         | 0/16000BC0
write_lsn        | 0/16000BC0
flush_lsn        | 0/16000BC0
replay_lsn       | 0/16000BC0
write_lag        | 
flush_lag        | 
replay_lag       | 
sync_priority    | 0
sync_state       | async
reply_time       | 2022-04-13 17:35:54.243317+08

方式二

1. 停主库

（略，参考方式一）

2. 将其中一台备库提升为主库

（略，参考方式一）

3. 在源主库配置主从关系

这一步没有方式一那么麻烦，只需执行pg_basebackup命令，就可以完成postgres.conf
、postgres.auto.conf
、standby.signal
的创建和修改。

language-shell

-bash-4.2$ pg_basebackup -h 192.168.56.182 -U repusr -D /data/pgdata -X stream  -P -R -S pg_digops01
Password: 
24371/24371 kB (100%), 1/1 tablespace

这里需要注意，-S
参数定义的复制槽，复制槽一定要先在新主库中创建好了的名字才可以。否则会报如下错误。

language-shell

-bash-4.2$ pg_basebacku p -h 192.168.56.182 -U repusr -D /data/pgdata -X stream  -P -R -S pg_digops1
Password: 
pg_basebackup: error: could not send replication command "START_REPLICATION": ERROR:  replication slot "pg_digops1" does not exist
24371/24371 kB (100%), 1/1 tablespace
pg_basebackup: error: child process exited with exit code 1
pg_basebackup: removing data directory "/data/pgdata"

4. 启动源主数据库，此时已变为新从节点。

language-shell

-bash-4.2$ pg_ctl start -D /data/pgdata
waiting for server to start....2022-04-13 18:05:45.140 CST [6267] LOG:  starting PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2022-04-13 18:05:45.140 CST [6267] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-04-13 18:05:45.145 CST [6267] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-04-13 18:05:45.152 CST [6268] LOG:  database system was interrupted; last known up at 2022-04-13 18:05:13 CST
2022-04-13 18:05:45.603 CST [6268] LOG:  entering standby mode
2022-04-13 18:05:45.609 CST [6268] LOG:  redo starts at 0/1D000028
2022-04-13 18:05:45.614 CST [6268] LOG:  consistent recovery state reached at 0/1D000100
2022-04-13 18:05:45.614 CST [6267] LOG:  database system is ready to accept read only connections
2022-04-13 18:05:45.627 CST [6272] LOG:  started streaming WAL from primary at 0/1E000000 on timeline 8
 done

5. 验证主备关系

(略，参考方式一)

方式三

1. 停主库

（略，参考方式一）

2. 将其中一台备库提升为主库

提升

language-sql

postgres=# select pg_promote(true,10);
2022-04-13 19:57:49.946 CST [6268] LOG:  received promote request
2022-04-13 19:57:49.946 CST [6272] FATAL:  terminating walreceiver process due to administrator command
2022-04-13 19:57:49.952 CST [6268] LOG:  invalid record length at 0/1E000148: wanted 24, got 0
2022-04-13 19:57:49.952 CST [6268] LOG:  redo done at 0/1E000110
2022-04-13 19:57:49.983 CST [6268] LOG:  selected new timeline ID: 9
2022-04-13 19:57:50.043 CST [6268] LOG:  archive recovery complete
2022-04-13 19:57:50.059 CST [6267] LOG:  database system is ready to accept connections
 pg_promote 
------------
 t
(1 row)

在提升为主库的数据库上创建复制槽

language-sql

SELECT * FROM pg_create_physical_replication_slot('pg_digops03');
SELECT * FROM pg_create_physical_replication_slot('pg_digops02');
postgres=# SELECT slot_name, slot_type, active FROM pg_replication_slots;
  slot_name  | slot_type | active 
-------------+-----------+--------
 pg_digops03 | physical  | f
 pg_digops02 | physical  | f
(2 rows)

3. 在源主库配置主从关系

这一步没有方式一那么麻烦，只需执行pg_basebackup命令，就可以完成postgres.conf
、postgres.auto.conf
、standby.signal
的创建和修改。

language-shell

-bash-4.2$  pg_rewind --target-pgdata /data/pgdata --source-server='host=192.168.56.180 port=5432 user=postgres dbname=postgres password=123456' -P -R
pg_rewind: connected to server
pg_rewind: servers diverged at WAL location 0/1E000148 on timeline 8
pg_rewind: rewinding from last common checkpoint at 0/1E000098 on timeline 8
pg_rewind: reading source file list
pg_rewind: reading target file list
pg_rewind: reading WAL in target
pg_rewind: need to copy 51 MB (total source directory size is 71 MB)
52937/52937 kB (100%) copied
pg_rewind: creating backup label and updating control file
pg_rewind: syncing target data directory
pg_rewind: Done!
-bash-4.2$

可以看到源目录大小是71MB，需要拷贝51MB。

postgres.conf
中增加如下内容

language-shell

primary_conninfo = 'user=repusr password=repusr host=192.168.56.180 port=5432 sslmode=prefer application_name=pg_digops03 gssencmode=prefer channel_binding=prefer'
primary_slot_name = 'pg_digops03'

postgresql.auto.conf中删除如下内容

language-shell

primary_conninfo = 'xxx'
primary_slot_name = 'xxx'

4. 启动源主数据库，此时已变为新从节点。

language-shell

-bash-4.2$ pg_ctl start -D /data/pgdata
waiting for server to start....2022-04-13 20:23:17.297 CST [30055] LOG:  starting PostgreSQL 13.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit
2022-04-13 20:23:17.297 CST [30055] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-04-13 20:23:17.306 CST [30055] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-04-13 20:23:17.313 CST [30056] LOG:  database system was interrupted while in recovery at log time 2022-04-13 20:07:50 CST
2022-04-13 20:23:17.313 CST [30056] HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2022-04-13 20:23:17.740 CST [30056] LOG:  entering standby mode
2022-04-13 20:23:17.745 CST [30056] LOG:  redo starts at 0/1E000060
2022-04-13 20:23:17.749 CST [30056] LOG:  consistent recovery state reached at 0/1E0301C0
2022-04-13 20:23:17.749 CST [30056] LOG:  invalid record length at 0/1E0301C0: wanted 24, got 0
2022-04-13 20:23:17.750 CST [30055] LOG:  database system is ready to accept read only connections
2022-04-13 20:23:17.760 CST [30060] LOG:  started streaming WAL from primary at 0/1E000000 on timeline 9
 done
server started

5. 验证主备关系

(略，参考方式一)

6. 常见问题

6.1 pg_rewind后，日志仍缺失

language-shell

-bash-4.2$ pg_rewind --target-pgdata /data/pgdata --source-server='host=192.168.56.180 port=5432 user=postgres dbname=postgres password=123456' -P -R
pg_rewind: connected to server
pg_rewind: source and target cluster are on the same timeline
pg_rewind: no rewind required

缺日志报错

language-shell

2022-04-13 20:26:44.706 CST [4452] LOG:  started streaming WAL from primary at 0/21000000 on timeline 9
2022-04-13 20:26:44.706 CST [4452] FATAL:  could not receive data from WAL stream: ERROR:  requested starting point 0/21000000 is ahead of the WAL flush position of this server 0/1E030F08

这种情况只能通过pg_basebackup再重新初始化同步解决了。

6.2 从库未生成standby.signal
无法建立主从关系

未生成standby.signal
文件，启动从库时仍然是可访问状态，不是read only，且无法建立主从关系。

postgresql

文章转载自DigOps，如果涉嫌侵权，请发送邮件至：contact@modb.pro进行举报，并提供相关证据，一经查实，墨天轮将立刻删除相关内容。

PostgreSQL部署|基于Stream复制的手工主从切换

需要提前了解的信息

1. 主备切换流程的一般步骤

2. 如何确认PostgreSQL的节点是主还是从？

方式一

1. 停主库

2. 将其中一台备库提升为主库

3. 在源主库配置主从关系

4. 启动源主数据库，此时已变为新从节点。

5. 验证主备关系

方式二

1. 停主库

2. 将其中一台备库提升为主库

3. 在源主库配置主从关系

4. 启动源主数据库，此时已变为新从节点。

5. 验证主备关系

方式三

1. 停主库

2. 将其中一台备库提升为主库

3. 在源主库配置主从关系

4. 启动源主数据库，此时已变为新从节点。

5. 验证主备关系

6. 常见问题

评论