[postgresql] repmgr集群手动切换报错 "data_directory" parameter in repmgr.conf on "XXX" is incorrectly

原创手机用户0512 2023-05-26

291

尝试postgrsql数据库repmgr集群的搭建部署后，查看集群状态正常：


[pg@node2 conf]$ repmgr cluster show -f /home/pg/conf/repmgr.conf
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                           
----+-------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | node1 | primary | * running |          | default  | 100      | 1        | host=192.168.5.101 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | standby |   running | node1    | default  | 100      | 1        | host=192.168.5.102 user=repmgr dbname=repmgr connect_timeout=2


复制

尝试在备库机器上执行手动切换：

[pg@node2 conf]$ repmgr standby switchover -U repmgr --verbose -f /home/pg/conf/repmgr.conf
NOTICE: using provided configuration file "/home/pg/conf/repmgr.conf"
WARNING: following problems with command line parameters detected:
  database connection parameters not required when executing STANDBY SWITCHOVER
NOTICE: executing switchover on node "node2" (ID: 2)
INFO: searching for primary node
INFO: checking if node 1 is primary
INFO: current primary node is 1
INFO: SSH connection to host "192.168.5.101" succeeded
ERROR: "data_directory" parameter in repmgr.conf on "node1" is incorrectly configured
HINT: execute "repmgr node check --data-directory-config" on "node1" to diagnose the issue



复制

按报错提示在node1执行检查，配置正确：
repmgr node check --data-directory-config -f /home/pg/conf/repmgr.conf

[pg@node1 ~]$ repmgr node check --data-directory-config -f /home/pg/conf/repmgr.conf
OK (configured "data_directory" is "/home/pg/data")
复制

联想使用pg，多版本共存，一个方法是登录后指定不同环境变量，如source 11.sh 14.sh
cat 11.sh
cat 14.sh

[pg@node1 ~]$ cat 11.sh
export PGPORT=9999
export PGDATA=/home/pg/data
export PGHOME=/home/pg/soft
export MANPATH=$PGHOME/share/man:$MANPATH
export PATH=$PGHOME/bin:$PATH

复制

[pg@node1 ~]$ cat 14.sh
export PGPORT=1414
export PGDATA=/home/pg/data
export PGHOME=/home/pg/soft
export MANPATH=$PGHOME/share/man:$MANPATH
export PATH=$PGHOME/bin:$PATH

复制

又可能出现环境变量使用错误的情况，或是脚本无法识别用户环境变量。
(.bash_profile 需要用户登录后才有效原理一样)

脚本可以加入声明export，或修改/etc/bashrc
或用户目录下.bashrc 。区别是/etc/bashrc对所以用户生效。还是用户目录下.bashrc合适。

tail .bashrc

[pg@node1 ~]$ tail .bashrc
# export SYSTEMD_PAGER=

# User specific aliases and functions


export PGPORT=5666
export PGDATA=/home/pg/data
export PGHOME=/home/pg/soft
export MANPATH=$PGHOME/share/man:$MANPATH
export PATH=$PGHOME/bin:$PATH

[pg@node1 ~]$ source .bashrc
[pg@node1 ~]$ 
复制

再次执行切换：

[pg@node2 ~]$ repmgr standby switchover -U repmgr --verbose -f /home/pg/conf/repmgr.conf
NOTICE: using provided configuration file "/home/pg/conf/repmgr.conf"
WARNING: following problems with command line parameters detected:
  database connection parameters not required when executing STANDBY SWITCHOVER
NOTICE: executing switchover on node "node1" (ID: 1)
INFO: searching for primary node
INFO: checking if node 2 is primary
INFO: current primary node is 2
INFO: SSH connection to host "192.168.5.102" succeeded
INFO: 0 pending archive files
INFO: replication lag on this standby is 0 seconds
NOTICE: local node "node1" (ID: 1) will be promoted to primary; current primary "node2" (ID: 2) will be demoted to standby
NOTICE: stopping current primary node "node2" (ID: 2)
NOTICE: issuing CHECKPOINT on node "node2" (ID: 2)
DETAIL: executing server command "/home/pg/soft/bin/pg_ctl  -D '/home/pg/data' -W -m fast stop"
INFO: checking for primary shutdown; 1 of 100 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 100 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 0/5000028
NOTICE: promoting standby to primary
DETAIL: promoting server "node1" (ID: 1) using pg_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: standby promoted to primary after 1 second(s)
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node1" (ID: 1) was successfully promoted to primary
INFO: local node 2 can attach to rejoin target node 1
DETAIL: local node's recovery point: 0/5000028; rejoin target node's fork point: 0/50000A0
NOTICE: setting node 2's upstream to node 1
WARNING: unable to ping "host=192.168.5.102 user=repmgr dbname=repmgr connect_timeout=2"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: starting server using "/home/pg/soft/bin/pg_ctl  -w -D '/home/pg/data' start"
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 1
INFO: node "node2" (ID: 2) is pingable
INFO: node "node2" (ID: 2) has attached to its upstream node
NOTICE: node  "node1" (ID: 1) promoted to primary, node "node2" (ID: 2) demoted to standby
NOTICE: switchover was successful
DETAIL: node "node1" is now primary and node "node2" is attached as standby
NOTICE: STANDBY SWITCHOVER has completed successfully

复制

确认切换成功：

[pg@node2 ~]$ repmgr cluster show -f /home/pg/conf/repmgr.conf
 ID | Name  | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                           
----+-------+---------+-----------+----------+----------+----------+----------+----------------------------------------------------------------
 1  | node1 | standby |   running | node2    | default  | 100      | 1        | host=192.168.5.101 user=repmgr dbname=repmgr connect_timeout=2
 2  | node2 | primary | * running |          | default  | 100      | 2        | host=192.168.5.102 user=repmgr dbname=repmgr connect_timeout=2

复制

结论：
repmgr集群手动切换报错 “data_directory” parameter in repmgr.conf on “XXX” is incorrectly

请尝试配置~/.bashrc /etc/bashrc
添加环境变量，便于repmgr远程连接节点识别。

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

[postgresql] repmgr集群手动切换报错 "data_directory" parameter in repmgr.conf on "XXX" is incorrectly

评论

相关阅读