暂无图片
Pacemaker的PostgreSQL一主多从读
我来答
分享
李石岩
2020-03-12
Pacemaker的PostgreSQL一主多从读

作者搭建了一套postgresql集群,Pacemaker的PostgreSQL一主多从。在在做主节点网卡测试停启的过程中出现问题,作者node4为主节点,node5 node6 为从节点,node4在做了ifconfig ens33 down以后,发现node5为主节点了,node6也会首先down下然后启动,这种情况下,node4 ifconfig ens33 up 以后,对其进行cls_repair_by_pg_rewind 恢复,发现恢复不成功,出现以下问题
[root@node4 data]# cls_repair_by_pg_rewind
resource msPostgresql is NOT running
resource msPostgresql is NOT running
resource msPostgresql is NOT running
connected to server

could not fetch remote file “global/pg_control”: ERROR: permission denied for function pg_read_binary_file
Failure, exiting
failed to call “/usr/pgsql/bin/pg_rewind -D /data/pgsql/data --source-server=‘host=10.10.10.19’ port=5432 -P”

从问题上看,是权限除了问题,由于作者刚刚学习这个,不太明白怎么解决。请大神帮忙指导指导。

我来答
添加附件
收藏
分享
问题补充
6条回答
默认
最新
李石岩

以下是各节点的状态
[root@node4 data]# cls_status
resource msPostgresql is NOT running
Stack: corosync
Current DC: node5 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Thu Mar 12 12:40:34 2020
Last change: Thu Mar 12 12:39:54 2020 by root via cibadmin on node4

3 nodes configured
9 resources configured

Online: [ node4 node5 node6 ]

Full list of resources:

vip-master (ocf:💓IPaddr2): Started node5
vip-slave (ocf:💓IPaddr2): Started node6
Master/Slave Set: msPostgresql [pgsql]
Masters: [ node5 ]
Slaves: [ node6 ]
Stopped: [ node4 ]
lvsdr (ocf:💓lvsdr): Started node6
Clone Set: lvsdr-realsvr-clone [lvsdr-realsvr]
Started: [ node6 ]
Stopped: [ node4 node5 ]

Node Attributes:

  • Node node4:
    • master-pgsql : -INFINITY
    • pgsql-data-status : DISCONNECT
    • pgsql-status : STOP
  • Node node5:
    • master-pgsql : 1000
    • pgsql-data-status : LATEST
    • pgsql-master-baseline : 00000000050001E0
    • pgsql-status : PRI
  • Node node6:
    • pgsql-data-status : STREAMING|SYNC
    • pgsql-status : HS:sync

Migration Summary:

  • Node node6:
  • Node node5:
  • Node node4:
    pgsql: migration-threshold=3 fail-count=1000000 last-failure=‘Thu Mar 12 12:29:32 2020’

Failed Actions:

  • pgsql_start_0 on node4 ‘unknown error’ (1): call=55, status=complete, exitreason=‘The master’s timeline forked off current database system timeline 2 before latest checkpoint location 00000000050001E0, REPL_INF’,
    last-rc-change=‘Thu Mar 12 12:29:31 2020’, queued=0ms, exec=267ms

pgsql_REPL_INFO:node5|2|00000000050001E0

暂无图片 评论
暂无图片 有用 0
打赏 0
李石岩

经过测试,我将/data/pgsql/data目录给删除掉,然后在root用户下执行cls_rebuild_slave,node4进行恢复了。但是rewind这种方法仍然没有解决。

暂无图片 评论
暂无图片 有用 0
打赏 0
李石岩

[root@node5 pha4pgsql]# cls_repair_by_pg_rewind
resource msPostgresql is NOT running
resource msPostgresql is NOT running
resource msPostgresql is NOT running
connected to server

could not fetch remote file “global/pg_control”: ERROR: permission denied for function pg_read_binary_file
Failure, exiting
failed to call “/usr/pgsql/bin/pg_rewind -D /data/pgsql/data --source-server=‘host=10.10.10.19’ port=5432 -P”
[root@node5 pha4pgsql]# rm /data/pgsql/data/ -rf
[root@node5 pha4pgsql]# cls_
cls_cleanup cls_master cls_repair_by_pg_rewind cls_start cls_unmaintenance
cls_maintenance cls_online_switch cls_reset_master cls_status cls_unmaintenance_node
cls_maintenance_node cls_rebuild_slave cls_standby_node cls_stop cls_unstandby_node
[root@node5 pha4pgsql]# cls_rebuild_slave
resource msPostgresql is NOT running
resource msPostgresql is NOT running
resource msPostgresql is NOT running
24559/24559 kB (100%), 1/1 tablespace
pg_basebackup complete!
resource msPostgresql is NOT running
resource msPostgresql is NOT running
Cleaned up all resources on all nodes
Waiting for 1 replies from the CRMd. OK
wait for recovery complete

slave recovery of node5 successed
[root@node5 pha4pgsql]# su - postgres
上一次登录:四 3月 12 16:42:55 CST 2020pts/0 上
postgres@node5-> pg_controldata|grep cluster
Database cluster state: in archive recovery

暂无图片 评论
暂无图片 有用 0
打赏 0
李石岩

对主节点进行了killall postgres 发现并没有出现掉线的状态,在显示日志出现了报错。可能与配置文件中的migration-threshold=3这个参数设置有关,出现三次的话,就可能会切换了。
[root@node6 tmp]# cls_status
Stack: corosync
Current DC: node5 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Mar 13 09:35:33 2020
Last change: Thu Mar 12 23:43:35 2020 by root via crm_attribute on node4

3 nodes configured
9 resources configured

Online: [ node4 node5 node6 ]

Full list of resources:

vip-master (ocf:💓IPaddr2): Started node4
vip-slave (ocf:💓IPaddr2): Started node5
Master/Slave Set: msPostgresql [pgsql]
Masters: [ node4 ]
Slaves: [ node5 node6 ]
lvsdr (ocf:💓lvsdr): Started node5
Clone Set: lvsdr-realsvr-clone [lvsdr-realsvr]
Started: [ node5 node6 ]
Stopped: [ node4 ]

Node Attributes:

  • Node node4:
    • master-pgsql : 1000
    • pgsql-data-status : LATEST
    • pgsql-master-baseline : 000000000B000258
    • pgsql-status : PRI
  • Node node5:
    • master-pgsql : 100
    • pgsql-data-status : STREAMING|SYNC
    • pgsql-status : HS:sync
  • Node node6:
    • master-pgsql : -INFINITY
    • pgsql-data-status : STREAMING|ASYNC
    • pgsql-status : HS:async

Migration Summary:

  • Node node6:
  • Node node5:
  • Node node4:
    pgsql: migration-threshold=3 fail-count=2 last-failure=‘Thu Mar 12 23:43:24 2020’

Failed Actions:

  • pgsql_monitor_3000 on node4 ‘not running’ (7): call=161, status=complete, exitreason=’’,
    last-rc-change=‘Thu Mar 12 23:43:24 2020’, queued=0ms, exec=0ms

pgsql_REPL_INFO:node4|4|0000000009034248
[root@node6 tmp]# pcs resource cleanup msPostgresql
Cleaned up pgsql:0 on node6
Cleaned up pgsql:1 on node5
Cleaned up pgsql:2 on node4
Waiting for 1 replies from the CRMd. OK
[root@node6 tmp]# cls_status
Stack: corosync
Current DC: node5 (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum
Last updated: Fri Mar 13 09:37:05 2020
Last change: Fri Mar 13 09:37:02 2020 by hacluster via crmd on node4

3 nodes configured
9 resources configured

Online: [ node4 node5 node6 ]

Full list of resources:

vip-master (ocf:💓IPaddr2): Started node4
vip-slave (ocf:💓IPaddr2): Started node5
Master/Slave Set: msPostgresql [pgsql]
Masters: [ node4 ]
Slaves: [ node5 node6 ]
lvsdr (ocf:💓lvsdr): Started node5
Clone Set: lvsdr-realsvr-clone [lvsdr-realsvr]
Started: [ node5 node6 ]
Stopped: [ node4 ]

Node Attributes:

  • Node node4:
    • master-pgsql : 1000
    • pgsql-data-status : LATEST
    • pgsql-master-baseline : 000000000B000258
    • pgsql-status : PRI
  • Node node5:
    • master-pgsql : 100
    • pgsql-data-status : STREAMING|SYNC
    • pgsql-status : HS:sync
  • Node node6:
    • master-pgsql : -INFINITY
    • pgsql-data-status : STREAMING|ASYNC
    • pgsql-status : HS:async

Migration Summary:

  • Node node6:
  • Node node5:
  • Node node4:

pgsql_REPL_INFO:node4|4|0000000009034248
[root@node6 tmp]#

暂无图片 评论
暂无图片 有用 0
打赏 0
李石岩

跟pger朋友聊天,关闭主节点,主节点启动不起来,可以试着采用以下方法
pcs resource cleanup master-group
pcs resource cleanup msPostgresql
rm /var/lib/pgsql/tmp/PGSQL.lock

https://blog.csdn.net/joshua1830/article/details/51943999

暂无图片 评论
暂无图片 有用 0
打赏 0
手机用户6325

大佬,这种cls_*命令在哪查文档

暂无图片 评论
暂无图片 有用 0
打赏 0
李石岩
题主
2025-02-25
这个你找找吧。
回答交流
Markdown


请输入正文
提交
相关推荐
流复制备库归档问题
回答 1
级联复制一般都设置wal归档
postgreSQL 如何知道Range分区表的From和To的范围?
回答 2
可以参考下面的语句:SELECTc.oid::pgcatalog.regclass,pgcatalog.pggetexpr(c.relpartbound,c.oid)FROMpgcatalog.pgc
PG如何修改函数的名字呢?
回答 1
修改名字不就是新的了,就相当于重建一个
可以推荐一个开源的paid工具展示执行计划吗?
回答 1
不太懂~,能说的更详细一些嘛
pg中的analyze的疑问??
回答 1
1、从我自己的测试结果看,analyze没法只收集索引的统计信息。2、表和索引的一起收集pg的analyze和oracle的analyze不一样,跟mysql的比较像,功能没有那么丰富,就是表和索引一
PostgreSQL登陆时出现提示:“invalid suffix config” “invalid user name when get shadow information”
回答 4
暂无文字回复
哪里有postgreSQL与其他几款数据库的性能对比报告吗?
回答 2
性能对比报告可以在官网或者各大技术社区上找到,比如GitHub、StackOverflow、HackerNews等社区来进行搜索,也可以在Google上搜索查看。例如,PostgreSQL与MySQL
各位大佬,有在linux下,用pyinstaller打包过. py文件吗
回答 1
是有过,主要是用于爬虫项目,要是包含其他exe文件,需要自己放进去。网上也有很多可以参考的,自己要试试才可以
vacuum freeze是否受未提交事务的影响?
回答 1
不受未提交事务的影响
怎么用一个数学指标来描述postgreSQL数据库中的表,哪个字段适合作为索引?
回答 1
不考虑业务场景,那一定是区分度。非常直观的一个数学指标