暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

postgresql Peered

原创 Oracle 2023-02-03
706

3.2 Peered
3.2.1 说明
Peering已经完成,但是PG当前Acting Set规模小于存储池规定的最小副本数(min_size)。
3.2.2 故障模拟
a. 停掉两个副本osd.1,osd.0

$ systemctl stop ceph-osd@1
$ systemctl stop ceph-osd@0
b. 查看集群健康状态

$ bin/ceph health detail
HEALTH_WARN 1 osds down; Reduced data availability: 4 pgs inactive; Degraded data redundancy: 26/39 objects degraded (66.667%), 20 pgs unclean, 20 pgs degraded; application not enabled on 1 pool(s)
OSD_DOWN 1 osds down
osd.0 (root=default,host=ceph-xx-cc00) is down
PG_AVAILABILITY Reduced data availability: 4 pgs inactive
pg 1.6 is stuck inactive for 516.741081, current state undersized+degraded+peered, last acting [2]
pg 1.10 is stuck inactive for 516.737888, current state undersized+degraded+peered, last acting [2]
pg 1.11 is stuck inactive for 516.737408, current state undersized+degraded+peered, last acting [2]
pg 1.12 is stuck inactive for 516.736955, current state undersized+degraded+peered, last acting [2]
PG_DEGRADED Degraded data redundancy: 26/39 objects degraded (66.667%), 20 pgs unclean, 20 pgs degraded
pg 1.0 is undersized+degraded+peered, acting [2]
pg 1.1 is undersized+degraded+peered, acting [2]
c. 客户端IO操作(夯住)

#读取对象到文件,夯住IO
$ bin/rados -p test_pool get myobject ceph.conf.old
故障总结:

现在pg 只剩下osd.2上存活,并且 pg 还多了一个状态:peered,英文的意思是仔细看,这里我们可以理解成协商、搜索。
这时候读取文件,会发现指令会卡在那个地方一直不动,为什么就不能读取内容了,因为我们设置的 min_size=2 ,如果存活数少于2,比如这里的 1 ,那么就不会响应外部的IO请求。
d. 调整min_size=1可以解决IO夯住问题

#设置min_size = 1
$ bin/ceph osd pool set test_pool min_size 1
set pool 1 min_size to 1
e. 查看集群监控状态

$ bin/ceph health detail
HEALTH_WARN 1 osds down; Degraded data redundancy: 26/39 objects degraded (66.667%), 20 pgs unclean, 20 pgs degraded, 20 pgs undersized; application not enabled on 1 pool(s)
OSD_DOWN 1 osds down
osd.0 (root=default,host=ceph-xx-cc00) is down
PG_DEGRADED Degraded data redundancy: 26/39 objects degraded (66.667%), 20 pgs unclean, 20 pgs degraded, 20 pgs undersized
pg 1.0 is stuck undersized for 65.958983, current state active+undersized+degraded, last acting [2]
pg 1.1 is stuck undersized for 65.960092, current state active+undersized+degraded, last acting [2]
pg 1.2 is stuck undersized for 65.960974, current state active+undersized+degraded, last acting [2]
f. 客户端IO操作

#读取对象到文件中
$ ll -lh ceph.conf*
-rw-r--r-- 1 root root 6.1K Jun 25 14:01 ceph.conf
-rw-r--r-- 1 root root 6.1K Jul 3 20:11 ceph.conf.old
-rw-r--r-- 1 root root 6.1K Jul 3 20:11 ceph.conf.old.1
故障总结:

可以看到,PG状态Peered没有了,并且客户端文件IO可以正常读写了。
当min_size=1时,只要集群里面有一份副本活着,那就可以响应外部的IO请求。
3.2.3 总结
Peered状态我们这里可以将它理解成它在等待其他副本上线。
当min_size = 2 时,也就是必须保证有两个副本存活的时候就可以去除Peered这个状态。
处于 Peered 状态的 PG 是不能响应外部的请求的并且IO被挂起。



 

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论