暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

GPCC参数metrics_collector配置错误导致GreenPlum启动报错

DB宝 2023-01-28
1635

现象

 1[gpadmin@mdw1 ~]$ gpstart -a
220230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Starting gpstart with args: -a
320230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Gathering information and validating the environment...
420230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63 Open Source'
520230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Greenplum Catalog Version: '
301908232'
620230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Starting Master instance in admin mode
720230116:12:58:42:008927 gpstart:mdw1:gpadmin-[CRITICAL]:-Failed to start Master instance in admin mode
820230116:12:58:42:008927 gpstart:mdw1:gpadmin-[CRITICAL]:-Error occurred: non-zero rc: 1
9 Command was: '
env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/gpdb/master/gpseg-1/ -l /data/gpdb/master/gpseg-1//pg_log/startup.log -w -t 600 -o " -p 5432 -c gp_role=utility " start'
10rc=1, stdout='
waiting for server to start.... stopped waiting
11', stderr='pg_ctl: could not start server
12Examine the log output.
13'
14[gpadmin@mdw1 ~]$ tailf /data/gpdb/master/gpseg-1//pg_log/startup.log
152023-01-16 12:58:59.464993 CST,,,p8992,th834783360,,,,0,,,seg-1,,,,,"LOG","00000","registering background worker ""sweeper process""",,,,,,,,"RegisterBackgroundWorker","bgworker.c",774,
162023-01-16 12:58:59.465304 CST,,,p8992,th834783360,,,,0,,,seg-1,,,,,"FATAL","58P01","could not access file ""metrics_collector"": No such file or directory",,,,,,,,"internal_load_library","dfmgr.c",202,1    0xbef3fc postgres errstart (elog.c:557)
172    0xbf456d postgres <symbol not found> (dfmgr.c:199)
183    0xbf4f54 postgres load_file (dfmgr.c:156)
194    0xc083a4 postgres process_shared_preload_libraries (miscinit.c:1378)
205    0xa0d6e3 postgres PostmasterMain (postmaster.c:1151)
216    0x6b0871 postgres main (main.c:205)
227    0x7f522e7ed3d5 libc.so.6 __libc_start_main + 0xf5
238    0x6bc58c postgres <symbol not found> + 0x6bc58c
24


分析

从启动日志“2023-01-16 12:58:59.465304 CST,,,p8992,th834783360,,,,0,,,seg-1,,,,,"FATAL","58P01","could not access file ""metrics_collector"": No such file or directory",,,,,,,,"internal_load_library","dfmgr.c",202,1    0xbef3fc postgres errstart (elog.c:557)”可以看到应该是metrics_collector的问题,这个值是参数文件postgresql.conf中的shared_preload_libraries的值,用于开启gpcc的指标监控。

报错,应该是gpcc安装有错误,然后启动数据库导致的。

若是GPCC安装成功,则会在如下位置有库文件,否则不能随便重启GreenPlum,会导致启动失败:

 1[root@lhrgp40 /]# find /usr/local -name metrics_collector*
2/usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector--1.0.sql
3/usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector.control
4/usr/local/greenplum-db-6.19.3/lib/postgresql/metrics_collector.so
5[root@lhrgp40 /]
6[gpadmin@lhrgp40 ~]$ ll $GPHOME/share/postgresql/extension/gp_wlm*
7-rw-r--r-- 1 gpadmin gpadmin 856 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/gp_wlm--0.1.sql
8-rw-r--r-- 1 gpadmin gpadmin 232 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/gp_wlm.control
9[gpadmin@lhrgp40 ~]$ ll $GPHOME/share/postgresql/extension/metrics_collector*
10-rw-r--r-- 1 gpadmin gpadmin 846 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector--1.0.sql
11-rw-r--r-- 1 gpadmin gpadmin 233 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector.control
12[gpadmin@lhrgp40 ~]$ ll $GPHOME/lib/postgresql/metrics_collector.so
13-rwxr-xr-x 1 gpadmin gpadmin 3357064 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/lib/postgresql/metrics_collector.so
14[gpadmin@lhrgp40 ~]$ 
15[gpadmin@lhrgp40 ~]$ gppkg -q --all
1620230116:14:58:39:020317 gppkg:lhrgp40:gpadmin-[INFO]:-Starting gppkg with args: -q --all
17MetricsCollector-6.8.3_gp_6.19.3

解决

1、先修复master实例,将参数文件postgresql.conf中的shared_preload_libraries的值清空

2、再修改segment实例,将参数文件postgresql.conf中的shared_preload_libraries的值清空

3、尽快启动GreenPlum实例,命令gpstart -a

4、再修复mirror实例的参数文件,将参数文件postgresql.conf中的shared_preload_libraries的值清空

5、最后再单独启动mirror实例,启动方式:

1nohup  /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg5 -p 7002 &

segment的配置可以在master实例上查看:

1 select * from gp_segment_configuration order by 2,1 ;

最后重新安装gpcc,请参考:https://www.xmmup.com/greenplumguanfangjiankonggongjugpcc-6deanzhuanghexiezai.html

postgresql.conf参数文件的位置

 1[gpadmin@lhrgp40 ~]$ ps -ef|grep green
2gpadmin    520     1  0 14:28 pts/0    00:00:07 /usr/local/greenplum-cc-6.8.3/bin/gpccws -W masterport5432e
3gpadmin    672     1  0 14:28 ?        00:00:02 /usr/local/greenplum-cc-6.8.3/bin/ccagent -udpport 9898 -rpcaddr lhrgp40:8899 masterport5432e
4gpadmin   1845     1  0 14:33 ?        00:00:21 /usr/local/greenplum-db-6.19.3/bin/postgres -D /opt/greenplum/data/master/gpseg-1 -p 5432 -E
5gpadmin  15037 15036  0 15:28 ?        00:00:00 addr2line -s -e /usr/local/greenplum-db-6.19.3/bin/postgres 0xbefe0c 0xbf2e08 0xa12c84 0x9fd127 0xa08dd0 0x6ac32e 0xa0e592 0x6b09e1 0x7f969816e555 0x6bc6fc
6gpadmin  15039 15724  0 15:28 pts/0    00:00:00 grep --color=auto green
7[gpadmin@lhrgp40 ~]$ ll /opt/greenplum/data/master/gpseg-1/postgresql.conf
8-rw------- 1 gpadmin gpadmin 23762 Jan 16 14:31 /opt/greenplum/data/master/gpseg-1/postgresql.conf
9[gpadmin@lhrgp40 ~]$ more postgresql.conf^C
10[gpadmin@lhrgp40 ~]$ more /opt/greenplum/data/master/gpseg-1/postgresql.conf | grep shared_preload_libraries
11#shared_preload_libraries = ''          # (change requires restart)
12shared_preload_libraries='metrics_collector'

同一个主机上可能有多个primary和mirror,那么每个库都需要修改,如下得修改6个库的参数文件:

1[root@hdw ~]# ps -ef|grep green
2gpadmin   3120     1  0 13:47 ?        00:00:00 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg3 -p 7000
3gpadmin   3138     1  0 13:47 ?        00:00:00 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg4 -p 7001
4gpadmin   7256     1  0 13:53 ?        00:00:00 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg5 -p 7002
5gpadmin  27039     1  0 13:19 ?        00:00:30 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/primary/gpseg7 -p 6001
6gpadmin  27041     1  0 13:19 ?        00:00:30 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/primary/gpseg8 -p 6002
7gpadmin  27042     1  0 13:19 ?        00:00:30 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/primary/gpseg6 -p 6000
8[root@hdw5 ~]


文章转载自DB宝,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论