今天检查11g RAC 时发现只有节点2起来了,节点1报错CRS-0184: Cannot communicate with the CRS daemon.
查看节点2crs状态
[code][root@node2 ~]# /oracle/app/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type ONLINE ONLINE node2
ora....ER.lsnr ora....er.type ONLINE ONLINE node2
ora....N1.lsnr ora....er.type ONLINE ONLINE node2
ora.asm ora.asm.type ONLINE ONLINE node2
ora.eons ora.eons.type ONLINE ONLINE node2
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE node2
ora.node1.vip ora....t1.type ONLINE ONLINE node2
ora....SM2.asm application ONLINE ONLINE node2
ora....E2.lsnr application ONLINE ONLINE node2
ora.node2.gsd application OFFLINE OFFLINE
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip ora....t1.type ONLINE ONLINE node2
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ons ora.ons.type ONLINE ONLINE node2
ora.scan1.vip ora....ip.type ONLINE ONLINE node2[/code]
查看节点1crs状态
[root@node1 ~]# /oracle/app/grid/bin/crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
检查下ocr
[root@node1 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=8, opn=kgfolclcpi1, dep=204, loc=kgfokge
AMDU-00204: Disk N0001 is in currently mounted diskgroup DATA
AMDU-00201: Disk N0001: 'ORCL:DISK1'
] [8]
查看节点1 crs日志
[code][root@node1 ~]# tail -100 /oracle/app/grid/log/node1/crsd/crsd.log
2013-08-16 09:54:06.336: [ OCRASM][549824240]proprasmo: Error in open/create file in dg [DATA]
[ OCRASM][549824240]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat
2013-08-16 09:54:06.365: [ OCRASM][549824240]proprasmo: kgfoCheckMount returned [7]
2013-08-16 09:54:06.365: [ OCRASM][549824240]proprasmo: The ASM instance is down
2013-08-16 09:54:06.366: [ OCRRAW][549824240]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2013-08-16 09:54:06.366: [ OCRRAW][549824240]proprioo: No OCR/OLR devices are usable
2013-08-16 09:54:06.366: [ OCRASM][549824240]proprasmcl: asmhandle is NULL
2013-08-16 09:54:06.366: [ OCRRAW][549824240]proprinit: Could not open raw device
2013-08-16 09:54:06.366: [ OCRASM][549824240]proprasmcl: asmhandle is NULL
2013-08-16 09:54:06.367: [ OCRAPI][549824240]a_init:16!: Backend init unsuccessful : [26]
2013-08-16 09:54:06.367: [ CRSOCR][549824240] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat
] [7]
2013-08-16 09:54:06.367: [ CRSD][549824240][PANIC] CRSD exiting: Could not init OCR, code: 26
2013-08-16 09:54:06.367: [ CRSD][549824240] Done.
[ clsdmt][1114937664]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=node1DBG_CRSD))
2013-08-16 09:54:07.295: [ clsdmt][1114937664]PID for the Process [4251], connkey 1
2013-08-16 09:54:07.295: [ clsdmt][1114937664]Creating PID [4251] file for home /oracle/app/grid host node1 bin crs to /oracle/app/grid/crs/init/
2013-08-16 09:54:07.295: [ clsdmt][1114937664]Writing PID [4251] to the file [/oracle/app/grid/crs/init/node1.pid][/code]
发现与ASM有关,就查看ASM的状态,和磁盘组的状态
[code][root@node1 ~]# oracleasm listdisks
DISK1
DISK2
DISK3
DISK4
[root@node1 bin]# ls -l /dev/sd*
brw-rw---- 1 grid asmadmin 8, 0 Aug 16 10:01 /dev/sda
brw-rw---- 1 grid asmadmin 8, 1 Aug 16 10:02 /dev/sda1
brw-rw---- 1 grid asmadmin 8, 2 Aug 16 10:01 /dev/sda2
brw-rw---- 1 grid asmadmin 8, 16 Aug 16 10:01 /dev/sdb
brw-rw---- 1 grid asmadmin 8, 17 Aug 16 10:02 /dev/sdb1
brw-rw---- 1 grid asmadmin 8, 32 Aug 16 10:01 /dev/sdc
brw-rw---- 1 grid asmadmin 8, 33 Aug 16 10:02 /dev/sdc1
brw-rw---- 1 grid asmadmin 8, 48 Aug 16 10:01 /dev/sdd
brw-rw---- 1 grid asmadmin 8, 49 Aug 16 10:02 /dev/sdd1
brw-rw---- 1 grid asmadmin 8, 64 Aug 16 10:01 /dev/sde
brw-rw---- 1 grid asmadmin 8, 65 Aug 16 10:02 /dev/sde1
[root@node1 bin]# ps -ef |grep smon
grid 4018 1 0 10:31 ? 00:00:00 asm_smon_+ASM1
root 4156 3682 0 10:31 pts/1 00:00:00 grep smon
[grid@node1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Fri Aug 16 11:37:57 2013
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> select name,state,total_mb,free_mb from v$asm_diskgroup;
NAME STATE TOTAL_MB FREE_MB
------------------------------ ----------- ---------- ----------
DATA MOUNTED 8188 7786
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options[/code]
都是正常的。
后来根据crs的log,在mos上查说是bin目录下的oracle权限不对。
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat
1、grid/bin/orale 权限和所有者,所属组如下:如下:
-rwsr-s--x grid oinstall
2、oracle/db1/bin/oracle 权限和所有者,所属组如下:
-rwsr-s--x oracle asmadmin
检查自己的grid下的oracle如下
[root@node1 ~]# ls -l /oracle/app/grid/bin/oracle
-rwxr-x--x 1 grid asmadmin 184286251 Aug 14 14:43 /oracle/app/grid/bin/oracle
权限不对,所属组也不对。
[root@node1 ~]# chmod 6751 /oracle/app/grid/bin/oracle
[root@node1 ~]# chown grid:oinstall /oracle/app/grid/bin/oracle
查看oracle db下的oracle如下:
[root@node1 bin]# ls -l /oracle/app/oracle/db1/bin/oracle
-rwsr-s--x 1 oracle oinstall 173515880 Aug 14 17:09 /oracle/app/oracle/db1/bin/oracle
所属组不对
[root@node1 bin]# chown oracle:asmadmin /oracle/app/oracle/db1/bin/oracle
修改后如下:
[root@node1 bin]# ls -l /oracle/app/oracle/db1/bin/oracle
-rwsr-s--x 1 oracle asmadmin 173515880 Aug 14 17:09 /oracle/app/oracle/db1/bin/oracle
[root@node1 bin]# ls -l /oracle/app/grid/bin/oracle
-rwsr-s--x 1 grid oinstall 184286251 Aug 14 14:43 /oracle/app/grid/bin/oracle
重启节点1,之后OK
[code][root@node1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type ONLINE ONLINE node1
ora....ER.lsnr ora....er.type ONLINE ONLINE node1
ora....N1.lsnr ora....er.type ONLINE ONLINE node2
ora.asm ora.asm.type ONLINE ONLINE node1
ora.eons ora.eons.type ONLINE ONLINE node1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE node1
ora....SM1.asm application ONLINE ONLINE node1
ora....E1.lsnr application ONLINE ONLINE node1
ora.node1.gsd application OFFLINE OFFLINE
ora.node1.ons application ONLINE ONLINE node1
ora.node1.vip ora....t1.type ONLINE ONLINE node1
ora....SM2.asm application ONLINE ONLINE node2
ora....E2.lsnr application ONLINE ONLINE node2
ora.node2.gsd application OFFLINE OFFLINE
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip ora....t1.type ONLINE ONLINE node2
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ons ora.ons.type ONLINE ONLINE node1
ora.scan1.vip ora....ip.type ONLINE ONLINE node2[/code]
查看节点2crs状态
[code][root@node2 ~]# /oracle/app/grid/bin/crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type ONLINE ONLINE node2
ora....ER.lsnr ora....er.type ONLINE ONLINE node2
ora....N1.lsnr ora....er.type ONLINE ONLINE node2
ora.asm ora.asm.type ONLINE ONLINE node2
ora.eons ora.eons.type ONLINE ONLINE node2
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE node2
ora.node1.vip ora....t1.type ONLINE ONLINE node2
ora....SM2.asm application ONLINE ONLINE node2
ora....E2.lsnr application ONLINE ONLINE node2
ora.node2.gsd application OFFLINE OFFLINE
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip ora....t1.type ONLINE ONLINE node2
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ons ora.ons.type ONLINE ONLINE node2
ora.scan1.vip ora....ip.type ONLINE ONLINE node2[/code]
查看节点1crs状态
[root@node1 ~]# /oracle/app/grid/bin/crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
检查下ocr
[root@node1 bin]# ./ocrcheck
PROT-602: Failed to retrieve data from the cluster registry
PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=8, opn=kgfolclcpi1, dep=204, loc=kgfokge
AMDU-00204: Disk N0001 is in currently mounted diskgroup DATA
AMDU-00201: Disk N0001: 'ORCL:DISK1'
] [8]
查看节点1 crs日志
[code][root@node1 ~]# tail -100 /oracle/app/grid/log/node1/crsd/crsd.log
2013-08-16 09:54:06.336: [ OCRASM][549824240]proprasmo: Error in open/create file in dg [DATA]
[ OCRASM][549824240]SLOS : SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat
2013-08-16 09:54:06.365: [ OCRASM][549824240]proprasmo: kgfoCheckMount returned [7]
2013-08-16 09:54:06.365: [ OCRASM][549824240]proprasmo: The ASM instance is down
2013-08-16 09:54:06.366: [ OCRRAW][549824240]proprioo: Failed to open [+DATA]. Returned proprasmo() with [26]. Marking location as UNAVAILABLE.
2013-08-16 09:54:06.366: [ OCRRAW][549824240]proprioo: No OCR/OLR devices are usable
2013-08-16 09:54:06.366: [ OCRASM][549824240]proprasmcl: asmhandle is NULL
2013-08-16 09:54:06.366: [ OCRRAW][549824240]proprinit: Could not open raw device
2013-08-16 09:54:06.366: [ OCRASM][549824240]proprasmcl: asmhandle is NULL
2013-08-16 09:54:06.367: [ OCRAPI][549824240]a_init:16!: Backend init unsuccessful : [26]
2013-08-16 09:54:06.367: [ CRSOCR][549824240] OCR context init failure. Error: PROC-26: Error while accessing the physical storage ASM error [SLOS: cat=7, opn=kgfoAl06, dep=27140, loc=kgfokge
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat
] [7]
2013-08-16 09:54:06.367: [ CRSD][549824240][PANIC] CRSD exiting: Could not init OCR, code: 26
2013-08-16 09:54:06.367: [ CRSD][549824240] Done.
[ clsdmt][1114937664]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=node1DBG_CRSD))
2013-08-16 09:54:07.295: [ clsdmt][1114937664]PID for the Process [4251], connkey 1
2013-08-16 09:54:07.295: [ clsdmt][1114937664]Creating PID [4251] file for home /oracle/app/grid host node1 bin crs to /oracle/app/grid/crs/init/
2013-08-16 09:54:07.295: [ clsdmt][1114937664]Writing PID [4251] to the file [/oracle/app/grid/crs/init/node1.pid][/code]
发现与ASM有关,就查看ASM的状态,和磁盘组的状态
[code][root@node1 ~]# oracleasm listdisks
DISK1
DISK2
DISK3
DISK4
[root@node1 bin]# ls -l /dev/sd*
brw-rw---- 1 grid asmadmin 8, 0 Aug 16 10:01 /dev/sda
brw-rw---- 1 grid asmadmin 8, 1 Aug 16 10:02 /dev/sda1
brw-rw---- 1 grid asmadmin 8, 2 Aug 16 10:01 /dev/sda2
brw-rw---- 1 grid asmadmin 8, 16 Aug 16 10:01 /dev/sdb
brw-rw---- 1 grid asmadmin 8, 17 Aug 16 10:02 /dev/sdb1
brw-rw---- 1 grid asmadmin 8, 32 Aug 16 10:01 /dev/sdc
brw-rw---- 1 grid asmadmin 8, 33 Aug 16 10:02 /dev/sdc1
brw-rw---- 1 grid asmadmin 8, 48 Aug 16 10:01 /dev/sdd
brw-rw---- 1 grid asmadmin 8, 49 Aug 16 10:02 /dev/sdd1
brw-rw---- 1 grid asmadmin 8, 64 Aug 16 10:01 /dev/sde
brw-rw---- 1 grid asmadmin 8, 65 Aug 16 10:02 /dev/sde1
[root@node1 bin]# ps -ef |grep smon
grid 4018 1 0 10:31 ? 00:00:00 asm_smon_+ASM1
root 4156 3682 0 10:31 pts/1 00:00:00 grep smon
[grid@node1 ~]$ sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Fri Aug 16 11:37:57 2013
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> select name,state,total_mb,free_mb from v$asm_diskgroup;
NAME STATE TOTAL_MB FREE_MB
------------------------------ ----------- ---------- ----------
DATA MOUNTED 8188 7786
SQL> exit
Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options[/code]
都是正常的。
后来根据crs的log,在mos上查说是bin目录下的oracle权限不对。
ORA-27140: attach to post/wait facility failed
ORA-27300: OS system dependent operation:invalid_egid failed with status: 1
ORA-27301: OS failure message: Operat
1、grid/bin/orale 权限和所有者,所属组如下:如下:
-rwsr-s--x grid oinstall
2、oracle/db1/bin/oracle 权限和所有者,所属组如下:
-rwsr-s--x oracle asmadmin
检查自己的grid下的oracle如下
[root@node1 ~]# ls -l /oracle/app/grid/bin/oracle
-rwxr-x--x 1 grid asmadmin 184286251 Aug 14 14:43 /oracle/app/grid/bin/oracle
权限不对,所属组也不对。
[root@node1 ~]# chmod 6751 /oracle/app/grid/bin/oracle
[root@node1 ~]# chown grid:oinstall /oracle/app/grid/bin/oracle
查看oracle db下的oracle如下:
[root@node1 bin]# ls -l /oracle/app/oracle/db1/bin/oracle
-rwsr-s--x 1 oracle oinstall 173515880 Aug 14 17:09 /oracle/app/oracle/db1/bin/oracle
所属组不对
[root@node1 bin]# chown oracle:asmadmin /oracle/app/oracle/db1/bin/oracle
修改后如下:
[root@node1 bin]# ls -l /oracle/app/oracle/db1/bin/oracle
-rwsr-s--x 1 oracle asmadmin 173515880 Aug 14 17:09 /oracle/app/oracle/db1/bin/oracle
[root@node1 bin]# ls -l /oracle/app/grid/bin/oracle
-rwsr-s--x 1 grid oinstall 184286251 Aug 14 14:43 /oracle/app/grid/bin/oracle
重启节点1,之后OK
[code][root@node1 bin]# ./crs_stat -t
Name Type Target State Host
------------------------------------------------------------
ora.DATA.dg ora....up.type ONLINE ONLINE node1
ora....ER.lsnr ora....er.type ONLINE ONLINE node1
ora....N1.lsnr ora....er.type ONLINE ONLINE node2
ora.asm ora.asm.type ONLINE ONLINE node1
ora.eons ora.eons.type ONLINE ONLINE node1
ora.gsd ora.gsd.type OFFLINE OFFLINE
ora....network ora....rk.type ONLINE ONLINE node1
ora....SM1.asm application ONLINE ONLINE node1
ora....E1.lsnr application ONLINE ONLINE node1
ora.node1.gsd application OFFLINE OFFLINE
ora.node1.ons application ONLINE ONLINE node1
ora.node1.vip ora....t1.type ONLINE ONLINE node1
ora....SM2.asm application ONLINE ONLINE node2
ora....E2.lsnr application ONLINE ONLINE node2
ora.node2.gsd application OFFLINE OFFLINE
ora.node2.ons application ONLINE ONLINE node2
ora.node2.vip ora....t1.type ONLINE ONLINE node2
ora.oc4j ora.oc4j.type OFFLINE OFFLINE
ora.ons ora.ons.type ONLINE ONLINE node1
ora.scan1.vip ora....ip.type ONLINE ONLINE node2[/code]
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。
评论
相关阅读
【纯干货】Oracle 19C RU 19.27 发布,如何快速升级和安装?
Lucifer三思而后行
677次阅读
2025-04-18 14:18:38
Oracle RAC 一键安装翻车?手把手教你如何排错!
Lucifer三思而后行
634次阅读
2025-04-15 17:24:06
Oracle数据库一键巡检并生成HTML结果,免费脚本速来下载!
陈举超
541次阅读
2025-04-20 10:07:02
【活动】分享你的压箱底干货文档,三篇解锁进阶奖励!
墨天轮编辑部
491次阅读
2025-04-17 17:02:24
【ORACLE】记录一些ORACLE的merge into语句的BUG
DarkAthena
487次阅读
2025-04-22 00:20:37
【ORACLE】你以为的真的是你以为的么?--ORA-38104: Columns referenced in the ON Clause cannot be updated
DarkAthena
470次阅读
2025-04-22 00:13:51
一页概览:Oracle GoldenGate
甲骨文云技术
467次阅读
2025-04-30 12:17:56
火焰图--分析复杂SQL执行计划的利器
听见风的声音
413次阅读
2025-04-17 09:30:30
3月“墨力原创作者计划”获奖名单公布
墨天轮编辑部
372次阅读
2025-04-15 14:48:05
OR+DBLINK的关联SQL优化思路
布衣
352次阅读
2025-05-05 19:28:36