- 新装的19c rac ipv6 环境更换了ocr磁盘组,想重启集群,发现怎么也停不下来,错误日志如下:
[root@xydb5node1 bin]# ./crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'xydb5node1' CRS-2673: Attempting to stop 'ora.crsd' on 'xydb5node1' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on server 'xydb5node1' CRS-2679: Attempting to clean 'ora.xydb5node1.vip' on 'xydb5node1' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'xydb5node1' CRS-33673: Attempting to stop resource group 'ora.asmgroup' on server 'xydb5node1' CRS-2673: Attempting to stop 'ora.CRSDG.dg' on 'xydb5node1' CRS-2673: Attempting to stop 'ora.DATADG1.dg' on 'xydb5node1' CRS-2673: Attempting to stop 'ora.FRADG.dg' on 'xydb5node1' CRS-2673: Attempting to stop 'ora.OCRDG.dg' on 'xydb5node1' CRS-2681: Clean of 'ora.xydb5node1.vip' on 'xydb5node1' succeeded CRS-2677: Stop of 'ora.DATADG1.dg' on 'xydb5node1' succeeded CRS-2677: Stop of 'ora.CRSDG.dg' on 'xydb5node1' succeeded CRS-2677: Stop of 'ora.FRADG.dg' on 'xydb5node1' succeeded CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'xydb5node1' succeeded CRS-2673: Attempting to stop 'ora.scan1.vip' on 'xydb5node1' CRS-2677: Stop of 'ora.OCRDG.dg' on 'xydb5node1' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'xydb5node1' CRS-2677: Stop of 'ora.asm' on 'xydb5node1' succeeded CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'xydb5node1' CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'xydb5node1' succeeded CRS-2673: Attempting to stop 'ora.asmnet1.asmnetwork' on 'xydb5node1' CRS-2677: Stop of 'ora.asmnet1.asmnetwork' on 'xydb5node1' succeeded CRS-33677: Stop of resource group 'ora.asmgroup' on server 'xydb5node1' succeeded. Action for VIP aborted CRS-2675: Stop of 'ora.scan1.vip' on 'xydb5node1' failed CRS-2679: Attempting to clean 'ora.scan1.vip' on 'xydb5node1' CRS-2678: 'ora.scan1.vip' on 'xydb5node1' has experienced an unrecoverable failure CRS-0267: Human intervention required to resume its availability. CRS-2672: Attempting to start 'ora.xydb5node1.vip' on 'xydb5node2' CRS-5005: IP Address: 2409:8760:1282:0001:0f11:0000:0000:0044 is already in use in the network CRS-2674: Start of 'ora.xydb5node1.vip' on 'xydb5node2' failed CRS-2799: Failed to shut down resource 'ora.scan1.vip' on 'xydb5node1' CRS-2794: Shutdown of Cluster Ready Services-managed resources on 'xydb5node1' has failed CRS-2675: Stop of 'ora.crsd' on 'xydb5node1' failed CRS-2799: Failed to shut down resource 'ora.crsd' on 'xydb5node1' CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'xydb5node1' has failed CRS-4687: Shutdown command has completed with errors. CRS-4000: Command Stop failed, or completed with errors.
复制
检查crs alert日志,截取了一段错误日志,如下:
/u01/app/grid/diag/crs/xydb5node1/crs/trace/alert.log
2020-02-27 18:21:12.138 [CRSD(286254)]CRS-2758: Resource 'ora.scan1.vip' is in an unknown state. 2020-02-27 18:21:12.138 [CRSD(286254)]CRS-2769: Unable to failover resource 'ora.net1.network'. 2020-02-27 18:21:12.404 [ORAROOTAGENT(425250)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 425250 2020-02-27 18:21:17.816 [OHASD(284843)]CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'xydb5node1' has failed 2020-02-27 18:21:39.564 [OHASD(284843)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'xydb5node1' 2020-02-27 18:22:39.581 [ORAROOTAGENT(425250)]CRS-5818: Aborted command 'stop' for resource 'ora.xydb5node1.vip'. Details at (:CRSAGF00113:) {1:42136:7050} in /u01/app/grid/diag/crs/xydb5node1/crs/trace/crsd_orarootagent_root.trc. 2020-02-27 18:22:39.600 [CRSD(286254)]CRS-2757: Command 'Stop' timed out waiting for response from the resource 'ora.xydb5node1.vip'. Details at (:CRSPE00221:) {1:42136:7050} in /u01/app/grid/diag/crs/xydb5node1/crs/trace/crsd.trc. 2020-02-27 18:23:41.601 [ORAROOTAGENT(425250)]CRS-5818: Aborted command 'clean' for resource 'ora.xydb5node1.vip'. Details at (:CRSAGF00113:) {1:42136:7050} in /u01/app/grid/diag/crs/xydb5node1/crs/trace/crsd_orarootagent_root.trc. 2020-02-27 18:24:41.890 [ORAROOTAGENT(435394)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 435394 2020-02-27 18:25:42.261 [ORAROOTAGENT(435394)]CRS-5818: Aborted command 'clean' for resource 'ora.xydb5node1.vip'. Details at (:CRSAGF00113:) {0:8:2} in /u01/app/grid/diag/crs/xydb5node1/crs/trace/crsd_orarootagent_root.trc. 2020-02-27 18:26:42.550 [ORAROOTAGENT(436268)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 436268 2020-02-27 18:27:42.922 [ORAROOTAGENT(436268)]CRS-5818: Aborted command 'clean' for resource 'ora.xydb5node1.vip'. Details at (:CRSAGF00113:) {0:9:2} in /u01/app/grid/diag/crs/xydb5node1/crs/trace/crsd_orarootagent_root.trc. 2020-02-27 18:28:42.949 [OHASD(284843)]CRS-2795: Shutdown of Oracle High Availability Services-managed resources on 'xydb5node1' has failed 2020-02-27 18:28:42.942 [CRSD(286254)]CRS-2758: Resource 'ora.xydb5node1.vip' is in an unknown state. 2020-02-27 18:28:43.209 [ORAROOTAGENT(438555)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 438555 2020-02-27 18:28:43.739 [ORAAGENT(438579)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 438579 2020-02-27 18:55:14.545 [OHASD(284843)]CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'xydb5node1'
复制
如上,从alert的日志中只能看出是卡在停集群vip资源,别的问题暂时无法定位,进一步分析trace日志,如下:
/u01/app/grid/diag/crs/xydb5node1/crs/trace/crsd_orarootagent_root.trc
2020-02-27 19:17:13.902 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Failed to delete 2409:8760:1282:0001:0f11:0000:0000:0045 on bond0 2020-02-27 19:17:13.902 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] (null) category: -2, operation: ioctl, loc: SIOCDIFADDR, OS error: 99, other: failed to delete address 2020-02-27 19:17:13.902 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 } 2020-02-27 19:17:14.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 { 2020-02-27 19:17:14.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Deleting ipv6 address '2409:8760:1282:0001:0f11:0000:0000:0045', on the interface name 'bond0' 2020-02-27 19:17:14.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] sclsideladdrsv6 returned 2020-02-27 19:17:14.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Failed to delete 2409:8760:1282:0001:0f11:0000:0000:0045 on bond0 2020-02-27 19:17:14.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] (null) category: -2, operation: ioctl, loc: SIOCDIFADDR, OS error: 99, other: failed to delete address 2020-02-27 19:17:14.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 } 2020-02-27 19:17:15.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 { 2020-02-27 19:17:15.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Deleting ipv6 address '2409:8760:1282:0001:0f11:0000:0000:0045', on the interface name 'bond0' 2020-02-27 19:17:15.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] sclsideladdrsv6 returned 2020-02-27 19:17:15.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Failed to delete 2409:8760:1282:0001:0f11:0000:0000:0045 on bond0 2020-02-27 19:17:15.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] (null) category: -2, operation: ioctl, loc: SIOCDIFADDR, OS error: 99, other: failed to delete address 2020-02-27 19:17:15.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 } 2020-02-27 19:17:16.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 { 2020-02-27 19:17:16.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Deleting ipv6 address '2409:8760:1282:0001:0f11:0000:0000:0045', on the interface name 'bond0' 2020-02-27 19:17:16.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] sclsideladdrsv6 returned 2020-02-27 19:17:16.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Failed to delete 2409:8760:1282:0001:0f11:0000:0000:0045 on bond0 2020-02-27 19:17:16.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] (null) category: -2, operation: ioctl, loc: SIOCDIFADDR, OS error: 99, other: failed to delete address 2020-02-27 19:17:16.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 } 2020-02-27 19:17:17.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] VipActions::stopIpV6 { 2020-02-27 19:17:17.903 :CLSDYNAM:1310668544: [ora.xydb5node2.vip]{0:19:2} [clean] Deleting ipv6 address '2409:8760:1282:0001:0f11:0000:0000:0045', on the interface name 'bond0'
复制
一堆奇怪的错误,暂时也看不懂,查mos也没有找到关于这个错误的文档说明,咨询同事说是因为子网长度大于64导致,检查这套库public ip的子网长度发现是128,如下:
[root@xydb5node1 bin]# ./oifcfg iflist -n bond0 192.168.122.0 255.255.255.0 bond1 1.1.4.64 255.255.255.248 bond0 2409:8760:1282:1:f11::42 /128 bond1 fd17:625c:f037:a801:: /64 bond1 fd2a:1a21:628e:1:: /64
复制
直接修改网卡配置,然后重启network
[grid@xydb5node2 ~]$ oifcfg iflist -n bond0 192.168.122.0 255.255.255.0 bond1 1.1.4.64 255.255.255.248 bond0 2409:8760:1282:1:: /64 bond1 fd17:625c:f037:a801:: /64 bond1 fd2a:1a21:628e:1:: /64
复制
如上,bond0的子网长度已经变成64位了,然后再去停crs集群,发现还是失败,无赖重启服务器。
2020-02-28 13:41:23.331 [OHASD(54165)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 54165 2020-02-28 13:41:23.419 [OHASD(54165)]CRS-0714: Oracle Clusterware Release 19.0.0.0.0. 2020-02-28 13:41:23.432 [OHASD(54165)]CRS-2112: The OLR service started on node xydb6node2. 2020-02-28 13:41:23.457 [OHASD(60420)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 60420 2020-02-28 13:41:23.545 [OHASD(60420)]CRS-0714: Oracle Clusterware Release 19.0.0.0.0. 2020-02-28 13:41:23.697 [OHASD(54165)]CRS-1301: Oracle High Availability Service started on node xydb6node2. 2020-02-28 13:41:23.697 [OHASD(54165)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2020-02-28 13:41:23.896 [OHASD(54336)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 54336 2020-02-28 13:41:24.003 [OHASD(54336)]CRS-2112: The OLR service started on node xydb6node2. 2020-02-28 13:41:24.018 [OHASD(54336)]CRS-1301: Oracle High Availability Service started on node xydb6node2. 2020-02-28 13:41:24.018 [OHASD(54336)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2020-02-28 13:41:24.026 [OHASD(60420)]CRS-1301: Oracle High Availability Service started on node xydb6node2. 2020-02-28 13:41:24.026 [OHASD(60420)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2020-02-28 13:41:24.025 [OHASD(62185)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 62185 2020-02-28 13:41:24.132 [OHASD(62185)]CRS-0714: Oracle Clusterware Release 19.0.0.0.0. 2020-02-28 13:41:24.142 [OHASD(62185)]CRS-2112: The OLR service started on node xydb6node2. 2020-02-28 13:41:24.632 [OHASD(62185)]CRS-1301: Oracle High Availability Service started on node xydb6node2. 2020-02-28 13:41:24.632 [OHASD(62185)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred 2020-02-28 13:41:24.812 [CSSDAGENT(62356)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 62356 2020-02-28 13:41:24.835 [CSSDMONITOR(62358)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 62358 2020-02-28 13:41:24.880 [ORAROOTAGENT(62337)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 62337 2020-02-28 13:41:24.910 [ORAAGENT(62347)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 62347 2020-02-28 13:41:25.499 [OHASD(62185)]CRS-6015: Oracle Clusterware has experienced an internal error. Details at (:CLSGEN00100:) {0:0:2} in /u01/app/grid/diag/crs/xydb6node2/crs/trace/ohasd.trc. 2020-02-28T13:41:25.515580+08:00 Errors in file /u01/app/grid/diag/crs/xydb6node2/crs/trace/ohasd.trc (incident=1): CRS-6015 [] [] [] [] [] [] [] [] [] [] [] [] Incident details in: /u01/app/grid/diag/crs/xydb6node2/crs/incident/incdir_1/ohasd_i1.trc 2020-02-28 13:41:25.531 [OHASD(62185)]CRS-8505: Oracle Clusterware OHASD process with operating system process ID 62185 encountered internal error CRS-06015
复制
继续看trace中的日志,如下:
2763925248: [CLSDIMT] 2020-02-28 13:41:25.533 :GIPCHGEN:3446142720: gipchaInternalGroupDestroy: Destroyed hagroup 0x7fd7c4021490 [00000000000092b0] { gipchaGroup : numDead 0, numEndp 0, numZombi+ 2763925248: [CLSDIMT] 2020-02-28 13:41:25.533 :GIPCHGEN:3446142720: gipchaGroupFree: destroying ha group 0x7fd7c4021490 [00000000000092b0] { gipchaGroup : numDead 0, numEndp 0, numZombie 0, numP+ 2763925248: [CLSDIMT] 2020-02-28 13:41:25.533 :GIPCGEN:3446142720: gipcEndpointFree: destroying the endp 0x7fd7c4021a70 endpId 00000000000092b4 2763925248: [CLSDIMT] 2020-02-28 13:41:25.533 :GIPCGEN:3446142720: gipcEndpointCheckFlush: GIPC_FLAG_CLOSE_IMMEDIATE set for endp 0x7fd7c4021a70 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: (:CLSCE0099:)clsce_publish_internal 0x55bc53750280 destroying connection (nil) 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: mx 0x55bc53750280 release { 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: mx 0x55bc53750280 release } 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: clsce_publish_internal 0x55bc53750280 } 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCAL:3446142720: (:CLSCAL0811:)clscal_repository_write_publish_evt_new: clsce_publish() failed, ret [4], err [CRS-10203: (:CLSCE0047:) Could no+ 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: clsce_event_serialize { 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: clsce_event_serialize } 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: clsce_event_destroy { 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: (:CLSCE0056:)clsce_event_destroy event 0x440d86a0 destroyed 2763925248: [CLSDIMT] 2020-02-28 13:41:25.534 :CLSCEVT:3446142720: clsce_event_destroy } 2020-02-28 13:41:25.716 :CLSDIMT:2763925248: Wraps: [16] Size: [10005,129] 2020-02-28 13:41:25.716 :CLSDIMT:2763925248: ===> CLSD In-memory buffer ends ----- END DDE Action: 'clsdAdrActions' (SUCCESS, 19 csec) ----- [TOC00018-END] ----- END DDE Actions Dump (total 20 csec) ----- [TOC00004-END] End of Incident Dump [TOC00002-END] TOC00000 - Table of contents TOC00001 - Error Stack TOC00002 - Dump for incident 1 (CRS 6015) | TOC00003 - START Event Driven Actions Dump | TOC00004 - START DDE Actions Dump | | TOC00005 - START DDE Action: 'dumpFrameContext' (Sync) | | | TOC00006 - START Frame Context DUMP | | TOC00007 - START DDE Action: 'dumpDiagCtx' (Sync) | | | TOC00008 - Diag Context Dump | | TOC00009 - START DDE Action: 'dumpBuckets' (Sync) | | | TOC00010 - Trace Bucket Dump Begin: CLSD_SHARED_BUCKET | | TOC00011 - START DDE Action: 'dumpGeneralConfiguration' (Sync) | | | TOC00012 - General Configuration | | TOC00013 - START DDE Action: 'xdb_dump_buckets' (Sync) | | TOC00014 - START DDE Action: 'dumpKGERing' (Sync) | | TOC00015 - START DDE Action: 'dumpKGEIEParms' (Sync) | | TOC00016 - START DDE Action: 'dumpKGEState' (Sync) | | TOC00017 - START DDE Action: 'kpuActionDefault' (Sync) | | TOC00018 - START DDE Action: 'clsdAdrActions' (Sync)
复制
还是看不出是什么问题,crs还是无法正常启动,在Mos上也找不到答案,暂时的办法只能先改成64,再重装数据库。
IPV6INIT=yes IPV6_FAILURE_FATAL=no IPV6ADDR=2409:8760:1282:0001:0F11:0000:0000:0047/120 IPV6_DEFAULTGW=2409:8760:1282:0001:0F11:0000:0000:00FF
复制
改成64位
IPV6INIT=yes IPV6_FAILURE_FATAL=no IPV6ADDR=2409:8760:1282:0001:0F11:0000:0000:0047/64 IPV6_DEFAULTGW=2409:8760:1282:0001:0F11:0000:0000:00FF
复制
临时解决办法
- 将public ip的子网长度改成64。
- private ip的子网长度也要用64。
- 重装gi后恢复正常。
总结
关于19c rac ipv6 子网长度的问题 ,我暂时还没找到官方的回复,临时测试出来的解决办法是修改子网长度为64,目前测试发现大于小于64都不行,能正常安装,无法关闭,强杀进程后启不来,后续找到解决办法再补充进来。
附:ipv6基础知识
最后修改时间:2020-02-28 20:02:38
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。
评论
相关阅读
【专家有话说第五期】在不同年龄段,DBA应该怎样规划自己的职业发展?
墨天轮编辑部
1321次阅读
2025-03-13 11:40:53
【专家观点】罗敏:从理论到真实SQL,感受DeepSeek如何做性能优化
墨天轮编辑部
1305次阅读
2025-03-06 16:45:38
2025年2月国产数据库大事记
墨天轮编辑部
1021次阅读
2025-03-05 12:27:34
2025年2月国产数据库中标情况一览:GoldenDB 3500+万!达梦近千万!
通讯员
903次阅读
2025-03-06 11:40:20
Oracle RAC ASM 磁盘组满了,无法扩容怎么在线处理?
Lucifer三思而后行
790次阅读
2025-03-17 11:33:53
Oracle+Deepseek+Dify 实现数据库数据实时分析
bicewow
722次阅读
2025-03-06 09:41:49
Oracle避坑指南|同名表导出难题:如何精准排除指定用户下的表?
szrsu
557次阅读
2025-03-05 00:42:34
2月“墨力原创作者计划”获奖名单公布
墨天轮编辑部
464次阅读
2025-03-13 14:38:19
AI的优化能力,取决于你问问题的能力!
潇湘秦
438次阅读
2025-03-11 11:18:22
Oracle 如何修改 db_unique_name?强迫症福音!
Lucifer三思而后行
353次阅读
2025-03-12 21:27:56