环境配置简介:
此次是模拟物理机生产环境,采用两台Dell R730机架服务器,日立存储G350,存储系统与服务器之间有一台博科光纤交换机,public-IP采用的是1G速率的双绞线,连接至一台交换机,private-IP采用1G速率的双绞线连接至另一台交换机。
软件:操作系统RedHat 6.9,数据库Oracle database 11.2.0.4。

/etc/hosts文件配置
[root@host02 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
#public ip
192.168.146.233 host01
192.168.146.232 host02
#vitual ip
192.168.146.235 host01-vip
192.168.146.234 host02-vip
#private ip
10.10.10.1 host01-priv
10.10.10.2 host02-priv
#scan ip
192.168.146.237 scan-ip
环境配置简要说明
时间管理:双节点时间同步使用物理机设置,使用date命令,禁用了NTP服务,所以安装过程会自动启用Oracle grid集群软件自带的CTSS管理服务。
网络管理(DNS):没有启用grid自带的GNS服务,也没有安装DNS服务,也没有安装DHCP服务,直接采用/etc/hosts文件中配置SCAN IP方式。
存储管理:使用udev配置,直接在/etc/udev/rules.d/99-oracle-asmdevices.rules
[root@host02 rules.d]# cat 99-oracle-asmdevices.rules
/etc/udev/rules.d/99-oracle-asmdevices.rules
#by gaoby 2025-02-09
KERNEL=="sd*",ENV{ID_SERIAL}=="360060e80225cda0050415cda00003a03",SYMLINK+="rac_OCR01%n",OWNER="grid",GROUP="oinstall",MODE="0660"
KERNEL=="sd*",ENV{ID_SERIAL}=="360060e80225cda0050415cda00003a04",SYMLINK+="rac_OCR02%n",OWNER="grid",GROUP="oinstall",MODE="0660"
KERNEL=="sd*",ENV{ID_SERIAL}=="360060e80225cda0050415cda00003a05",SYMLINK+="rac_OCR03%n",OWNER="grid",GROUP="oinstall",MODE="0660"
KERNEL=="sd*",ENV{ID_SERIAL}=="360060e80225cda0050415cda00003a00",SYMLINK+="rac_DATA01%n",OWNER="grid",GROUP="oinstall",MODE="0660"
KERNEL=="sd*",ENV{ID_SERIAL}=="360060e80225cda0050415cda00003a01",SYMLINK+="rac_DATA02%n",OWNER="grid",GROUP="oinstall",MODE="0660"
KERNEL=="sd*",ENV{ID_SERIAL}=="360060e80225cda0050415cda00003a02",SYMLINK+="rac_FRA01%n",OWNER="grid",GROUP="oinstall",MODE="0660"遇到问题
第一个节点root执行/u01/app/11.2.0/grid/root.sh脚本时报错
Failed to start Oracle Grid Infrastructure stack.
Failed to start Cluster Synchorinisation Service in Clustered mode at /u01/app/11.2.0/grid/crs/install/crsconfig_lib.pm line 1278
通过查看日志文件/u01/app/grid/cfgtoollogs/asmca/asmca-250107PM060905.log
ORA-15032:not all alterations performed
ORA-15017: diskgroup "CRSDG" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "CRSDG"具体报错信息如下图

通过网上各种检查没有解决问题,最后通过检查操作系统日志/var/log/messages发现问题,而磁盘sdd并非操作系统盘,而是日立存储G350的LUN,

因为sdb ,sdc,sdd,sde,sdf,sdg都是日立存储G350的LUN,在安装grid过程中有时会异常属主和属组变为root和disk。因而判断从存储至服务器的光纤链路有异常(光纤收发模块,光纤线,服务器的光纤卡,光纤交换机),但光纤交换机和服务器光纤卡通常不会时而异常时而连接状态良好,因此判断是光纤线和光纤收发模块故障。
[root@host02 rules.d]# ll /dev/sd*
brw-rw---- 1 root disk 8, 0 Feb 27 17:54 /dev/sda
brw-rw---- 1 root disk 8, 1 Feb 27 17:54 /dev/sda1
brw-rw---- 1 root disk 8, 2 Feb 27 17:54 /dev/sda2
brw-rw---- 1 root disk 8, 3 Feb 27 17:54 /dev/sda3
brw-rw---- 1 root disk 8, 4 Feb 27 17:54 /dev/sda4
brw-rw---- 1 grid oinstall 8, 16 Feb 27 17:54 /dev/sdb
brw-rw---- 1 grid oinstall 8, 32 Feb 27 17:54 /dev/sdc
brw-rw---- 1 grid oinstall 8, 48 Mar 6 14:29 /dev/sdd
brw-rw---- 1 grid oinstall 8, 64 Mar 6 14:29 /dev/sde
brw-rw---- 1 grid oinstall 8, 80 Mar 6 14:29 /dev/sdf
brw-rw---- 1 grid oinstall 8, 96 Mar 6 14:29 /dev/sdgj经过与存储工程师沟通后,更换光纤线后,再次安装grid软件就完全正常。




