rac环境模拟vote盘和data盘磁盘头损坏的修复

时间:2022-12-29 07:15:50
文档课题:rac环境模拟vote盘和data盘磁盘头损坏的修复.
系统:centos 7.9 64位
数据库:oracle 11.2.0.4 64位
环境:rac (两节点)
1、磁盘组信息
1.1、系统信息
[root@hisdb1 ~]# cat /etc/*release
CentOS Linux release 7.9.2009 (Core)
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

CentOS Linux release 7.9.2009 (Core)
CentOS Linux release 7.9.2009 (Core)
1.2、磁盘信息
SQL> select group_number,name,path,state,total_mb,free_mb from v$asm_disk where name is not null order by path;

GROUP_NUMBER NAME PATH STATE TOTAL_MB FREE_MB
------------ --------------- -------------------- -------- ---------- ----------
2 DATA02 ORCL:DATA02 NORMAL 10239 6662
1 DATA03 ORCL:DATA03 NORMAL 20479 13765
3 DATA04 ORCL:DATA04 NORMAL 10239 9843
SQL> select group_number,name,type,total_mb,free_mb from v$asm_diskgroup;

GROUP_NUMBER NAME TYPE TOTAL_MB FREE_MB
------------ --------------- ------ ---------- ----------
1 DATA EXTERN 20479 13765
2 FRA EXTERN 10239 6662
3 OCRBK EXTERN 10239 9843
[root@hisdb1 disks]# pwd
/dev/oracleasm/disks
[root@hisdb1 disks]# ll /dev/oracleasm/disks/*
brw-rw---- 1 grid asmadmin 8, 17 Dec 27 20:27 /dev/oracleasm/disks/DATA01
brw-rw---- 1 grid asmadmin 8, 33 Dec 27 20:27 /dev/oracleasm/disks/DATA02
brw-rw---- 1 grid asmadmin 8, 49 Dec 27 20:27 /dev/oracleasm/disks/DATA03
brw-rw---- 1 grid asmadmin 8, 65 Dec 27 20:27 /dev/oracleasm/disks/DATA04
说明:以上DATA04对应vote盘,DATA03对应data盘.
2、vote盘
模拟vote盘的损坏以及修复.
2.1、拷贝数据
--从/dev/oracleasm/disks/DATA04拷贝1个8k的块到/home/grid/data04.dd
[grid@hisdb1 disks]$ dd if=/dev/oracleasm/disks/DATA04 of=/home/grid/data04.dd bs=8192 count=1
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000340858 s, 24.0 MB/s
[grid@hisdb1 ~]$ ll data04.dd
-rw-r--r-- 1 grid oinstall 8192 Dec 27 21:32 data04.dd
--借助kfed读取/dev/oracleasm/disks/DATA04磁盘头信息.
[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA04 text=data04.txt
[grid@hisdb1 ~]$ head data04.txt
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3855329304 ; 0x00c: 0xe5cba818
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
2.2、损坏磁盘
--破坏votedisk磁盘组的磁盘
[grid@hisdb1 ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/DATA04 bs=8192 count=1
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000128522 s, 63.7 MB/s
[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA04 | head
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
2.3、异常重现
--重启集群
[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl stop cluster -all
CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb1'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.hisdb1.vip' on 'hisdb1'
CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.heal.db' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb1'
CRS-2677: Stop of 'ora.hisdb1.vip' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb2'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb2'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.cvu' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.oc4j' on 'hisdb2'
CRS-2677: Stop of 'ora.cvu' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.hisdb2.vip' on 'hisdb2'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'hisdb2'
CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.heal.db' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb2'
CRS-2677: Stop of 'ora.hisdb2.vip' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'
CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'hisdb1'
CRS-2677: Stop of 'ora.oc4j' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.ons' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb1'
CRS-2677: Stop of 'ora.net1.network' on 'hisdb1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb1' has completed
CRS-2677: Stop of 'ora.crsd' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'
CRS-2677: Stop of 'ora.evmd' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'
CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'hisdb2'
CRS-2677: Stop of 'ora.ons' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb2'
CRS-2677: Stop of 'ora.net1.network' on 'hisdb2' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb2' has completed
CRS-2677: Stop of 'ora.crsd' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'
CRS-2677: Stop of 'ora.evmd' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb1'
CRS-2677: Stop of 'ora.ctssd' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb1'
CRS-2677: Stop of 'ora.cssd' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb2'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb2'
CRS-2677: Stop of 'ora.cssd' on 'hisdb2' succeeded
[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl start cluster -all
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb2'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb1'
CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'hisdb2'
CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'hisdb1'
CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb2'
CRS-2676: Start of 'ora.diskmon' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb1'
CRS-2676: Start of 'ora.diskmon' on 'hisdb1' succeeded
……
说明:此时会一直hang住,因为损坏的是投票盘,集群无法启动.
2.4、相关告警
--ocssd.log不断报如下错误:
2022-12-27 22:17:15.937: [ CSSD][3821278976]clssnmvDiskVerify: Successful discovery of 0 disks
2022-12-27 22:17:15.937: [ CSSD][3821278976]clssnmCompleteInitVFDiscovery: Completing initial voting file discovery
2022-12-27 22:17:15.937: [ CSSD][3821278976]clssnmvFindInitialConfigs: No voting files found
2022-12-27 22:17:15.937: [ CSSD][3821278976](:CSSNM00070:)clssnmCompleteInitVFDiscovery: Voting file not found. Retrying discovery in 15 seconds
2022-12-27 22:17:15.996: [ CSSD][3823675136]clssscSelect: cookie accept request 0x7fa0d80845c0
2022-12-27 22:17:15.996: [ CSSD][3823675136]clssscevtypSHRCON: getting client with cmproc 0x7fa0d80845c0
2022-12-27 22:17:15.996: [ CSSD][3823675136]clssgmRegisterClient: proc(4/0x7fa0d80845c0), client(358/0x7fa0d8071230)
2022-12-27 22:17:15.996: [ CSSD][3823675136]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x7fa0d80845c0) client(0x7fa0d8071230)
2022-12-27 22:17:15.996: [ CSSD][3823675136]clssgmDiscEndpcl: gipcDestroy 0x5976
2022-12-27 22:17:16.329: [ CSSD][3823675136]clssscSelect: cookie accept request 0x7fa0d8099e80
2022-12-27 22:17:16.329: [ CSSD][3823675136]clssscevtypSHRCON: getting client with cmproc 0x7fa0d8099e80
2022-12-27 22:17:16.329: [ CSSD][3823675136]clssgmRegisterClient: proc(5/0x7fa0d8099e80), client(357/0x7fa0d8071230)
2022-12-27 22:17:16.329: [ CSSD][3823675136]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x7fa0d8099e80) client(0x7fa0d8071230)
2022-12-27 22:17:16.329: [ CSSD][3823675136]clssgmDiscEndpcl: gipcDestroy 0x598c
2022-12-27 22:17:16.998: [ CSSD][3823675136]clssscSelect: cookie accept request 0x7fa0d80845c0
2022-12-27 22:17:16.998: [ CSSD][3823675136]clssscevtypSHRCON: getting client with cmproc 0x7fa0d80845c0
2022-12-27 22:17:16.998: [ CSSD][3823675136]clssgmRegisterClient: proc(4/0x7fa0d80845c0), client(359/0x7fa0d8071230)
2022-12-27 22:17:16.998: [ CSSD][3823675136]clssgmExecuteClientRequest(): type(6) size(684) only connect and exit messages are allowed before lease acquisition proc(0x7fa0d80845c0) client(0x7fa0d8071230)
-- alerthisdb1.log报错如下
[grid@hisdb1 hisdb1]$ tail -5000f alerthisdb1.log
每隔15s如下错误
[cssd(7816)]CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds; Details at (:CSSNM00070:) in /u01/app/11.2.0/grid/log/hisdb1/cssd/ocssd.log
2.5、恢复vote磁盘
[grid@hisdb1 ~]$ kfed repair /dev/oracleasm/disks/DATA04
说明:修复成功后,集群恢复正常.
3、data盘
模拟data盘的损坏和修复.
3.1、拷贝数据
[grid@hisdb1 ~]$ dd if=/dev/oracleasm/disks/DATA03 of=/home/grid/data03.dd bs=8192 count=1
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000373797 s, 21.9 MB/s
[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA03 | head
kfbh.endian: 1 ; 0x000: 0x01
kfbh.hard: 130 ; 0x001: 0x82
kfbh.type: 1 ; 0x002: KFBTYP_DISKHEAD
kfbh.datfmt: 1 ; 0x003: 0x01
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 2147483648 ; 0x008: disk=0
kfbh.check: 3875939376 ; 0x00c: 0xe7062430
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
3.2、损坏磁盘
[grid@hisdb1 ~]$ dd if=/dev/zero of=/dev/oracleasm/disks/DATA03 bs=8192 count=1
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000199175 s, 41.1 MB/s
[grid@hisdb1 ~]$ kfed read /dev/oracleasm/disks/DATA03 | head
kfbh.endian: 0 ; 0x000: 0x00
kfbh.hard: 0 ; 0x001: 0x00
kfbh.type: 0 ; 0x002: KFBTYP_INVALID
kfbh.datfmt: 0 ; 0x003: 0x00
kfbh.block.blk: 0 ; 0x004: blk=0
kfbh.block.obj: 0 ; 0x008: file=0
kfbh.check: 0 ; 0x00c: 0x00000000
kfbh.fcn.base: 0 ; 0x010: 0x00000000
kfbh.fcn.wrap: 0 ; 0x014: 0x00000000
kfbh.spare1: 0 ; 0x018: 0x00000000
3.3、异常重现
[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl stop cluster -all
CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.crsd' on 'hisdb2'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.LISTENER_SCAN1.lsnr' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.cvu' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.oc4j' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb1'
CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'hisdb2'
CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.OCRBK.dg' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.FRA.dg' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.heal.db' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'hisdb2'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.hisdb2.vip' on 'hisdb2'
CRS-2677: Stop of 'ora.LISTENER_SCAN1.lsnr' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.scan1.vip' on 'hisdb1'
CRS-2677: Stop of 'ora.cvu' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.hisdb1.vip' on 'hisdb1'
CRS-2677: Stop of 'ora.heal.db' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb2'
CRS-2677: Stop of 'ora.heal.db' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.DATA.dg' on 'hisdb1'
CRS-2677: Stop of 'ora.hisdb2.vip' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.scan1.vip' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.FRA.dg' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.DATA.dg' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.hisdb1.vip' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.oc4j' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'
CRS-2677: Stop of 'ora.OCRBK.dg' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'
CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.ons' on 'hisdb2'
CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.ons' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb2'
CRS-2677: Stop of 'ora.net1.network' on 'hisdb2' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb2' has completed
CRS-2673: Attempting to stop 'ora.ons' on 'hisdb1'
CRS-2677: Stop of 'ora.ons' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.net1.network' on 'hisdb1'
CRS-2677: Stop of 'ora.net1.network' on 'hisdb1' succeeded
CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'hisdb1' has completed
CRS-2677: Stop of 'ora.crsd' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb2'
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb2'
CRS-2677: Stop of 'ora.crsd' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.ctssd' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.evmd' on 'hisdb1'
CRS-2673: Attempting to stop 'ora.asm' on 'hisdb1'
CRS-2677: Stop of 'ora.evmd' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.evmd' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.ctssd' on 'hisdb2' succeeded
CRS-2677: Stop of 'ora.asm' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb2'
CRS-2677: Stop of 'ora.asm' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'hisdb1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb1'
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'hisdb2'
CRS-2677: Stop of 'ora.cssd' on 'hisdb1' succeeded
CRS-2677: Stop of 'ora.cssd' on 'hisdb2' succeeded
[root@hisdb1 ~]# /u01/app/11.2.0/grid/bin/crsctl start cluster -all
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb1'
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'hisdb2'
CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'hisdb1'
CRS-2676: Start of 'ora.cssdmonitor' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb1'
CRS-2672: Attempting to start 'ora.cssd' on 'hisdb2'
CRS-2672: Attempting to start 'ora.diskmon' on 'hisdb2'
CRS-2676: Start of 'ora.diskmon' on 'hisdb1' succeeded
CRS-2676: Start of 'ora.diskmon' on 'hisdb2' succeeded
CRS-2676: Start of 'ora.cssd' on 'hisdb1' succeeded
CRS-2672: Attempting to start 'ora.ctssd' on 'hisdb1'
CRS-2676: Start of 'ora.cssd' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'hisdb1'
CRS-2672: Attempting to start 'ora.ctssd' on 'hisdb2'
CRS-2676: Start of 'ora.ctssd' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.evmd' on 'hisdb2'
CRS-2676: Start of 'ora.ctssd' on 'hisdb1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'hisdb2'
CRS-2672: Attempting to start 'ora.evmd' on 'hisdb1'
CRS-2676: Start of 'ora.evmd' on 'hisdb2' succeeded
CRS-2676: Start of 'ora.evmd' on 'hisdb1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'hisdb1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'hisdb1'
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'hisdb2'
CRS-2676: Start of 'ora.asm' on 'hisdb1' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'hisdb1'
CRS-2676: Start of 'ora.asm' on 'hisdb2' succeeded
CRS-2672: Attempting to start 'ora.crsd' on 'hisdb2'
CRS-2676: Start of 'ora.crsd' on 'hisdb1' succeeded
CRS-2676: Start of 'ora.crsd' on 'hisdb2' succeeded
说明:集群能成功开启,但无法打开实例,因为实例的相关数据文件全在data磁盘组.
3.4、相关异常
[grid@hisdb1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Tue Dec 27 22:46:21 2022

Copyright (c) 1982, 2013, Oracle. All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> col name for a20
SQL> col path for a40
SQL> set line 160
SQL> select name,total_mb,usable_file_mb,state from v$asm_diskgroup;

NAME TOTAL_MB USABLE_FILE_MB STATE
-------------------- ---------- -------------- -----------
FRA 10239 6624 MOUNTED
OCRBK 10239 9843 MOUNTED
SQL> alter diskgroup data mount;
alter diskgroup data mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"
说明:可以看到data磁盘无法挂载.
[grid@hisdb1 hisdb1]$ tail -5000f alerthisdb1.log
2022-12-27 22:46:18.033:
[crsd(10020)]CRS-2807:Resource 'ora.DATA.dg' failed to start automatically.
2022-12-27 22:46:18.033:
[crsd(10020)]CRS-2807:Resource 'ora.DATA.dg' failed to start automatically.
2022-12-27 22:46:18.033:
[crsd(10020)]CRS-2807:Resource 'ora.heal.db' failed to start automatically.
2022-12-27 22:46:18.033:
[crsd(10020)]CRS-2807:Resource 'ora.heal.db' failed to start automatically.
说明:集群告警日志如上.
SQL> select group_number,name,path,state,total_mb,free_mb from v$asm_disk;

GROUP_NUMBER NAME PATH STATE TOTAL_MB FREE_MB
------------ -------------------- --------------- -------- ---------- ----------
0 ORCL:DATA01 NORMAL 0 0
0 ORCL:DATA03 NORMAL 0 0
2 DATA02 ORCL:DATA02 NORMAL 10239 6624
3 DATA04 ORCL:DATA04 NORMAL 10239 9843
[grid@hisdb2 hisdb2]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE OFFLINE hisdb1
ONLINE OFFLINE hisdb2
ora.FRA.dg
ONLINE ONLINE hisdb1
ONLINE ONLINE hisdb2
ora.LISTENER.lsnr
ONLINE ONLINE hisdb1
ONLINE ONLINE hisdb2
ora.OCRBK.dg
ONLINE ONLINE hisdb1
ONLINE ONLINE hisdb2
ora.asm
ONLINE ONLINE hisdb1 Started
ONLINE ONLINE hisdb2 Started
ora.gsd
OFFLINE OFFLINE hisdb1
OFFLINE OFFLINE hisdb2
ora.net1.network
ONLINE ONLINE hisdb1
ONLINE ONLINE hisdb2
ora.ons
ONLINE ONLINE hisdb1
ONLINE ONLINE hisdb2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE hisdb2
ora.cvu
1 ONLINE ONLINE hisdb1
ora.heal.db
1 ONLINE OFFLINE Instance Shutdown
2 ONLINE OFFLINE Instance Shutdown
ora.hisdb1.vip
1 ONLINE ONLINE hisdb1
ora.hisdb2.vip
1 ONLINE ONLINE hisdb2
ora.oc4j
1 ONLINE ONLINE hisdb1
ora.orcl.db
1 OFFLINE OFFLINE Instance Shutdown
2 OFFLINE OFFLINE Instance Shutdown
ora.scan1.vip
1 ONLINE ONLINE hisdb2
说明:集群状态显示异常,heal数据库无法开启.
3.5、恢复data磁盘
[grid@hisdb1 ~]$ kfed repair /dev/oracleasm/disks/DATA03
[grid@hisdb1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.4.0 Production on Tue Dec 27 22:54:47 2022

Copyright (c) 1982, 2013, Oracle. All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> alter diskgroup data mount;

Diskgroup altered.
SQL> select group_number,name,path,state,total_mb,free_mb from v$asm_disk;

GROUP_NUMBER NAME PATH STATE TOTAL_MB FREE_MB
------------ --------------- ------------------------- -------- ---------- ----------
0 ORCL:DATA01 NORMAL 0 0
2 DATA02 ORCL:DATA02 NORMAL 10239 6618
1 DATA03 ORCL:DATA03 NORMAL 20479 13765
3 DATA04 ORCL:DATA04 NORMAL 10239 9843
说明:data磁盘修复成功后,集群恢复正常.

参考文档:
https://www.modb.pro/db/22060
https://blog.csdn.net/jycjyc/article/details/106275991
https://blog.51cto.com/lhrbest/2699983