oracle11g asm standalone 单实例重建

时间:2023-06-01 19:48:56

原文地址:oracle11g asm单实例重建has 作者:datapeng

最近到客户那里处理故障,客户说,他们修改了一下hostname,导到has出现了问题,当然,他们的数据库也就无法再启动,把处理过程记录下来,供大家参考!
在有些时候,我们修改了hostname,以及其它一些配置后,导至has、crs均无法启动,这时候,我们就需要对has、crs进行重建,其具体步骤如下。
1、首先修改hostname

[root@mydb ~]# vi /etc/hosts

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1               testdb localhost.localdomain localhost
::1             localhost6.localdomain6 localhost6
192.168.88.150          testdb

[root@mydb ~]# vi /etc/sysconfig/network

NETWORKING=yes
NETWORKING_IPV6=yes
HOSTNAME=testdb

[root@mydb network-scripts]# reboot

Broadcast message from root (pts/0) (Fri Dec 13 08:57:58 2013):

The system is going down for reboot NOW!

2、修改hostname后,重启出现问题

[root@testdb bin]# ./crsctl start has
CLSU-00100: Operating System function: opendir failed with error data: 2
CLSU-00101: Operating System error message: No such file or directory
CLSU-00103: error location: scrsearch1
CLSU-00104: additional error information: cant open scr home dir scls_scr_getval
CRS-4000: Command Start failed, or completed with errors.

可以看到,has无法启动,在这种情况下,我们必须重新配置has\crs

3、配置has,修复上面的问题存在

--首先,对has进行删除,然后再进行配置
[root@testdb bin]# cd /u01/app/grid/product/11.2.0/crs/crs/install
[root@testdb install]# ls *has.pl
roothas.pl
[root@testdb install]# ./roothas.pl -deconfig -force -verbose
Using configuration parameter file: ./crsconfig_params
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Stop failed, or completed with errors.
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Delete failed, or completed with errors.
CLSU-00100: Operating System function: opendir failed with error data: 2
CLSU-00101: Operating System error message: No such file or directory
CLSU-00103: error location: scrsearch1
CLSU-00104: additional error information: cant open scr home dir scls_scr_getval
CRS-4544: Unable to connect to OHAS
CRS-4000: Command Stop failed, or completed with errors.
Successfully deconfigured Oracle Restart stack

--重新配置has
[root@testdb install]# ./roothas.pl
Using configuration parameter file: ./crsconfig_params
LOCAL ADD MODE 
Creating OCR keys for user 'oracle', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE 
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node testdb successfully pinned.
Adding Clusterware entries to inittab

testdb     2013/12/13 09:55:56     /u01/app/grid/product/11.2.0/crs/cdata/testdb/backup_20131213_095556.olr
Successfully configured Oracle Grid Infrastructure for a Standalone Server

如果失败,在这里会提示你做这样一个操作:
/u01/app/grid/product/11.2.0/crs/perl/bin/perl -I/u01/app/grid/product/11.2.0/crs/perl/lib -I/u01/app/grid/product/11.2.0/crs/crs/install /u01/app/grid/product/11.2.0/crs/crs/install/roothas.pl

[root@testdb bin]# ./crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora.cssd       ora.cssd.type  OFFLINE   OFFLINE               
ora.diskmon    ora....on.type OFFLINE   OFFLINE               
ora.evmd       ora.evm.type   ONLINE    ONLINE    testdb      
ora.ons        ora.ons.type   OFFLINE   OFFLINE

[root@testdb bin]# ./crsctl config has
CRS-4622: Oracle High Availability Services autostart is enabled.

可以看到,这里has已经起来了,但是crs并没有起来

[root@testdb bin]# ./crsctl start crs
CRS-4013: This command is not supported in a single-node configuration.
CRS-4000: Command Start failed, or completed with errors.

--重新配置crs

[root@testdb ~]# cd /etc/oracle/scls_scr/
[root@testdb scls_scr]# ls -a
.  ..  testdb
[root@testdb scls_scr]# 
看到有新的名称,但crs并没有起来,所以,我们必须重新配置

--先卸载crs,然后再重新配置
[root@testdb bin]# cd /u01/app/grid/product/11.2.0/crs/crs/install
[root@testdb install]# ls *crs.pl
rootcrs.pl
[root@testdb install]# ./rootcrs.pl -deconfig -force -verbose
Using configuration parameter file: ./crsconfig_params
Usage: srvctl []
    commands: enable|disable|start|stop|status|add|remove|modify|getenv|setenv|unsetenv|config|upgrade
    objects: database|service|asm|diskgroup|listener|home|ons
For detailed help on each command and object and its options use:
  srvctl -h or
  srvctl -h
PRKO-2012 : nodeapps object is not supported in Oracle Restart
CRS-2613: Could not find resource 'ora.registry.acfs'.
CRS-4000: Command Stop failed, or completed with errors.
CRS-2613: Could not find resource 'ora.drivers.acfs'.
CRS-4000: Command Modify failed, or completed with errors.
CRS-2613: Could not find resource 'ora.drivers.acfs'.
CRS-4000: Command Delete failed, or completed with errors.
CRS-4013: This command is not supported in a single-node configuration.
CRS-4000: Command Stop failed, or completed with errors.
################################################################
# You must kill processes or reboot the system to properly #
# cleanup the processes started by Oracle clusterware          #
################################################################
Successfully deconfigured Oracle clusterware stack on this node
[root@testdb install]# ./rootcrs.pl
Using configuration parameter file: ./crsconfig_params
 
The oracle binary is currently linked with RAC disabled.
Please execute the following steps to relink oracle binary
and rerun the command with RAC enabled: 
   setenv ORACLE_HOME 
   cd /rdbms/lib 
   make -f ins_rdbms.mk rac_on ioracle 
[root@testdb install]# ./roothas.pl
Using configuration parameter file: ./crsconfig_params
LOCAL ADD MODE 
Creating OCR keys for user 'oracle', privgrp 'oinstall'..
Operation successful.
LOCAL ONLY MODE 
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
CRS-4664: Node testdb successfully pinned.
Adding Clusterware entries to inittab

testdb     2013/12/13 10:10:38     /u01/app/grid/product/11.2.0/crs/cdata/testdb/backup_20131213_101038.olr
Successfully configured Oracle Grid Infrastructure for a Standalone Server

到这里,crs配置已经完成

4、添加服务进去
[root@testdb bin]# ./srvctl add listener
[root@testdb bin]# ./srvctl add asm -l LISTENER
[root@testdb bin]# ./crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               OFFLINE OFFLINE      testdb                                       
ora.asm
               OFFLINE OFFLINE      testdb                                       
ora.ons
               OFFLINE OFFLINE      testdb                                       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        OFFLINE OFFLINE                                                   
ora.diskmon
      1        OFFLINE OFFLINE                                                   
ora.evmd
      1        ONLINE  ONLINE       testdb

[oracle@testdb bin]$ ./crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....ER.lsnr ora....er.type OFFLINE   OFFLINE               
ora.asm        ora.asm.type   OFFLINE   OFFLINE               
ora.cssd       ora.cssd.type  OFFLINE   OFFLINE               
ora.diskmon    ora....on.type OFFLINE   OFFLINE               
ora.evmd       ora.evm.type   ONLINE    ONLINE    testdb      
ora.ons        ora.ons.type   OFFLINE   OFFLINE               
ora.mydb.db  ora....se.type OFFLINE   OFFLINE

5、启动各服务
[root@testdb bin]# ./srvctl start asm
PRCR-1079 : Failed to start resource ora.asm
CRS-2674: Start of 'ora.asm' on 'testdb' failed
CRS-2678: 'ora.asm' on 'testdb' has experienced an unrecoverable failure
CRS-0267: Human intervention required to resume its availability.
CRS-5802: Unable to start the agent process

启动asm时候报错。由于,我是用root的用户加进去的,所以asm是无法启动的,查了好久,最后尝试用管理的用户去尝试一下

使用asm的管理用户oracle(我在安装是没有建grid,所以均是oracle)
[root@testdb bin]# su - oracle
[oracle@testdb ~]$ export ORACLE_HOME=/u01/app/grid/product/11.2.0/crs
[oracle@testdb ~]$ cd /u01/app/grid/product/11.2.0/crs/bin
[oracle@testdb bin]$ ./srvctl remove asm
[oracle@testdb bin]$ ./srvctl add asm -l LISTENER
[oracle@testdb bin]$ ./srvctl start asm

数据库也添加进去

[oracle@testdb bin]$ ./srvctl add database -d mydb -o /u01/app/oracle/product/11.2.0/db1 
[oracle@testdb bin]$ ./crs_stat -t
Name           Type           Target    State     Host        
------------------------------------------------------------
ora....ER.lsnr ora....er.type ONLINE    ONLINE    testdb      
ora.asm        ora.asm.type   ONLINE    ONLINE    testdb      
ora.cssd       ora.cssd.type  ONLINE    ONLINE    testdb      
ora.diskmon    ora....on.type OFFLINE   OFFLINE               
ora.evmd       ora.evm.type   ONLINE    ONLINE    testdb      
ora.ons        ora.ons.type   OFFLINE   OFFLINE               
ora.mydb.db  ora....se.type   ONLINE    OFFLINE

打开数据库
[oracle@testdb ~]$ sqlplus "/as sysdba"

SQL*Plus: Release 11.2.0.3.0 Production on Fri Dec 13 12:43:14 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup
ORA-01078: failure in processing system parameters
ORA-01565: error in identifying file '+SYSDG/wuyedb/spfilewuyedb.ora'
ORA-17503: ksfdopn:2 Failed to open file +SYSDG/wuyedb/spfilewuyedb.ora
ORA-15056: additional error message
ORA-17503: ksfdopn:2 Failed to open file +SYSDG/wuyedb/spfilewuyedb.ora
ORA-15001: diskgroup "SYSDG" does not exist or is not mounted
ORA-06512: at line 4

diskgroup 还没有启动,需要启动起来

[oracle@testdb ~]$ export ORACLE_SID=+ASM
[oracle@testdb ~]$ export ORACLE_HOME=/u01/app/grid/product/11.2.0/crs
[oracle@testdb ~]$ cd /u01/app/grid/product/11.2.0/crs/bin
[oracle@testdb bin]$ ./sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Fri Dec 13 12:44:42 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Automatic Storage Management option

SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ ----------------------
SYSDG                          DISMOUNTED
REDODG1                        DISMOUNTED
REDODG2                        DISMOUNTED
ARCHDG                         DISMOUNTED
DATADG1                        DISMOUNTED

SQL> alter diskgroup sysdg mount;

Diskgroup altered.

SQL> alter diskgroup REDODG1 mount;

Diskgroup altered.

SQL> alter diskgroup REDODG2 mount;

Diskgroup altered.

SQL> alter diskgroup ARCHDG mount;

Diskgroup altered.

SQL> alter diskgroup DATADG1 mount;

Diskgroup altered.

然后再去启动数据库!

其它说明:

1、默认情况下HAS(High Availability Service)是自动启动的.通过如下命令可以取消和启用自动启动  
    crsctl disable has  
    crsctl enable has  
2、HAS手动启动和停止  
    crsctl start has  
    crsctl stop has  
3、查看HAS的状态  
    crsctl check has  
4、ora.css和ora.diskmon服务随着HAS的启动而自动启动,那么你可以这两个服务的AUTO_START属性  
        crsctl modify resource "ora.cssd" -attr "AUTO_START=1"  
        crsctl modify resource "ora.diskmon" -attr "AUTO_START=1"  
        ora.css和ora.diskmon的Auto start   
        crsctl modify resource "ora.cssd" -attr "AUTO_START=never"  
        crsctl modify resource "ora.diskmon" -attr "AUTO_START=never"

最后说明,12c单实例重构has也是差不多的步骤