Oracle 11g RAC ASM磁盘全部丢失后的恢复


一、环境描述

(1)Oracle 11.2.0.3 RAC ON Oracle Linux 6 x86_64,只有一个ASM外部冗余磁盘组——DATA;
(2)OCR,VOTEDISK,DATAFILE,CONTROLFILE,SPFILE全部位于这个磁盘组上; 

二、故障描述

(1)存储故障导致ASM磁盘丢失。
(2)CRS因为OCR和VOTEDISK的丢失,除了OHAS还联机外,CLUSTERWARE服务都已经停止。

三、备份情况


(1)RMAN备份:包括controlfile,database,spfile,archivelog,
(2)OCR备份:没有进行过人工备份,在$CRS_HOME/cdata目录下有CRS自动备份文件。

四、操作步骤

说明:准使用CRS自动备份的文件恢复OCR,使用RMAN备份来恢复数据库;准备恢复数据的同时,调整ASM磁盘组,将OCR,VOTEDISK同数据库文件分开存放。

推荐阅读:

Oracle 11g从入门到精通 PDF+光盘源代码  

Ubuntu 12.04(amd64)安装完Oracle 11gR2后各种问题解决方法

4.1 恢复OCR和VOTEDISK

(1) 在所有RAC节点上停止CRS服务

 
  1. [root@rac1 ~]# crsctl stop has -f
  2. CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac1'
  3. CRS-2673: Attempting to stop 'ora.mdnsd'on'rac1'
  4. CRS-2673: Attempting to stop 'ora.crf'on'rac1'
  5. CRS-2677: Stop of'ora.mdnsd'on'rac1' succeeded
  6. CRS-2677: Stop of'ora.crf'on'rac1' succeeded
  7. CRS-2673: Attempting to stop 'ora.gipcd'on'rac1'
  8. CRS-2677: Stop of'ora.gipcd'on'rac1' succeeded
  9. CRS-2673: Attempting to stop 'ora.gpnpd'on'rac1'
  10. CRS-2677: Stop of'ora.gpnpd'on'rac1' succeeded
  11. CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac1' has completed
  12. CRS-4133: Oracle High Availability Services has been stopped.
[root@rac1 ~]# crsctl stop has -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.crf' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.
 
  1. [root@rac2 ~]# crsctl stop has -f
  2. CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac2'
  3. CRS-2673: Attempting to stop 'ora.mdnsd'on'rac2'
  4. CRS-2673: Attempting to stop 'ora.crf'on'rac2'
  5. CRS-2677: Stop of'ora.mdnsd'on'rac2' succeeded
  6. CRS-2677: Stop of'ora.crf'on'rac2' succeeded
  7. CRS-2673: Attempting to stop 'ora.gipcd'on'rac2'
  8. CRS-2677: Stop of'ora.gipcd'on'rac2' succeeded
  9. CRS-2673: Attempting to stop 'ora.gpnpd'on'rac2'
  10. CRS-2677: Stop of'ora.gpnpd'on'rac2' succeeded
  11. CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac2' has completed
  12. CRS-4133: Oracle High Availability Services has been stopped.
[root@rac2 ~]# crsctl stop has -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac2'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac2'
CRS-2673: Attempting to stop 'ora.crf' on 'rac2'
CRS-2677: Stop of 'ora.mdnsd' on 'rac2' succeeded
CRS-2677: Stop of 'ora.crf' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac2'
CRS-2677: Stop of 'ora.gipcd' on 'rac2' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac2'
CRS-2677: Stop of 'ora.gpnpd' on 'rac2' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac2' has completed
CRS-4133: Oracle High Availability Services has been stopped.

 

(2) 在一个节点上以NOCRS方式启动CRS,此操作会启动ASM实例。

 
  1. [root@rac1 ~]# crsctl start crs -excl -nocrs
  2. CRS-4123: Oracle High Availability Services has been started.
  3. CRS-2672: Attempting to start 'ora.mdnsd'on'rac1'
  4. CRS-2676: Start of'ora.mdnsd'on'rac1' succeeded
  5. CRS-2672: Attempting to start 'ora.gpnpd'on'rac1'
  6. CRS-2676: Start of'ora.gpnpd'on'rac1' succeeded
  7. CRS-2672: Attempting to start 'ora.cssdmonitor'on'rac1'
  8. CRS-2672: Attempting to start 'ora.gipcd'on'rac1'
  9. CRS-2676: Start of'ora.cssdmonitor'on'rac1' succeeded
  10. CRS-2676: Start of'ora.gipcd'on'rac1' succeeded
  11. CRS-2672: Attempting to start 'ora.cssd'on'rac1'
  12. CRS-2672: Attempting to start 'ora.diskmon'on'rac1'
  13. CRS-2676: Start of'ora.diskmon'on'rac1' succeeded
  14. CRS-2676: Start of'ora.cssd'on'rac1' succeeded
  15. CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip'on'rac1'
  16. CRS-2672: Attempting to start 'ora.ctssd'on'rac1'
  17. CRS-2681: Clean of'ora.cluster_interconnect.haip'on'rac1' succeeded
  18. CRS-2672: Attempting to start 'ora.cluster_interconnect.haip'on'rac1'
  19. CRS-2676: Start of'ora.ctssd'on'rac1' succeeded
  20. CRS-2676: Start of'ora.cluster_interconnect.haip'on'rac1' succeeded
  21. CRS-2672: Attempting to start 'ora.asm'on'rac1'
  22. CRS-2676: Start of'ora.asm'on'rac1' succeeded
[root@rac1 ~]# crsctl start crs -excl -nocrs
CRS-4123: Oracle High Availability Services has been started.
CRS-2672: Attempting to start 'ora.mdnsd' on 'rac1'
CRS-2676: Start of 'ora.mdnsd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.gpnpd' on 'rac1'
CRS-2676: Start of 'ora.gpnpd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssdmonitor' on 'rac1'
CRS-2672: Attempting to start 'ora.gipcd' on 'rac1'
CRS-2676: Start of 'ora.cssdmonitor' on 'rac1' succeeded
CRS-2676: Start of 'ora.gipcd' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cssd' on 'rac1'
CRS-2672: Attempting to start 'ora.diskmon' on 'rac1'
CRS-2676: Start of 'ora.diskmon' on 'rac1' succeeded
CRS-2676: Start of 'ora.cssd' on 'rac1' succeeded
CRS-2679: Attempting to clean 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2672: Attempting to start 'ora.ctssd' on 'rac1'
CRS-2681: Clean of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2676: Start of 'ora.ctssd' on 'rac1' succeeded
CRS-2676: Start of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.asm' on 'rac1'
CRS-2676: Start of 'ora.asm' on 'rac1' succeeded

 

(3) 新添加了三块磁盘,已经使用UDEV进行了绑定,查看磁盘状态。

 
  1. [root@rac1 ~]# su - grid
  2. [grid@rac1 ~]$ sqlplus / as sysasm
  3. SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 17:41:49 2013
  4. Copyright (c) 1982, 2011, Oracle. All rights reserved.
  5. Connected to:
  6. Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
  7. With the Real Application Clusters and Automatic Storage Management options
  8. SQL> select group_number group#, disk_number disk#, OS_MB, state, path, header_status from v$asm_disk orderby 1,2;
  9. GROUP# DISK# OS_MB STATE PATH HEADER_STATUS
  10. ---------- ---------- ---------- ---------- -------------------- ----------------------
  11. 0 0 1024 NORMAL /dev/asm-diskc CANDIDATE
  12. 0 1 5120 NORMAL /dev/asm-diskd CANDIDATE
  13. 0 2 20480 NORMAL /dev/asm-diskb CANDIDATE
[root@rac1 ~]# su - grid
[grid@rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 17:41:49 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL> select group_number group#, disk_number disk#, OS_MB, state, path, header_status from v$asm_disk order by 1,2;

    GROUP#      DISK#      OS_MB STATE      PATH                 HEADER_STATUS
---------- ---------- ---------- ---------- -------------------- ----------------------
         0          0       1024 NORMAL     /dev/asm-diskc       CANDIDATE
         0          1       5120 NORMAL     /dev/asm-diskd       CANDIDATE
         0          2      20480 NORMAL     /dev/asm-diskb       CANDIDATE

 

(4) 创建三个磁盘组,SYSTEMDG给CRS使用,用于存放OCR,VOTEDISK和ASM实例的SPFILE。其余两个给ORACLE使用,DATADG用于存放datafile,controlfile,redolog,spfile;ARCLOGDG存放archivelog。

 
  1. SQL> create diskgroup SYSTEMDG external redundancy
  2. 2 disk '/dev/asm-diskc'
  3. 3 ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';
  4. Diskgroup created.
  5. SQL> create diskgroup DATADG external redundancy
  6. 2 disk '/dev/asm-diskb'
  7. 3 ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';
  8. Diskgroup created.
  9. SQL> create diskgroup ARCLOGDG external redundancy
  10. 2 disk '/dev/asm-diskd'
  11. 3 ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';
  12. Diskgroup created.
SQL> create diskgroup SYSTEMDG external redundancy
  2  disk '/dev/asm-diskc'
  3  ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

SQL> create diskgroup DATADG external redundancy
  2  disk '/dev/asm-diskb'
  3  ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

SQL> create diskgroup ARCLOGDG external redundancy
  2  disk '/dev/asm-diskd'
  3  ATTRIBUTE 'compatible.rdbms' = '11.2','compatible.asm' = '11.2';

Diskgroup created.

 

(5) 准备恢复OCR和VOTEDISK,/etc/oracle/ocr.loc中记录了OCR路径,修改ocrconfig_loc的值,以便将OCR恢复到新的磁盘组中。

 
  1. [root@rac1 ~]# more /etc/oracle/ocr.loc
  2. ocrconfig_loc=+DATA
  3. local_only=FALSE
  4. [root@rac1 ~]# vi /etc/oracle/ocr.loc
  5. ocrconfig_loc=+SYSTEMDG
  6. local_only=FALSE
[root@rac1 ~]# more /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE

[root@rac1 ~]# vi /etc/oracle/ocr.loc

ocrconfig_loc=+SYSTEMDG
local_only=FALSE

 

(6) 恢复OCR

 
  1. [root@rac1 ~]# ocrconfig -showbackup
  2. PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy
  3. rac1 2013/07/05 12:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr
  4. rac1 2013/07/05 08:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr
  5. rac1 2013/07/05 04:30:00 /u01/app/11.2.0/grid/cdata/rac-cluster/backup02.ocr
  6. rac1 2013/07/05 00:29:59 /u01/app/11.2.0/grid/cdata/rac-cluster/day.ocr
  7. rac1 2013/07/05 00:29:59 /u01/app/11.2.0/grid/cdata/rac-cluster/week.ocr
  8. PROT-25: Manual backups for the Oracle Cluster Registry are not available
  9. [root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr
  10. [root@rac1 ~]#
  11. [root@rac1 ~]# ocrcheck
  12. Status of Oracle Cluster Registry isas follows :
  13. Version : 3
  14. Total space (kbytes) : 262120
  15. Used space (kbytes) : 2840
  16. Available space (kbytes) : 259280
  17. ID : 59415097
  18. Device/File Name : +SYSTEMDG
  19. Device/File integrity check succeeded
  20. Device/File not configured
  21. Device/File not configured
  22. Device/File not configured
  23. Device/File not configured
  24. Cluster registry integrity check succeeded
  25. Logical corruption check succeeded
[root@rac1 ~]# ocrconfig -showbackup
PROT-26: Oracle Cluster Registry backup locations were retrieved from a local copy

rac1     2013/07/05 12:30:00     /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr

rac1     2013/07/05 08:30:00     /u01/app/11.2.0/grid/cdata/rac-cluster/backup01.ocr

rac1     2013/07/05 04:30:00     /u01/app/11.2.0/grid/cdata/rac-cluster/backup02.ocr

rac1     2013/07/05 00:29:59     /u01/app/11.2.0/grid/cdata/rac-cluster/day.ocr

rac1     2013/07/05 00:29:59     /u01/app/11.2.0/grid/cdata/rac-cluster/week.ocr
PROT-25: Manual backups for the Oracle Cluster Registry are not available

[root@rac1 ~]# ocrconfig -restore /u01/app/11.2.0/grid/cdata/rac-cluster/backup00.ocr
[root@rac1 ~]#
[root@rac1 ~]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          3
         Total space (kbytes)     :     262120
         Used space (kbytes)      :       2840
         Available space (kbytes) :     259280
         ID                       :   59415097
         Device/File Name         :  +SYSTEMDG
                                    Device/File integrity check succeeded

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

                                    Device/File not configured

         Cluster registry integrity check succeeded

         Logical corruption check succeeded

 

(7) 创建VOTEDISK

 
  1. [root@rac1 ~]# crsctl replace votedisk +SYSTEMDG
  2. CRS-4602: Failed 27 toadd voting file afb0ca0f35684f1abfd43d5ec2dc1123.
  3. Failed toreplace voting disk groupwith +SYSTEMDG.
  4. CRS-4000: Command Replace failed, or completed with errors.
[root@rac1 ~]#  crsctl replace votedisk +SYSTEMDG
CRS-4602: Failed 27 to add voting file afb0ca0f35684f1abfd43d5ec2dc1123.
Failed to replace voting disk group with +SYSTEMDG.
CRS-4000: Command Replace failed, or completed with errors.


以上报错是因为使用UDEV绑定ASM磁盘时需要更改默认磁盘搜索路径为/dev/asm*,修改ASM磁盘搜索路径

 
  1. [root@rac1 ~]# su - grid
  2. [grid@rac1 ~]$ sqlplus / as sysasm
  3. SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 19:03:25 2013
  4. Copyright (c) 1982, 2011, Oracle. All rights reserved.
  5. Connected to:
  6. Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
  7. With the Real Application Clusters and Automatic Storage Management options
  8. SQL> show parameter asm_diskstring
  9. NAME TYPE VALUE
  10. ------------------------------------ ----------- ------------------------------
  11. asm_diskstring string
  12. SQL>
  13. SQL>
  14. SQL> alter system set asm_diskstring = '/dev/asm*';
  15. System altered.
  16. SQL> create spfile from memory;
  17. create spfile from memory
  18. *
  19. ERROR at line 1:
  20. ORA-00349: failure obtaining block sizefor
  21. '+DATA/rac-cluster/asmparameterfile/registry.253.819922365'
  22. ORA-15001: diskgroup "DATA" does not exist orisnot mounted
  23. SQL> create spfile='+SYSTEMDG'from memory;
  24. File created.
  25. SQL> startup force mount;
  26. ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
  27. ASM instance started
  28. Total System Global Area 283930624 bytes
  29. Fixed Size 2227664 bytes
  30. Variable Size 256537136 bytes
  31. ASM Cache 25165824 bytes
  32. ASM diskgroups mounted
[root@rac1 ~]# su - grid
[grid@rac1 ~]$ sqlplus / as sysasm

SQL*Plus: Release 11.2.0.3.0 Production on Fri Jul 5 19:03:25 2013

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

SQL> show parameter asm_diskstring

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
asm_diskstring                       string
SQL> 
SQL> 
SQL> alter system set asm_diskstring = '/dev/asm*';

System altered.

SQL> create spfile from memory;
create spfile from memory
*
ERROR at line 1:
ORA-00349: failure obtaining block size for
'+DATA/rac-cluster/asmparameterfile/registry.253.819922365'
ORA-15001: diskgroup "DATA" does not exist or is not mounted

SQL> create spfile='+SYSTEMDG' from memory;

File created.

SQL> startup force mount;
ORA-32004: obsolete or deprecated parameter(s) specified for ASM instance
ASM instance started

Total System Global Area  283930624 bytes
Fixed Size                  2227664 bytes
Variable Size             256537136 bytes
ASM Cache                  25165824 bytes
ASM diskgroups mounted

 

在次创建VOTEDISK,成功。

 
  1. [root@rac1 init]# crsctl replace votedisk +SYSTEMDG
  2. Successful addition of voting disk 8ebb7a63accb4fa8bfa7ab65df7a8c8a.
  3. Successfully replaced voting disk groupwith +SYSTEMDG.
  4. CRS-4266: Voting file(s) successfully replaced
[root@rac1 init]# crsctl replace votedisk +SYSTEMDG
Successful addition of voting disk 8ebb7a63accb4fa8bfa7ab65df7a8c8a.
Successfully replaced voting disk group with +SYSTEMDG.
CRS-4266: Voting file(s) successfully replaced

 

(8) OCR和VOTEDISK都恢复完成后,重启CRS到正常模式。

 
  1. [root@rac1 ~]# crsctl stop has -f
  2. CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on'rac1'
  3. CRS-2673: Attempting to stop 'ora.mdnsd'on'rac1'
  4. CRS-2673: Attempting to stop 'ora.ctssd'on'rac1'
  5. CRS-2673: Attempting to stop 'ora.asm'on'rac1'
  6. CRS-2677: Stop of'ora.mdnsd'on'rac1' succeeded
  7. CRS-2677: Stop of'ora.asm'on'rac1' succeeded
  8. CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip'on'rac1'
  9. CRS-2677: Stop of'ora.ctssd'on'rac1' succeeded
  10. CRS-2677: Stop of'ora.cluster_interconnect.haip'on'rac1' succeeded
  11. CRS-2673: Attempting to stop 'ora.cssd'on'rac1'
  12. CRS-2677: Stop of'ora.cssd'on'rac1' succeeded
  13. CRS-2673: Attempting to stop 'ora.gipcd'on'rac1'
  14. CRS-2677: Stop of'ora.gipcd'on'rac1' succeeded
  15. CRS-2673: Attempting to stop 'ora.gpnpd'on'rac1'
  16. CRS-2677: Stop of'ora.gpnpd'on'rac1' succeeded
  17. CRS-2793: Shutdown of Oracle High Availability Services-managed resources on'rac1' has completed
  18. CRS-4133: Oracle High Availability Services has been stopped.
  19. [root@rac1 ~]# crsctl start crs
  20. CRS-4123: Oracle High Availability Services has been started.
  21. [root@rac1 ~]# crsctl check crs
  22. CRS-4638: Oracle High Availability Services is online
  23. CRS-4537: Cluster Ready Services is online
  24. CRS-4529: Cluster Synchronization Services is online
  25. CRS-4533: Event Manager is online
  26. [root@rac1 ~]#
[root@rac1 ~]# crsctl stop has -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1'
CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1'
CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1'
CRS-2673: Attempting to stop 'ora.asm' on 'rac1'
CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1'
CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded
CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.cssd' on 'rac1'
CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1'
CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded
CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1'
CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed
CRS-4133: Oracle High Availability Services has been stopped.

[root@rac1 ~]# crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

[root@rac1 ~]# crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
[root@rac1 ~]#

 

  • 1
  • 2
  • 下一页

相关内容