无RMAN备份集情况下的坏块恢复


测试的环境是没有可用的RMAN备份集,但是有数据文件的热备,下面来看测试:


--创建测试用户和测试表
[Oracle@ora10g ~]$ sqlplus / as sysdba


SQL*Plus: Release 10.2.0.1.0 - Production on 16 16:01:02 2014


Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> create user zlm identified by zlm;

User created.

SQL> alter user zlm default tablespace zlm;

User altered.


SQL> grant dba to zlm;


Grant succeeded.


SQL> conn zlm/zlm

Connected.
SQL> create table corrupt_test (id number(10),name varchar2(15));


Table created.


SQL> insert into corrupt_test values(1,'aaron8219');


1 row created.


SQL> commit;


Commit complete.


SQL> set lin 130

SQL> col segment_name for a20

SQL> col tablespace_name for a20
SQL> select segment_name,tablespace_name from dba_segments where segment_name='CORRUPT_TEST';

 

SEGMENT_NAME        TABLESPACE_NAME
-------------------- --------------------
CORRUPT_TEST        ZLM


SQL> col name for a45
SQL> select a.segment_name,a.tablespace_name,b.file#,b.name from dba_segments a,v$datafile b where a.header_file=b.file# and a.segment_name='CORRUPT_TEST';


SEGMENT_NAME        TABLESPACE_NAME          FILE# NAME
-------------------- -------------------- ---------- ---------------------------------------------
CORRUPT_TEST        ZLM                          6 /u01/app/oracle/oradata/ora10g/zlm01.dbf


由于之前做过RMAN备份,所以先把备份集删除


[oracle@ora10g ~]$ rman target /


Recovery Manager: Release 10.2.0.1.0 - Production on 16 16:06:47 2014


Copyright (c) 1982, 2005, Oracle.  All rights reserved.


connected to target database: ORA10G (DBID=4175411955)


RMAN> list backupset;

 

using target database control file instead of recovery catalog


List of Backup Sets
===================


BS Key  Type LV Size      Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
286    Full    880.50M    DISK        00:01:35    2014-11-12   
        BP Key: 286  Status: AVAILABLE  Compressed: NO  Tag: TAG20141112T141548
        Piece Name: /u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp
  List of Datafiles in backup set 286
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- ---------- ----
  1      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/system01.dbf
  2      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/undotbs01.dbf
  3      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/sysaux01.dbf
  4      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/users01.dbf
  5      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/example01.dbf
  6      Full 1202813    2014-11-12


BS Key  Size      Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
302    42.17M    DISK        00:00:27    2014-11-21   
        BP Key: 302  Status: AVAILABLE  Compressed: YES  Tag: ARC_BAK
        Piece Name: /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc


  List of Archived Logs in backup set 302
  Thrd Seq    Low SCN    Low Time  Next SCN  Next Time
  ---- ------- ---------- ---------- ---------- ---------
  1    39      1234835    2014-11-18 1247748    2014-11-21
  1    40      1247748    2014-11-21 1249682    2014-11-21
  1    41      1249682    2014-11-21 1250181    2014-11-21
  1    42      1250181    2014-11-21 1258063    2014-11-21
  1    43      1258063    2014-11-21 1260208    2014-11-21


BS Key  Type LV Size      Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
303    Full    164.91M    DISK        00:01:52    2014-11-21   
        BP Key: 303  Status: AVAILABLE  Compressed: YES  Tag: DB_BAK
        Piece Name: /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db
  List of Datafiles in backup set 303
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- ---------- ----
  1      Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/system01.dbf
  2      Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/undotbs01.dbf
  3      Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/sysaux01.dbf
  4      Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/users01.dbf
  5      Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/example01.dbf
  6      Full 1260226    2014-11-21 /u01/app/oracle/oradata/ora10g/zlm01.dbf


BS Key  Size      Device Type Elapsed Time Completion Time
------- ---------- ----------- ------------ ---------------
304    19.50K    DISK        00:00:01    2014-11-21   
        BP Key: 304  Status: AVAILABLE  Compressed: YES  Tag: ARC_BAK
        Piece Name: /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc


  List of Archived Logs in backup set 304
  Thrd Seq    Low SCN    Low Time  Next SCN  Next Time
  ---- ------- ---------- ---------- ---------- ---------
  1    44      1260208    2014-11-21 1260277    2014-11-21


BS Key  Type LV Size      Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
305    Full    7.23M      DISK        00:00:01    2014-11-21   
        BP Key: 305  Status: AVAILABLE  Compressed: NO  Tag: TAG20141121T151114
        Piece Name: /u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl
  Control File Included: Ckp SCN: 1260283      Ckp time: 2014-11-21
  SPFILE Included: Modification time: 2014-11-21


RMAN> exit

 


Recovery Manager complete.
[oracle@ora10g ~]$ cd /u01/orabackup/backupsets/
[oracle@ora10g backupsets]$ ll -lrth
total 215M
-rw-r----- 1 oracle oinstall  43M Nov 21 15:09 ora10g-4175411955_20141121_864227317_351.arc
-rw-r----- 1 oracle oinstall 165M Nov 21 15:11 ora10g-4175411955_20141121_864227354_352.db
-rw-r----- 1 oracle oinstall  20K Nov 21 15:11 ora10g-4175411955_20141121_864227471_353.arc
-rw-r----- 1 oracle oinstall 7.3M Nov 21 15:11 ora10g-c-4175411955-20141121-04.ctl


--删除RMAN备份集
[oracle@ora10g backupsets]$ rm -f *
[oracle@ora10g backupsets]$ rman target /


Recovery Manager: Release 10.2.0.1.0 - Production on 16 16:07:59 2014


Copyright (c) 1982, 2005, Oracle.  All rights reserved.


connected to target database: ORA10G (DBID=4175411955)


RMAN> crosscheck backup;

 

using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=154 devtype=DISK
crosschecked backup piece: found to be 'AVAILABLE'
backup piece handle=/u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp recid=286 stamp=863446548
crosschecked backup piece: found to be 'EXPIRED'
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc recid=302 stamp=864227318
crosschecked backup piece: found to be 'EXPIRED'
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db recid=303 stamp=864227356
crosschecked backup piece: found to be 'EXPIRED'
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc recid=304 stamp=864227472
crosschecked backup piece: found to be 'EXPIRED'
backup piece handle=/u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl recid=305 stamp=864227475
Crosschecked 5 objects

 


RMAN> delete noprompt expired backupset;


using channel ORA_DISK_1


List of Backup Pieces
BP Key  BS Key  Pc# Cp# Status      Device Type Piece Name
------- ------- --- --- ----------- ----------- ----------
302    302    1  1  EXPIRED    DISK        /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc
303    303    1  1  EXPIRED    DISK        /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db
304    304    1  1  EXPIRED    DISK        /u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc
305    305    1  1  EXPIRED    DISK        /u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227317_351.arc recid=302 stamp=864227318
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227354_352.db recid=303 stamp=864227356
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-4175411955_20141121_864227471_353.arc recid=304 stamp=864227472
deleted backup piece
backup piece handle=/u01/orabackup/backupsets/ora10g-c-4175411955-20141121-04.ctl recid=305 stamp=864227475
Deleted 4 EXPIRED objects


现在把由RMAN脚本生成的备份集删除了,再查看一次


RMAN> list backup;

 


List of Backup Sets
===================


BS Key  Type LV Size      Device Type Elapsed Time Completion Time
------- ---- -- ---------- ----------- ------------ ---------------
286    Full    880.50M    DISK        00:01:35    2014-11-12   
        BP Key: 286  Status: AVAILABLE  Compressed: NO  Tag: TAG20141112T141548
        Piece Name: /u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp
  List of Datafiles in backup set 286
  File LV Type Ckp SCN    Ckp Time  Name
  ---- -- ---- ---------- ---------- ----
  1      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/system01.dbf
  2      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/undotbs01.dbf
  3      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/sysaux01.dbf
  4      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/users01.dbf
  5      Full 1202813    2014-11-12 /u01/app/oracle/oradata/ora10g/example01.dbf
  6      Full 1202813    2014-11-12


还有一个备份集在fra中做的全库备份,也将其删除


RMAN> host;

 

[oracle@ora10g backupsets]$ cd /u01/app/oracle/flash_recovery_area/ORA10G/backupset/
[oracle@ora10g backupset]$ ll
total 4
drwxr-x--- 2 oracle oinstall 4096 Nov 12 14:15 2014_11_12
[oracle@ora10g backupset]$ rm -rf *
[oracle@ora10g backupset]$ exit
exit
host command complete


RMAN> crosscheck backup;

 

using channel ORA_DISK_1
crosschecked backup piece: found to be 'EXPIRED'
backup piece handle=/u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp recid=286 stamp=863446548
Crosschecked 1 objects

 


RMAN> delete noprompt expired backup;


using channel ORA_DISK_1


List of Backup Pieces
BP Key  BS Key  Pc# Cp# Status      Device Type Piece Name
------- ------- --- --- ----------- ----------- ----------
286    286    1  1  EXPIRED    DISK        /u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp
deleted backup piece
backup piece handle=/u01/app/oracle/flash_recovery_area/ORA10G/backupset/2014_11_12/o1_mf_nnndf_TAG20141112T141548_b65yrnkg_.bkp recid=286 stamp=863446548
Deleted 1 EXPIRED objects

 


RMAN> list backup summary;

 


RMAN> list backup;

 


RMAN> exit


好了,现在数据库的RMAN备份彻底没有了,继续我们的测试:


--开启测试表空间热备份模式
SQL> alter tablespace zlm begin backup;


Tablespace altered.


SQL> select * from v$backup;


FILE# STATUS                CHANGE# TIME
----- ------------------ ---------- ----------
    1 NOT ACTIVE                  0
    2 NOT ACTIVE                  0
    3 NOT ACTIVE                  0
    4 NOT ACTIVE                  0
    5 NOT ACTIVE                  0
    6 ACTIVE                1317685 2014-11-26


6 rows selected.


此时可以看到,开启热备模式以后,6号文件的状态从NOT ACTIVE变成了ACTIVE


SQL> select name,checkpoint_change# from v$datafile;


NAME                                          CHECKPOINT_CHANGE#
--------------------------------------------- ------------------
/u01/app/oracle/oradata/ora10g/system01.dbf              1306748
/u01/app/oracle/oradata/ora10g/undotbs01.dbf            1306748
/u01/app/oracle/oradata/ora10g/sysaux01.dbf              1306748
/u01/app/oracle/oradata/ora10g/users01.dbf              1306748
/u01/app/oracle/oradata/ora10g/example01.dbf            1306748
/u01/app/oracle/oradata/ora10g/zlm01.dbf                1319387


6 rows selected.


SCN也比其他文件的要大,因为相当于对6号文件单独进行存档了,只不过SCN还没有写进数据文件头,这个时候这个数据文件是废的,要保持一致性,必须要依靠归档来实现


--OS级别热备份6号数据文件
SQL> !cp $ORACLE_BASE/oradata/zlm01.dbf /u01/zlm01_bak.dbf
cp: cannot stat `/u01/app/oracle/oradata/zlm01.dbf': No such file or directory


SQL> !cp $ORACLE_BASE/oradata/ora10g/zlm01.dbf /u01/zlm01_bak.dbf


--关闭热备模式
SQL> alter tablespace zlm end backup;


Tablespace altered.

 


SQL> select * from v$backup;


    FILE# STATUS                CHANGE# TIME
---------- ------------------ ---------- ----------
        1 NOT ACTIVE                  0
        2 NOT ACTIVE                  0
        3 NOT ACTIVE                  0
        4 NOT ACTIVE                  0
        5 NOT ACTIVE                  0
        6 NOT ACTIVE            1319387 2014-11-26


6 rows selected.


现在6号文件的状态又变回了NOT ACTIVE,说明热备结束了


SQL> select name,checkpoint_change# from v$datafile;


NAME                                          CHECKPOINT_CHANGE#
--------------------------------------------- ------------------
/u01/app/oracle/oradata/ora10g/system01.dbf              1306748
/u01/app/oracle/oradata/ora10g/undotbs01.dbf            1306748
/u01/app/oracle/oradata/ora10g/sysaux01.dbf              1306748
/u01/app/oracle/oradata/ora10g/users01.dbf              1306748
/u01/app/oracle/oradata/ora10g/example01.dbf            1306748
/u01/app/oracle/oradata/ora10g/zlm01.dbf                1319387


6 rows selected.


数据文件的SCN依然是之前的,还没有变化


SQL> select header_block from dba_segments where segment_name='CORRUPT_TEST';

 

HEADER_BLOCK
------------
          11


通过dba_segments视图,得知6号文件的段头块是11


--模拟出现坏块
SQL> !
[oracle@ora10g backupsets]$ dd of=/u01/app/oracle/oradata/ora10g/zlm01.dbf bs=8192 conv=notrunc seek=12 <<EOF
> corruption
> EOF
0+1 records in
0+1 records out
11 bytes (11 B) copied, 0.000168204 seconds, 65.4 kB/s


seek=12表示跳过12个block开始写入,因为我不想破坏段头块,只是在文件尾部写了废数据“corruption”,那么这个块就会标识为逻辑坏块


[oracle@ora10g backupsets]$ sqlplus /nolog


SQL*Plus: Release 10.2.0.1.0 - Production on 16 15:52:41 2014


Copyright (c) 1982, 2005, Oracle.  All rights reserved.


SQL> conn zlm/zlm
Connected.
SQL> select * from corrupt_test;


        ID NAME
---------- ---------------
        1 aaron8219


此时由于测试表corrupt_test里数据块中的行数据还在内存中,所以还是可以查询到行记录的


SQL> alter system flush buffer_cache;


System altered.


SQL> select * from corrupt_test;
select * from corrupt_test
              *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 6, block # 12)
ORA-01110: data file 6: '/u01/app/oracle/oradata/ora10g/zlm01.dbf'


但是一旦我们把它刷到磁盘,就报ORA-01578的错误了,提示6号文件的第12个块损坏了,就是之前指定的那个数据块


SQL> exit
Disconnected from Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options
[oracle@ora10g backupsets]$ rman target /


Recovery Manager: Release 10.2.0.1.0 - Production on 16 16:30:19 2014


Copyright (c) 1982, 2005, Oracle.  All rights reserved.


connected to target database: ORA10G (DBID=4175411955)


RMAN> blockrecover datafile 6 block 12;


Starting blockrecover at 2014-11-26
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=159 devtype=DISK


RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of blockrecover command at 11/26/2014 16:30:51
RMAN-06026: some targets not found - aborting restore
RMAN-06023: no backup or copy of datafile 6 found to restore


此时直接用blockrecover来恢复坏块是不行的,首先我们没有可用的备份集,其次,控制文件中也不知道从哪里去找可用的备份文件,那么我们就要先把之前做过的热备文件catalog到控制文件中


RMAN> catalog datafilecopy '/u01/zlm01_bak.dbf';


cataloged datafile copy
datafile copy filename=/u01/zlm01_bak.dbf recid=17 stamp=864664486


RMAN> blockrecover datafile 6 block 12;


Starting blockrecover at 2014-11-26
using channel ORA_DISK_1


channel ORA_DISK_1: restoring block(s) from datafile copy /u01/zlm01_bak.dbf


starting media recovery
media recovery complete, elapsed time: 00:00:01


Finished blockrecover at 2014-11-26


RMAN> exit

Recovery Manager complete.


再做一次blockrecover,现在就顺利地介质恢复完了


[oracle@ora10g backupsets]$ sqlplus zlm/zlm


SQL*Plus: Release 10.2.0.1.0 - Production on 16 16:35:23 2014


Copyright (c) 1982, 2005, Oracle.  All rights reserved.

 


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options


SQL> select * from corrupt_test;


        ID NAME
---------- ---------------
        1 aaron8219


SQL>

可以看到,之前丢失的数据,又回来了

总结:

虽然在没有RMAN备份集的情况下,通过热备文件可以把丢失的数据恢复出来,但这毕竟还是很不靠谱的。在生产环境中,我们几乎不可能经常去对某个数据文件做热备,也不会知道什么时候,哪个文件就会出现坏块。所以,平时做好RMAN全备还是非常非常重要的,只要有备份集和归档,我们的数据就不会丢失。当执行blockrecover datafile xxx block xxx时,Oracle会直接去RMAN备份集中恢复,不需要额外的catalog步骤,也不用我们过多地人为干预。

--------------------------------------推荐阅读 --------------------------------------

RMAN 配置归档日志删除策略

Oracle基础教程之通过RMAN复制数据库

RMAN备份策略制定参考内容

RMAN备份学习笔记

Oracle数据库备份加密 RMAN加密

--------------------------------------分割线 --------------------------------------

相关内容