物理Data Guard中哪个进程处理Redo GAP


在Oracle Data Guard中,Redo Gap的产生是由于一些网络或者其他问题导致redo的传输中断。当故障消除后,这些没有传输过去的redo文件会由一些进程发现,并且将它们传输到备库。

术语:

ARC:归档进程
MRP:Media Recovery Process,在备库上负责应用redo
RFS:Remote File Server ,在备库上接收发送过来的redo文件
FAL:Fetch Archive Log

测试目的:由于网络问题发生了gap后,确定哪个进程负责处理gap。

测试环境:Oracle 11.2.0.2 on Linux 5.

测试过程:

1.确保当前主库和备库是同步的:
Primary:
MAX(SEQUENCE#)
--------------
           86

Standby:
MAX(SEQUENCE#)
--------------
           86

2. 模拟网络中断,导致gap:
在主库将网卡停掉: #ifconfig eth0 down

将主库执行数次switch logfile:
SQL>alter system switch logfile;
SQL>alter system switch logfile;
...

Primary:
MAX(SEQUENCE#)
--------------
           96

这时主库alert log报出了与备库连接不通的错误:
TNS-00513: Destination host unreachable
   nt secondary err code: 101
   nt OS err code: 0
Error 12543 received logging on to the standby
FAL[server, ARCp]: Error 12543 creating remote archivelog file 'STANDBY'
FAL[server, ARCp]: FAL archive failed, see trace file.
ARCH: FAL archive failed. Archiver continuing
ORACLE Instance orcl - Archival Error. Archiver continuing.



3.将主库的这些归档临时换个目录,保证这些归档无法传到备库:
mv *.arc ../

4. 启动主库的网卡:
#ifconfig eth0 up

5.这时,主库的ARC并没有把缺少的日志传到备库。最终备库的MRP发现了gap并把gap fetching.

备库alert log:
Thu Mar 29 19:58:49 2012
Media Recovery Waiting for thread 1 sequence 87 (in transit) <====  网络中断时,等待87
...
Thu Mar 29 20:08:45 2012
...
Media Recovery Waiting for thread 1 sequence 94
Thu Mar 29 20:11:01 2012
RFS[61]: Assigned to RFS process 13643
RFS[61]: Opened log for thread 1 sequence 97 dbid 1285401128 branch 757620395
Archived Log entry 80 added for thread 1 sequence 97 rlc 757620395 ID 0x4c9d8928 dest 2:
Thu Mar 29 20:11:02 2012
RFS[62]: Assigned to RFS process 13645
RFS[62]: Selected log 4 for thread 1 sequence 98 dbid 1285401128 branch 757620395
Thu Mar 29 20:11:02 2012
Primary database is in MAXIMUM PERFORMANCE mode
Re-archiving standby log 4 thread 1 sequence 98
Thu Mar 29 20:11:02 2012
Archived Log entry 81 added for thread 1 sequence 98 ID 0x4c9d8928 dest 1:
RFS[63]: Assigned to RFS process 13647
RFS[63]: Selected log 4 for thread 1 sequence 99 dbid 1285401128 branch 757620395
Thu Mar 29 20:11:05 2012
Fetching gap sequence in thread 1, gap sequence 94-96 <===========取gap
...

6.通过MRP的trace,可以确定是MRP 作了fetching gap:

MRP trace:

*** 2012-03-29 20:08:45.375 4265 krsh.c
Media Recovery Waiting for thread 1 sequence 94

*** 2012-03-29 20:11:05.543
*** 2012-03-29 20:11:05.543 4265 krsh.c
Fetching gap sequence in thread 1, gap sequence 94-96 <==========MRP取gap.
Redo shipping client performing standby login
*** 2012-03-29 20:11:05.593 4595 krsu.c
Logged on to standby successfully
Client logon and security negotiation successful!

7.将移走的归档日志移回之后,备库的RFS接收到了这些日志, MRP 将这些日志进行了apply.

Thu Mar 29 20:12:06 2012
RFS[64]: Assigned to RFS process 13649
RFS[64]: Opened log for thread 1 sequence 94 dbid 1285401128 branch 757620395
Archived Log entry 82 added for thread 1 sequence 94 rlc 757620395 ID 0x4c9d8928 dest 2:
Thu Mar 29 20:12:06 2012
RFS[65]: Assigned to RFS process 13651
RFS[65]: Opened log for thread 1 sequence 95 dbid 1285401128 branch 757620395
Thu Mar 29 20:12:06 2012
RFS[66]: Assigned to RFS process 13653
RFS[66]: Opened log for thread 1 sequence 96 dbid 1285401128 branch 757620395
Archived Log entry 83 added for thread 1 sequence 95 rlc 757620395 ID 0x4c9d8928 dest 2:
Archived Log entry 84 added for thread 1 sequence 96 rlc 757620395 ID 0x4c9d8928 dest 2:
Thu Mar 29 20:12:16 2012
Media Recovery Log /home/oracle/arch1/standby/1_94_757620395.arc
Media Recovery Log /home/oracle/arch1/standby/1_95_757620395.arc
Media Recovery Log /home/oracle/arch1/standby/1_96_757620395.arc
Media Recovery Log /home/oracle/arch1/standby/1_97_757620395.arc
Media Recovery Log /home/oracle/arch1/standby/1_98_757620395.arc


测试结论:
通过这个例子说明,对于这种gap的处理,主库的ARC进程对于以前产生的gap文件,并没有进行处理。是备库的MRP进程在apply log的时候发现了gap,将这些文件通过FAL进程取回。

注:在11g,理论上主库的ARC进程和备库的RFS、MRP进程在某些情况都有可能处理gap.


8. 为了进一步确定是MRP通过FAL取了gap文件,我将主库的密码修改了一下,结果MRP的trace中报错:FAL[client, MRP0],说明是通过FAL取的。

 *** 2012-03-29 21:18:15.964 4265 krsh.c
Error 1031 received logging on to the standby
*** 2012-03-29 21:18:15.964 4265 krsh.c
FAL[client, MRP0]: Error 1031 connecting to PRIMARY for fetching gap sequence

相关内容

    暂无相关文章