Global Enqueue Services Deadlock导致节点重启


日志中报错信息如下:
Sun Sep 02 01:29:47 2012
Global Enqueue Services Deadlock detected. More info in file
 /u01/app/11.1.0/diag/rdbms/zzbrac2/zzbrac21/trace/zzbrac21_lmd0_11802.trc.
Sun Sep 02 01:30:30 2012
Thread 1 advanced to log sequence 16189 (LGWR switch)
  Current log# 7 seq# 16189 mem# 0: +DATADG/zzbrac2/onlinelog/group_7.660.791891943
  Current log# 7 seq# 16189 mem# 1: +DATADG/zzbrac2/onlinelog/group_7.1693.791891943
Sun Sep 02 01:41:28 2012
ALTER SYSTEM SET service_names='SYS$SYS.KUPC$S_1_20120902010315.ZZBRAC2','zzbrac2' SCOPE=MEMORY SID='zzbrac21';
ALTER SYSTEM SET service_names='zzbrac2' SCOPE=MEMORY SID='zzbrac21';
Sun Sep 02 01:59:27 2012
Clearing Resource Manager plan via parameter
Sun Sep 02 01:59:35 2012
Thread 1 advanced to log sequence 16190 (LGWR switch)
  Current log# 8 seq# 16190 mem# 0: +DATADG/zzbrac2/onlinelog/group_8.3050.791891983
  Current log# 8 seq# 16190 mem# 1: +DATADG/zzbrac2/onlinelog/group_8.2893.791891985
Thread 1 advanced to log sequence 16191 (LGWR switch)
  Current log# 1 seq# 16191 mem# 0: +DATADG/zzbrac2/onlinelog/group_1.264.733863259
  Current log# 1 seq# 16191 mem# 1: +DATADG/zzbrac2/onlinelog/group_1.265.733863261
Sun Sep 02 02:04:12 2012
LMON (ospid: 11800) has not called a wait for 87 secs.
Sun Sep 02 02:04:43 2012
LMON (ospid: 11800) has not called a wait for 118 secs.
ERROR: LMON is not healthy and has no heartbeat.
ERROR: LMD0 (ospid: 11802) is terminating the instance.
LMD0 (ospid: 11802): terminating the instance due to error 482
Sun Sep 02 02:04:43 2012
System state dump is made for local instance
System State dumped to trace file /u01/app/11.1.0/diag/rdbms/zzbrac2/zzbrac21/trace/zzbrac21_diag_11788.trc
Instance terminated by LMD0, pid = 11802
 
原因分析:
01:29:47的时候,全局队列死锁(由应用造成)
02 02:04 LMON(全局队列服务监视进程)不能被调用
                 LMON 不健康并没有心跳
                 LMD0 (全局队列服务守护进程)重启实例
 
初步判断是由应用造成死锁,导致LMON hang住,LMD0 将实例重启。

相关内容