Oracle等待事件DFS lock handle

文章由LinuxBoy分享于2019-03-31 01:03:58热评（269）

Oracle等待事件DFS lock handle

在做性能压力测试，测试结果不能通过，获取现场一个小时的AWR报告，发现大量的等待事件，数据库是RAC，版本是Oracle 11.2.0.4.0。

Snap Id	Snap Time	Sessions	Cursors/Session	Instances
Begin Snap:	1607	21-10月-14 20:00:03	560	67.9	2
End Snap:	1608	21-10月-14 21:00:11	573	12.4	2
Elapsed:		60.13 (mins)
DB Time:		2,090.75 (mins)

Event	Waits	Total Wait Time (sec)	Wait Avg(ms)	Wait Class
rdbms ipc reply	32,876,281	44.9K	1	35.8	Other
DB CPU		21.3K		17.0
direct path read	435,808	18.8K	43	15.0	User I/O
DFS lock handle	4,204,866	7977.9	2	6.4	Other
log file sync	8,541	252.7	30	.2	Commit

1. 排在第一的等待事件是rdbms ipc reply , 解释是The rdbms ipc reply Oracle metric event is used to wait for a reply from one of the background processes.说明lgwr，dbwr等后台进程空闲，等待前台进程给予他们的工作任务。DFS lock handle这个等待事件很可疑，官方解释是：

The session waits for the lock handle of a global lock request. The lock handle identifies a global lock. With this lock handle, other operations can be performed on this global lock (to identify the global lock in future operations such as conversions or release). The global lock is maintained by the DLM.

大致意思是无法获得global cache lock的handle时候所记录的等待事件。

2. 在网上看了下大家的处理方式，序列的cache过小，数据库服务器CPU过高，做过相应的调整和监控，都不解决问题。在做性能测试的时候，

select chr(bitand(p1,-16777216)/16777215) || chr(bitand(p1, 16711680)/65535) "Lock",

to_char(bitand(p1, 65536)) "Mode",

p2, p3 , seconds_in_wait

from v$session_wait

where event = 'DFS lock handle';

发现了BB锁，意思是：2PC distributed transaction branch across RAC instances DX Serializes tightly coupled distributed transaction branches。

大致意思是分布式事务两个RAC实例中across。我随即做出调整，将weblogic连接改为只是连接一个RAC节点，再进行测试。测试结果如下：

Snap Id	Snap Time	Sessions	Cursors/Session	Instances
Begin Snap:	1680	24-10月-14 12:00:13	864	9.5	2
End Snap:	1681	24-10月-14 13:00:17	863	9.9	2
Elapsed:		60.07 (mins)
DB Time:		80.28 (mins)

Event	Waits	Total Wait Time (sec)	Wait Avg(ms)	Wait Class
DB CPU		2335.6		48.5
rdbms ipc reply	5,326,201	645.6	0	13.4	Other
gc buffer busy acquire	39,052	226.7	6	4.7	Cluster
DFS lock handle	672,757	225.8	0	4.7	Other

DFS lock handle减少了非常多，但还是存在，不过性能测试结果好了很多。

3. 如何彻底解决呢？先说下DFS lock handle，说简单一点就是一个object在不同的实例中DML，每个实例在自己处理自己的object。这是一个权衡的问题，如果weblogic动态连接实例，就无法保证每次处理自己的object,但这样可以容灾，其他的实例挂了也没问题；如果是指定单独的实例，相对于动态是优、缺点是反的。还有一种说法是metalink中有关于DFS lock handle的都是bug，目前尚不清楚数据库升级后是不是会好一点。

Oracle 11g下重现library cache lock等待事件

Oracle 11g等待事件：db file async I/O submit

[Oracle] 常见的等待事件

oracle Log Buffer内部机制以及常见等待事件

RAC数据库中的'log file sync'等待事件

Oracle Tuning Log File Sync 等待事件的几种策略

Oracle等待事件详细分析

推荐文章：

Oracle等待事件DFS lock handle