Oracle半连接学习总结
Oracle半连接学习总结
半连接(semi-join)
在我们查看一个数据集中某些字段存在于另一个数据集合中的记录时,常常会用到in 或者 exists。在执行计划中会看到join semi。
在这里给出以下测试用的数据表结构以及模拟数据
drop table table_1 purge;
drop table table_2 purge;
create table table_1
as select
cast(rownum as int) a,
cast(rownum+10 as int) b,
cast(dbms_random.string('i',10) as varchar2(10)) c
from dual connect by level<=500000
create table table_2
as select
cast(rownum*2 as int) a,
cast(rownum*2+10 as int) b,
cast(dbms_random.string('i',10) as varchar2(10)) c
from dual connect by level<=1000
在table_2增加一条已有的数据
insert into table_2 select * from table_2 where a=20
create index idx_tab1_a on table_1(a);
create index idx_tab2_a on table_2(a);
analyze table table_1 compute statistics
for table
for all indexes
for all indexed columns
analyze table table_2 compute statistics
for table
for all indexes
for all indexed columns
先看看下面的语句是什么业务含义
select * from table_1 ,table_2 wheretable_1.a=table_2.a
即使找到符合条件的数据,本次迭代也不会停止在table_2中继续往下寻找下一条符合条件的数据,如果再次找到,则迭代继续返回满足条件的数据。
… |
|
|
|
|
|
18 |
28 |
LYZDKBNGLN |
18 |
28 |
SICWAOITLK |
20 |
30 |
DQCETGYPWE |
20 |
30 |
HEFBMTNBQL |
20 |
30 |
DQCETGYPWE |
20 |
30 |
HEFBMTNBQL |
22 |
32 |
URPNGTEIBW |
22 |
32 |
TQNIVPFQUP |
… |
|
|
|
|
|
select /*d*/ table_1.*
from table_1 ,table_2 where table_1.a=table_2.a
in exists的含义为
可以理解为在主数据集作迭代时,如果在副数据集中找到第一个符合条件的数据,即完成本条迭代的操作,
在业务上可以理解为
即可以理解为为
select
a.a,a.b,b.a
from table_1a, (select distinct a from table_2) b
where a.a=b.a --and b.a=20
可以清楚看到上面语句与下面语句返回的结果不同
select
a.a,a.b,b.a
from table_1a, (select a from table_2) b
where a.a=b.a --and b.a=20
我们来看看上面写法的执行计划
EXPLAIN PLAN FOR
select
a.a,a.b,b.a
from table_1 a, (select distinct a from table_2) b
where a.a=b.a and b.a=20;
SELECT * FROM TABLE(DBMS_XPLAN.display());
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1001 | 39039 | 509 (2)| 00:00:07 |
| 1 | VIEW | VM_NWVW_1 | 1001 | 39039 | 509 (2)| 00:00:07 |
| 2 | HASH UNIQUE | | 1001 | 27027 | 509 (2)| 00:00:07 |
|* 3 | HASH JOIN | | 1001 | 27027 | 508 (2)| 00:00:07 |
| 4 | INDEX FAST FULL SCAN| IDX_TAB2_A | 1001 | 3003 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL |TABLE_1 | 500K| 11M| 502 (1)| 00:00:07 |
--------------------------------------------------------------------------------------
虽然有业务上明确意义的写法,但是这种写法并不是高效率的写法,Oracle提供了IN ESISTS的解法来提高效率
EXPLAIN PLAN FOR
select
a.a,a.b
from table_1 a, (select distinct a from table_2) b
where a.a=b.a --and b.a=20;
SELECT * FROM TABLE(DBMS_XPLAN.display());
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1001 | 26026 | 509 (2)| 00:00:07 |
| 1 | VIEW | VM_NWVW_1 | 1001 | 26026 | 509 (2)| 00:00:07 |
| 2 | HASH UNIQUE | | 1001 | 27027 | 509 (2)| 00:00:07 |
|* 3 | HASH JOIN | | 1001 | 27027 | 508 (2)| 00:00:07 |
| 4 | INDEX FAST FULL SCAN| IDX_TAB2_A | 1001 | 3003 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL |TABLE_1 | 500K| 11M| 502 (1)| 00:00:07 |
--------------------------------------------------------------------------------------
call count cpu elapsed disk query current rows
------- ------ -------- -------------------- ---------- ---------- ----------
Parse 1 0.00 0.01 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 18 0 100
------- ------ -------- -------------------- ---------- ---------- ----------
total 3 0.00 0.01 0 18 0 100
EXPLAIN PLAN FOR
select
a.a,a.b
from table_1 a
where a.a in (select a from table_2);
SELECT * FROM TABLE(DBMS_XPLAN.display());
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 20000 | 508 (2)| 00:00:07 |
|* 1 | HASH JOIN RIGHT SEMI | | 1000 | 20000 | 508 (2)| 00:00:07 |
| 2 | INDEX FAST FULL SCAN| IDX_TAB2_A | 1001 | 3003 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL |TABLE_1 | 500K| 8300K| 502 (1)| 00:00:07 |
------------------------------------------------------------------------------------
call count cpu elapsed disk query current rows
------- ------ -------- -------------------- ---------- ---------- ----------
Parse 1 0.00 0.12 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 10 0 100
------- ------ -------- -------------------- ---------- ---------- ----------
total 3 0.00 0.12 0 10 0 100
EXPLAIN PLAN FOR
select
a.a,a.b
from table_1 a
where exists (select null from table_2 b where a.a=b.a);
SELECT * FROM TABLE(DBMS_XPLAN.display());
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1000 | 20000 | 508 (2)| 00:00:07 |
|* 1 | HASH JOIN RIGHT SEMI | | 1000 | 20000 | 508 (2)| 00:00:07 |
| 2 | INDEX FAST FULL SCAN| IDX_TAB2_A | 1001 | 3003 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL |TABLE_1 | 500K| 8300K| 502 (1)| 00:00:07 |
------------------------------------------------------------------------------------
call count cpu elapsed disk query current rows
------- ------ -------- -------------------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 1 0.00 0.00 0 10 0 100
------- ------ -------- -------------------- ---------- ---------- ----------
total 3 0.00 0.00 0 10 0 100
将表table_1数据增加到50万条,执行计划与跟踪结果表明,in 与 exists的写法一样的效率。
|
评论暂时关闭