5.1、嵌套循环(NESTED LOOPS)
嵌套循环的算法:驱动表返回一行数据,通过连接列传值给被驱动表,驱动表返回多少行,被驱动表就要被扫描多少次。
驱动表应该返回少量数据;
嵌套循环被驱动表必须走索引;
嵌套循环被驱动表索引只能走INDEX UNIQUE SCAN或者INDEX RANGE SCAN;
嵌套循环被驱动表的连接列基数应该很高;
两表关联返回少量数据才能走嵌套循环;
嵌套循环不需要消耗PGA;
SCOTT@o11g>select /*+ gather_plan_statistics use_nl(e,d) leading(e) */ e.ename,e.job,d.dname from emp e,dept d where e.deptno = d.deptno ;
ENAME JOB DNAME
-------------------- ------------------ ----------------------------
SMITH CLERK RESEARCH
ALLEN SALESMAN SALES
WARD SALESMAN SALES
JONES MANAGER RESEARCH
MARTIN SALESMAN SALES
BLAKE MANAGER SALES
CLARK MANAGER ACCOUNTING
SCOTT ANALYST RESEARCH
KING PRESIDENT ACCOUNTING
TURNER SALESMAN SALES
ADAMS CLERK RESEARCH
JAMES CLERK SALES
FORD ANALYST RESEARCH
MILLER CLERK ACCOUNTING
14 rows selected.
SCOTT@o11g>select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST') );
PLAN_TABLE_OUTPUT
---------------------------------------------------------
SQL_ID bwna3dbbfqzs3, child number 0
-------------------------------------
select /*+ gather_plan_statistics use_nl(e,d) leading(e) */
e.ename,e.job,d.dname from emp e,dept d where e.deptno = d.deptno
Plan hash value: 3625962092
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 25 |
| 1 | NESTED LOOPS | | 1 |14| 14 |00:00:00.01 | 25 |
| 2 | NESTED LOOPS | | 1 |14| 14 |00:00:00.01 | 11 |
| 3 | TABLE ACCESS FULL | EMP | 1 |14| 14 |00:00:00.01 | 7 |
|* 4 | INDEX UNIQUE SCAN | PK_DEPT | 14 |1 | 14 |00:00:00.01 | 4 |
| 5 | TABLE ACCESS BY INDEX ROWID| DEPT| 14 |1 | 14 |00:00:00.01 | 14 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("E"."DEPTNO"="D"."DEPTNO")
23 rows selected.
复制
嵌套循环代码实现(PL/SQL)
declare
cursor cur_emp is
select ename, job, deptno from emp;
v_dname dept.dname%type;
begin
for x in cur_emp loop
select dname into v_dname from dept where deptno = x.deptno;
dbms_output.put_line(x.ename || ' ' || x.job || ' ' || v_dname);
end loop;
end;
复制
5.2、HASH连接(HASH JOIN)
两表关联返回大量数据应该走HASH连接;
HASH连接的算法:两表等值关联,返回大量数据,将较小的表选为驱动表,将驱动表的“select列和join列”读入PGA中的work area,然后对驱动表的连接列进行hash运算生成hash table,当驱动表的所有数据完全读入PAG中的work area之后,再读取被驱动表(被驱动表不需要读入PAG的work area),对被驱动表的连接列也进行hash运算,然后到PGA中的work area去探测hash table,找到数据就关联上,没找到数据就没关联上。
哈希连接支持支等值连接;
执行计划中离HASH连接关键字最近的表就是驱动表;
HASH连接的驱动表和被驱动表的连接列都不需要创建索引;
SCOTT@o11g>select /*+ gather_plan_statistics use_hash(e,d) */ e.ename,e.job,d.dname from emp e,dept d where e.deptno = d.deptno ;
ENAME JOB DNAME
-------------------- ------------------ ----------------------------
SMITH CLERK RESEARCH
ALLEN SALESMAN SALES
WARD SALESMAN SALES
JONES MANAGER RESEARCH
MARTIN SALESMAN SALES
BLAKE MANAGER SALES
CLARK MANAGER ACCOUNTING
SCOTT ANALYST RESEARCH
KING PRESIDENT ACCOUNTING
TURNER SALESMAN SALES
ADAMS CLERK RESEARCH
JAMES CLERK SALES
FORD ANALYST RESEARCH
MILLER CLERK ACCOUNTING
14 rows selected.
Elapsed: 00:00:00.00
SCOTT@o11g>select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')) ;
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
---------------SQL_ID azwrt6r51ddmr, child number 0
-------------------------------------
select /*+ gather_plan_statistics use_hash(e,d) */
e.ename,e.job,d.dname from emp e,dept d where e.deptno = d.deptno
Plan hash value: 615168685
-----------------------------------------------------------------------------------------
|Id | Operation|Name | Starts | E-Rows | A-Rows |A-Time | Buffers| OMem |1Mem| Used-Mem |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 13 | | | |
|* 1 | HASH JOIN | | 1 | 14 | 14 |00:00:00.01 | 13 | 1599K| 1599K| 1070K (0)|
| 2 | TABLE ACCESS FULL| DEPT | 1 | 4 | 4 |00:00:00.01 | 6 | | | |
| 3 | TABLE ACCESS FULL| EMP | 1 |14 | 14 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("E"."DEPTNO"="D"."DEPTNO")
21 rows selected.
Elapsed: 00:00:00.02
复制
以上执行计划中:
DEPT是驱动表,EMP是被驱动表;
驱动表和被驱动表均只扫描了1次(Starts =1);
Used-Mem表示HASH连接消耗了多少PGA,当驱动表太大、PGA不能完全容纳驱动表时,驱动表就会溢出到临时表空间,进而产生磁盘HASH连接;
当两表使用外连接进行关联,如果执行计划走的是HASH连接,想要更改驱动表,我们需要使用swap_join_inputs,而不是leading,如下:
使用leading:
SCOTT@o11g>explain plan for select /*+ use_hash(d,e) leading(e)*/ * from dept d left join emp e on d.deptno = e.deptno ;
Explained.
Elapsed: 00:00:00.01
SCOTT@o11g>select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------Plan hash value: 3713469723
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 812 | 6 (0)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 14 | 812 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| EMP | 14 | 532 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("D"."DEPTNO"="E"."DEPTNO"(+))
15 rows selected.
Elapsed: 00:00:00.03
使用swap_join_inputs:
SCOTT@o11g>explain plan for select /*+ use_hash(d,e) swap_join_inputs(e)*/ * from dept d left join emp e on d.deptno = e.deptno ;
Explained.
Elapsed: 00:00:00.00
SCOTT@o11g>select * from table(dbms_xplan.display) ;
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------Plan hash value: 3590956717
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 812 | 6 (0)| 00:00:01 |
|* 1 | HASH JOIN RIGHT OUTER| | 14 | 812 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL | EMP | 14 | 532 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL | DEPT | 4 | 80 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("D"."DEPTNO"="E"."DEPTNO"(+))
15 rows selected.
Elapsed: 00:00:00.02
复制
思考:怎么优化HASH连接?
**回答:**因为HASH连接需要将驱动表的select列和join列放入PGA中,所以,应该尽量避免书写select * from …语句,将需要的列放select list中,这样可以减少驱动表对PGA的占用,避免驱动表溢出到临时表空间,从而提升性能。如果无法避免驱动表被溢出到临时表空间,我们可以将临时表空间创建在SSD上或RAID 0上,加快临时数据的交换速度;
当PGA采用自动管理,单个进程的work area被限制在1G以内,如果PGA采用手动管理,单个进程work area不能超过2GB。如果驱动包比较大,可以开启并行查询至少parallel(4),将表拆分为至少4分,这样每个并行进程中的work area能够容纳1GB的数据,从而避免驱动表被溢出到临时表空间;
5.3、排序合并连接(SORT MERGE JOIN)
排序合并连接主要用于处理两表非等值关联,比如>,>=,<,<=,<>,但是不能用于instr、substr、like、regexp_like关联,instr、substr、like、regexp_like关联只能走嵌套循环。
排序合并连接需要将两个表都放入PGA中;
排序合并连接的算法:两表关联,先对两个表根据连接列进行排序,将较小的表作为驱动表(Oracle官方认为排序合并连接没有驱动表,笔者认为是有的),然后从驱动表中取出连接列的值,到已经排好序的被驱动表中匹配数据,如果匹配上数据,就关联成功。驱动表返回多少行,被驱动表就要被匹配多少次,这个匹配的过程类似嵌套循环,但是嵌套循环是从被驱动表的索引中匹配数据,而排序合并连接是在内存中(PGA中的work area)匹配数据。
SCOTT@o11g>select /*+ gather_plan_statistics */ e.ename,e.job,d.dname from emp e ,dept d where e.deptno > d.deptno ;
ENAME JOB DNAME
-------------------- ------------------ ----------------------------
此处省略输出......
17 rows selected.
Elapsed: 00:00:00.00
SCOTT@o11g>select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')) ;
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------
SQL_ID 3ac7qrztsbqpz, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ e.ename,e.job,d.dname from emp e
,dept d where e.deptno > d.deptno
Plan hash value: 844388907
-----------------------------------------------------------------------------------------
| Id| Operation| Name|Starts | E-Rows| A-Rows| A-Time| Buffers | OMem| 1Mem| Used-Mem |
-----------------------------------------------------------------------------------------
|0 | SELECT STATEMENT | | 1 | | 17 |00:00:00.01 | 12 | | | |
|1 | MERGE JOIN | | 1 | 14 | 17 |00:00:00.01 | 12 | | | |
|2 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 4 | 3 |00:00:00.01 | 6 | | | |
|3 | INDEX FULL SCAN | PK_DEPT | 1 | 4 | 3 |00:00:00.01 | 3 | | | |
|* 4 | SORT JOIN | | 3 | 14 | 17 |00:00:00.01 | 6 | 2048 | 2048 | 2048 (0)|
|5 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 6 | | | |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("E"."DEPTNO">"D"."DEPTNO")
filter("E"."DEPTNO">"D"."DEPTNO")
24 rows selected.
Elapsed: 00:00:00.02
复制
以上执行计划中:
离MERGE JOIN关键字最近的表就是驱动表,DEPT是驱动表;
驱动表只扫描1次
思考:怎么优化排序合并连接?
回答:如果两表关联是等值连接,走的是排序合并连接,我们可以将表连接方式改为HASH连接;
如果是非等值连接,可以考虑增加两表的显示条件,将两个表数据量缩小;
5.4、笛卡尔连接(CARTESIAN JOIN)
如果两个表连接没有关联条件就会产生笛卡尔积;
SCOTT@o11g>set autot trace SCOTT@o11g>select * from emp,dept ; 56 rows selected. Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- Plan hash value: 2034389985 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 56 | 3248 | 10 (0)| 00:00:01 | | 1 | MERGE JOIN CARTESIAN| | 56 | 3248 | 10 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | DEPT | 4 | 80 | 3 (0)| 00:00:01 | | 3 | BUFFER SORT | | 14 | 532 | 7 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | EMP | 14 | 532 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 58 recursive calls 0 db block gets 80 consistent gets 0 physical reads 0 redo size 4265 bytes sent via SQL*Net to client 556 bytes received via SQL*Net from client 5 SQL*Net roundtrips to/from client 11 sorts (memory) 0 sorts (disk) 56 rows processed
复制
在多表关联的时候,两个表没有直接关联条件,但是优化器错误的把某个表返回的Rows算为1行(注意必须是1行),这个时候也可能发生笛卡尔连接;
如果两个表有直接关联条件,无法控制两个表进行笛卡尔连接;
如果两个表没有直接关联条件,我们在编写SQL的时候将两个表依次放在from后面并且添加HINT:ordered,就可以使两个表进行笛卡尔积关联;
不加HINT SCOTT@o11g>select a.ename,a.sal,a.deptno,b.dname,c.grade from dept b,salgrade c,emp a where a.deptno = b.deptno and a.sal between c.losal and c.hisal ; 14 rows selected. Elapsed: 00:00:00.02 Execution Plan ---------------------------------------------------------- Plan hash value: 2614604844 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 42 | 1512 | 11 (28)| 00:00:01 | | 1 | MERGE JOIN | | 42 | 1512 | 11 (28)| 00:00:01 | | 2 | SORT JOIN | | 14 | 364 | 7 (29)| 00:00:01 | | 3 | MERGE JOIN | | 14 | 364 | 6 (17)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 52 | 2 (0)| 00:00:01 | | 5 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00:00:01 | |* 6 | SORT JOIN | | 14 | 182 | 4 (25)| 00:00:01 | | 7 | TABLE ACCESS FULL | EMP | 14 | 182 |3 (0)| 00:00:01 | |* 8 | FILTER | | | | | | |* 9 | SORT JOIN | | 5 | 50 | 4 (25)| 00:00:01 | | 10 | TABLE ACCESS FULL | SALGRADE | 5 | 50 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - access("A"."DEPTNO"="B"."DEPTNO") filter("A"."DEPTNO"="B"."DEPTNO") 8 - filter("A"."SAL"<="C"."HISAL") 9 - access(INTERNAL_FUNCTION("A"."SAL")>=INTERNAL_FUNCTION("C"."LOSAL")) filter(INTERNAL_FUNCTION("A"."SAL")>=INTERNAL_FUNCTION("C"."LOSAL")) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 14 consistent gets 0 physical reads 0 redo size 1164 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 3 sorts (memory) 0 sorts (disk) 14 rows processed 加HINT:ordered SCOTT@o11g>select /*+ ordered */a.ename,a.sal,a.deptno,b.dname,c.grade from dept b,salgrade c,emp a where a.deptno = b.deptno and a.sal between c.losal and c .hisal ; 14 rows selected. Elapsed: 00:00:00.05 Execution Plan ---------------------------------------------------------- Plan hash value: 2197699399 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 42 | 1512 | 13 (0)| 00:00:01 | |* 1 | HASH JOIN | | 42 | 1512 | 13 (0)| 00:00:01 | | 2 | MERGE JOIN CARTESIAN| | 20 | 460 | 10 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | DEPT | 4 | 52 | 3 (0)| 00:00:01 | | 4 | BUFFER SORT | | 5 | 50 | 7 (0)| 00:00:01 | | 5 | TABLE ACCESS FULL | SALGRADE | 5 | 50 | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | EMP | 14 | 182 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("A"."DEPTNO"="B"."DEPTNO") filter("A"."SAL">="C"."LOSAL" AND "A"."SAL"<="C"."HISAL") Statistics ---------------------------------------------------------- 28 recursive calls 0 db block gets 53 consistent gets 6 physical reads 0 redo size 1207 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 6 sorts (memory) 0 sorts (disk) 14 rows processed
复制
思考:当执行计划中有笛卡儿连接应该怎么优化?
回答:首先应该检查表是否有关联条件,如果没有关联条件,那么应该询问开发与业务人员为何表没有关联条件,是否为满足业务需求而故意不写关联条件;其次应该检查离笛卡尔连接最近的表是否真的返回1行数据,如果返回数据真的只有1行,那么走笛卡尔连接是没有问题的,如果返回行数据超过1行,那就需要检查为什么Rows会估算错误。纠正错误的Rows之后,优化器就不会走笛卡尔连接了;
可以使用/+ opt_param(’_optimizer_mjc_enable’,‘false’) / 禁止笛卡尔连接;
5.5、标量子查询(SCALAR SUBQUERY)
当一个子查询介于select与from之间,这种子查询就叫标量子查询;
标量子查询可以等价改写为外连接;
SCOTT@o11g>select /*+gather_plan_statistics */ e.ename,e.sal,(select d.dname from dept d where d.deptno = e.deptno) dname from emp e ;
ENAME SAL DNAME
-------------------- ---------- ----------------------------
省略输出......
14 rows selected.
Elapsed: 00:00:00.00
SCOTT@o11g>select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')) ;
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------SQL_ID 1xmbyqw4m967p, child number 0
-------------------------------------
select /*+gather_plan_statistics */ e.ename,e.sal,(select d.dname from
dept d where d.deptno = e.deptno) dname from emp e
Plan hash value: 2981343222
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 7 |
| 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 3 | 1 | 3 |00:00:00.01 | 5 |
|* 2 | INDEX UNIQUE SCAN | PK_DEPT | 3 | 1 | 3 |00:00:00.01 | 2 |
| 3 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 7 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("D"."DEPTNO"=:B1)
21 rows selected.
Elapsed: 00:00:00.05
复制
标量子查询类似一个天然的嵌套循环;
等价改写为外连接:
select d.dname, d.loc, (select max(e.sal) from emp e where e.deptno = d.deptno) max_sal from dept d; 等价改写: select d.dname, d.loc, e.max_sal from dept d left join (select max(sal) max_sal, deptno from emp e group by deptno) e on d.deptno = e.deptno;
复制
5.6、半连接(SEMI JOIN)
两表关联只返回一个表的数据就叫半连接。半连接一般就是指的in和exists;
5.6.1、半连接等价改写
半连接in的写法如下:
select * from dept where deptno in (select deptno from emp);
复制
半连接exists的写法如下:
select *
from dept
where exists (select null from emp where dept.deptno = emp.deptno);
复制
5.6.2、控制半连接执行计划
原始执行计划:(排序合并连接) SCOTT@o11g>set autot trace SCOTT@o11g>select * from dept where deptno in ( select deptno from emp) ; Elapsed: 00:00:00.02 Execution Plan ---------------------------------------------------------- Plan hash value: 1090737117 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 69 | 6 (17)| 00:00:01 | | 1 | MERGE JOIN SEMI | | 3 | 69 | 6 (17)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 00:00:01 | | 3 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00:00:01 | |* 4 | SORT UNIQUE | | 14 | 42 | 4 (25)| 00:00:01 | | 5 | TABLE ACCESS FULL | EMP | 14 | 42 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("DEPTNO"="DEPTNO") filter("DEPTNO"="DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 10 consistent gets 0 physical reads 0 redo size 768 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 3 rows processed 变更执行计划:(嵌套连接,驱动表为dept) SCOTT@o11g>set autot trace SCOTT@o11g>select /*+ use_nl(emp@a,dept) leading(dept) */ * from dept where deptno in (select /*+qb_name(a)*/ deptno from emp) ; Execution Plan ---------------------------------------------------------- Plan hash value: 2645846736 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 69 | 10 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 3 | 69 | 10 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| EMP | 9 | 27 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("DEPTNO"="DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 31 consistent gets 0 physical reads 0 redo size 768 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3 rows processed
复制
5.7、反连接(ANTI JOIN)
两表关联只返回主表的数据,而且只返回主表与子表没有关联上的数据;
反连接一般就是指not in 和not exists;
注意,not in里面如果有null,整个查询会返回空,而in里面有null,查询不受null影响;
5.7.1、反连接等价改写
select * from dept where deptno not in (select deptno from emp where deptno is not null);
select *
from dept
where not exists (select null from emp where dept.deptno = emp.deptno);
select d.*
from dept d
left join emp e
on d.deptno = e.deptno
where e.deptno is null
复制
5.7.2、控制反连接执行计划
以下为排序合并连接: SCOTT@o11g>set autot trace SCOTT@o11g>select * from dept where deptno not in ( select deptno from emp) ; Execution Plan ---------------------------------------------------------- Plan hash value: 2230682264 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 23 | 6 (17)| 00:00:01 | | 1 | MERGE JOIN ANTI NA | | 1 | 23 | 6 (17)| 00:00:01 | | 2 | SORT JOIN | | 4 | 80 | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 4 | 80 | 2 (0)| 00:00:01 | | 4 | INDEX FULL SCAN | PK_DEPT | 4 | | 1 (0)| 00:00:01 | |* 5 | SORT UNIQUE | | 14 | 42 | 4 (25)| 00:00:01 | | 6 | TABLE ACCESS FULL | EMP | 14 | 42 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("DEPTNO"="DEPTNO") filter("DEPTNO"="DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 8 consistent gets 0 physical reads 0 redo size 674 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed 使用嵌套循环连接: SCOTT@o11g>select /*+ use_nl(dept,emp@a) */ * from dept where deptno not in (select /*+qb_name(a)*/ deptno from emp where deptno is not null) ; Execution Plan ---------------------------------------------------------- Plan hash value: 1522491139 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 23 | 10 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 23 | 10 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| EMP | 9 | 27 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("DEPTNO" IS NOT NULL AND "DEPTNO"="DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 30 consistent gets 0 physical reads 0 redo size 674 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SCOTT@o11g>select /*+ use_nl(dept,emp@a) */ * from dept where not exists (select /*+qb_name(a)*/ null from emp where emp.deptno = dept.deptno) ; Execution Plan ---------------------------------------------------------- Plan hash value: 1522491139 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 23 | 10 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 23 | 10 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS FULL| EMP | 9 | 27 | 2 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("EMP"."DEPTNO"="DEPT"."DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 30 consistent gets 0 physical reads 0 redo size 674 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed 以下无法改变驱动表: SCOTT@o11g>select /*+ use_hash(dept,emp@a) leading(emp@a)*/ * from dept where not exists (select /*+qb_name(a)*/ null from emp where emp.deptno = dept.deptno ) ; Execution Plan ---------------------------------------------------------- Plan hash value: 474461924 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 23 | 6 (0)| 00:00:01 | |* 1 | HASH JOIN ANTI | | 1 | 23 | 6 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| DEPT | 4 | 80 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| EMP | 14 | 42 | 3 (0)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("EMP"."DEPTNO"="DEPT"."DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 12 consistent gets 0 physical reads 0 redo size 674 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed 以下改变了驱动表: SCOTT@o11g>select /*+ use_hash(dept,emp@a) swap_join_inputs(emp@a)*/ * from dept where not exists (select /*+qb_name(a)*/ null from emp where emp.deptno = de pt.deptno) ; Execution Plan ---------------------------------------------------------- Plan hash value: 152508289 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 23 | 6 (0)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI| | 1 | 23 | 6 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | EMP | 14 | 42 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | DEPT | 4 | 80 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("EMP"."DEPTNO"="DEPT"."DEPTNO") Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 12 consistent gets 0 physical reads 0 redo size 674 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
复制
5.8、FILTER
如果子查询(in/exists/not in/not exists)没能展开(unnest),在执行计划中就会产生FILTER,FILTER类似嵌套循环,FILTER的算法与标量子查询一模一样;
SCOTT@o11g>set autot traceonly
SCOTT@o11g>select ename, deptno
from emp
where exists (select deptno
from dept
where emp.deptno = dept.deptno
and dname = 'RESEARCH'
7 and rownum = 1)
8 ;
Execution Plan
----------------------------------------------------------
Plan hash value: 3414630506
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 45 | 6 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | EMP | 14 | 126 | 3 (0)| 00:00:01 |
|* 3 | COUNT STOPKEY | | | | | |
|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "DEPT" "DEPT" WHERE ROWNUM=1 AND
"DEPT"."DEPTNO"=:B1 AND "DNAME"='RESEARCH'))
3 - filter(ROWNUM=1)
4 - filter("DNAME"='RESEARCH')
5 - access("DEPT"."DEPTNO"=:B1)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
698 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
5 rows processed
SCOTT@o11g>alter session set statistics_level=all;
Session altered.
select ename, deptno
from emp
where exists (select deptno
from dept
where emp.deptno = dept.deptno
and dname = 'RESEARCH'
7 and rownum = 1);
ENAME DEPTNO
-------------------- ----------
SMITH 20
JONES 20
SCOTT 20
ADAMS 20
FORD 20
SCOTT@o11g>select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST')) ;
PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------------------------SQL_ID 6mq67by27udgm, child number 1
-------------------------------------
select ename, deptno from emp where exists (select deptno
from dept where emp.deptno = dept.deptno and dname
= 'RESEARCH' and rownum = 1)
Plan hash value: 3414630506
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 5 |00:00:00.01 | 13 |
|* 1 | FILTER | | 1 | | 5 |00:00:00.01 | 13 |
| 2 | TABLE ACCESS FULL | EMP | 1 | 14 | 14 |00:00:00.01 | 7 |
|* 3 | COUNT STOPKEY | | 3 | | 1 |00:00:00.01 | 6 |
|* 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 3 | 1 | 1 |00:00:00.01 | 6 |
|* 5 | INDEX UNIQUE SCAN | PK_DEPT | 3 | 1 | 3 |00:00:00.01 | 3 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( IS NOT NULL)
3 - filter(ROWNUM=1)
4 - filter("DNAME"='RESEARCH')
5 - access("DEPT"."DEPTNO"=:B1)
27 rows selected.
复制
5.9、IN和EXIST谁快
in和exists是半连接,半连接也属于表连接,那么既然是表连接,我们需要关心两表的大小以及两表之间究竟走什么连接方式,才能去优化SQL;
5.10、SQL语句的本质
标量子查询可以改写为外连接(需要注意表与表之间关系,去重);
半连接可以改写为内连接(需要注意表与表之间关系,去重);
反连接可以改写为外连接(不需要注意表与表之间的关系,也不需要去重);
SQL语句其本质就是表连接,以及表与表之间是几比几关系再加上group by;