测试目的:
测试数据库优化器是否会将特定left join语句改写成单表查询?
测试数据库类型和版本:
Oracle:11.20.4.0
MySQL:8.0.XX
达梦:8.1.3.XX
Oracle数据库
新增测试数据:
sqlplus cjc/******
create table t1 (id int,name varchar(10));
create table t2 (id int,name varchar(10));
insert into t1 values(1,'a');
insert into t1 values(2,'x');
insert into t1 values(3,'y');
insert into t2 values(2,'aa');
insert into t2 values(3,'xx');
insert into t2 values(6,'ff');
commit;
收集统计信息
EXEC DBMS_STATS.GATHER_TABLE_STATS('CJC','T1',estimate_percent=>100,method_opt=> 'FOR ALL INDEXED COLUMNS',CASCADE=> TRUE);
EXEC DBMS_STATS.GATHER_TABLE_STATS('CJC','T2',estimate_percent=>100,method_opt=> 'FOR ALL INDEXED COLUMNS',CASCADE=> TRUE);
问题:
如下SQL,查询结果显然是t1表id列的全部数据,那么Oracle优化器会将此语句改写成t1表单表查询吗,也就是去掉t2表的扫描?
select t1.id from t1 left join t2 on t1.id=t2.id;
查看执行计划,当前场景下,仍然扫描了t2表
SQL> set autotrace on
SQL> select t1.id from t1 left join t2 on t1.id=t2.id;
ID
----------
2
3
1
Execution Plan
----------------------------------------------------------
Plan hash value: 1823443478
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 33 | 6 (0)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 3 | 33 | 6 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| T1 | 3 | 15 | 3 (0)| 00:00:01 |
| 3 | TABLE ACCESS FULL| T2 | 3 | 18 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."ID"="T2"."ID"(+))
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
582 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
通过10053查看SQL改写
alter system flush shared_pool;
alter session set tracefile_identifier='10053';
ALTER SESSION SET EVENTS='10053 trace name context forever, level 1';
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
ALTER SESSION SET EVENTS '10053 trace name context off';
select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/db/oracle/app/oracle/diag/rdbms/cjc/cjc/trace/cjc_ora_24218_10053.trc
SQL改写结果:
vi cjc_ora_24218_10053.trc
搜索关键字:Final query
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "T1"."ID" "ID" FROM "CJC"."T1" "T1","CJC"."T2" "T2" WHERE "T1"."ID"="T2"."ID"(+)
原SQL:
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
优化器自动改写后的SQL:并没有改写成单表查询
SELECT "T1"."ID" "ID" FROM "CJC"."T1" "T1","CJC"."T2" "T2" WHERE "T1"."ID"="T2"."ID"(+);
如何让优化器改写成t1表单表查询呢?
添加主键
SQL> alter table cjc.t1 add primary key (id);
SQL> alter table cjc.t2 add primary key (id);
再次收集统计信息
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('CJC','T1',estimate_percent=>100,method_opt=> 'FOR ALL INDEXED COLUMNS',CASCADE=> TRUE);
SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('CJC','T2',estimate_percent=>100,method_opt=> 'FOR ALL INDEXED COLUMNS',CASCADE=> TRUE);
查看执行计划,只扫描了t1表的主键索引,没有扫描t2表。
SQL> set line 100
SQL> select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
ID
----------
1
2
3
Execution Plan
----------------------------------------------------------
Plan hash value: 3198233734
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 9 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | SYS_C0011128 | 3 | 9 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
582 bytes sent via SQL*Net to client
520 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
执行计划里的SYS_C0011128是t1表的主键
SQL> col owner for a20
SQL> select OWNER,CONSTRAINT_NAME,TABLE_NAME,CONSTRAINT_TYPE from dba_constraints where CONSTRAINT_NAME='SYS_C0011128';
OWNER CONSTRAINT_NAME TABLE_NAME C
-------------------- ------------------------------ ------------------------------ -
CJC SYS_C0011128 T1 P
SQL> col COLUMN_NAME for a20
SQL> select TABLE_NAME,INDEX_NAME,COLUMN_NAME from DBA_IND_COLUMNS where TABLE_NAME='T1' and TABLE_OWNER='CJC';
TABLE_NAME INDEX_NAME COLUMN_NAME
------------------------------ ------------------------------ --------------------
T1 SYS_C0011128 ID
通过10053查看SQL改写
alter system flush shared_pool;
alter session set tracefile_identifier='10053';
ALTER SESSION SET EVENTS='10053 trace name context forever, level 1';
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
ALTER SESSION SET EVENTS '10053 trace name context off';
select value from v$diag_info where name='Default Trace File';
VALUE
--------------------------------------------------------------------------------
/db/oracle/app/oracle/diag/rdbms/cjc/cjc/trace/cjc_ora_25374_10053.trc
可以看到,优化器最终将SQL改成了t1表单表查询,去掉了t2表的扫描
vi cjc_ora_25374_10053.trc
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "T1"."ID" "ID" FROM "CJC"."T1" "T1"
kkoqbc: optimizing query block SEL$ABDE525E (#0)
原SQL:
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
优化器自动改写后的SQL:
SELECT "T1"."ID" "ID" FROM "CJC"."T1" "T1";
为什么t1,t2表添加主键后,优化器才能改写SQL为单表查询?
实际上只需要对t2表id列添加唯一约束,优化器就可以改写SQL为单表查询。
删除主键
alter table cjc.t1 drop primary key;
alter table cjc.t2 drop primary key;
添加唯一约束
alter table cjc.t2 add constraint uk_t2_id unique(id);
收集统计信息
EXEC DBMS_STATS.GATHER_TABLE_STATS('CJC','T1',estimate_percent=>100,method_opt=> 'FOR ALL INDEXED COLUMNS',CASCADE=> TRUE);
EXEC DBMS_STATS.GATHER_TABLE_STATS('CJC','T2',estimate_percent=>100,method_opt=> 'FOR ALL INDEXED COLUMNS',CASCADE=> TRUE);
再次查询
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 9 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| T1 | 3 | 9 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
但实际上t2表id列是否唯一,是否有空值等,并不影响结果,Oracle虽然以优化器强大著称,但也有不靠谱的时候。
继续对比看看MySQL和达梦数据库优化器SQL改写的能力:
MySQL数据库
新增测试数据
mysql> use cjc;
create table t1 (id int,name varchar(10));
create table t2 (id int,name varchar(10));
insert into t1 values(1,'a');
insert into t1 values(2,'x');
insert into t1 values(3,'y');
insert into t2 values(2,'aa');
insert into t2 values(3,'xx');
insert into t2 values(6,'ff');
mysql> select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
+------+
| id |
+------+
| 1 |
| 2 |
| 3 |
+------+
3 rows in set (0.00 sec)
查看执行计划,扫描了t1,t2表
mysql> explain select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
对t2表id列添加唯一约束,还是会扫描t2表?
mysql> alter table cjc.t2 add constraint uk_t2_id unique(id);
mysql> explain select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
+----+-------------+-------+------------+--------+---------------+----------+---------+-----------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+----------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | eq_ref | uk_t2_id | uk_t2_id | 5 | cjc.t1.id | 1 | 100.00 | Using index |
+----+-------------+-------+------------+--------+---------------+----------+---------+-----------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
在看看达梦数据库的表现:
达梦数据库
新增测试数据
dmdba@CJC-DB-01:/home/dmdba$disql cjc/******:5238
create table t1 (id int,name varchar(10));
create table t2 (id int,name varchar(10));
insert into t1 values(1,'a');
insert into t1 values(2,'x');
insert into t1 values(3,'y');
insert into t2 values(2,'aa');
insert into t2 values(3,'xx');
insert into t2 values(6,'ff');
commit;
SQL> select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
行号 id
---------- -----------
1 2
2 3
3 1
已用时间: 1.303(毫秒). 执行号:11401.
收集统计信息
SQL> DBMS_STATS.GATHER_SCHEMA_STATS('CJC',100,TRUE,'FOR ALL COLUMNS SIZE AUTO');
查看执行计划,达梦默认是索引组织表,扫描了t1,t2两个表的聚簇索引:
SQL> explain select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
1 #NSET2: [1, 3, 4]
2 #PRJT2: [1, 3, 4]; exp_num(1), is_atom(FALSE)
3 #HASH LEFT JOIN2: [1, 3, 4]; key_num(1), partition_keys_num(0), ret_null(0), mix(0) KEY(t1.id=t2.id)
4 #CSCN2: [1, 3, 4]; INDEX33573407(t1)
5 #CSCN2: [1, 3, 4]; INDEX33573408(t2)
已用时间: 0.767(毫秒). 执行号:0.
添加唯一约束,可以看到优化器将SQL改写成单表查询了,和Oracle类似。
SQL> alter table cjc.t2 add constraint uk_t2_id unique(id);
操作已执行
已用时间: 245.113(毫秒). 执行号:11504.
SQL> DBMS_STATS.GATHER_SCHEMA_STATS('CJC',100,TRUE,'FOR ALL COLUMNS SIZE AUTO');
DMSQL 过程已成功完成
已用时间: 115.322(毫秒). 执行号:11505.
SQL> explain select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
1 #NSET2: [1, 3, 16]
2 #PRJT2: [1, 3, 16]; exp_num(2), is_atom(FALSE)
3 #CSCN2: [1, 3, 16]; INDEX33573407(t1)
已用时间: 0.593(毫秒). 执行号:0.
SQL> set autotrace traceonly
SQL> select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
1 #NSET2: [1, 3->3, 16]
2 #PRJT2: [1, 3->3, 16]; exp_num(2), is_atom(FALSE)
3 #CSCN2: [1, 3->3, 16]; INDEX33573407(t1)
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
1 logical reads
0 physical reads
0 redo size
176 bytes sent to client
122 bytes received from client
1 roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
0 io wait time(ms)
0 exec time(ms)
生成10053
alter session 0 set events '10053 trace name context forever,level 1';
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
alter session set events '10053 trace name context off';
dmdba@CJC-DB-01:/db/dm8/data/cjc/trace$ls -lrth CJC01_0314_1504_140727623426088.trc
-rw-r--r-- 1 dmdba dinstall 61K Mar 14 15:12 CJC01_0314_1504_140727623426088.trc
查看SQL改写:
vi CJC01_0314_1504_140727623426088.trc
DM Database Server x64 V8[1-3-62-2023.12.23-213044-20067-ENT ], Dec 26 2023 20:08:39 built.
*** 2025-03-14 15:12:18.439000000
*** Start trace 10053 event [level 1]
Current SQL Statement:
select t1.id from cjc.t1 left join cjc.t2 on t1.id=t2.id;
......
*** Plan before optimized:
project[0x7ffdb404e0c8] n_exp(2)
base table[0x7ffdb404da98] (t1, FULL SEARCH)
---------------- single table access path probe for t1 ----------------
*** path 1: INDEX33573407 (FULL search), cost: 0.04483
>>> best access path: INDEX33573407 (FULL search), cost: 0.04483
*** BEST PLAN FOR THIS STATEMENT ***
project[0x7ffdb405fa68] n_exp(2) (cost: 0.04483, rows: 3)
base table[0x7ffdb40603e8] (t1, INDEX33573407, FULL SEARCH) (cost: 0.04483, rows: 3)
可以看到,达梦数据库也没有扫描t2表。
###chenjuchao 20250315###
欢迎关注我的公众号《IT小Chen》
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。