暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

为什么我查询中的一些表从执行计划中消失了?

原创 赵勇 2022-09-12
1114

原文:http://optimizermagic.blogspot.com/2008/06/why-are-some-of-tables-in-my-query.html

Oracle 10gR2时,我们引入了一个新的转换技术,表消除(也称之为“连接消除”),它会消除查询中多余的表。如果表中的一个列只在连接谓词中使用,而且保证这些连接既不会减少结果集,也不会扩展结果集中的记录,那么这个表就是多余的。有若干种情况下,Oracle会消除多余的表,接下来我们会逐一讨论。

主键 – 外键表消除

从10gR2开始,优化器消除因主键–外键约束而多余的表。考虑以下示例中的表:

create table jobs 
(
job_id NUMBER PRIMARY KEY,
job_title VARCHAR2(35) NOT NULL,
min_salary NUMBER,
max_salary NUMBER
);
create table departments 
(
department_id NUMBER PRIMARY KEY,
department_name VARCHAR2(50)
);
create table employees
(
employee_id NUMBER PRIMARY KEY,
employee_name VARCHAR2(50),
department_id NUMBER REFERENCES departments(department_id),
job_id NUMBER REFERENCES jobs(job_id)
);


以及查询:

select e.employee_name
from employees e, departments d
where e.department_id = d.department_id; 


在这个查询中,对departments的连接就是多余的。departments表中的列只出现在连接谓词中,并且主键-外键约束保证了employees中的每一行,最多只能在departments中匹配到一行。因此,查询是等价于:

select e.employee_name
from employees e
where e.department_id is not null; 


优化器会为该查询生成如下的执行计划:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter("E"."DEPARTMENT_ID" IS NOT NULL)


注:如果列上有非空约束,那么谓词IS NOT NULL不是必须的。

自11gR1起,优化器还可以消除半连接或反连接中的表。考虑如下查询:

select e.employee_id, e.employee_name
from employees e
where not exists (select 1
                  from jobs j
                  where j.job_id = e.job_id);


由于employees.job_id是jobs.job_id的外键,任何employees.job_id中的非空值,必须在jobs表中有一个对应。因此,只有employees表中的job_id列上存在NULL值的,才会出现在结果中。因此,该查询等价于:

select e.employee_id, e.employee_name
from employees e
where job_id is null;

且优化器可以选择这样的执行计划:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."JOB_ID" IS NULL)


假定employees.job_id有一个非空约束

alter table employees modify job_id not null;

在这种情况下,EMPLOYEES中不可能有任何满足条件的行,优化器会选择下面的执行计划:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   FILTER                         
   2    TABLE ACCESS FULL   EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(NULL IS NOT NULL)



过滤条件"NULL IS NOT NULL" 是一个常为false的谓词,这最终会阻止表扫描的发生。

还是在11gR1中,优化器支持对符合ANSI的连接做表消除,比如下面的查询:

select employee_name
from employees e inner join jobs j 
on e.job_id = j.job_id;


优化器会消除JOBS表,并产生这样的执行计划:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------


外连接表消除

11gR1中,对外连接引入了一种新风格的表消除。它并不需要主键-外键约束。为了演示,我们需要一个新表并在EMPLOYEES中附加新列。

create table projects
(
project_id NUMBER UNIQUE,
deadline DATE,
priority NUMBER
);

alter table employees add project_id number;


现在,考虑下面外连接employees和projects表的查询:

select e.employee_name, e.project_id
from employees e, projects p
where e.project_id = p.project_id (+);


外连接保证了employees中的每一行至少在结果中出现一次。在projects.project_id列上在的唯一约束,保证了employees中的每一行最多在projects表中匹配到一行。综上,这两个条件保证了employees表中的每一行,只会在结果集中出现一次。由于没有projects表上的其它列被引用,projects表可以被消除。所以,优化器会选择下面的执行计划:


-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------


为什么我会写下这样的查询?

本文中的所有样例查询都是非常简单的,一个人不太可能写出明显不需要连接的查询。在现实中,有许多场景下,表消除是有帮助的,包括机器生成的SQL和消除视图中的表。比如,由一堆表构成的视图,是有连接的。为了检索视图中出现的所有列,连接也许是需要的。但是,一些视图的使用者,可能只是访问视图列的一个子集。在这种情况下,连接的表可以被消除。

比如,考虑该视图:

create view employee_directory_v as
select e.employee_name, d.department_name, j.job_title
from employees e, departments d, jobs j
where e.department_id = d.department_id
and e.job_id = j.job_id;


该视图可以供一个简单的员工花名册的应用程序,用于通过job_title查找员工姓名。应用发出如下查询:

select employee_name
from employee_directory_v
where department = 'ACCOUNTING';


由于job_title未被引用,jobs表可以从查询中被消除,优化器会选择如下的执行计划:

--------------------------------------------
 Id   Operation             Name       
--------------------------------------------
   0  SELECT STATEMENT                 
*  1   HASH JOIN                       
   2    TABLE ACCESS FULL   EMPLOYEES  
*  3    TABLE ACCESS FULL   DEPARTMENTS
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
 3 - filter("D"."DEPARTMENT_NAME"='ACCOUNTING')



已知限制

目前有以下几个表消除的限制:

多列主键-外键约束不支持
在查询的其它位置引用了连接键,将阻止表消除。对于一个内连接,连接两侧的列是等价的,如果在查询的其它位置引用了表的连接列,则该表不能被消除。变通方法是改写查询,引用另一个表的连接列。(但我们确认这不总是有效的)

原文链接:http://optimizermagic.blogspot.com/2008/06/why-are-some-of-tables-in-my-query.html

原文内容(注,因原文链接已不可访问,故以下内容转载自其它网址中的内容,故不能保证仍是“原文”):

In 10gR2, we introduced a new transformation, table elimination (alternately called “join elimination”), which removes redundant tables from a query. A table is redundant if its columns are only referenced to in join predicates, and it is guaranteed that those joins neither filter nor expand the resulting rows. There are several cases where Oracle will eliminate a redundant table. We will discuss each case in turn.

Primary Key-Foreign Key Table Elimination

Starting in 10gR2, the optimizer eliminates tables that are redundant due to primary key-foreign key constraints. Consider the following example tables:

create table jobs 
(
job_id NUMBER PRIMARY KEY,
job_title VARCHAR2(35) NOT NULL,
min_salary NUMBER,
max_salary NUMBER
);
create table departments 
(
department_id NUMBER PRIMARY KEY,
department_name VARCHAR2(50)
);
create table employees
(
employee_id NUMBER PRIMARY KEY,
employee_name VARCHAR2(50),
department_id NUMBER REFERENCES departments(department_id),
job_id NUMBER REFERENCES jobs(job_id)
);


and the query:

from employees e, departments d
where e.department_id = d.department_id; 


In this query, the join to departments is redundant. The only column referenced in the query appears in the join predicate, and the primary key-foreign key constraint guarantees that there is at most one match in departments for each row in employees. Hence, the query is equivalent to:

select e.employee_name
from employees e
where e.department_id is not null; 


The optimizer will generate this plan for the query:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter("E"."DEPARTMENT_ID" IS NOT NULL)



Note that the IS NOT NULL predicate is not necessary if the column has a NOT NULL constraint on it.

Starting in 11gR1, the optimizer will also eliminate tables that are semi-joined or anti-joined. Consider the following query:

select e.employee_id, e.employee_name
from employees e
where not exists (select 1
                  from jobs j
                  where j.job_id = e.job_id);



Since employees.job_id is a foreign key to jobs.job_id, any non-null value inemployees.job_id must have a match in jobs. So only employees with null values foremployees.job_id will appear in the result. Hence, this query is equivalent to:

select e.employee_id, e.employee_name
from employees e
where job_id is null;

and the optimizer can choose this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."JOB_ID" IS NULL)



Suppose employees.job_id has a NOT NULL constraint:

alter table employees modify job_id not null;

In this case, there could not possibly be any rows in EMPLOYEES, and the optimizer could choose this plan:

 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   FILTER                         
   2    TABLE ACCESS FULL   EMPLOYEES 
-------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(NULL IS NOT NULL)


The “NULL IS NOT NULL” filter is a false constant predicate, that will prevent the table scan from even taking place.

“NULL IS NOT NULL”

Also in 11gR1, the optimization became available for ANSI compliant joins. For this query:

select employee_name
from employees e inner join jobs j
on e.job_id = j.job_id;

the optimizer can eliminate JOBS and produce this plan:

 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------



Outer Join Table Elimination

In 11gR1, a new form of table elimination was introduced for outer joins, which does not require PK-FK constraints. For the example, we require a new table and an addition to EMPLOYEES:

create table projects
(
project_id NUMBER UNIQUE,
deadline DATE,
priority NUMBER
);

alter table employees add project_id number;

Now consider a query that outer joins employees and projects:

select e.employee_name, e.project_id
from employees e, projects p
where e.project_id = p.project_id (+);

The outer join guarantees that every row in employees will appear at least once in the result. The unique constraint on projects.project_id guarantees that every row in employees will match at most one row in projects. Together, these two properties guarantee that every row inemployees will appear in the result exactly once. Since no other columns from projects are referenced, projects can be eliminated, and the optimizer can choose this plan:

 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------


Why Would I Ever Write Such a Query?

All of the example queries in this post are very simple, and one would be unlikely to write a query where the join is so obviously unnecessary. There are many real world scenarios where table elimination may be helpful, including machine-generated queries and elimination of tables in views. For example, a set of tables might be exposed as a view, which contains a join. The join may be necessary to retrieve all of the columns exposed by the view. But some users of the view may only access a subset of the columns, and in this case, the joined table can be eliminated.

For example, consider the view:

create view employee_directory_v as
select e.employee_name, d.department_name, j.job_title
from employees e, departments d, jobs j
where e.department_id = d.department_id
and e.job_id = j.job_id;

This view might be exposed to a simple employee directory application. To lookup employee names by job title, the application issues a query:

select employee_name
from employee_directory_v
where department = ‘ACCOUNTING’;

Since the job_title column is not referenced, jobs can be eliminated from the query, and the optimizer can choose this plan:

--------------------------------------------
 Id   Operation             Name       
--------------------------------------------
   0  SELECT STATEMENT                 
*  1   HASH JOIN                       
   2    TABLE ACCESS FULL   EMPLOYEES  
*  3    TABLE ACCESS FULL   DEPARTMENTS
--------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
 3 - filter("D"."DEPARTMENT_NAME"='ACCOUNTING')


Known Limitations

There are currently a few limitations of table elimination:

Multi-column primary key-foreign key constraints are not supported. 
Referring to the join key elsewhere in the query will prevent table elimination. For an inner join, the join keys on each side of the join are equivalent, but if the query contains other references to the join key from the table that could otherwise be eliminated, this prevents elimination. A workaround is to rewrite the query to refer to the join key from the other table (we realize this is not always possible).
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论