暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

Oracle优化器转换:子查询展开之二

原创 赵勇 2022-09-12
1093

在优化器转换系列博客的第一部分,我们讨论了子查询展开。我们解释了子查询如何被评估,以及EXISTS和ANY子查询的展开。
这是该系列的第二篇,我们将会讨论NOT EXISTS和出现在WHERE子句中的单行子查询的展开。我要再一次感谢Rafi ---- 优化器的高级开发人员之一,他为本博文提供了内容。

快速回顾子查询展开的意义

子查询展开是一种优化,它转换子查询为外连接查询,从而允许优化器在访问路径,连接方法和连接顺序的选择时考虑子查询表。展开要么合并子查询为外部查询,要么转换为内联视图(译者注:出现在FROM子句中的子查询)

NOT EXISTS 子查询

反连接总是用来展开NOT EXISTS和ALL子查询。我们使用如下的非标准语法来表示反连接:T1.x A= T2.y,T1是反连接中的左表,T2是右表。反连接的语义是:只要T1.x在T2.y中的任一个值中找到匹配,则拒绝T1表中的这一行。只有当T1.x未匹配到T2.y中的任一个值时,才会返回T1表中的这一行。

考虑查询F,它是一个包含两个表的NOT EXISTS子查询。

F.

SELECT C.cust_last_name, C.country_id  
FROM customers C  
WHERE NOT EXISTS (SELECT 1  
FROM sales S, products P  
WHERE P.prod_id = S.prod_id and  
P.prod_min_price > 90 and  
S.cust_id = C.cust_id);

F中的子查询可以使用反连接展开;然而,子查询中的表是内连接,SALES和PRODUCTS表必须在反连接前被执行。为了强制连接的顺序,产生了内联视图。该展开产生了查询G;这时,内联视图成为了反连接中的右表。

G.

SELECT C.cust_last_name, C.country_id  
FROM customers C,  
(SELECT S.cust_id AS s_cust_id  
FROM sales S, products P  
WHERE P.prod_id = S.prod_id and P.prod_min_price > 90) VW  
WHERE C.cust_id A= VW.s_cust_id;

G的执行计划如XG所示。在三种连接方法中(即,嵌套循环连接,HASH连接和排序合并连接),HASH连接被优化器选中来完成反连接。

XG.
explain_plan_for_G.png

图片.png

单行聚合子查询

考虑查询H,它包含一个聚合的,相关的和单行的子查询。

H.

SELECT C.cust_last_name, C.cust_income_level  
FROM customers C  
WHERE C.cust_credit_limit < (SELECT SUM(S.amount_sold)  
FROM sales S  
WHERE S.cust_id = C.cust_id);

对多个等值相关的谓词做聚合,类似于在子查询列上做GROUP-BY的聚合,然后与外层表在GROUP-BY列上做连接。H中的子查询是以消除相关并转换为GROUP-BY视图的方式来展开的,其与外层表CUSTOMER是内连接。相关条件和连接条件都被转换为了连接谓词。该转换如I所示:

I.

SELECT C.cust_last_name, C.cust_income_level  
FROM customers C,  
SELECT SUM(S.amount_sold) AS sum_amt, S.cust_id  
FROM sales S  
GROUP BY S.cust_id) VW  
WHERE C.cust_credit_limit < VW.sum_amt and  
C.cust_id = VW.cust_id;

I的执行计划如XI所示:

XI.
explain_plan_for_I.png

图片.png

展开的有效性

每一个子查询,在被展开前,要通过一系列的有效性检查。优化器决定展开或者决定不展开一个子查询,可以通过指定适当的HINT来跳过,但是有效性不可以被跳过。因为这将不能保证展开在语义上是等价的。
下面,我们列举了一些目前会使子查询展开无效的检查。注意,这个检查清单并不代表全部。

  • 与非父表发生相关的子查询;例如,子查询SQ3被SQ2所包含(SQ2是SQ3的父亲),SQ2又被SQ1所包含,而SQ3与SQ1中涉及的列是相关的。
  • GROUP-BY子查询是相关的;这种情况下,展开隐含是在GROUP-BY之后做连接,但特定两个操作次序的改变,并不总是对的。
  • 连接和相关条件格式不佳(例如,它们的本地列与相关列出现在谓词的同一侧)并且子查询需要生成内联视图,当这样的谓词出现时,是不允许分离视图列和外层表列的。
  • 对于反意连接(译者注:即OR逻辑)子查询,出现在连接条件或相关条件的外层表列是不同的。

使用视图合并转换,Oracle可以合并展开时生成的Group-by或distinct内联视图,因此,当预期会有一个视图出现时,但在执行计划中可能却并没有出现任何视图。

总结

在这些博文中,我们尝试通过简单的示例查询,来演示不同类型子查询展开背后的基础理论。Oracle可以处理更复杂的查询–在一层或多层中有多个子查询的语句,有多个表,相关或连接条件包含非等谓词和表达式,子查询包含集合操作符,带有GROUP-BY和COUNT聚合函数的子查询,ALL子查询在连接条件列上含有可为空的列,以及反意连接子查询。

如果子查询的展开,并不需要生成内联视图,那么子查询总是展开的。因为这种展开通过允许更多的连接方法和连接次序,从而提供更有效率的执行计划。如果子查询中相关谓词中的本地列(例如查询A中的S.cust_id)上有索引,那么子查询会评估为类似于基于索引的嵌套循环连接。因此,在某些情况下,不展开也许会比生成内联视图,仅允许与外层表做排序合并和HASH连接的展开更有效率。故而,Oracle中,对于生成内联视图的子查询展开,是在基于成本的查询转换框架下,基于成本进行的。
再次感谢Rafi在本博文中的贡献。请关注我们在该系列中的下一篇博文,是有关视图合并的。

原文链接:https://blogs.oracle.com/optimizer/post/optimizer-transformations-subquery-unesting-part-2

原文内容:
Optimizer Transformations: Subquery Unesting part 2
January 1, 2020 | 4 minute read
Maria Colgan
Distinguished Product Manager

In Part one of our blog series on Optimizer transformations we discussed Subquery unesting. We explained how subqueries are evaluated, and the unnesting of EXISTS and ANY subqueries.

Here in part two of our series we will discuss the unnesting of NOT EXISTS and single-row subqueries, which appear in the WHERE clause. Once again I need to give credit to Rafi one of the senior Optimizer developers for the content of this blog!

Quick recap on what we mean by Subquery Unesting

Subquery unnesting is an optimization that converts a subquery into a join in the outer query and allows the optimizer to consider subquery table(s) during access path, join method, and join order selection. Unnesting either merges the subquery into the body of the outer query or turns it into an inline view.

NOT EXISTS Subqueries

Anti-join is always used for unnesting NOT EXISTS and ALL subqueries. We represent anti-join by the following non-standard syntax: T1.x A= T2.y, where T1 is the left table and T2 is the right table of the anti-join. The semantics of anti-join is the following: A row of T1 is rejected as soon as T1.x finds a match with any value of T2.y. A row of T1 is returned, only if T1.x does not match with any value of T2.y.

Consider query F, which has a NOT EXISTS subquery containing two tables.

F.

SELECT C.cust_last_name, C.country_id
FROM customers C
WHERE NOT EXISTS (SELECT 1
FROM sales S, products P
WHERE P.prod_id = S.prod_id and
P.prod_min_price > 90 and
S.cust_id = C.cust_id);

The subquery in F can be unnested by using an anti-join; however, the inner join of the tables in the subquery, sales and products must take place before the anti-join is performed. An inline view is generated in order to enforce the join order. This unnesting produces query G; here the inline view becomes the right table of anti-join.

G.

SELECT C.cust_last_name, C.country_id
FROM customers C,
(SELECT S.cust_id AS s_cust_id
FROM sales S, products P
WHERE P.prod_id = S.prod_id and P.prod_min_price > 90) VW
WHERE C.cust_id A= VW.s_cust_id;

The execution plan of G is shown as XG. Of the three join methods (i.e., nested-loop, hash and sort-merge), the hash method was selected by the optimizer to do the anti-join.

XG.
explain_plan_for_G.png

图片.png

Single-Row Aggregated Subqueries

Consider query H that contains an aggregated, correlated, single-row subquery.

H.

SELECT C.cust_last_name, C.cust_income_level
FROM customers C
WHERE C.cust_credit_limit < (SELECT SUM(S.amount_sold)
FROM sales S
WHERE S.cust_id = C.cust_id);

Doing aggregation for multiple values of equality correlation predicate is like doing aggregation and group-by on the local column, and then joining the view with the outer table on the group-by columns. The subquery in H is unnested by decorrelating it and converting it into a group-by view, which is inner joined with the outer table, customer; here both the correlating and connecting conditions have been turned into join predicates. The transformed query is shown as I.

I.

SELECT C.cust_last_name, C.cust_income_level
FROM customers C,
SELECT SUM(S.amount_sold) AS sum_amt, S.cust_id
FROM sales S
GROUP BY S.cust_id) VW
WHERE C.cust_credit_limit < VW.sum_amt and
C.cust_id = VW.cust_id;

XI shows the execution plan of I.

XI.
explain_plan_for_I.png

图片.png

Validity of Unnesting

Every subquery, before it can be unnested, goes through a set of validity checks. The optimizer decisions to unnest or not to unnest a subquery can be overridden by specifying an appropriate hint, but the validity requirements cannot be, since unnesting in such cases would not guarantee a semantically equivalent query.

In the following, we enumerate some important checks that currently invalidate subquery unnesting. Note that this list of checks is by no means exhaustive.

  • Subqueries that are correlated to non-parent; for example, subquery SQ3 is contained by SQ2 (parent of SQ3) and SQ2 in turn is contained by SQ1 and SQ3 is correlated to tables defined in SQ1.

  • A group-by subquery is correlated; in this case, unnesting implies doing join after group-by. Changing the given order of the two operations may not be always legal.

  • Connecting or correlating conditions are not well-formed (e.g., they contains a mix of local and correlated columns on either side of the predicate) and the subquery requires inline view generation, as predicates of this kind do not allow separating out view columns and outer table columns.

  • For disjunctive subqueries, the outer columns in the connecting or correlating conditions are not the same.

Using view-merging transformation, Oracle may merge the group-by or distinct inline view generated during unnesting, and therefore the execution plan may not show any view even when a view is expected.

Summary

In these posts we have tried to illustrate the basic ideas behind unnesting of different types of subquery by showing simple example queries. Oracle can handle far more complex queries - query statements with multiple subqueries at one or more levels, multiple tables, correlated and connecting conditions containing inequality predicates and expressions, subqueries that contain set operators, subqueries with group-by and COUNT aggregate function, ALL subqueries containing nullable columns in its connecting condition, and subqueries in disjunction.

If unnesting of a subquery does not require generation of an inline view, then the subquery is always unnested, because this unnesting provides a more efficient execution plan by allowing more join methods and join orders. If the local column (e.g., S.cust_id in A) in the correlating predicate of the subquery has an index on it, then the subquery evaluation becomes akin to doing index-based nested-loop join; and thus, in some cases, not unnesting may be more efficient than unnesting that generates an inline view and enables only sort-merge and hash join of the view with outer tables. Therefore, in Oracle, subquery unnesting which generates inline views, is done based on cost under the cost-based query transformation framework.

Once again many thanks to Rafi for all his work on this blog post. Watch out for our next post in this series on view merging.

最后修改时间:2022-09-13 09:29:18
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
1人已赞赏
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论