暂无图片
暂无图片
2
暂无图片
暂无图片
1
暂无图片

磐维数据库执行计划及hint测试

原创 _ 2025-04-07
38

一、查看执行计划

1.基本EXPLAIN命令
mydb=> explain SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(>     SELECT deptno, MAX(sal) AS max_sal 
mydb(>     FROM emp 
mydb(>     GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Hash Left Join  (cost=23.46..43.97 rows=619 width=78)
   Hash Cond: (d.deptno = e.deptno)
   ->  Seq Scan on dept d  (cost=0.00..16.19 rows=619 width=58)
   ->  Hash  (cost=20.96..20.96 rows=200 width=44)
         ->  Subquery Scan on e  (cost=16.96..20.96 rows=200 width=44)
               ->  HashAggregate  (cost=16.96..18.96 rows=200 width=58)
                     Group By Key: emp.deptno
                     ->  Seq Scan on emp  (cost=0.00..14.64 rows=464 width=26)
(8 rows)
复制
  • 仅生成执行计划不实际执行
  • 显示优化器预估的代价(cost)
  • 采用树状结构展示查询计划
2.EXPLAIN ANALYZE
mydb=> explain analyze SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(>     SELECT deptno, MAX(sal) AS max_sal 
mydb(>     FROM emp 
mydb(>     GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=23.46..43.97 rows=619 width=78) (actual time=0.154..0.157 rows=4 loops=1)
   Hash Cond: (d.deptno = e.deptno)
   ->  Seq Scan on dept d  (cost=0.00..16.19 rows=619 width=58) (actual time=0.007..0.008 rows=4 loops=1)
   ->  Hash  (cost=20.96..20.96 rows=200 width=44) (actual time=0.033..0.033 rows=3 loops=1)
          Buckets: 32768  Batches: 1  Memory Usage: 257kB
         ->  Subquery Scan on e  (cost=16.96..20.96 rows=200 width=44) (actual time=0.026..0.027 rows=3 loops=1)
               ->  HashAggregate  (cost=16.96..18.96 rows=200 width=58) (actual time=0.026..0.027 rows=3 loops=1)
                     Group By Key: emp.deptno
                     ->  Seq Scan on emp  (cost=0.00..14.64 rows=464 width=26) (actual time=0.004..0.008 rows=12 loops=1)
 Total runtime: 0.313 ms
(10 rows)
复制

增加功能:

  • 实际执行SQL并收集运行时统计信息
  • 显示实际耗时(milliseconds级精度)
  • 展示内存使用情况
  • 输出格式包含:

(cost=0.00..20.88 rows=7 width=58) -- 预算代价

(actual time=1.943..2.544 rows=3 loops=1) -- 实际耗时和数据量

3.EXPLAIN PERFORMANCE
mydb=> EXPLAIN PERFORMANCE SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(>     SELECT deptno, MAX(sal) AS max_sal 
mydb(>     FROM emp 
mydb(>     GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
                                                           QUERY PLAN                                                            
---------------------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=23.46..43.97 rows=619 distinct=[200, 200] width=78) (actual time=0.144..0.147 rows=4 loops=1)
   Output: d.dname, e.max_sal
   Hash Cond: (d.deptno = e.deptno)
   (Buffers: shared hit=2)
   (CPU: ex c/r=-1393948902190, ex row=7, ex cyc=-9757642315335, inc cyc=48788213375151)
   ->  Seq Scan on public.dept d  (cost=0.00..16.19 rows=619 width=58) (actual time=0.005..0.007 rows=4 loops=1)
         Output: d.deptno, d.dname, d.loc
         (Buffers: shared hit=1)
         (CPU: ex c/r=12197053259986, ex row=4, ex cyc=48788213039945, inc cyc=48788213039945)
   ->  Hash  (cost=20.96..20.96 rows=200 width=44) (actual time=0.023..0.023 rows=3 loops=1)
         Output: e.max_sal, e.deptno
          Buckets: 32768  Batches: 1  Memory Usage: 257kB
         (Buffers: shared hit=1)
         (CPU: ex c/r=-9757642645637, ex row=3, ex cyc=-29272927936911, inc cyc=9757642650541)
         ->  Subquery Scan on e  (cost=16.96..20.96 rows=200 width=44) (actual time=0.019..0.021 rows=3 loops=1)
               Output: e.max_sal, e.deptno
               (CPU: ex c/r=0, ex row=3, ex cyc=0, inc cyc=39030570587452)
               ->  HashAggregate  (cost=16.96..18.96 rows=200 width=58) (actual time=0.018..0.020 rows=3 loops=1)
                     Output: emp.deptno, max(emp.sal)
                     Group By Key: emp.deptno
                     (Buffers: shared hit=1)
                     (CPU: ex c/r=-7318231964543, ex row=12, ex cyc=-87818783574519, inc cyc=39030570586034)
                     ->  Seq Scan on public.emp  (cost=0.00..14.64 rows=464 width=26) (actual time=0.004..0.005 rows=12 loops=1)
                           Output: emp.deptno, emp.sal
                           (Buffers: shared hit=1)
                           (CPU: ex c/r=10570779513379, ex row=12, ex cyc=126849354160553, inc cyc=126849354160553)
 Total runtime: 0.265 ms
(27 rows)
复制

增强功能:

  • 显示全量执行信息
  • 额外包含:各节点内存消耗、网络流量等详细信息
  • 提供整体查询的资源消耗总结
4.诊断级详细信息

显示列级信息

mydb=> EXPLAIN (VERBOSE, COSTS) SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(>     SELECT deptno, MAX(sal) AS max_sal 
mydb(>     FROM emp 
mydb(>     GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
                                      QUERY PLAN                                      
--------------------------------------------------------------------------------------
 Hash Left Join  (cost=23.46..43.97 rows=619 distinct=[200, 200] width=78)
   Output: d.dname, e.max_sal
   Hash Cond: (d.deptno = e.deptno)
   ->  Seq Scan on public.dept d  (cost=0.00..16.19 rows=619 width=58)
         Output: d.deptno, d.dname, d.loc
   ->  Hash  (cost=20.96..20.96 rows=200 width=44)
         Output: e.max_sal, e.deptno
         ->  Subquery Scan on e  (cost=16.96..20.96 rows=200 width=44)
               Output: e.max_sal, e.deptno
               ->  HashAggregate  (cost=16.96..18.96 rows=200 width=58)
                     Output: emp.deptno, max(emp.sal)
                     Group By Key: emp.deptno
                     ->  Seq Scan on public.emp  (cost=0.00..14.64 rows=464 width=26)
                           Output: emp.deptno, emp.sal
(14 rows)
复制
5.计划结构解析

执行计划节点类型:

  • 表扫描节点(Seq Scan/Index Scan)
  • 连接节点(Hash Join/Nested Loop)
  • 物化节点(Sort/Aggregate)
  • 控制节点(Limit/Recursive UNION)

二、执行计划分析

1.重点关注高成本算子:
  • Nested Loop Join(小表驱动大表时需要搭配索引)
  • 带0有spill groups/memory wanted的Hash算子
    示例:49765 groups total in 32 batches; 1 overflows
2.提升Hash性能:
  • 增大work_mem解决Hash溢出:
    SET work_mem=‘128MB’;

  • 强制使用两阶段聚合:
    SET optimizer_force_three_stage_agg=off;

3.索引优化(当发现SeqScan时可采取):

CREATE INDEX idx_emp ON emp;

4.消除冗余操作:
  • 多级分区表改单级分区
  • 1亿级大表避免使用多列索引
  • 用UNION ALL代替UNION消除去重
5.统计信息更新(当rows估值偏差大时):
  1. 手工收集统计信息
ANALYZE tablename;       -- 收集单个表的统计信息
ANALYZE;                 -- 收集全库的统计信息
ALTER TABLE tablename ADD STATISTICS ((column1, column2)); -- 声明多列统计信息
ANALYZE tablename;                                         -- 收集已声明的多列统计信息

复制
  1. 自动化统计信息收集
mydb=> show autovacuum_max_workers ;
 autovacuum_max_workers 
------------------------
 3
(1 row)

mydb=> show track_counts ;
 track_counts 
--------------
 on
(1 row)

mydb=> show default_statistics_target ;--表示收集比例
 default_statistics_target 
---------------------------
 100
(1 row)

复制

空表时数据超过50行触发自动收集
非空表50 + 10% * reltuples(reltuples为表总行数)时触发
当检测到统计信息缺失或过期时,通过系统视图(如GS_WLM_SESSION_STATISTICS)反馈警告信息,需手动执行ANALYZE。
DDL操作后需更新统计信息

6.关联子查询改写(当发现SubPlan时):
  • EXISTS改写为SEMI JOIN
  • IN改写为HASH JOIN

具体效率分析示例:

-- 原始低效执行计划片段:
mydb=> EXPLAIN analyze  SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(>     SELECT deptno, MAX(sal) AS max_sal 
mydb(>     FROM emp 
mydb(>     GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=23.46..43.97 rows=619 width=78) (actual time=0.138..0.141 rows=4 loops=1)
   Hash Cond: (d.deptno = e.deptno)
   ->  Seq Scan on dept d  (cost=0.00..16.19 rows=619 width=58) (actual time=0.003..0.004 rows=4 loops=1)
   ->  Hash  (cost=20.96..20.96 rows=200 width=44) (actual time=0.031..0.031 rows=3 loops=1)
          Buckets: 32768  Batches: 1  Memory Usage: 257kB
         ->  Subquery Scan on e  (cost=16.96..20.96 rows=200 width=44) (actual time=0.025..0.027 rows=3 loops=1)
               ->  HashAggregate  (cost=16.96..18.96 rows=200 width=58) (actual time=0.025..0.027 rows=3 loops=1)
                     Group By Key: emp.deptno
                     ->  Seq Scan on emp  (cost=0.00..14.64 rows=464 width=26) (actual time=0.003..0.007 rows=12 loops=1)
 Total runtime: 0.261 ms
(10 rows)
复制

创建索引

mydb=> create index indx_emp_depton on emp(deptno);
CREATE INDEX
复制

优化后效果

mydb=> EXPLAIN analyze  SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(>     SELECT deptno, MAX(sal) AS max_sal 
mydb(>     FROM emp 
mydb(>     GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
                                                       QUERY PLAN                                                       
------------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=1.57..19.47 rows=619 width=78) (actual time=0.138..0.140 rows=4 loops=1)
   Hash Cond: (d.deptno = e.deptno)
   ->  Seq Scan on dept d  (cost=0.00..16.19 rows=619 width=58) (actual time=0.004..0.005 rows=4 loops=1)
   ->  Hash  (cost=1.42..1.42 rows=12 width=44) (actual time=0.029..0.029 rows=3 loops=1)
          Buckets: 32768  Batches: 1  Memory Usage: 257kB
         ->  Subquery Scan on e  (cost=1.18..1.42 rows=12 width=44) (actual time=0.022..0.024 rows=3 loops=1)
               ->  HashAggregate  (cost=1.18..1.30 rows=12 width=58) (actual time=0.022..0.023 rows=3 loops=1)
                     Group By Key: emp.deptno
                     ->  Seq Scan on emp  (cost=0.00..1.12 rows=12 width=26) (actual time=0.003..0.004 rows=12 loops=1)
 Total runtime: 0.258 ms
(10 rows)
复制

三、hint

1、JOIN顺序指定(Leading Hint)
mydb=> EXPLAIN (COSTS OFF) 
mydb-> SELECT e.ename, d.dname 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
             QUERY PLAN             
------------------------------------
 Hash Join
   Hash Cond: (e.deptno = d.deptno)
   ->  Seq Scan on emp e
   ->  Hash
         ->  Seq Scan on dept d
(5 rows)
复制
mydb=> EXPLAIN (COSTS OFF) 
mydb-> SELECT  /*+ Leading(emp dept) */ e.ename, d.dname 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
WARNING:  Error hint: Leading(emp dept), relation name "emp" is not found.
             QUERY PLAN             
------------------------------------
 Hash Join
   Hash Cond: (e.deptno = d.deptno)
   ->  Seq Scan on emp e
   ->  Hash
         ->  Seq Scan on dept d
(5 rows)
复制

hint未生效,因为要使用别名

mydb=> EXPLAIN (COSTS OFF) 
mydb-> SELECT  /*+ Leading(d e) */ e.ename, d.dname 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
             QUERY PLAN             
------------------------------------
 Hash Join
   Hash Cond: (e.deptno = d.deptno)
   ->  Seq Scan on emp e
   ->  Hash
         ->  Seq Scan on dept d
(5 rows)

复制

依然未生效

mydb=> EXPLAIN (COSTS OFF) 
mydb-> SELECT  /*+ Leading((d e)) */ e.ename, d.dname 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
             QUERY PLAN             
------------------------------------
 Hash Join
   Hash Cond: (d.deptno = e.deptno)
   ->  Seq Scan on dept d
   ->  Hash
         ->  Seq Scan on emp e
(5 rows)
复制

leading一个括号仅指定join顺序,不指定内外表顺序,当需要指定内外表顺序(即驱动表和被驱动表)需要使用两层括号,内层表示驱动顺序,外层表示使用需要join的表。

leading(t1 t2 t3 t4 t5)表示:t1、t2、t3、t4、t5先join,五表join顺序及内外表不限。

leading((t1 t2 t3 t4 t5))表示:t1和t2先join,t2做内表;再和t3 join,t3做内表;再和t4 join,t4做内表;再和t5 join,t5做内表。

leading(t1 (t2 t3 t4) t5)表示:t2、t3、t4先join,内外表不限;再和t1、t5 join,内外表不限。

leading((t1 (t2 t3 t4) t5))表示:t2、t3、t4先join,内外表不限;在最外层,t1再和t2、t3、t4的join表join,t1为外表,再和t5 join,t5为内表。

leading((t1 (t2 t3) t4 t5)) leading((t3 t2))表示:t2、t3先join,t2做内表;然后再和t1 join,t2、t3的join表做内表;然后再依次跟t4、t5做join,t4、t5做内表。
复制
2、JOIN方法指定(Join方法)

语法
[no] nestloop|hashjoin|mergejoin(table_list)

mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT * 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
             QUERY PLAN             
------------------------------------
 Hash Join
   Hash Cond: (e.deptno = d.deptno)
   ->  Seq Scan on emp e
   ->  Hash
         ->  Seq Scan on dept d
(5 rows)
复制

/*+no hashjoin(d e) */

mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+no hashjoin(d e) */ * 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
             QUERY PLAN              
-------------------------------------
 Merge Join
   Merge Cond: (d.deptno = e.deptno)
   ->  Sort
         Sort Key: d.deptno
         ->  Seq Scan on dept d
   ->  Sort
         Sort Key: e.deptno
         ->  Seq Scan on emp e
(8 rows)
复制

/*+ nestloop(e d) */

mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ nestloop(e d) */ * 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
              QUERY PLAN              
--------------------------------------
 Nested Loop
   Join Filter: (e.deptno = d.deptno)
   ->  Seq Scan on emp e
   ->  Materialize
         ->  Seq Scan on dept d
(5 rows)
复制
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ mergejoin(d e) */ * 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
             QUERY PLAN              
-------------------------------------
 Merge Join
   Merge Cond: (d.deptno = e.deptno)
   ->  Sort
         Sort Key: d.deptno
         ->  Seq Scan on dept d
   ->  Sort
         Sort Key: e.deptno
         ->  Seq Scan on emp e
(8 rows)

复制
3、Scan方法指定(IndexScan/SeqScan)

[no] tablescan|indexscan|indexonlyscan(table [index])

mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT   * 
mydb-> FROM emp e 
mydb-> WHERE e.sal > 3000;
           QUERY PLAN            
---------------------------------
 Seq Scan on emp e
   Filter: (sal > 3000::numeric)
(2 rows)

mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ indexscan(e idx_emp_empno) */ * 
mydb-> FROM emp e 
mydb-> WHERE e.sal > 3000;
WARNING:  unused hint: IndexScan(e idx_emp_empno)
           QUERY PLAN            
---------------------------------
 Seq Scan on emp e
   Filter: (sal > 3000::numeric)
(2 rows)
复制
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ indexscan(e idx_emp_empno) */ * 
mydb-> FROM emp e 
mydb-> WHERE e.empno > 7000;
               QUERY PLAN                
-----------------------------------------
 Index Scan using idx_emp_empno on emp e
   Index Cond: (empno > 7000::numeric)
(2 rows)

mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT  * 
mydb-> FROM emp e 
mydb-> WHERE e.empno > 7000;
            QUERY PLAN             
-----------------------------------
 Seq Scan on emp e
   Filter: (empno > 7000::numeric)
(2 rows)

复制
4、rows

rows(table_list #|+|-|* const)
#、+、-、*,进行行数估算hint的四种操作符号。#表示直接使用后面的行数进行hint。+、-、*表示对原来估算的行数进行加、减、乘操作,运算后的行数最小值为1行。table_list为hint对应的单表或多表join结果集,与Join方式的Hint中table_list相同。


mydb=> EXPLAIN  
mydb-> SELECT e.ename, d.loc 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
                            QUERY PLAN                             
-------------------------------------------------------------------
 Hash Join  (cost=1.09..2.36 rows=12 width=13)
   Hash Cond: (e.deptno = d.deptno)
   ->  Seq Scan on emp e  (cost=0.00..1.12 rows=12 width=11)
   ->  Hash  (cost=1.04..1.04 rows=4 width=12)
         ->  Seq Scan on dept d  (cost=0.00..1.04 rows=4 width=12)
(5 rows)

mydb=> EXPLAIN 
mydb-> SELECT  /*+ Rows(e d *1000) */ e.ename, d.loc 
mydb-> FROM emp e, dept d 
mydb-> WHERE e.deptno = d.deptno;
                            QUERY PLAN                             
-------------------------------------------------------------------
 Hash Join  (cost=1.09..2.36 rows=12000 width=13)
   Hash Cond: (e.deptno = d.deptno)
   ->  Seq Scan on emp e  (cost=0.00..1.12 rows=12 width=11)
   ->  Hash  (cost=1.04..1.04 rows=4 width=12)
         ->  Seq Scan on dept d  (cost=0.00..1.04 rows=4 width=12)
(5 rows)
复制
5、子链接块名
mydb=> explain select /*+nestloop(emp tt) */ * from emp  where empno  in (select /*+blockname(tt)*/ empno from dept group by 1);
WARNING:  Error hint: NestLoop(emp tt), relation name "tt" is not found.
WARNING:  unused hint: BlockName(tt)
                            QUERY PLAN                            
------------------------------------------------------------------
 Seq Scan on emp  (cost=0.00..7.46 rows=6 width=46)
   Filter: (SubPlan 1)
   SubPlan 1
     ->  Group  (cost=0.00..1.05 rows=1 width=0)
           Group By Key: emp.empno
           ->  Seq Scan on dept  (cost=0.00..1.04 rows=4 width=0)
(6 rows)
复制
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

筱悦星辰
暂无图片
12天前
评论
暂无图片 0
世界上有两个词最可贵:一个是认真,一个是坚持。没有人生来就能做到很多事,那些总是能成事的人,并不是天赋异禀,而是他们在每一件应该做的事情上,都付出了持续不断的努力。
12天前
暂无图片 点赞
评论
目录
  • 一、查看执行计划
    • 1.基本EXPLAIN命令
    • 2.EXPLAIN ANALYZE
    • 3.EXPLAIN PERFORMANCE
    • 4.诊断级详细信息
    • 5.计划结构解析
  • 二、执行计划分析
    • 1.重点关注高成本算子:
    • 2.提升Hash性能:
    • 3.索引优化(当发现SeqScan时可采取):
    • 4.消除冗余操作:
    • 5.统计信息更新(当rows估值偏差大时):
    • 6.关联子查询改写(当发现SubPlan时):
  • 三、hint
    • 1、JOIN顺序指定(Leading Hint)
    • 2、JOIN方法指定(Join方法)
    • 3、Scan方法指定(IndexScan/SeqScan)
    • 4、rows
    • 5、子链接块名