一、查看执行计划
1.基本EXPLAIN命令
mydb=> explain SELECT d.dname, e.max_sal mydb-> FROM dept d mydb-> LEFT JOIN ( mydb(> SELECT deptno, MAX(sal) AS max_sal mydb(> FROM emp mydb(> GROUP BY deptno mydb(> ) e ON d.deptno = e.deptno; QUERY PLAN ------------------------------------------------------------------------------- Hash Left Join (cost=23.46..43.97 rows=619 width=78) Hash Cond: (d.deptno = e.deptno) -> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) -> Hash (cost=20.96..20.96 rows=200 width=44) -> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) -> HashAggregate (cost=16.96..18.96 rows=200 width=58) Group By Key: emp.deptno -> Seq Scan on emp (cost=0.00..14.64 rows=464 width=26) (8 rows)
复制
- 仅生成执行计划不实际执行
- 显示优化器预估的代价(cost)
- 采用树状结构展示查询计划
2.EXPLAIN ANALYZE
mydb=> explain analyze SELECT d.dname, e.max_sal mydb-> FROM dept d mydb-> LEFT JOIN ( mydb(> SELECT deptno, MAX(sal) AS max_sal mydb(> FROM emp mydb(> GROUP BY deptno mydb(> ) e ON d.deptno = e.deptno; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=23.46..43.97 rows=619 width=78) (actual time=0.154..0.157 rows=4 loops=1) Hash Cond: (d.deptno = e.deptno) -> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.007..0.008 rows=4 loops=1) -> Hash (cost=20.96..20.96 rows=200 width=44) (actual time=0.033..0.033 rows=3 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 257kB -> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) (actual time=0.026..0.027 rows=3 loops=1) -> HashAggregate (cost=16.96..18.96 rows=200 width=58) (actual time=0.026..0.027 rows=3 loops=1) Group By Key: emp.deptno -> Seq Scan on emp (cost=0.00..14.64 rows=464 width=26) (actual time=0.004..0.008 rows=12 loops=1) Total runtime: 0.313 ms (10 rows)
复制
增加功能:
- 实际执行SQL并收集运行时统计信息
- 显示实际耗时(milliseconds级精度)
- 展示内存使用情况
- 输出格式包含:
(cost=0.00..20.88 rows=7 width=58) -- 预算代价
(actual time=1.943..2.544 rows=3 loops=1) -- 实际耗时和数据量
3.EXPLAIN PERFORMANCE
mydb=> EXPLAIN PERFORMANCE SELECT d.dname, e.max_sal mydb-> FROM dept d mydb-> LEFT JOIN ( mydb(> SELECT deptno, MAX(sal) AS max_sal mydb(> FROM emp mydb(> GROUP BY deptno mydb(> ) e ON d.deptno = e.deptno; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=23.46..43.97 rows=619 distinct=[200, 200] width=78) (actual time=0.144..0.147 rows=4 loops=1) Output: d.dname, e.max_sal Hash Cond: (d.deptno = e.deptno) (Buffers: shared hit=2) (CPU: ex c/r=-1393948902190, ex row=7, ex cyc=-9757642315335, inc cyc=48788213375151) -> Seq Scan on public.dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.005..0.007 rows=4 loops=1) Output: d.deptno, d.dname, d.loc (Buffers: shared hit=1) (CPU: ex c/r=12197053259986, ex row=4, ex cyc=48788213039945, inc cyc=48788213039945) -> Hash (cost=20.96..20.96 rows=200 width=44) (actual time=0.023..0.023 rows=3 loops=1) Output: e.max_sal, e.deptno Buckets: 32768 Batches: 1 Memory Usage: 257kB (Buffers: shared hit=1) (CPU: ex c/r=-9757642645637, ex row=3, ex cyc=-29272927936911, inc cyc=9757642650541) -> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) (actual time=0.019..0.021 rows=3 loops=1) Output: e.max_sal, e.deptno (CPU: ex c/r=0, ex row=3, ex cyc=0, inc cyc=39030570587452) -> HashAggregate (cost=16.96..18.96 rows=200 width=58) (actual time=0.018..0.020 rows=3 loops=1) Output: emp.deptno, max(emp.sal) Group By Key: emp.deptno (Buffers: shared hit=1) (CPU: ex c/r=-7318231964543, ex row=12, ex cyc=-87818783574519, inc cyc=39030570586034) -> Seq Scan on public.emp (cost=0.00..14.64 rows=464 width=26) (actual time=0.004..0.005 rows=12 loops=1) Output: emp.deptno, emp.sal (Buffers: shared hit=1) (CPU: ex c/r=10570779513379, ex row=12, ex cyc=126849354160553, inc cyc=126849354160553) Total runtime: 0.265 ms (27 rows)
复制
增强功能:
- 显示全量执行信息
- 额外包含:各节点内存消耗、网络流量等详细信息
- 提供整体查询的资源消耗总结
4.诊断级详细信息
显示列级信息
mydb=> EXPLAIN (VERBOSE, COSTS) SELECT d.dname, e.max_sal mydb-> FROM dept d mydb-> LEFT JOIN ( mydb(> SELECT deptno, MAX(sal) AS max_sal mydb(> FROM emp mydb(> GROUP BY deptno mydb(> ) e ON d.deptno = e.deptno; QUERY PLAN -------------------------------------------------------------------------------------- Hash Left Join (cost=23.46..43.97 rows=619 distinct=[200, 200] width=78) Output: d.dname, e.max_sal Hash Cond: (d.deptno = e.deptno) -> Seq Scan on public.dept d (cost=0.00..16.19 rows=619 width=58) Output: d.deptno, d.dname, d.loc -> Hash (cost=20.96..20.96 rows=200 width=44) Output: e.max_sal, e.deptno -> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) Output: e.max_sal, e.deptno -> HashAggregate (cost=16.96..18.96 rows=200 width=58) Output: emp.deptno, max(emp.sal) Group By Key: emp.deptno -> Seq Scan on public.emp (cost=0.00..14.64 rows=464 width=26) Output: emp.deptno, emp.sal (14 rows)
复制
5.计划结构解析
执行计划节点类型:
- 表扫描节点(Seq Scan/Index Scan)
- 连接节点(Hash Join/Nested Loop)
- 物化节点(Sort/Aggregate)
- 控制节点(Limit/Recursive UNION)
二、执行计划分析
1.重点关注高成本算子:
- Nested Loop Join(小表驱动大表时需要搭配索引)
- 带0有spill groups/memory wanted的Hash算子
示例:49765 groups total in 32 batches; 1 overflows
2.提升Hash性能:
-
增大work_mem解决Hash溢出:
SET work_mem=‘128MB’; -
强制使用两阶段聚合:
SET optimizer_force_three_stage_agg=off;
3.索引优化(当发现SeqScan时可采取):
CREATE INDEX idx_emp ON emp;
4.消除冗余操作:
- 多级分区表改单级分区
- 1亿级大表避免使用多列索引
- 用UNION ALL代替UNION消除去重
5.统计信息更新(当rows估值偏差大时):
- 手工收集统计信息
ANALYZE tablename; -- 收集单个表的统计信息 ANALYZE; -- 收集全库的统计信息 ALTER TABLE tablename ADD STATISTICS ((column1, column2)); -- 声明多列统计信息 ANALYZE tablename; -- 收集已声明的多列统计信息
复制
- 自动化统计信息收集
mydb=> show autovacuum_max_workers ; autovacuum_max_workers ------------------------ 3 (1 row) mydb=> show track_counts ; track_counts -------------- on (1 row) mydb=> show default_statistics_target ;--表示收集比例 default_statistics_target --------------------------- 100 (1 row)
复制
空表时数据超过50行触发自动收集
非空表50 + 10% * reltuples(reltuples为表总行数)时触发
当检测到统计信息缺失或过期时,通过系统视图(如GS_WLM_SESSION_STATISTICS)反馈警告信息,需手动执行ANALYZE。
DDL操作后需更新统计信息
6.关联子查询改写(当发现SubPlan时):
- EXISTS改写为SEMI JOIN
- IN改写为HASH JOIN
具体效率分析示例:
-- 原始低效执行计划片段: mydb=> EXPLAIN analyze SELECT d.dname, e.max_sal mydb-> FROM dept d mydb-> LEFT JOIN ( mydb(> SELECT deptno, MAX(sal) AS max_sal mydb(> FROM emp mydb(> GROUP BY deptno mydb(> ) e ON d.deptno = e.deptno; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- Hash Left Join (cost=23.46..43.97 rows=619 width=78) (actual time=0.138..0.141 rows=4 loops=1) Hash Cond: (d.deptno = e.deptno) -> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.003..0.004 rows=4 loops=1) -> Hash (cost=20.96..20.96 rows=200 width=44) (actual time=0.031..0.031 rows=3 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 257kB -> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) (actual time=0.025..0.027 rows=3 loops=1) -> HashAggregate (cost=16.96..18.96 rows=200 width=58) (actual time=0.025..0.027 rows=3 loops=1) Group By Key: emp.deptno -> Seq Scan on emp (cost=0.00..14.64 rows=464 width=26) (actual time=0.003..0.007 rows=12 loops=1) Total runtime: 0.261 ms (10 rows)
复制
创建索引
mydb=> create index indx_emp_depton on emp(deptno); CREATE INDEX
复制
优化后效果
mydb=> EXPLAIN analyze SELECT d.dname, e.max_sal mydb-> FROM dept d mydb-> LEFT JOIN ( mydb(> SELECT deptno, MAX(sal) AS max_sal mydb(> FROM emp mydb(> GROUP BY deptno mydb(> ) e ON d.deptno = e.deptno; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Hash Left Join (cost=1.57..19.47 rows=619 width=78) (actual time=0.138..0.140 rows=4 loops=1) Hash Cond: (d.deptno = e.deptno) -> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.004..0.005 rows=4 loops=1) -> Hash (cost=1.42..1.42 rows=12 width=44) (actual time=0.029..0.029 rows=3 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 257kB -> Subquery Scan on e (cost=1.18..1.42 rows=12 width=44) (actual time=0.022..0.024 rows=3 loops=1) -> HashAggregate (cost=1.18..1.30 rows=12 width=58) (actual time=0.022..0.023 rows=3 loops=1) Group By Key: emp.deptno -> Seq Scan on emp (cost=0.00..1.12 rows=12 width=26) (actual time=0.003..0.004 rows=12 loops=1) Total runtime: 0.258 ms (10 rows)
复制
三、hint
1、JOIN顺序指定(Leading Hint)
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT e.ename, d.dname mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------ Hash Join Hash Cond: (e.deptno = d.deptno) -> Seq Scan on emp e -> Hash -> Seq Scan on dept d (5 rows)
复制
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ Leading(emp dept) */ e.ename, d.dname mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; WARNING: Error hint: Leading(emp dept), relation name "emp" is not found. QUERY PLAN ------------------------------------ Hash Join Hash Cond: (e.deptno = d.deptno) -> Seq Scan on emp e -> Hash -> Seq Scan on dept d (5 rows)
复制
hint未生效,因为要使用别名
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ Leading(d e) */ e.ename, d.dname mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------ Hash Join Hash Cond: (e.deptno = d.deptno) -> Seq Scan on emp e -> Hash -> Seq Scan on dept d (5 rows)
复制
依然未生效
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ Leading((d e)) */ e.ename, d.dname mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------ Hash Join Hash Cond: (d.deptno = e.deptno) -> Seq Scan on dept d -> Hash -> Seq Scan on emp e (5 rows)
复制
leading一个括号仅指定join顺序,不指定内外表顺序,当需要指定内外表顺序(即驱动表和被驱动表)需要使用两层括号,内层表示驱动顺序,外层表示使用需要join的表。
leading(t1 t2 t3 t4 t5)表示:t1、t2、t3、t4、t5先join,五表join顺序及内外表不限。 leading((t1 t2 t3 t4 t5))表示:t1和t2先join,t2做内表;再和t3 join,t3做内表;再和t4 join,t4做内表;再和t5 join,t5做内表。 leading(t1 (t2 t3 t4) t5)表示:t2、t3、t4先join,内外表不限;再和t1、t5 join,内外表不限。 leading((t1 (t2 t3 t4) t5))表示:t2、t3、t4先join,内外表不限;在最外层,t1再和t2、t3、t4的join表join,t1为外表,再和t5 join,t5为内表。 leading((t1 (t2 t3) t4 t5)) leading((t3 t2))表示:t2、t3先join,t2做内表;然后再和t1 join,t2、t3的join表做内表;然后再依次跟t4、t5做join,t4、t5做内表。
复制
2、JOIN方法指定(Join方法)
语法
[no] nestloop|hashjoin|mergejoin(table_list)
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT * mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------ Hash Join Hash Cond: (e.deptno = d.deptno) -> Seq Scan on emp e -> Hash -> Seq Scan on dept d (5 rows)
复制
/*+no hashjoin(d e) */
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+no hashjoin(d e) */ * mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------- Merge Join Merge Cond: (d.deptno = e.deptno) -> Sort Sort Key: d.deptno -> Seq Scan on dept d -> Sort Sort Key: e.deptno -> Seq Scan on emp e (8 rows)
复制
/*+ nestloop(e d) */
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ nestloop(e d) */ * mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN -------------------------------------- Nested Loop Join Filter: (e.deptno = d.deptno) -> Seq Scan on emp e -> Materialize -> Seq Scan on dept d (5 rows)
复制
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ mergejoin(d e) */ * mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------- Merge Join Merge Cond: (d.deptno = e.deptno) -> Sort Sort Key: d.deptno -> Seq Scan on dept d -> Sort Sort Key: e.deptno -> Seq Scan on emp e (8 rows)
复制
3、Scan方法指定(IndexScan/SeqScan)
[no] tablescan|indexscan|indexonlyscan(table [index])
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT * mydb-> FROM emp e mydb-> WHERE e.sal > 3000; QUERY PLAN --------------------------------- Seq Scan on emp e Filter: (sal > 3000::numeric) (2 rows) mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ indexscan(e idx_emp_empno) */ * mydb-> FROM emp e mydb-> WHERE e.sal > 3000; WARNING: unused hint: IndexScan(e idx_emp_empno) QUERY PLAN --------------------------------- Seq Scan on emp e Filter: (sal > 3000::numeric) (2 rows)
复制
mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT /*+ indexscan(e idx_emp_empno) */ * mydb-> FROM emp e mydb-> WHERE e.empno > 7000; QUERY PLAN ----------------------------------------- Index Scan using idx_emp_empno on emp e Index Cond: (empno > 7000::numeric) (2 rows) mydb=> EXPLAIN (COSTS OFF) mydb-> SELECT * mydb-> FROM emp e mydb-> WHERE e.empno > 7000; QUERY PLAN ----------------------------------- Seq Scan on emp e Filter: (empno > 7000::numeric) (2 rows)
复制
4、rows
rows(table_list #|+|-|* const)
#、+、-、*,进行行数估算hint的四种操作符号。#表示直接使用后面的行数进行hint。+、-、*表示对原来估算的行数进行加、减、乘操作,运算后的行数最小值为1行。table_list为hint对应的单表或多表join结果集,与Join方式的Hint中table_list相同。
mydb=> EXPLAIN mydb-> SELECT e.ename, d.loc mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------------------------------------- Hash Join (cost=1.09..2.36 rows=12 width=13) Hash Cond: (e.deptno = d.deptno) -> Seq Scan on emp e (cost=0.00..1.12 rows=12 width=11) -> Hash (cost=1.04..1.04 rows=4 width=12) -> Seq Scan on dept d (cost=0.00..1.04 rows=4 width=12) (5 rows) mydb=> EXPLAIN mydb-> SELECT /*+ Rows(e d *1000) */ e.ename, d.loc mydb-> FROM emp e, dept d mydb-> WHERE e.deptno = d.deptno; QUERY PLAN ------------------------------------------------------------------- Hash Join (cost=1.09..2.36 rows=12000 width=13) Hash Cond: (e.deptno = d.deptno) -> Seq Scan on emp e (cost=0.00..1.12 rows=12 width=11) -> Hash (cost=1.04..1.04 rows=4 width=12) -> Seq Scan on dept d (cost=0.00..1.04 rows=4 width=12) (5 rows)
复制
5、子链接块名
mydb=> explain select /*+nestloop(emp tt) */ * from emp where empno in (select /*+blockname(tt)*/ empno from dept group by 1); WARNING: Error hint: NestLoop(emp tt), relation name "tt" is not found. WARNING: unused hint: BlockName(tt) QUERY PLAN ------------------------------------------------------------------ Seq Scan on emp (cost=0.00..7.46 rows=6 width=46) Filter: (SubPlan 1) SubPlan 1 -> Group (cost=0.00..1.05 rows=1 width=0) Group By Key: emp.empno -> Seq Scan on dept (cost=0.00..1.04 rows=4 width=0) (6 rows)
复制
评论
