GBase 8c 学习笔记 011 —— GBase 8c Plan Hint

心有阳光 2023-03-31

799

GBase 8c 学习笔记 011 —— GBase 8c Plan Hint

Plan Hint介绍

Plan Hint 为用户提供了直接影响执行计划的手段，用户可以通过指定 join 顺序，join、scan 方法，指定行数等手段来进行执行计划调优，以提升查询性能；
指定形式：/*+ <plan hint>*/
支持范围：
- 指定 join 方式；
- 指定 join 顺序；
- 指定行数；
- 指定 scan 方式；
- 指定链接块名；
- custom plan 和 generic plan 选择的 hint；
- 指定子查询不扩展；

使用Plan Hint

指定Join方式

语法

[no] nestloop | hashjoin | mergejoin (table_list)
-- no 表示不使用hint的join方式。
-- 支持常用的join，包括Hash Join、Nested Join 、Merge Join。

--- 1. 创建示例表
-- 创建表t1
drop table if exists t1;
create table t1 (
    id1 int, 
    id2 int
) ;

postgres=# create table t1 (
postgres(#     id1 int,
postgres(#     id2 int
postgres(# ) ;
CREATE TABLE

-- 创建表t2
drop table if exists t2;
create table t2 (
    id1 int, 
    id2 int
);

postgres=# create table t2 (
postgres(#     id1 int,
postgres(#     id2 int
postgres(# );
CREATE TABLE

-- 创建表t3
drop table if exists t3;
create table t3 (
    id1 int, 
    id2 int
);

postgres=# create table t3 (
postgres(#     id1 int,
postgres(#     id2 int
postgres(# );
CREATE TABLE


--- 2. 插入数据
postgres=# insert into t1 select generate_series(1,100),generate_series(1,100);
INSERT 0 100
postgres=# insert into t2 select generate_series(1,100),generate_series(1,100);
INSERT 0 100
postgres=# insert into t3 select generate_series(1,100),generate_series(1,100);
INSERT 0 100

--- 3. 查看默认执行计划的 join 方式
postgres=# explain select * from t1,t2 where t1.id1 = t2.id2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=13.29..27.80 rows=10 width=16)
   Spawn on: All datanodes
   ->  Hash Join  (cost=13.29..27.75 rows=20 width=16)
         Hash Cond: (t2.id2 = t1.id1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..14.31 rows=20 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=8)
         ->  Hash  (cost=13.13..13.13 rows=21 width=8)
               ->  Seq Scan on t1  (cost=0.00..13.13 rows=20 width=8)
(9 rows)

--- 4. 开启 nestloop，使用 hint 指定执行计划的 join 方式为 nestloop
postgres=# set enable_nestloop = on ;
SET
postgres=# explain select /*+ nestloop(t1 t2)*/ * from t1,t2 where t1.id1 = t2.id2;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..30.06 rows=10 width=16)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..30.01 rows=20 width=16)
         Join Filter: (t1.id1 = t2.id2)
         ->  Seq Scan on t1  (cost=0.00..13.13 rows=20 width=8)
         ->  Materialize  (cost=0.00..14.37 rows=20 width=8)
               ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..14.31 rows=20 width=8)
                     Spawn on: All datanodes
                     ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=8)
(9 rows)

指定Join顺序

语法

-- 不指定内外表顺序
leading(join_table_list)

-- 同时指定join顺序和内外表顺序
leading((join_table_list))

--- 1. 使用 3 表查询，查看默认的执行计划
postgres=# explain select * from t1,t2,t3;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..127.29 rows=4000 width=24)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..96.82 rows=8000 width=24)
         ->  Nested Loop  (cost=0.00..31.59 rows=400 width=16)
               ->  Streaming(type: BROADCAST)  (cost=0.00..15.18 rows=40 width=8)
                     Spawn on: All datanodes
                     ->  Seq Scan on t1  (cost=0.00..13.13 rows=20 width=8)
               ->  Materialize  (cost=0.00..13.20 rows=20 width=8)
                     ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=8)
         ->  Materialize  (cost=0.00..15.28 rows=40 width=8)
               ->  Streaming(type: BROADCAST)  (cost=0.00..15.18 rows=40 width=8)
                     Spawn on: All datanodes
                     ->  Seq Scan on t3  (cost=0.00..13.13 rows=20 width=8)
(13 rows)

--- 2. 使用Hint指定join顺序
postgres=# explain select /*+ leading(t1 (t2 t3)) */ * from t1,t2,t3;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..127.29 rows=4000 width=24)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..96.82 rows=8000 width=24)
         ->  Nested Loop  (cost=0.00..31.59 rows=400 width=16)
               ->  Streaming(type: BROADCAST)  (cost=0.00..15.18 rows=40 width=8)
                     Spawn on: All datanodes
                     ->  Seq Scan on t2  (cost=0.00..13.13 rows=20 width=8)
               ->  Materialize  (cost=0.00..13.20 rows=20 width=8)
                     ->  Seq Scan on t3  (cost=0.00..13.13 rows=20 width=8)
         ->  Materialize  (cost=0.00..15.28 rows=40 width=8)
               ->  Streaming(type: BROADCAST)  (cost=0.00..15.18 rows=40 width=8)
                     Spawn on: All datanodes
                     ->  Seq Scan on t1  (cost=0.00..13.13 rows=20 width=8)
(13 rows)

指定行数

语法

rows(table_list #|+|-|* const)
-- # 表示直接使用后面的行数进行hint。
-- +，-，* 表示对原来估算的行数进行加、减、乘操作
-- const常量可以是任意非负数，支持科学计数法。

--- 1. 使用 2 表关联，查询默认的中间结果集行数
postgres=# explain select * from t1,t2;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..100.89 rows=5000 width=16)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..75.50 rows=10000 width=16)
         ->  Streaming(type: BROADCAST)  (cost=0.00..10.37 rows=200 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t1  (cost=0.00..2.50 rows=100 width=8)
         ->  Materialize  (cost=0.00..2.75 rows=100 width=8)
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
(8 rows)

--- 2. 使用Hint指定行数为5
postgres=# explain select /*+ rows(t1 t2 #5) */ * from t1,t2;
                                 QUERY PLAN
-----------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..75.51 rows=2 width=16)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..75.50 rows=5 width=16)
         ->  Streaming(type: BROADCAST)  (cost=0.00..10.37 rows=200 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t1  (cost=0.00..2.50 rows=100 width=8)
         ->  Materialize  (cost=0.00..2.75 rows=100 width=8)
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
(8 rows)

指定扫描方式

语法

[no] <scan_type> (table [index])
-- no表示不使用hint的scan方式。
-- <scan_type> 支持常用的 tablescan、indexscan、indexonlyscan。
-- table表示hint指定的表，只能指定一个表，如果表存在别名应优先使用别名进行hint。

--- 1. 使用2表关联，查看默认的数据扫描方式
postgres=# explain select * from t1,t2 where t1.id1 = t2.id2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=3.12..10.69 rows=50 width=16)
   Spawn on: All datanodes
   ->  Hash Join  (cost=3.12..10.44 rows=100 width=16)
         Hash Cond: (t2.id2 = t1.id1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..6.62 rows=100 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
         ->  Hash  (cost=2.50..2.50 rows=100 width=8)
               ->  Seq Scan on t1  (cost=0.00..2.50 rows=100 width=8)
(9 rows)

--- 2. 为 t1 表创建索引
postgres=# create index t1_id1_inx on t1 using btree (id1);
CREATE INDEX
postgres=# \d+ t1
                          Table "public.t1"
 Column |  Type   | Modifiers | Storage | Stats target | Description
--------+---------+-----------+---------+--------------+-------------
 id1    | integer |           | plain   |              |
 id2    | integer |           | plain   |              |
Indexes:
    "t1_id1_inx" btree (id1) TABLESPACE pg_default
Has OIDs: no
Distribute By: HASH(id1)
Location Nodes: ALL DATANODES
Options: orientation=row, compression=no

--- 3. 使用Hint指定使用索引扫描
postgres=# explain select /*+ no indexscan(t1 t1_id1_inx) */ *
postgres-#             from t1,t2 where t1.id1 = t2.id2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=3.12..10.69 rows=50 width=16)
   Spawn on: All datanodes
   ->  Hash Join  (cost=3.12..10.44 rows=100 width=16)
         Hash Cond: (t2.id2 = t1.id1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..6.62 rows=100 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
         ->  Hash  (cost=2.50..2.50 rows=100 width=8)
               ->  Seq Scan on t1  (cost=0.00..2.50 rows=100 width=8)
(9 rows)

--- 4. 使用 no 指定不走索引扫描
postgres=# explain select /*+ no indexscan(t1 t1_id1_inx) */ *
postgres-#             from t1,t2 where t1.id1 = t2.id2;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=3.12..10.69 rows=50 width=16)
   Spawn on: All datanodes
   ->  Hash Join  (cost=3.12..10.44 rows=100 width=16)
         Hash Cond: (t2.id2 = t1.id1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..6.62 rows=100 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
         ->  Hash  (cost=2.50..2.50 rows=100 width=8)
               ->  Seq Scan on t1  (cost=0.00..2.50 rows=100 width=8)
(9 rows)

指定链接块名

语法

blockname (table)
-- table表示为该子链接块hint的别名的名称。

custom plan与generic plan

如果同样的一条 SQL，可能只是查询条件入参不同，要执行很多遍，每次都是同样的执行计划、每次都发生硬解析，则会消耗大量时间。
- 针对这种查询，可以使用 PBE 的方式执行，来减少硬解析流程，提升效率。
  - P (Parse)： 解析SQL语句，生成解析树、查询树并保存起来。
  - B (Bind)： 第一次调用则生成并保存计划，若已有计划则直接使用，并将必要的入参补充完整。
  - E (Execute)： 根据计划执行
对于以PBE方式执行的查询语句和DML语句，优化器会基于规则、代价、参数等因素选择生成 Custom Plan 或 Generic Plan 执行。
- 前 n 次执行时，每次都是硬解析，每一次都产生新的执行计划，叫做 custom plan；
- 当第 n+1 次开始执行时，会生成一个通用的执行计划（generic plan），同时与之前的 customplan进行比较，如果 generic plan 效率高，则会把执行计划固定下来，此后即使传入的值发生变化，执行计划也不再变化。
使用 hint 可以使此类语句强制选择 Custom Plan 或 Generic Plan 。

--- 1. 使用Hint为链接名指定名称
postgres=# explain select /*+nestloop(t2 block1) */ * from t2
postgres-#         where id1 in
postgres-#             (select /*+blockname(block1)*/ id1 from t3 group by 1);
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=2.62..48.00 rows=50 width=8)
   Spawn on: All datanodes
   ->  Nested Loop Semi Join  (cost=2.62..47.88 rows=100 width=8)
         Join Filter: (t2.id1 = t3.id1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..6.62 rows=100 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
         ->  Materialize  (cost=2.62..3.88 rows=100 width=4)
               ->  HashAggregate  (cost=2.62..3.12 rows=100 width=4)
                     Group By Key: t3.id1
                     ->  Seq Scan on t3  (cost=0.00..2.50 rows=100 width=4)
(11 rows)

指定custom plan

语法

use_cplan
-- 对于非PBE方式执行的SQL语句，设置本hint不会影响执行方式。

--- 1. 使用 prepare 创建一个带入参的等值查询语句，并使用Hint指定使用 custom plan
postgres=# prepare p1 as select /*+ use_cplan */ * from t1,t2 where t1.id1 = $1;
PREPARE

--- 2. 查看该语句的执行计划
postgres=# explain execute p1(1);
                                QUERY PLAN
--------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..6.04 rows=50 width=16)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..5.78 rows=100 width=16)
         ->  Streaming(type: BROADCAST)  (cost=0.00..2.78 rows=2 width=8)
               Spawn on: dn1
               ->  Seq Scan on t1  (cost=0.00..2.62 rows=1 width=8)
                     Filter: (id1 = 1)
         ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
(8 rows)

指定generic plan

语法

use_gplan
-- 对于非PBE方式执行的SQL语句，设置本hint不会影响执行方式

--- 1. 使用 prepare 创建一个带入参的等值查询语句，并使用Hint指定使用generic plan
postgres=# prepare p2 as select /*+ use_gplan */ * from t1,t2 where t1.id1 = $1;
PREPARE

--- 2. 查看该语句的执行计划
postgres=# explain execute p2(1);
                                QUERY PLAN
--------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=0.00..6.04 rows=50 width=16)
   Spawn on: All datanodes
   ->  Nested Loop  (cost=0.00..5.78 rows=100 width=16)
         ->  Streaming(type: BROADCAST)  (cost=0.00..2.78 rows=2 width=8)
               Spawn on: All datanodes
               ->  Seq Scan on t1  (cost=0.00..2.62 rows=1 width=8)
                     Filter: (id1 = $1)
         ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=8)
(8 rows)

指定子查询不扩展

语法

no_expand
-- 数据库在对查询进行逻辑优化时通常会将可以提升的子查询提升到上层来避免嵌套执行，通过此 hint 可以使子查询不扩展。
-- 大多数情况下不建议使用此hint。

--- 1. 使用子查询，查看默认的执行计划
postgres=# explain select * from t1 where t1.id1 in (select id2 from t2);
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=3.12..10.58 rows=50 width=8)
   Spawn on: All datanodes
   ->  Hash Right Semi Join  (cost=3.12..10.45 rows=100 width=8)
         Hash Cond: (t2.id2 = t1.id1)
         ->  Streaming(type: REDISTRIBUTE)  (cost=0.00..6.62 rows=100 width=4)
               Spawn on: All datanodes
               ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=4)
         ->  Hash  (cost=2.50..2.50 rows=100 width=8)
               ->  Seq Scan on t1  (cost=0.00..2.50 rows=100 width=8)
(9 rows)

--- 2. 使用Hint指定子查询不扩展
postgres=# explain select * from t1 where t1.id1 in (select /*+ no_expand */ id2 from t2);
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Streaming(type: GATHER)  (cost=3.70..6.38 rows=25 width=8)
   Spawn on: All datanodes
   ->  Seq Scan on t1  (cost=3.70..6.32 rows=50 width=8)
         Filter: (hashed SubPlan 1)
         SubPlan 1
           ->  Materialize  (cost=0.00..3.20 rows=400 width=4)
                 ->  Streaming(type: BROADCAST)  (cost=0.00..2.70 rows=200 width=4)
                       Spawn on: All datanodes
                       ->  Seq Scan on t2  (cost=0.00..2.50 rows=100 width=4)
(9 rows)

gbase 墨力计划

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

文章被以下合辑收录

GBase 8c 学习笔记（共17篇）

GBase 8c 学习笔记

GBase 8c 学习笔记 011 —— GBase 8c Plan Hint

GBase 8c 学习笔记 011 —— GBase 8c Plan Hint

Plan Hint介绍

使用Plan Hint

指定Join方式

指定Join顺序

指定行数

指定扫描方式

指定链接块名

custom plan与generic plan

指定custom plan

指定generic plan

指定子查询不扩展

文章被以下合辑收录

评论