大家好,这次大表哥带来的是 mysql 8.0 hint的中篇。
本文将会介绍这2种类型的 hint:
Table-Level Optimizer Hints --表级别的
Subquery Optimizer Hints --子查询级别的
我们大家先看表级别的hint: Table-Level Optimizer Hints --表级别的
表级别的hint 都是关于表与表的连接方式 ,在MYSQL中表与表的连接方式大致分为如下几种:
1) Index Nested loop join (简称 NLJ)
2) Batched Key Accessed join (简称BKA)
3) Block Nested loop join (before 8.0.18) (简称BNL)
4) Hash join (from 8.0.18)
简单的来说 :
BKA (Batched Key Accessed join )是对 NLJ(Index Nested loop join)的优化, 批量的扫描数据放入 join buffer 按照一定的顺序后,再去顺序IO读取数据,节省了昂贵的IO随机读取的成本。本身也是MRR的一种实现方式。
Hash join 是对 BNL(Block Nested loop join)的算法改进的优化,熟悉oracle 或者 大数据 spark 的同学应该都很熟悉。 在OLAP或者大数据分析中进场用到HASH JOIN 作为大表之间的连接。
具体的介绍可以参考大表哥之前的文章: https://www.modb.pro/db/365827
本次我们主要看一下 表连接相关的hint:
BKA, NO_BKA: 是否开启 Batched Key Accessed join
BNL, NO_BNL: 是否开启Block Nested loop join,注意的是在mysql 8.0.18和以后的版本中, 这个hint也是 打开和关闭 hash join 的方式。
在mysql的 8.0.20版本之后,将不再支持 Block Nested loop join, 但是 仍将这个hint 作为 打开和关闭 hash join 的方式。
HASH_JOIN, NO_HASH_JOIN: 仅仅在mysql 8.0.18版本中, 作为 打开和关闭 hash join 的方式。 之后的版本,请用 BNL, NO_BNL。
从hint的角度来说,HASH_JOIN, NO_HASH_JOIN 只是mysql 版本8.0.18 中的一个过客。。。。。。
DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN: 派生表的谓词下推,这个实在版本8.0.22中出现的
MERGE, NO_MERGE: 是否进行 merge 操作,不进行merge的话 类似于生成 物化表
下面大表哥带你逐一的看一下:
预先准备2张表:
Table: t1
Create Table: CREATE TABLE `t1` (
`id` int NOT NULL,
`a` int DEFAULT NULL,
`b` int DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs
1 row in set (0.00 sec)
Create Table: CREATE TABLE `t2` (
`id` int NOT NULL,
`a` int DEFAULT NULL,
`b` int DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `a` (`a`),
KEY `idx_b` (`b`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs
1 row in set (0.00 sec)
复制
BKA, NO_BKA: 是否开启 Batched Key Accessed join
我们可以看到加入 hint /*+BKA(t1, t2) */ 之后,执行计划里面出现了 Using join buffer (Batched Key Access)
explain SELECT * FROM t1, t2 where t1.a =t2.a
--------------
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 1000 | 100.00 | Using where |
| 1 | SIMPLE | t2 | NULL | ref | a | a | 5 | testdb.t1.a | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
explain SELECT/*+BKA(t1, t2) */ * FROM t1, t2 where t1.a =t2.a
--------------
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+----------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+----------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 1000 | 100.00 | Using where |
| 1 | SIMPLE | t2 | NULL | ref | a | a | 5 | testdb.t1.a | 1 | 100.00 | Using join buffer (Batched Key Access) |
+----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+----------------------------------------+
2 rows in set, 1 warning (0.01 sec)
复制
BNL, NO_BNL: 是否开启Block Nested loop join
大表哥测试的mysql 版本是 mysql version: 8.0.20-11
这个版本开始mysql是 移除了 BNL的连接方式,只有HASH 的连接方式。
explain SELECT/*+BNL(t1,t2)*/ * FROM t1, t2 where t1.b =t2.b
--------------
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 998386 | 10.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
复制
我们尝试一下 NO_BNL: 我们看到变成了 NLJ
explain format=tree SELECT/*+NO_BNL(t1,t2)*/ * FROM t1, t2 where t1.b =t2.b
--------------
*************************** 1. row ***************************
EXPLAIN: -> Nested loop inner join (cost=100327701.00 rows=99838601)
-> Table scan on t1 (cost=101.00 rows=1000)
-> Filter: (t2.b = t1.b) (cost=498.98 rows=99839)
-> Table scan on t2 (cost=498.98 rows=998386)
1 row in set (0.00 sec)
复制
HASH_JOIN, NO_HASH_JOIN: 仅仅在mysql 8.0.18版本中生效,我们就不测试这个了。
DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN: 派生表的谓词下推
这个功能是在mysql 8.0.22中才有的,mysql大叔在官方文档上是这样描述的:
在mysql 版本8.0.22之前,派生表如果被物化的话,MYSQL会将整个表进行物化,然后再去进行where条件中的过滤。
我们测试一下 mysql 版本 8.0.20 的版本:此版本不是支持 DERIVED_CONDITION_PUSHDOWN的。
我们可以看到执行计划中的步骤是 1. 先物化 Materialize 2.在过滤 Filter: (t2.a > 20)
root@localhost:mysql_uatDB.sock [testdb]> select @@version;
--------------
select @@version
--------------
+-----------+
| @@version |
+-----------+
| 8.0.20-11 |
+-----------+
1 row in set (0.00 sec)
explain format=tree SELECT/*+NO_MERGE(t2)*/ * FROM t1, (select * from t2) t2 where t1.b =t2.b and t2.a > 20
--------------
*************************** 1. row ***************************
EXPLAIN: -> Nested loop inner join
-> Filter: (t1.b is not null) (cost=101.00 rows=1000)
-> Table scan on t1 (cost=101.00 rows=1000)
-> Filter: (t2.a > 20)
-> Index lookup on t2 using <auto_key1> (b=t1.b)
-> Materialize
-> Table scan on t2 (cost=100327.60 rows=998386)
复制
我们再次测试一下mysql 版本:8.0.27
我们可以看到从执行计划中可以看到的步骤是 1.索引过滤 with index condition: (t2.a > 20) 2. 过滤后的结果集进行物化 Materialize
mysql> select @@version;
+-----------+
| @@version |
+-----------+
| 8.0.27-18 |
+-----------+
1 row in set (0.00 sec)
mysql> explain format=tree SELECT/*+NO_MERGE(t2)*/ * FROM t1, (select * from t2) t2 where t1.b =t2.b and t2.a > 20;
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Nested loop inner join (cost=0.95 rows=1)
-> Filter: (t1.b is not null) (cost=0.35 rows=1)
-> Table scan on t1 (cost=0.35 rows=1)
-> Index lookup on t2 using <auto_key0> (b=t1.b)
-> Materialize (cost=0.81..0.81 rows=1)
-> Index range scan on t2 using a, with index condition: (t2.a > 20) (cost=0.71 rows=1)
|
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
复制
MERGE, NO_MERGE: 是否进行 merge 操作
我们可以看到添加完hint /+NO_MERGE(t2)/之后, (select * from t2) 物化成了派生表
root@localhost:mysql_uatDB.sock [testdb]> explain SELECT * FROM t1, (select * from t2) t2 where t1.b =t2.b;
--------------
explain SELECT * FROM t1, (select * from t2) t2 where t1.b =t2.b
--------------
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+
| 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL |
| 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 998386 | 10.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
explain SELECT/*+NO_MERGE(t2)*/ * FROM t1, (select * from t2) t2 where t1.b =t2.b
--------------
+----+-------------+------------+------------+------+---------------+-------------+---------+-------------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+-------------+---------+-------------+--------+----------+-------------+
| 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where |
| 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 5 | testdb.t1.b | 998 | 100.00 | NULL |
| 2 | DERIVED | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 998386 | 100.00 | NULL |
+----+-------------+------------+------------+------+---------------+-------------+---------+-------------+--------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
复制
我们再来看一下: Subquery Optimizer Hints --子查询级别的
这种类型的hint 主要指针对于 mysql 表连接的semi join 的连接策略的。
关于mysql 的semi join 的详细介绍 可以大表哥之前的文章:https://www.modb.pro/db/375572
在本篇文章中,我们就简单的回顾一下 关于 semi join 的几种连接策略的hint:
DuplicateWeedout strategy:
hint /*+ SEMIJOIN(@subq1 DUPSWEEDOUT) */ 对应的执行计划是 Start temporary; End temporary
explain select/*+ SEMIJOIN(@subq1 DUPSWEEDOUT) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2 )
--------------
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+---------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+---------------------------------------------+
| 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 100.00 | NULL |
| 1 | SIMPLE | tab2 | NULL | ref | idx_pid | idx_pid | 5 | testdb.tab1.id | 2 | 100.00 | Using index; Start temporary; End temporary |
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+---------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
复制
Firstmatch Strategy:
hint /*+ SEMIJOIN(@subq1 FIRSTMATCH) */ 对应的执行计划是 FirstMatch(tab1)
explain select/*+ SEMIJOIN(@subq1 FIRSTMATCH) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2)
--------------
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+-------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+-------------------------------+
| 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 100.00 | NULL |
| 1 | SIMPLE | tab2 | NULL | ref | idx_pid | idx_pid | 5 | testdb.tab1.id | 2 | 100.00 | Using index; FirstMatch(tab1) |
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+-------------------------------+
2 rows in set, 1 warning (0.00 sec)
复制
Loosescan Strategy
hint /*+ SEMIJOIN(@subq1 LOOSESCAN) */ 对应的执行计划是 LooseScan
root@localhost:mysql_uatDB.sock [testdb]> explain select/*+ SEMIJOIN(@subq1 LOOSESCAN) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2);
--------------
explain select/*+ SEMIJOIN(@subq1 LOOSESCAN) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2)
--------------
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------+
| 1 | SIMPLE | tab2 | NULL | index | idx_pid | idx_pid | 5 | NULL | 4 | 50.00 | Using index; LooseScan |
| 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 50.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
复制
Materialize scan/Materialize lookup Strategy
hint /*+ SEMIJOIN(@subq1 MATERIALIZATION) */ 对应的执行计划是 MATERIALIZED
explain select/*+ SEMIJOIN(@subq1 MATERIALIZATION) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2)
--------------
+----+--------------+-------------+------------+--------+---------------------+---------------------+---------+----------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+---------------------+---------------------+---------+----------------+------+----------+-------------+
| 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 100.00 | Using where |
| 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | testdb.tab1.id | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | tab2 | NULL | index | idx_pid | idx_pid | 5 | NULL | 4 | 100.00 | Using index |
+----+--------------+-------------+------------+--------+---------------------+---------------------+---------+----------------+------+----------+-------------+
3 rows in set, 1 warning (0.00 sec)
复制
最后大表哥总结一下今天介绍的2种hint:
1)Table-Level Optimizer Hints --表级别的hint
BKA, NO_BKA: 是否开启 Batched Key Accessed join
BNL, NO_BNL: 是否开启Block Nested loop join
HASH_JOIN, NO_HASH_JOIN 是否开启hash join 尽在mysql 版本8.0.18中有效
DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN: 派生表的谓词下推,这个实在版本8.0.22中出现的
MERGE, NO_MERGE: 是否进行 merge 操作
2)Subquery Optimizer Hints --子查询级别的hit
主要是针对mysql semi join的 4种表的连接策略:
DuplicateWeedout strategy
Firstmatch Strategy
Loosescan Strategy
Materialize scan/Materialize lookup Strategy
评论
