暂无图片
暂无图片
4
暂无图片
暂无图片
1
暂无图片

聊聊 mysql 8.0 hint 之中篇

原创 大表哥 2022-03-17
2103

image.png
大家好,这次大表哥带来的是 mysql 8.0 hint的中篇。

本文将会介绍这2种类型的 hint:

Table-Level Optimizer Hints --表级别的

Subquery Optimizer Hints --子查询级别的

我们大家先看表级别的hint: Table-Level Optimizer Hints --表级别的

表级别的hint 都是关于表与表的连接方式 ,在MYSQL中表与表的连接方式大致分为如下几种:

1) Index Nested loop join (简称 NLJ)
2) Batched Key Accessed join (简称BKA)
3) Block Nested loop join (before 8.0.18) (简称BNL)
4) Hash join (from 8.0.18)

简单的来说 :
BKA (Batched Key Accessed join )是对 NLJ(Index Nested loop join)的优化, 批量的扫描数据放入 join buffer 按照一定的顺序后,再去顺序IO读取数据,节省了昂贵的IO随机读取的成本。本身也是MRR的一种实现方式。

Hash join 是对 BNL(Block Nested loop join)的算法改进的优化,熟悉oracle 或者 大数据 spark 的同学应该都很熟悉。 在OLAP或者大数据分析中进场用到HASH JOIN 作为大表之间的连接。

具体的介绍可以参考大表哥之前的文章: https://www.modb.pro/db/365827

本次我们主要看一下 表连接相关的hint:

BKA, NO_BKA: 是否开启 Batched Key Accessed join

BNL, NO_BNL: 是否开启Block Nested loop join,注意的是在mysql 8.0.18和以后的版本中, 这个hint也是 打开和关闭 hash join 的方式。
在mysql的 8.0.20版本之后,将不再支持 Block Nested loop join, 但是 仍将这个hint 作为 打开和关闭 hash join 的方式。

HASH_JOIN, NO_HASH_JOIN: 仅仅在mysql 8.0.18版本中, 作为 打开和关闭 hash join 的方式。 之后的版本,请用 BNL, NO_BNL。

从hint的角度来说,HASH_JOIN, NO_HASH_JOIN 只是mysql 版本8.0.18 中的一个过客。。。。。。

DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN: 派生表的谓词下推,这个实在版本8.0.22中出现的

MERGE, NO_MERGE: 是否进行 merge 操作,不进行merge的话 类似于生成 物化表

下面大表哥带你逐一的看一下:

预先准备2张表:

Table: t1 Create Table: CREATE TABLE `t1` ( `id` int NOT NULL, `a` int DEFAULT NULL, `b` int DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs 1 row in set (0.00 sec) Create Table: CREATE TABLE `t2` ( `id` int NOT NULL, `a` int DEFAULT NULL, `b` int DEFAULT NULL, PRIMARY KEY (`id`), KEY `a` (`a`), KEY `idx_b` (`b`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs 1 row in set (0.00 sec)
复制

BKA, NO_BKA: 是否开启 Batched Key Accessed join

我们可以看到加入 hint /*+BKA(t1, t2) */ 之后,执行计划里面出现了 Using join buffer (Batched Key Access)

explain SELECT * FROM t1, t2 where t1.a =t2.a -------------- +----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+ | 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 1000 | 100.00 | Using where | | 1 | SIMPLE | t2 | NULL | ref | a | a | 5 | testdb.t1.a | 1 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+-------------+ 2 rows in set, 1 warning (0.00 sec) explain SELECT/*+BKA(t1, t2) */ * FROM t1, t2 where t1.a =t2.a -------------- +----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+----------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+----------------------------------------+ | 1 | SIMPLE | t1 | NULL | ALL | a | NULL | NULL | NULL | 1000 | 100.00 | Using where | | 1 | SIMPLE | t2 | NULL | ref | a | a | 5 | testdb.t1.a | 1 | 100.00 | Using join buffer (Batched Key Access) | +----+-------------+-------+------------+------+---------------+------+---------+-------------+------+----------+----------------------------------------+ 2 rows in set, 1 warning (0.01 sec)
复制

BNL, NO_BNL: 是否开启Block Nested loop join

大表哥测试的mysql 版本是 mysql version: 8.0.20-11

这个版本开始mysql是 移除了 BNL的连接方式,只有HASH 的连接方式。

explain SELECT/*+BNL(t1,t2)*/ * FROM t1, t2 where t1.b =t2.b -------------- +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+ | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL | | 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 998386 | 10.00 | Using where; Using join buffer (hash join) | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+ 2 rows in set, 1 warning (0.00 sec)
复制

我们尝试一下 NO_BNL: 我们看到变成了 NLJ

explain format=tree SELECT/*+NO_BNL(t1,t2)*/ * FROM t1, t2 where t1.b =t2.b -------------- *************************** 1. row *************************** EXPLAIN: -> Nested loop inner join (cost=100327701.00 rows=99838601) -> Table scan on t1 (cost=101.00 rows=1000) -> Filter: (t2.b = t1.b) (cost=498.98 rows=99839) -> Table scan on t2 (cost=498.98 rows=998386) 1 row in set (0.00 sec)
复制

HASH_JOIN, NO_HASH_JOIN: 仅仅在mysql 8.0.18版本中生效,我们就不测试这个了。

DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN: 派生表的谓词下推

这个功能是在mysql 8.0.22中才有的,mysql大叔在官方文档上是这样描述的:
在mysql 版本8.0.22之前,派生表如果被物化的话,MYSQL会将整个表进行物化,然后再去进行where条件中的过滤。

Image.png

我们测试一下 mysql 版本 8.0.20 的版本:此版本不是支持 DERIVED_CONDITION_PUSHDOWN的。

我们可以看到执行计划中的步骤是 1. 先物化 Materialize 2.在过滤 Filter: (t2.a > 20)

root@localhost:mysql_uatDB.sock [testdb]> select @@version; -------------- select @@version -------------- +-----------+ | @@version | +-----------+ | 8.0.20-11 | +-----------+ 1 row in set (0.00 sec) explain format=tree SELECT/*+NO_MERGE(t2)*/ * FROM t1, (select * from t2) t2 where t1.b =t2.b and t2.a > 20 -------------- *************************** 1. row *************************** EXPLAIN: -> Nested loop inner join -> Filter: (t1.b is not null) (cost=101.00 rows=1000) -> Table scan on t1 (cost=101.00 rows=1000) -> Filter: (t2.a > 20) -> Index lookup on t2 using <auto_key1> (b=t1.b) -> Materialize -> Table scan on t2 (cost=100327.60 rows=998386)
复制

我们再次测试一下mysql 版本:8.0.27
我们可以看到从执行计划中可以看到的步骤是 1.索引过滤 with index condition: (t2.a > 20) 2. 过滤后的结果集进行物化 Materialize

mysql> select @@version; +-----------+ | @@version | +-----------+ | 8.0.27-18 | +-----------+ 1 row in set (0.00 sec) mysql> explain format=tree SELECT/*+NO_MERGE(t2)*/ * FROM t1, (select * from t2) t2 where t1.b =t2.b and t2.a > 20; +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | EXPLAIN | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | -> Nested loop inner join (cost=0.95 rows=1) -> Filter: (t1.b is not null) (cost=0.35 rows=1) -> Table scan on t1 (cost=0.35 rows=1) -> Index lookup on t2 using <auto_key0> (b=t1.b) -> Materialize (cost=0.81..0.81 rows=1) -> Index range scan on t2 using a, with index condition: (t2.a > 20) (cost=0.71 rows=1) | +-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
复制

MERGE, NO_MERGE: 是否进行 merge 操作

我们可以看到添加完hint /+NO_MERGE(t2)/之后, (select * from t2) 物化成了派生表

root@localhost:mysql_uatDB.sock [testdb]> explain SELECT * FROM t1, (select * from t2) t2 where t1.b =t2.b; -------------- explain SELECT * FROM t1, (select * from t2) t2 where t1.b =t2.b -------------- +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+ | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL | | 1 | SIMPLE | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 998386 | 10.00 | Using where; Using join buffer (hash join) | +----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+--------------------------------------------+ 2 rows in set, 1 warning (0.00 sec) explain SELECT/*+NO_MERGE(t2)*/ * FROM t1, (select * from t2) t2 where t1.b =t2.b -------------- +----+-------------+------------+------------+------+---------------+-------------+---------+-------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+------------+------+---------------+-------------+---------+-------------+--------+----------+-------------+ | 1 | PRIMARY | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | Using where | | 1 | PRIMARY | <derived2> | NULL | ref | <auto_key0> | <auto_key0> | 5 | testdb.t1.b | 998 | 100.00 | NULL | | 2 | DERIVED | t2 | NULL | ALL | NULL | NULL | NULL | NULL | 998386 | 100.00 | NULL | +----+-------------+------------+------------+------+---------------+-------------+---------+-------------+--------+----------+-------------+ 3 rows in set, 1 warning (0.00 sec)
复制

我们再来看一下: Subquery Optimizer Hints --子查询级别的

这种类型的hint 主要指针对于 mysql 表连接的semi join 的连接策略的。

关于mysql 的semi join 的详细介绍 可以大表哥之前的文章:https://www.modb.pro/db/375572

在本篇文章中,我们就简单的回顾一下 关于 semi join 的几种连接策略的hint:

DuplicateWeedout strategy:
hint /*+ SEMIJOIN(@subq1 DUPSWEEDOUT) */ 对应的执行计划是 Start temporary; End temporary

explain select/*+ SEMIJOIN(@subq1 DUPSWEEDOUT) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2 ) -------------- +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+---------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+---------------------------------------------+ | 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 100.00 | NULL | | 1 | SIMPLE | tab2 | NULL | ref | idx_pid | idx_pid | 5 | testdb.tab1.id | 2 | 100.00 | Using index; Start temporary; End temporary | +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+---------------------------------------------+ 2 rows in set, 1 warning (0.00 sec)
复制

Firstmatch Strategy:
hint /*+ SEMIJOIN(@subq1 FIRSTMATCH) */ 对应的执行计划是 FirstMatch(tab1)

explain select/*+ SEMIJOIN(@subq1 FIRSTMATCH) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2) -------------- +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+-------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+-------------------------------+ | 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 100.00 | NULL | | 1 | SIMPLE | tab2 | NULL | ref | idx_pid | idx_pid | 5 | testdb.tab1.id | 2 | 100.00 | Using index; FirstMatch(tab1) | +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+-------------------------------+ 2 rows in set, 1 warning (0.00 sec)
复制

Loosescan Strategy
hint /*+ SEMIJOIN(@subq1 LOOSESCAN) */ 对应的执行计划是 LooseScan

root@localhost:mysql_uatDB.sock [testdb]> explain select/*+ SEMIJOIN(@subq1 LOOSESCAN) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2); -------------- explain select/*+ SEMIJOIN(@subq1 LOOSESCAN) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2) -------------- +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------+ | 1 | SIMPLE | tab2 | NULL | index | idx_pid | idx_pid | 5 | NULL | 4 | 50.00 | Using index; LooseScan | | 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 50.00 | Using where; Using join buffer (hash join) | +----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------------------------+ 2 rows in set, 1 warning (0.00 sec)
复制

Materialize scan/Materialize lookup Strategy
hint /*+ SEMIJOIN(@subq1 MATERIALIZATION) */ 对应的执行计划是 MATERIALIZED

explain select/*+ SEMIJOIN(@subq1 MATERIALIZATION) */ * from tab1 where id in (select /*+ QB_NAME(subq1) */ pid from tab2) -------------- +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+----------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+----------------+------+----------+-------------+ | 1 | SIMPLE | tab1 | NULL | ALL | PRIMARY | NULL | NULL | NULL | 2 | 100.00 | Using where | | 1 | SIMPLE | <subquery2> | NULL | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 5 | testdb.tab1.id | 1 | 100.00 | NULL | | 2 | MATERIALIZED | tab2 | NULL | index | idx_pid | idx_pid | 5 | NULL | 4 | 100.00 | Using index | +----+--------------+-------------+------------+--------+---------------------+---------------------+---------+----------------+------+----------+-------------+ 3 rows in set, 1 warning (0.00 sec)
复制

最后大表哥总结一下今天介绍的2种hint:
1)Table-Level Optimizer Hints --表级别的hint
BKA, NO_BKA: 是否开启 Batched Key Accessed join
BNL, NO_BNL: 是否开启Block Nested loop join
HASH_JOIN, NO_HASH_JOIN 是否开启hash join 尽在mysql 版本8.0.18中有效
DERIVED_CONDITION_PUSHDOWN, NO_DERIVED_CONDITION_PUSHDOWN: 派生表的谓词下推,这个实在版本8.0.22中出现的
MERGE, NO_MERGE: 是否进行 merge 操作

2)Subquery Optimizer Hints --子查询级别的hit
主要是针对mysql semi join的 4种表的连接策略:
DuplicateWeedout strategy
Firstmatch Strategy
Loosescan Strategy
Materialize scan/Materialize lookup Strategy

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论

墨天轮福利君
暂无图片
3年前
评论
暂无图片 1
您好,您的文章已入选墨力原创作者计划合格奖,10墨值奖励已经到账请查收! ❤️我们还会实时派发您的流量收益。
3年前
暂无图片 1
评论