MYSQL之执行计划的入门理解--下篇

原创大表哥 2022-03-07

1906

大家好！本次继续带来MYSQL之执行计划的入门理解–下篇：

大表哥会带来执行计划中剩下的几个指标理解： possible_keys，key，key_len，ref，rows，filtered，Extra

1）possible_keys 和 key

possible_keys：表示索引访问表的所有可能性，这个一般出现的越多，代表可能索引的索引就越多，就有可能存在重复索引的情况。
负面影响有2点：
1）增加了DML的索引维护，降低了该表的DML的效率
2）迷惑了优化器（特别是在统计信息不准的情况下）可能会选择不好的索引作为执行计划

key: 如果是索引访问表的话，最终在possible_keys中选择的一个索引名字

我们来看一个重复索引的例子：很明显索引 idx_name 和 idx_name_age 是重复的关系。（索引idx_name_age 包含了 idx_name ）

mysql> create table t4 (id int not null primary key, name varchar(20),age int);
Query OK, 0 rows affected (0.01 sec)


mysql> alter table t4 add index idx_name (name);
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0


mysql> alter table t4 add index idx_name_age (name,age);
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

复制

我们看一下简单的 select 查询的执行计划：这里的possible_keys 是idx_name,idx_name_age ，实际优化器选择的索引是idx_name

mysql> explain select * from t4 where name = 'BB';
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys         | key      | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t4    | NULL       | ref  | idx_name,idx_name_age | idx_name | 83      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

复制

那么如何查询数据库重复的索引呢？目前市面上有3种方式：
利用第三方工具：
1)pt: pt-duplicate-key-checker
2)mysql untility: mysqlindexcheck
（上面这2中工具的方法使用都十分简单，大表哥就不展开介绍了，感兴趣的同学可以自行查阅）

3）或者查询sys数据库下面的试图 sys.schema_redundant_indexes

我们可以看到
贴心的mysql大叔建议我们使用语句 sql_drop_index: ALTER TABLE testdb.t4 DROP INDEX idx_name 来删除这个重复的索引。

mysql> select * from sys.schema_redundant_indexes where table_name = 't4'\G
*************************** 1. row ***************************
              table_schema: testdb
                table_name: t4
      redundant_index_name: idx_name
   redundant_index_columns: name
redundant_index_non_unique: 1
       dominant_index_name: idx_name_age
    dominant_index_columns: name,age
dominant_index_non_unique: 1
            subpart_exists: 0
            sql_drop_index: ALTER TABLE `testdb`.`t4` DROP INDEX `idx_name`
1 row in set (0.00 sec)

复制

2）我们再来看一下 key_len 这个属性

key_len 代表了索引的长度：我们看 idx_name 这个索引的长度为83。

mysql> explain select * from t4 where name = 'BB';
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys         | key      | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t4    | NULL       | ref  | idx_name,idx_name_age | idx_name | 83      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

复制

这个83是如何计算的呢？我们知道 idx_name 只有单键值是 name varchar(20) , 每个字符根据数据库编码不一样，像是utf8mb3是占3个字节，
像是utf8mb4是占4个字节，为了支持一些EMJO的表情符号。大表哥的这个表就是 utf8mb4 的。

所以目前是每个字符占有4个字节 * 20的长度 = 80 目前这个索引的len 是80

并且还可以看到这个字段是允许为空的，这个默认是 len +1 ，
并且name这个字段是变长字段，这个默认是 len +2

综上所述： 20 * 4 + 1 + 2 = 83 与执行计划中的 key len 的长度是相符合的

mysql> show create table t4\G
*************************** 1. row ***************************
       Table: t4
Create Table: CREATE TABLE `t4` (
  `id` int(11) NOT NULL,
  `name` varchar(20) COLLATE utf8mb4_0900_as_cs DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_name` (`name`),
  KEY `idx_name_age` (`name`,`age`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs
1 row in set (0.00 sec)

复制

我们可以以此类推看一下了另外一个索引的长度：idx_name_age

name: 20* 4 + 2 +1 = 83
age: (int 默认占4个字节 ) 4 + (可以为NULL) 1 = 5

name+age = 83+ 5 =88

mysql> explain select * from t4 use index(idx_name_age) where name = 'BB' and age =1;
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key          | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | t4    | NULL       | ref  | idx_name_age  | idx_name_age | 88      | const,const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

3）我们再来看一下 ref 这个属性

执行计划中ref这个属性就是表示 where 后面所触发的连接条件：可以是表与表的连接，也可以是表与常量值的连接。

我们举例来看一下： where name = ‘BB’ and age =1; 其中 ‘BB’ 和 1 都是常量，索引对应的ref是 const,const

mysql> explain select * from t4 use index(idx_name_age) where name = 'BB' and age =1;
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key          | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | t4    | NULL       | ref  | idx_name_age  | idx_name_age | 88      | const,const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

我们再来看一下：表与表的连接：我们可以看到 ref 是 testdb.a.id

mysql> explain select * from t4 a, t4 b where a.id = b.id;
+----+-------------+-------+------------+--------+---------------+--------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key          | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+--------------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | a     | NULL       | index  | PRIMARY       | idx_name_age | 88      | NULL        |    1 |   100.00 | Using index |
|  1 | SIMPLE      | b     | NULL       | eq_ref | PRIMARY       | PRIMARY      | 4       | testdb.a.id |    1 |   100.00 | NULL        |
+----+-------------+-------+------------+--------+---------------+--------------+---------+-------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

复制

也可以是函数的连接条件：ref 是 func

mysql> explain select * from t4 a, t4 b where a.id = lower(b.id);
+----+-------------+-------+------------+--------+---------------+--------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys | key          | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+--------+---------------+--------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | b     | NULL       | index  | NULL          | idx_name_age | 88      | NULL |    1 |   100.00 | Using index |
|  1 | SIMPLE      | a     | NULL       | eq_ref | PRIMARY       | PRIMARY      | 4       | func |    1 |   100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+--------------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)

复制

4）我们再来看一下 rows 和 filtered 这个属性

rows ：这个表示通过全表或者索引扫描的预估的行数
filtered: 表示某列通过条件过滤的后所能筛选的比率。

值得注意的是，这2个属性的值都是预估的值，精确程度取决于表的统计信息和列上直方图的采样信息。

关于表和索引的统计信息可以参考之前的文章： https://www.modb.pro/db/336054

关于列的直方图的信息可以参考之前的文章： https://www.modb.pro/db/336688

5）我们最后再来看一下 Extra 这个属性

Extra列表示的是补充的额外的一些关于这个SQL执行计划的一些信息。这个Extra 的内容有好几十种。

我们就来简单的看一下，常见的一些Extra。

No tables used：没有实际的去访问任何一张表

mysql>  explain select  now();
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | No tables used |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

复制

Impossible WHERE： 1 = 2 是个伪命题， mysql大叔的优化器会直接 : say No!!

mysql> explain select * from t4 where 1 = 2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra            |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
|  1 | SIMPLE      | NULL  | NULL       | NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | Impossible WHERE |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
1 row in set, 1 warning (0.00 sec)

复制

Using index: 覆盖索引，不回表，mysql 大叔喜欢的类型

mysql> explain select name from t4 where name = 'B';
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys         | key      | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | t4    | NULL       | ref  | idx_name,idx_name_age | idx_name | 83      | const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

Using index condition：索引下推， MYSQL大叔优化器的宗旨是能在innoDB 存储引擎层解决问题的，就不会放到mysql server 层来解决。

我们看一下这个例子： where name like ‘jason%’ and age = 2;

索引 key(name,age) 这个中 name like ‘jason%’ 不是一个等值的查询，是一个边界前闭后开的扫描，这个时候mysql 大叔的优化器不会把 name like 'jason%'扫描到所有数据拿到
server 层与进行与 age = 2的过滤
而是在innoDB存储引擎的存储引擎层直接进行 age = 2的过滤，再把最终的结果返回给 server 层，这个优化就叫做索引下推

mysql> create table tab1 (id int not null primary key, name varchar(20),age int, job varchar(20), key(name,age));
Query OK, 0 rows affected (0.02 sec)


mysql> insert into tab1  (id,name,age,job) values (1,'jason',20,'dev');
Query OK, 1 row affected (0.00 sec)


mysql> insert into tab1  (id,name,age,job) values (2,'jason',30,'dba');
Query OK, 1 row affected (0.01 sec)


mysql> insert into tab1  (id,name,age,job) values (3,'jason',40,'pm');
Query OK, 1 row affected (0.00 sec)


mysql> explain select * from tab1 where name like 'jason%' and age = 2;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | tab1  | NULL       | range | name          | name | 88      | NULL |    3 |    33.33 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

复制

我们也可以通过优化器的参数来打开和关闭这个索引下推的功能
mysql 默认是 index_condition_pushdown=on

mysql> set optimizer_switch='index_condition_pushdown=off';
Query OK, 0 rows affected (0.00 sec)




mysql> desc select * from tab1 where name like 'jason%' and age = 2;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | tab1  | NULL       | range | name          | name | 88      | NULL |    3 |    33.33 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

using where: 表示SQL 语句需要进行 server 层的过滤

where 条件中：job 列上不存在索引

mysql> explain select * from tab1 where name = 'jason' and job = 'dev';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | tab1  | NULL       | ref  | name          | name | 83      | const |    3 |    33.33 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec

复制

Using filesort：需要在server层进行内存或者文件排序的SQL

MYSQL大叔的优化器不是很喜欢这个，应该在表设计中规避

mysql> explain select * from tab1 order by job desc;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | tab1  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |   100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

复制

Using temporary：需要在server层进行分组group by ,去重distinct 等操作的SQL。

MYSQL大叔的优化器不是很喜欢这个，应该在表设计中规避

mysql> explain select job,count(1) from tab1 group by job;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra           |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
|  1 | SIMPLE      | tab1  | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    3 |   100.00 | Using temporary |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)

复制

最后到这里, mysql 执行计划入门理解上篇和下篇全部介绍完毕了。开发和运维的同学，觉得有感兴趣的话，可以转发给你们周围的小伙伴！

墨力计划 mysql

「喜欢这篇文章，您的关注和赞赏是给作者最好的鼓励」

关注作者

1人已赞赏

筱悦星辰

1年前

敢于突破、勇于承担、无畏前行，这样的人才能走出更为开阔的天地。

1年前

大表哥

1年前

Keep moving, Stay hungry !

1年前

3年前

可惜没有执行时间. 意思是说这个执行计划真实执行后执行计划每个步骤时间消耗,扫描行数,扫描次数

3年前

大表哥

3年前

可以尝试： SET OPTIMIZER_TRACE = "enabled=on"; SET OPTIMIZER_TRACE_MAX_MEM_SIZE = 1000000; explain your SQL; select * from FROM INFORMATION_SCHEMA.OPTIMIZER_TRACE; 这个表INFORMATION_SCHEMA.OPTIMIZER_TRACE的信息很详细。

3年前

@大表哥 OK ! 我试试看谢了大表哥

3年前

墨天轮福利君

3年前

您好，您的文章已入选合格奖，10墨值奖励已经到账请查收！ ❤️我们还会实时派发您的流量收益。

3年前