大家好!本次继续带来MYSQL之执行计划的入门理解–下篇:
大表哥会带来执行计划中剩下的几个指标理解: possible_keys,key,key_len,ref,rows,filtered,Extra
1)possible_keys 和 key
possible_keys: 表示索引访问表的所有可能性,这个一般出现的越多,代表可能索引的索引就越多,就有可能存在重复索引的情况。
负面影响有2点:
1)增加了DML的索引维护,降低了该表的DML的效率
2)迷惑了优化器(特别是在统计信息不准的情况下)可能会选择不好的索引作为执行计划
key: 如果是索引访问表的话,最终在possible_keys中选择的一个索引名字
我们来看一个重复索引的例子: 很明显 索引 idx_name 和 idx_name_age 是重复的关系。(索引idx_name_age 包含了 idx_name )
mysql> create table t4 (id int not null primary key, name varchar(20),age int);
Query OK, 0 rows affected (0.01 sec)
mysql> alter table t4 add index idx_name (name);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table t4 add index idx_name_age (name,age);
Query OK, 0 rows affected (0.01 sec)
Records: 0 Duplicates: 0 Warnings: 0
我们看一下 简单的 select 查询的执行计划: 这里的possible_keys 是idx_name,idx_name_age ,实际优化器选择的索引是idx_name
mysql> explain select * from t4 where name = 'BB';
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | t4 | NULL | ref | idx_name,idx_name_age | idx_name | 83 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
那么如何查询数据库重复的索引呢? 目前市面上有3种方式:
利用第三方工具:
1)pt: pt-duplicate-key-checker
2)mysql untility: mysqlindexcheck
(上面这2中工具的方法使用都十分简单,大表哥就不展开介绍了,感兴趣的同学可以自行查阅)
3)或者查询sys数据库下面的试图 sys.schema_redundant_indexes
我们可以看到
贴心的mysql大叔建议我们 使用语句 sql_drop_index: ALTER TABLE testdb
.t4
DROP INDEX idx_name
来删除这个重复的索引。
mysql> select * from sys.schema_redundant_indexes where table_name = 't4'\G
*************************** 1. row ***************************
table_schema: testdb
table_name: t4
redundant_index_name: idx_name
redundant_index_columns: name
redundant_index_non_unique: 1
dominant_index_name: idx_name_age
dominant_index_columns: name,age
dominant_index_non_unique: 1
subpart_exists: 0
sql_drop_index: ALTER TABLE `testdb`.`t4` DROP INDEX `idx_name`
1 row in set (0.00 sec)
2)我们再来看一下 key_len 这个属性
key_len 代表了索引的长度: 我们看 idx_name 这个索引的长度为83。
mysql> explain select * from t4 where name = 'BB';
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
| 1 | SIMPLE | t4 | NULL | ref | idx_name,idx_name_age | idx_name | 83 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
这个83是如何计算的呢? 我们知道 idx_name 只有单键值是 name varchar(20) , 每个字符根据数据库编码不一样,像是utf8mb3是占3个字节,
像是utf8mb4是占4个字节,为了支持一些EMJO的表情符号。 大表哥的这个表就是 utf8mb4 的。
所以目前 是 每个字符占有4个字节 * 20的长度 = 80 目前这个索引的len 是80
并且还可以看到这个字段是允许为空的 , 这个默认是 len +1 ,
并且name这个字段是变长字段 , 这个默认是 len +2
综上所述: 20 * 4 + 1 + 2 = 83 与执行计划中的 key len 的长度是相符合的
mysql> show create table t4\G
*************************** 1. row ***************************
Table: t4
Create Table: CREATE TABLE `t4` (
`id` int(11) NOT NULL,
`name` varchar(20) COLLATE utf8mb4_0900_as_cs DEFAULT NULL,
`age` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_name` (`name`),
KEY `idx_name_age` (`name`,`age`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_as_cs
1 row in set (0.00 sec)
我们可以以此类推 看一下了另外一个索引的长度:idx_name_age
name: 20* 4 + 2 +1 = 83
age: (int 默认占4个字节 ) 4 + (可以为NULL) 1 = 5
name+age = 83+ 5 =88
mysql> explain select * from t4 use index(idx_name_age) where name = 'BB' and age =1;
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | t4 | NULL | ref | idx_name_age | idx_name_age | 88 | const,const | 1 | 100.00 | Using index |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
3)我们再来看一下 ref 这个属性
执行计划中ref这个属性就是表示 where 后面所触发的连接条件: 可以是表与表的连接,也可以是表与常量值的连接。
我们举例来看一下: where name = ‘BB’ and age =1; 其中 ‘BB’ 和 1 都是常量,索引对应的ref是 const,const
mysql> explain select * from t4 use index(idx_name_age) where name = 'BB' and age =1;
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | t4 | NULL | ref | idx_name_age | idx_name_age | 88 | const,const | 1 | 100.00 | Using index |
+----+-------------+-------+------------+------+---------------+--------------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
我们再来看一下: 表与表的连接: 我们可以看到 ref 是 testdb.a.id
mysql> explain select * from t4 a, t4 b where a.id = b.id;
+----+-------------+-------+------------+--------+---------------+--------------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+--------------+---------+-------------+------+----------+-------------+
| 1 | SIMPLE | a | NULL | index | PRIMARY | idx_name_age | 88 | NULL | 1 | 100.00 | Using index |
| 1 | SIMPLE | b | NULL | eq_ref | PRIMARY | PRIMARY | 4 | testdb.a.id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+---------------+--------------+---------+-------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
也可以是函数的连接条件:ref 是 func
mysql> explain select * from t4 a, t4 b where a.id = lower(b.id);
+----+-------------+-------+------------+--------+---------------+--------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+---------------+--------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | b | NULL | index | NULL | idx_name_age | 88 | NULL | 1 | 100.00 | Using index |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using where |
+----+-------------+-------+------------+--------+---------------+--------------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
4)我们再来看一下 rows 和 filtered 这个属性
rows : 这个表示通过全表或者索引扫描的预估的行数
filtered: 表示某列通过条件过滤的后所能筛选的比率。
值得注意的是,这2个属性的值都是预估的值,精确程度取决于表的统计信息和列上直方图的采样信息。
关于表和索引的统计信息 可以参考之前的文章: https://www.modb.pro/db/336054
关于列的直方图的信息 可以参考之前的文章: https://www.modb.pro/db/336688
5)我们最后再来看一下 Extra 这个属性
Extra列表示的是补充的额外的一些关于这个SQL执行计划的一些信息。 这个Extra 的内容有好几十种。
我们就来简单的看一下,常见的一些Extra。
No tables used: 没有实际的去访问任何一张表
mysql> explain select now();
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
Impossible WHERE: 1 = 2 是个伪命题, mysql大叔的优化器会直接 : say No!!
mysql> explain select * from t4 where 1 = 2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+------------------+
1 row in set, 1 warning (0.00 sec)
Using index: 覆盖索引, 不回表,mysql 大叔喜欢的类型
mysql> explain select name from t4 where name = 'B';
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | t4 | NULL | ref | idx_name,idx_name_age | idx_name | 83 | const | 1 | 100.00 | Using index |
+----+-------------+-------+------------+------+-----------------------+----------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Using index condition: 索引下推, MYSQL大叔优化器的宗旨是 能在innoDB 存储引擎层解决问题的,就不会放到mysql server 层来解决。
我们看一下这个例子: where name like ‘jason%’ and age = 2;
索引 key(name,age) 这个中 name like ‘jason%’ 不是一个等值的查询,是一个边界前闭后开的扫描, 这个时候mysql 大叔的优化器 不会把 name like 'jason%'扫描到所有数据拿到
server 层与 进行 与 age = 2的过滤
而是 在innoDB存储引擎 的存储引擎层直接进行 age = 2的过滤, 再把最终的结果返回给 server 层, 这个优化就叫做 索引下推
mysql> create table tab1 (id int not null primary key, name varchar(20),age int, job varchar(20), key(name,age));
Query OK, 0 rows affected (0.02 sec)
mysql> insert into tab1 (id,name,age,job) values (1,'jason',20,'dev');
Query OK, 1 row affected (0.00 sec)
mysql> insert into tab1 (id,name,age,job) values (2,'jason',30,'dba');
Query OK, 1 row affected (0.01 sec)
mysql> insert into tab1 (id,name,age,job) values (3,'jason',40,'pm');
Query OK, 1 row affected (0.00 sec)
mysql> explain select * from tab1 where name like 'jason%' and age = 2;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | tab1 | NULL | range | name | name | 88 | NULL | 3 | 33.33 | Using index condition |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)
我们也可以通过优化器的参数来打开和关闭这个 索引下推的功能
mysql 默认是 index_condition_pushdown=on
mysql> set optimizer_switch='index_condition_pushdown=off';
Query OK, 0 rows affected (0.00 sec)
mysql> desc select * from tab1 where name like 'jason%' and age = 2;
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | tab1 | NULL | range | name | name | 88 | NULL | 3 | 33.33 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
using where: 表示SQL 语句需要进行 server 层的过滤
where 条件中 :job 列上不存在索引
mysql> explain select * from tab1 where name = 'jason' and job = 'dev';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | tab1 | NULL | ref | name | name | 83 | const | 3 | 33.33 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec
Using filesort: 需要在server层进行内存或者文件排序的SQL
MYSQL大叔的优化器 不是很喜欢这个,应该在表设计中规避
mysql> explain select * from tab1 order by job desc;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | tab1 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.00 sec)
Using temporary: 需要在server层进行分组group by ,去重distinct 等操作的SQL。
MYSQL大叔的优化器 不是很喜欢这个,应该在表设计中规避
mysql> explain select job,count(1) from tab1 group by job;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
| 1 | SIMPLE | tab1 | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using temporary |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-----------------+
1 row in set, 1 warning (0.00 sec)
最后到这里, mysql 执行计划入门理解 上篇和下篇全部介绍完毕了。 开发和运维的同学,觉得有感兴趣的话,可以转发给你们周围的小伙伴!