MySQL之explain extra字段解析

PostgreSQL运维技术 2022-05-28

5397

这里对mysql进行SQL调优的时候肯定是会用到explain来看select语句的执行情况。但是之前我对其中的extra的认识一直是模棱两可的，甚至是错误的。

此篇呢，就主要对extra可能出现的几种情况做个梳理，并测试一把，分别模拟其常见的几种情况。

先上总结：

extra	where条件	select的字段
null	where筛选条件是索引的前导列	查询的列未被索引覆盖
Using index	where筛选条件是索引的前导列	查询的列被索引覆盖
Using where; Using index	where筛选条件是索引列之一但不是前导列或者where筛选条件是索引列前导列的一个范围	查询的列被索引覆盖
Using where;	where筛选条件不是索引列	-
Using where;	where筛选条件不是索引前导列、是索引列前导列的一个范围(>)	查询列未被索引覆盖
Using index condition	where索引列前导列的一个范围（<、between）	查询列未被索引覆盖

两种排序的情况。
这里是参考：http://mysql.taobao.org/monthly/2015/03/04/

extra	出现场景
Using filesort	filesort主要用于查询数据结果集的排序操作，首先MySQL会使用sort_buffer_size大小的内存进行排序，如果结果集超过了sort_buffer_size大小，会把这一个排序后的chunk转移到file上，最后使用多路归并排序完成所有数据的排序操作。
Using temporary	MySQL使用临时表保存临时的结构，以用于后续的处理，MySQL首先创建heap引擎的临时表，如果临时的数据过多，超过max_heap_table_size的大小，会自动把临时表转换成MyISAM引擎的表来使用。

两者的不同：
filesort只能应用在单个表上，如果有多个表的数据需要排序，那么MySQL会先使用using temporary保存临时数据，然后再在临时表上使用filesort进行排序，最后输出结果。

什么是索引前导列

所谓前导列，就是在创建复合索引语句的第一列或者连续的多列。比如通过：CREATE INDEX idx_combine ON table_a(x, y, z)创建索引，那么x,xy,xyz都是前导列，而yz，y，z这样的就不是。

什么是索引覆盖

创建一个索引，该索引包含查询中用到的所有字段，称为“覆盖索引”。使用覆盖索引，数据库只需要通过索引就可以查找和返回查询所需要的数据，而不必在使用索引处理数据之后再进行回表操作。

下面呢，我就mock一些数据，来分别验证下这几种情况。

测试数据

创建测试表：

CREATE TABLE `test_order` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `order_id` int(11) DEFAULT NULL,
  `order_status` tinyint(4) DEFAULT NULL,
  `create_date` datetime DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_userid_order_id_createdate` (`user_id`,`order_id`,`create_date`)
) ENGINE=InnoDB AUTO_INCREMENT=100002 DEFAULT CHARSET=utf8mb4;


复制

插入数据函数

DELIMITER $$
DROP FUNCTION IF EXISTS `bulk_insert_fun`$$
CREATE  FUNCTION `bulk_insert_fun`() RETURNS INT(11)
BEGIN
 DECLARE num INT DEFAULT 100000;
 DECLARE i INT DEFAULT 0;
 WHILE i<num DO
       insert into test_order (user_id,order_id,order_status,create_date) values (rand()*1000,id,rand()*10,DATE_ADD(NOW(), INTERVAL - RAND()*20000 HOUR));
  SET i = i+1;
 END WHILE;
 RETURN i;
END$$
DELIMITER ;

复制

批量插入数据：

select bulk_insert_fun();

复制

extra是null

表中的索引字段为： idx_userid_order_id_createdate
(user_id
,order_id
,create_date
)

如下SQL中，指定的where条件为user_id=1, 而user_id是索引前导列。select * 表明查询的列未被索引覆盖（表中还有order_status字段未在索引中）。

mysql> explain select * from test_order where user_id = 1;
+----+-------------+------------+------------+------+--------------------------------+--------------------------------+---------+-------+------+----------+-------+
| id | select_type | table      | partitions | type | possible_keys                  | key                            | key_len | ref   | rows | filtered | Extra |
+----+-------------+------------+------------+------+--------------------------------+--------------------------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | test_order | NULL       | ref  | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | const |  105 |   100.00 | NULL  |
+----+-------------+------------+------------+------+--------------------------------+--------------------------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

复制

可以看到，当where条件是索引的前导列，但是查询的列未被索引覆盖时，此时extra列为null。

这意味着用到了索引，但是部分字段未被索引覆盖，必须通过“回表”来实现，不是纯粹地用到了索引，也不是完全没用到索引，Extra中为NULL。

extra是using index

如下SQL中，指定的where条件为user_id=1, 而user_id是索引前导列。select查询的几个字段user_id
,order_id
,create_date
都在索引idx_userid_order_id_createdate
中，表明查询的列被索引覆盖。

mysql> explain select `user_id`,`order_id`,`create_date` from test_order where user_id = 1;
+----+-------------+------------+------------+------+--------------------------------+--------------------------------+---------+-------+------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys                  | key                            | key_len | ref   | rows | filtered | Extra       |
+----+-------------+------------+------------+------+--------------------------------+--------------------------------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ref  | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | const |  105 |   100.00 | Using index |
+----+-------------+------------+------------+------+--------------------------------+--------------------------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

可以看到，当where条件既是索引的前导列，查询的列同时也被索引覆盖时，此时extra列为Using index。

extra是Using where; Using index

1.where筛选条件是索引列之一但不是前导列，查询的列被索引覆盖

如下SQL中，指定的where条件为order_id=1, 而order_id不是索引前导列。select查询的几个字段user_id
,order_id
,create_date
都在索引idx_userid_order_id_createdate
中，表明查询的列被索引覆盖。

mysql> explain select `user_id`,`order_id`,`create_date` from test_order where order_id = 1;
+----+-------------+------------+------------+-------+---------------+--------------------------------+---------+------+--------+----------+--------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key                            | key_len | ref  | rows   | filtered | Extra                    |
+----+-------------+------------+------------+-------+---------------+--------------------------------+---------+------+--------+----------+--------------------------+
|  1 | SIMPLE      | test_order | NULL       | index | NULL          | idx_userid_order_id_createdate | 16      | NULL | 100035 |    10.00 | Using where; Using index |

复制

可见，当查询的列被索引覆盖，但是where筛选条件是索引列之一并不是索引的不是前导列，Extra中为Using where; Using index。此时意味着无法直接通过索引查找来查询到符合条件的数据。

2.where筛选条件是索引列前导列的一个范围，查询的列被索引覆盖

如下SQL中，指定的where条件为user_id>1, user_id是索引前导列，但user_id指定的是一个范围。select查询的几个字段user_id
,order_id
,create_date
都在索引idx_userid_order_id_createdate
中，表明查询的列被索引覆盖。

mysql> explain select `user_id`,`order_id`,`create_date` from test_order where user_id > 1;
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+-------+----------+--------------------------+
| id | select_type | table      | partitions | type  | possible_keys                  | key                            | key_len | ref  | rows  | filtered | Extra                    |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+-------+----------+--------------------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | NULL | 50017 |   100.00 | Using where; Using index |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+-------+----------+--------------------------+
1 row in set, 1 warning (0.01 sec)

复制

我认为，此处'Using where'表示进行查询可能会是表扫描（比如第一种情况），但是如果where条件中有可以用到索引，它就会使用Using index（比如第二种情况）。

extra是Using where

1.where筛选条件不是索引列

如下SQL中，指定的where条件为order_status =0, order_status不是索引前列。此时无论查询的列无论是否被索引覆盖，都是一样的情况。

mysql> explain select * from test_order where order_status =0;
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 100035 |    10.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select `user_id`,`order_id`,`create_date` from test_order where order_status =0;
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 100035 |    10.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)


复制

1.where筛选条件不是索引前导列、是索引列前导列的一个范围，查询的列未被索引覆盖

mysql> explain select * from test_order where order_id =0;
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 100035 |    10.00 | Using where |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test_order where user_id >1;
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+--------+----------+-------------+
| id | select_type | table      | partitions | type | possible_keys                  | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | ALL  | idx_userid_order_id_createdate | NULL | NULL    | NULL | 100035 |    50.00 | Using where |
+----+-------------+------------+------------+------+--------------------------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

using where意味着数据库按表扫描的方式进行where条件的过滤，此时type字段一般是all。这种方式没有用到任何的index，mysql就是从头到尾把整个表遍历一边，找到所需要的数据行。效率是最差的。

extra是Using index condition

如下SQL中，指定的where条件为user_id<5和(user_id<5 and user_id>1), user_id是索引前导列，但user_id指定的是一个范围(注意这里不包含小于，如果小于，那么结果就是上节里的Using where)。select * 表明查询的列未被索引覆盖。

mysql>  explain select * from test_order where user_id<5;
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-----------------------+
| id | select_type | table      | partitions | type  | possible_keys                  | key                            | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | NULL |  454 |   100.00 | Using index condition |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

mysql>  explain select * from test_order where user_id<5 and user_id>1;
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-----------------------+
| id | select_type | table      | partitions | type  | possible_keys                  | key                            | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | NULL |  302 |   100.00 | Using index condition |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)


复制

Where condition表示查询包含索引列和非索引列，优化器将首先解析索引列，并在表中查找其他条件(索引下推)。

此处如果我通过set optimizer_switch='index_condition_pushdown=off';
关闭索引下推，相同的条件extra的信息就会变成Using where。

mysql> set optimizer_switch='index_condition_pushdown=off';
Query OK, 0 rows affected (0.00 sec)

mysql> explain select * from test_order where user_id<5 and user_id>1;
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys                  | key                            | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | NULL |  302 |   100.00 | Using where |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain select * from test_order where user_id<5;
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys                  | key                            | key_len | ref  | rows | filtered | Extra       |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | test_order | NULL       | range | idx_userid_order_id_createdate | idx_userid_order_id_createdate | 5       | NULL |  454 |   100.00 | Using where |
+----+-------------+------------+------------+-------+--------------------------------+--------------------------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

复制

Using filesort

如下的SQL，order by的字段是order_status，它是非索引字段，无法使用index的有序性，所以这里是使用了filesort。

mysql>  explain select * from test_order force index(id)  order by test_order.order_status;
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+----------------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra          |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+----------------+
|  1 | SIMPLE      | test_order | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 100035 |   100.00 | Using filesort |
+----+-------------+------------+------------+------+---------------+------+---------+------+--------+----------+----------------+
1 row in set, 1 warning (0.01 sec)

复制

注：sort_buffer_size这个参数，是个connect级别的参数，MySQL5.7，默认值是1048576字节，也就是1MB，这个例子里需要的排序数据大小肯定是超过这个值了。

extra是Using temporary

如下SQL中，order by的字段在t2表上，所以需要把t1，t2表join的结果保存到temporary表上，然后对临时表进行filesort，最后输出结果。如果把order by的字段替换成t1.id，那么t1的extra信息里会只有Using filesort。

mysql>  explain select * from test_order t1 force index(id), test_order t2 where t1.id = t2.id order by t2.id;
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+--------+----------+---------------------------------+
| id | select_type | table | partitions | type   | possible_keys | key     | key_len | ref               | rows   | filtered | Extra                           |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+--------+----------+---------------------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL    | NULL          | NULL    | NULL    | NULL              | 100035 |   100.00 | Using temporary; Using filesort |
|  1 | SIMPLE      | t2    | NULL       | eq_ref | PRIMARY       | PRIMARY | 4       | zhenxi_test.t1.id |      1 |   100.00 | NULL                            |
+----+-------------+-------+------------+--------+---------------+---------+---------+-------------------+--------+----------+---------------------------------+
2 rows in set, 1 warning (0.00 sec)

复制

总结

extra字段通常是综合where子句和select的字段来一起判断的。通过上面的示例，我们可以看出其中Using index其实是性能最高的。其他的情况都可以酌情优化。尤其是临时表和文件排序的，更要重点进行优化。

参考：
https://blog.csdn.net/jiangxiulilinux/article/details/105863440
https://www.cnblogs.com/wy123/p/7366486.html http://mysql.taobao.org/monthly/2015/03/04/ https://stackoverflow.com/questions/28759576/mysql-using-index-condition-vs-using-where-using-index