上篇文章讲了数据库中最主要的几种访问路径,不同的访问路径,在执行性能上可能会存在巨大的差别。但是我们怎么知道某一个具体的 SQL 语句在执行时,采用了什么样的访问路径呢?这就涉及到 SQL 的执行计划了。
执行计划描述了 SQL 语句的访问路径,通过执行计划,我们可以知道:
-
表上是否有可用的索引,SQL 执行时是否使用了索引,使用了哪些索引?
-
表连接的顺序是怎样的?
-
使用了哪种表连接算法,是用了 Nest Loop 还是 Hash Join?
-
查询是否用到了临时表,是否进行了文件排序?
在 MySQL 中,我们使用 Explain 命令查看语句的执行计划。这一讲中我会使用四十多个演示 SQL,来解释 EXPLAIN 输出信息的具体含义。至于为什么一个 SQL 使用了这个执行计划,而不是别的执行计划,我会在接下来的几讲中慢慢展开。
为了便于演示各种不同的执行计划,我使用了下面这些测试表和测试数据,你也可以在自己的环境中进行测试。
CREATE TABLE `digit` (
`a` tinyint NOT NULL,
PRIMARY KEY (`a`)
) ENGINE=InnoDB;
insert into digit values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
CREATE VIEW numbers AS
select a.a*1000 + b.a*100 + c.a*10 + d.a as n
from digit a, digit b, digit c, digit d;
-- 本章中大部分案例都使用tab表
CREATE TABLE `tab` (
`id` int NOT NULL AUTO_INCREMENT,
`a` int NOT NULL,
`b` int NOT NULL,
`c` int NOT NULL,
`padding` varchar(7000) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_abc` (`a`,`b`,`c`)
) ENGINE=InnoDB;
insert into tab (a,b,c,padding)
select n%3, n, n%100, rpad('x', 100, 'x')
from numbers where n < 10000;
-- t_merge主要用于演示index_merge的几种情况
create table t_merge(
id int not null auto_increment,
a int not null,
b int not null,
c int not null,
d int not null,
padding varchar(4000),
primary key(id),
key idx_ad(a,d),
key idx_bd(b,d),
key idx_cd(c,d)
) ENGINE=InnoDB;
insert into t_merge(a,b,c,d,padding)
select n % 3 + 1, n % 17 + 1, n % 19 + 1, n % 10 + 1, rpad('y', 100, 'y')
from numbers
where n between 1 and 3*17*19*10;
使用 EXPLAIN 命令
使用 EXPLAIN 命令可以查看 SELECT、INSERT、UPDATE、DELETE、REPLACE 等语句的执行计划。EXPLAIN 命令的基本语法如下:
explain [format=traditional|json|tree] SQL Statement
FORMAT 指定了执行计划的输出格式,不指定时,默认以表格形式输出执行计划。
mysql> explain select * from tab;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | tab | NULL | ALL | NULL | NULL | NULL | NULL | 9755 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
FORMAT 指定为 TREE 时,以树的形式显示执行计划,这种显示格式和 Oracle 执行计划的显示方式比较像。
下面这个测试 SQL 中,使用了 NO_SEMIJOIN 提示,阻止优化器将子查询改写成表连接。你可以试一下,将提示去掉后,会使用怎样的执行计划。
explain format=tree
select /*+ NO_SEMIJOIN(@subq1) */ *
from tab a
where id in (
select /*+ QB_NAME(subq1) */ id from tab b
)
EXPLAIN: -> Filter: <in_optimizer>(a.id,<exists>(select #2)) (cost=1031.55 rows=9913)
-> Table scan on a (cost=1031.55 rows=9913)
-> Select #2 (subquery in condition; dependent)
-> Limit: 1 row(s) (cost=0.35 rows=1)
-> Single-row covering index lookup on b using PRIMARY (id=<cache>(a.id)) (cost=0.35 rows=1)
FORMAT 指定为 JSON 时,以 JSON 格式显示执行计划。JSON 格式的执行计划中,可以看到一些成本的信息。
explain format=json
select /*+ NO_SEMIJOIN(@subq1) */ *
from tab a
where id in (
select /*+ QB_NAME(subq1) */ id from tab b
)
{
"query_block": {
"select_id": 1,
"cost_info": {
"query_cost": "1031.55"
},
"table": {
"table_name": "a",
"access_type": "ALL",
"rows_examined_per_scan": 9913,
"rows_produced_per_join": 9913,
"filtered": "100.00",
"cost_info": {
"read_cost": "40.25",
"eval_cost": "991.30",
"prefix_cost": "1031.55",
"data_read_per_join": "264M"
},
"used_columns": [
"id",
"a",
"b",
"c",
"padding"
],
"attached_condition": "<in_optimizer>(`rep`.`a`.`id`,<exists>(<primary_index_lookup>(<cache>(`rep`.`a`.`id`) in tab on PRIMARY)))",
"attached_subqueries": [
{
"dependent": true,
"cacheable": false,
"query_block": {
"select_id": 2,
"cost_info": {
"query_cost": "0.35"
},
"table": {
"table_name": "b",
"access_type": "unique_subquery",
"possible_keys": [
"PRIMARY"
],
"key": "PRIMARY",
"used_key_parts": [
"id"
],
"key_length": "4",
"ref": [
"func"
],
"rows_examined_per_scan": 1,
"rows_produced_per_join": 1,
"filtered": "100.00",
"using_index": true,
"cost_info": {
"read_cost": "0.25",
"eval_cost": "0.10",
"prefix_cost": "0.35",
"data_read_per_join": "27K"
},
"used_columns": [
"id"
]
}
}
}
]
}
}
}
EXPLAIN 输出详解
接下来,我将以默认的表格输出形式为准,来进行介绍。在 EXPLAIN 的输出中,每一行表示一个查询单元。查询单元有几种情况:
-
关联查询中,每一个关联的表是一个查询单元;
-
每个子查询都会对应一个查询单元;
-
组成 UNION 语句的每个子句都会对应一个查询单元;
-
每个派生表对应一个查询单元。
EXPLAIN 的输出中包含以下字段:
ID
ID 为查询单元的编号。主查询(顶层查询)的 ID 为 1。同一层级内,如果有表连接,则它们的查询单元 ID 一样。子查询的嵌套层级越深,ID 越大。下面的例子中,有 2 个子查询,子查询 2 嵌套在子查询 1 中。
mysql> explain
select /*+ NO_SEMIJOIN(@subq1 ) */ *
from tab ta, tab tx
where ta.id in (
select /*+ QB_NAME(subq1) NO_SEMIJOIN(@subq2) */ id from tab tb
where c in (select /*+ QB_NAME(subq2) */ c from tab tc)
)
and ta.id = tx.id
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------+--------------------------+
| 1 | PRIMARY | ta | ALL | PRIMARY | NULL | NULL | NULL | 9913 | 100.00 | Using where |
| 1 | PRIMARY | tx | eq_ref | PRIMARY | PRIMARY | 4 | rep.ta.id | 1 | 100.00 | NULL |
| 2 | SUBQUERY | tb | index | PRIMARY | idx_abc | 12 | NULL | 9913 | 100.00 | Using where; Using index |
| 3 | SUBQUERY | tc | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------+--------------------------+
查询中使用提示 QB_NAME、NO_SEMIJOIN,是为了避免优化器使用半连接查询优化,因为使用半连接优化后,这个示例中的子查询就被优化掉了。ID 为 1 的 2 个查询单元,按从上到下的顺序关联,驱动表为 ta。这里 Extra 中没有额外的信息,说明使用了嵌套循环连接。
SELECT_TYPE
SELECT_TYPE 为查询单元的类型,有以下几种可能的类型。
- SIMPLE
如果语句没有使用 UNION 或子查询,那么 SELECT_TYPE 为 SIMPLE。
- PRIMARY
如果语句用到了子查询,那么最外层查询的 select_type 为 PRIMARY,表示这是主语句。如果语句使用了 UNION,则第一个查询单元的 select_type 为 PRIMARY。
- UNION
使用 UNION 或 UNION ALL 的查询中,第二个和后续查询的 select_type 为 UNION
- UNION RESULT
UNION RESULT 表示 UNION 去重后的结果。下面这个例子中,ID 为 3 的这一行就是 ID 为 1 和 2 这两个查询单元结果集 Union 之后得到的。
mysql> explain select a from tab ta union select c from tab tb;
+----+--------------+------------+-------+---------------+---------+---------+------+------+----------+-----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+------------+-------+---------------+---------+---------+------+------+----------+-----------------+
| 1 | PRIMARY | ta | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
| 2 | UNION | tb | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
| 3 | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------+------------+-------+---------------+---------+---------+------+------+----------+-----------------+
- DEPENDENT UNION
如果 UNION 子句位于子查询中,并且子查询依赖了外部查询中字段,则 select_type 为 DEPENDENT UNION。
mysql> explain select * from tab ta where id in (
select a from tab tb union select c from tab tc);
+----+--------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | PRIMARY | ta | ALL | NULL | NULL | NULL | NULL | 9913 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | tb | ref | idx_abc | idx_abc | 4 | func | 3304 | 100.00 | Using index |
| 3 | DEPENDENT UNION | tc | index | NULL | idx_abc | 12 | NULL | 9913 | 10.00 | Using where; Using index |
| 4 | UNION RESULT | <union2,3> | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
- UNCACHEABLE UNION
union 查询出现在子查询中,并且 union 语句使用了非确定性函数(如 rand)或使用了变量,那么 union 查询单元的 select_type 为 UNCACHEABLE UNION。
mysql> explain select distinct a from (
select a from tab tb
union all
select c from tab tc where c > rand()
) tx;
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 13217 | 100.00 | Using temporary |
| 2 | DERIVED | tb | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
| 3 | UNCACHEABLE UNION | tc | index | NULL | idx_abc | 12 | NULL | 9913 | 33.33 | Using where; Using index |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
- SUBQUERY
子查询可以出现在 select 列表中、where 条件中,也可以出现在 from 列表中。出现在 from 列表中的子查询为派生表(DERIVED)。如果子查询不依赖主查询中的字段,则称为不相关子查询,select_type 为 SUBQUERY。SQL 执行时,会缓存子查询的查询结果。下面的例子中,子查询中的 SQL 只需要执行 1 次,不需要每访问 1 行数据就重复执行 1 次子查询。
mysql> explain select a, (select avg(b) from tab ) from tab t1;
+----+-------------+-------+-------+---------------+---------+---------+------+-------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+----------+-------------+
| 1 | PRIMARY | t1 | index | NULL | idx_abc | 12 | NULL | 10010 | 100.00 | Using index |
| 2 | SUBQUERY | tab | index | NULL | idx_abc | 12 | NULL | 10010 | 100.00 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+----------+-------------+
使用 EXPLAN ANALYZE 命令,可以看到子查询只执行了 1 次(run only once)。
mysql> explain analyze select a, (select avg(b) from tab ) from tab t1\G
*************************** 1. row ***************************
EXPLAIN:
-> Covering index scan on t1 using idx_abc (cost=1025 rows=10010) (actual time=0.108..2.54 rows=10000 loops=1)
-> Select #2 (subquery in projection; run only once)
-> Aggregate: avg(tab.b) (cost=2026 rows=1) (actual time=3.79..3.79 rows=1 loops=1)
-> Covering index scan on tab using idx_abc (cost=1025 rows=10010) (actual time=0.0235..2.61 rows=10000 loops=1)
- DEPENDENT SUBQUERY
子查询引用了主查询中的字段,则子查询称为相关子查询。MySQL 中,相关子查询无法被缓存。对主查询中的每 1 行数据,都需要执行 1 次子查询。
mysql> explain select a, (select avg(b) from tab where a=t1.a) from tab t1;
+----+--------------------+-------+-------+---------------+---------+---------+----------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+-------+---------------+---------+---------+----------+------+----------+-------------+
| 1 | PRIMARY | t1 | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
| 2 | DEPENDENT SUBQUERY | tab | ref | idx_abc | idx_abc | 4 | rep.t1.a | 3304 | 100.00 | Using index |
+----+--------------------+-------+-------+---------------+---------+---------+----------+------+----------+-------------+
使用 EXPLAN ANLYZE 命令,可以看到子查询执行了 10000 次(loops=10000)。
mysql> explain analyze select a, (select avg(b) from tab where a=t1.a) from tab t1\G
*************************** 1. row ***************************
EXPLAIN:
-> Covering index scan on t1 using idx_abc (cost=1025 rows=10010) (actual time=0.0478..3.09 rows=10000 loops=1)
-> Select #2 (subquery in projection; dependent)
-> Aggregate: avg(tab.b) (cost=669 rows=1) (actual time=1.39..1.39 rows=1 loops=10000)
-> Covering index lookup on tab using idx_abc (a=t1.a) (cost=336 rows=3337) (actual time=0.023..1.03 rows=3333 loops=10000)
1 row in set, 1 warning (13.94 sec)
MariaDB 中,相关子查询也可以 cache,对于本例中的 SQL,子查询依赖主查询中 t1 表的字段 a,每次 t1.a 出现新值时,执行 1 次子查询,并缓存 t1.a 的值和查询结果。子查询执行的次数取决于关联的外部查询中字段的唯一值数量。
下面这个例子是在 MariaDB 10.0 的环境中执行得到的。explain extended 的 warning 信息中,
--这个例子在mariadb 10.0 中执行
mysql> explain extended select a, (select avg(b) from tab where a=t1.a) from tab t1;
+------+--------------------+-------+-------+---------------+---------+---------+------------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+--------------------+-------+-------+---------------+---------+---------+------------+------+----------+-------------+
| 1 | PRIMARY | t1 | index | NULL | idx_abc | 12 | NULL | 9919 | 100.00 | Using index |
| 2 | DEPENDENT SUBQUERY | tab | ref | idx_abc | idx_abc | 4 | mysql.t1.a | 1668 | 100.00 | Using index |
+------+--------------------+-------+-------+---------------+---------+---------+------------+------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)
mysql> show warnings;
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1276 | Field or reference 'mysql.t1.a' of SELECT #2 was resolved in SELECT #1 |
| Note | 1003 | /* select#1 */ select `mysql`.`t1`.`a` AS `a`,<expr_cache><`mysql`.`t1`.`a`>((/* select#2 */ select avg(`mysql`.`tab`.`b`) from `mysql`.`tab` where `mysql`.`tab`.`a` = `mysql`.`t1`.`a`)) AS `(select avg(b) from tab where a=t1.a)` from `mysql`.`tab` `t1` |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
- DRIVED
子查询出现在主查询的 FROM 子句的位置时,select_type 为 DERIVED。
mysql> explain select t1.*
from tab t1, (select a, avg(b) as avgb from tab group by a) t2
where t1.a = t2.a;
+----+-------------+------------+-------+---------------+-------------+---------+----------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+-------+---------------+-------------+---------+----------+------+----------+-------------+
| 1 | PRIMARY | t1 | ALL | idx_abc | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 4 | rep.t1.a | 10 | 100.00 | NULL |
| 2 | DERIVED | tab | index | idx_abc | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+----------+------+----------+-------------+
- DEPENDENT DRIVED
如果派生查询引用了主查询中的字段,则 select_type 为 DEPENDENT DRIVED。
mysql> explain select a, (select * from (select avg(b) as avgb from tab where a=t0.a) td) from tab t0;
+----+--------------------+------------+-------+---------------+---------+---------+----------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------+-------+---------------+---------+---------+----------+------+----------+-------------+
| 1 | PRIMARY | t0 | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
| 2 | DEPENDENT SUBQUERY | <derived3> | ALL | NULL | NULL | NULL | NULL | 2 | 100.00 | NULL |
| 3 | DEPENDENT DERIVED | tab | ref | idx_abc | idx_abc | 4 | rep.t0.a | 3304 | 100.00 | Using index |
+----+--------------------+------------+-------+---------------+---------+---------+----------+------+----------+-------------+
- MATERIALIZED
子查询的结果先存储到临时表,查询的其他部分和生成的临时表再关联。
mysql> explain select * from tab where a in( select b from tab);
+----+--------------+-------------+--------+---------------------+---------------------+---------+-----------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+--------+---------------------+---------------------+---------+-----------+------+----------+-------------+
| 1 | SIMPLE | tab | ALL | idx_abc | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | SIMPLE | <subquery2> | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 4 | rep.tab.a | 1 | 100.00 | NULL |
| 2 | MATERIALIZED | tab | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
+----+--------------+-------------+--------+---------------------+---------------------+---------+-----------+------+----------+-------------+
- UNCACHEABLE SUBQUERY
如果(不相关)子查询结果无法被缓存,则 select_type 为 UNCACHEABLE SUBQUERY。
mysql> explain select * from tab where b > (select avg(b) from tab where b > rand());
+----+----------------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+----------------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | PRIMARY | tab | ALL | NULL | NULL | NULL | NULL | 9913 | 100.00 | Using where |
| 2 | UNCACHEABLE SUBQUERY | tab | index | NULL | idx_abc | 12 | NULL | 9913 | 33.33 | Using where; Using index |
+----+----------------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
导致子查询无法 Cache 的一些情况:子查询使用了结果不确定的函数(如 rand),子查询使用了变量。
- UNCACHEABLE UNION
mysql> explain select distinct a from (select a from tab tb union all select c from tab tc where c > rand() ) tx;
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 13346 | 100.00 | Using temporary |
| 2 | DERIVED | tb | index | NULL | idx_abc | 12 | NULL | 10010 | 100.00 | Using index |
| 3 | UNCACHEABLE UNION | tc | index | NULL | idx_c | 4 | NULL | 10010 | 33.33 | Using where; Using index |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
TABLE
TABLE 这一列说明了一个查询单元的数据是从哪个表获取的。这里分几种情况。
-
table 是一个真实存在的物理表,这里会显示为表的别名。
-
,说明是从派生表获取的数据,真实的数据来自 ID 为 N 的查询单元 -
<unionM,N>,说明是将 ID 为 M 和 N 的查询单元 Union 后得到的数据。
-
,说明是 ID 为 N 的子查询物化后得到的数据。
这里的几种情况,在前面的例子中都出现过,你可以回头再看一下。
TYPE
TYPE 列显示查询单元的访问路径,你可以根据 TYPE 列来判断查询单元是否使用了索引。
- system
如果使用了 MyISAM 存储引擎,并且表中只有 1 行,则 Type 为 system。
mysql> create table t1_myisam(a int) engine=myisam;
mysql> insert into t1_myisam values(100);
mysql> explain select * from t1_myisam;
+----+-------------+-----------+--------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+--------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | t1_myisam | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
+----+-------------+-----------+--------+---------------+------+---------+------+------+----------+-------+
- const
const 表示查询最多返回 1 行记录。对主键或唯一索引的所有字段都使用常量等值匹配时,type 为 const。优化器会将 type 为 const 的查询单元直接替换为常量表。
mysql> explain select * from tab where id = 10\G
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | tab | const | PRIMARY | PRIMARY | 4 | const | 1 | 100.00 | NULL |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------+
- eq_ref
使用主键或唯一索引等值匹配时,type 为 eq_ref。对于组合主键、组合唯一索引,索引中的每一个字段都需要以等值匹配时,type 才为 eq_ref。eq_ref 访问路径每次最多只返回 1 行记录。eq_ref 和 const 的区别在于,const 使用常量匹配,而 eq_ref 中,匹配索引字段的值来自驱动表,不是固定的常量。
下面这个例子中,t2 表 id 字段的匹配条件来自驱动表 t1 的字段 a。
mysql> explain select * from tab t1 , tab t2 where t1.a = t2.id;
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------+
| 1 | SIMPLE | t1 | ALL | idx_abc | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | rep.t1.a | 1 | 100.00 | NULL |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------+
- ref
普通索引字段的等值匹配,或主键和唯一索引前缀字段上的等值匹配,类型为 ref。对于 ref 类型的访问路径,执行计划输出中的 key 列显示了实际使用的索引,key_len 显示使用的索引字段的长度,ref 字段显示用于匹配的值。
mysql> explain select * from tab where a=1;
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | tab | ref | idx_abc | idx_abc | 4 | const | 3333 | 100.00 | NULL |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
mysql> explain select * from tab t1, tab t2 where t1.a = t2.a;
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+-------+
| 1 | SIMPLE | t1 | ALL | idx_abc | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | SIMPLE | t2 | ref | idx_abc | idx_abc | 4 | rep.t1.a | 3304 | 100.00 | NULL |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+-------+
如果索引字段的条件使用了 or 或 in,那么 type 就不再是 ref 了。
- ref_or_null
ref_or_null 和 ref 类似,只是额外加了字段为空的条件。对比下下面这两个语句,字段 A 有 not null 约束时,type 为 ref。
mysql> create table tab3 like tab;
mysql> alter table tab3 modify a int;
mysql> insert into tab3 select * from tab;
mysql> explain select * from tab where a is null or a = 1;
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | tab | ref | idx_abc | idx_abc | 4 | const | 3333 | 100.00 | NULL |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
mysql> explain select * from tab3 where a is null or a = 1;
+----+-------------+-------+-------------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------------+---------------+---------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | tab3 | ref_or_null | idx_abc | idx_abc | 5 | const | 3334 | 100.00 | Using index condition |
+----+-------------+-------+-------------+---------------+---------+---------+-------+------+----------+-----------------------+
- range
使用索引字段上的范围条件查询数据。范围条件包括 <, <=, >, >=, BETWEEN 等,索引字段使用 OR 或 IN 多个值时,类型也为 range。对于 range 访问路径,执行计划中 key 列显示实际使用的索引名。key_len 列显示使用到的索引字段的长度,ken_len 取决于使用到的索引字段的数据类型、字段是否可以为空。rows 字段显示优化器评估得到的需要在索引中访问的记录数量。
mysql> explain select * from tab force index(idx_abc) where a = 1 or a = 2;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | tab | range | idx_abc | idx_abc | 4 | NULL | 6666 | 100.00 | Using index condition |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
mysql> explain select * from tab where a = 1 and b between 100 and 200;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| 1 | SIMPLE | tab | range | idx_abc | idx_abc | 8 | NULL | 34 | 100.00 | Using index condition |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
- index_merge
index_merge 会使用多个索引来查询数据,并将通过多个索引获取到的数据取并集或交集,得到最终的结果。执行计划输出中,key 字段显示实际参与 index_merge 的索引。Extra 列中的信息显示了实际的索引合并方法,包括 sort_union, union 和 intersect。
如果在执行计划中看到 index_merge 访问路径,一般要考虑是否可以创建联合索引,将访问路径改成 range 或 ref,或者将 SQL 改写为 union 查询。
mysql> explain select *
from t_merge
where (b=1 and d=1) or (c=1 and d between 3 and 5);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| 1 | SIMPLE | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8 | NULL | 210 | 100.00 | Using sort_union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
mysql> explain select * from t_merge where a=1 and b=1 and d=1 and c=1;
+----+-------------+---------+-------------+----------------------+----------------------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------------+----------------------+----------------------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | t_merge | index_merge | idx_ad,idx_bd,idx_cd | idx_cd,idx_bd,idx_ad | 8,8,8 | NULL | 1 | 100.00 | Using intersect(idx_cd,idx_bd,idx_ad); Using where |
+----+-------------+---------+-------------+----------------------+----------------------+---------+------+------+----------+----------------------------------------------------+
- index_subquery
index_subquery 是执行下面这种类型的子查询的一种方式:
where value in (select col from tab where ...)
如果子查询中的表 tab 有合适的索引可以用来检索,则可以使用 index_subquery 执行路径,对主查询中的每 1 行记录,执行类似下面的 SQL:
select * from tab where col = outer.value and ...
mysql> explain select /*+ NO_SEMIJOIN(@qb1) */ * from tab
where a = 1
and b between 100 and 200
and c in (select /*+ QB_NAME(qb1) */ a from tab where b=1 );
+----+--------------------+-------+----------------+---------------+---------+---------+------------+------+----------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+----------------+---------------+---------+---------+------------+------+----------+------------------------------------+
| 1 | PRIMARY | tab | range | idx_abc | idx_abc | 8 | NULL | 34 | 100.00 | Using index condition; Using where |
| 2 | DEPENDENT SUBQUERY | tab | index_subquery | idx_abc | idx_abc | 8 | func,const | 1 | 100.00 | Using index |
+----+--------------------+-------+----------------+---------------+---------+---------+------------+------+----------+------------------------------------+
mysql> show warnings\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select /*+ NO_SEMIJOIN(@`qb1`) */
`rep`.`tab`.`id` AS `id`,`rep`.`tab`.`a` AS `a`,`rep`.`tab`.`b` AS `b`,`rep`.`tab`.`c` AS `c`,`rep`.`tab`.`padding` AS `padding`
from `rep`.`tab`
where ((`rep`.`tab`.`a` = 1)
and (`rep`.`tab`.`b` between 100 and 200)
and <in_optimizer>(`rep`.`tab`.`c`,<exists>(<index_lookup>(<cache>(`rep`.`tab`.`c`) in tab on idx_abc))))
1 row in set (0.00 sec)
- unique_subquery
unique_subquery 和 index_subquery 类似,区别在于 type 为 unique_subquery 时,子查询中的表使用主键或唯一索引来关联查询。
mysql> explain select /*+ NO_SEMIJOIN(@qb1) */ *
from tab
where a=1
and id in (select /*+ QB_NAME(qb1) */ id from tab);
+----+--------------------+-------+-----------------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+-----------------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | PRIMARY | tab | ref | idx_abc | idx_abc | 4 | const | 3333 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | tab | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | Using index |
+----+--------------------+-------+-----------------+---------------+---------+---------+-------+------+----------+-------------+
mysql> show warnings\G
*************************** 1. row ***************************
Level: Note
Code: 1003
Message: /* select#1 */ select /*+ NO_SEMIJOIN(@`qb1`) */
`rep`.`tab`.`id` AS `id`,`rep`.`tab`.`a` AS `a`,`rep`.`tab`.`b` AS `b`,`rep`.`tab`.`c` AS `c`,`rep`.`tab`.`padding` AS `padding`
from `rep`.`tab`
where ((`rep`.`tab`.`a` = 1)
and <in_optimizer>(`rep`.`tab`.`id`,<exists>(<primary_index_lookup>(<cache>(`rep`.`tab`.`id`) in tab on PRIMARY))))
上面例子中 NO_SEMIJOIN 提示是为了阻止优化器使用半连接优化,因为使用半连接优化后,子查询被转换为半连接,就不会出现 index_subquery 和 unique_subquery 访问路径了。
- index
index 访问路径和 ALL 类似,主要区别是 ALL 访问路径需要访问表的每一行数据,而 index 访问路径是访问索引中的每一行数据。访问路径为 index 时,key_len 字段显示的是所有索引字段的长度总和。一般 type 为 index 时,possible_keys 总是 NULL。
以下 2 种情况可以使用 index 访问路径:
情况 1:语句使用了覆盖索引,但是又缺少索引字段的查询条件。这种情况下 Extra 列会显示 Using index。
情况 2:按索引的顺序读取表的记录,但是无法使用覆盖索引。这种情况下 Extra 列不显示 Using index。
下面的 2 个例子分别展示了这两种情况。
mysql> explain select a,b,c from tab force index(idx_abc) order by a,b,c;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | tab | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
mysql> explain select * from tab force index(idx_abc) order by a,b,c;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------+
| 1 | SIMPLE | tab | index | NULL | idx_abc | 12 | NULL | 9913 | 100.00 | NULL |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------+
- ALL
全表扫描,需要访问表中的每一行记录。出现 ALL 访问路径,可能是因为表上缺少合适的索引,或者是因为 SQL 语句的写法问题导致无法使用索引,也可能是因为查询中对表缺少合适的过滤条件,或者是索引字段的过滤性不好,需要根据查询语句的情况具体分析。
POSSIBLE_KEYS
possible_keys 列显示查询单元能使用 range、ref 等访问路径访问的索引。执行计划最终选择的索引在 keys 列显示。是否使用索引,以及使用哪个索引,取决于优化器对各种访问方式的成本评估,还跟表连接的顺序和连接算法也有关系。
下面这个例子中,t1 表的 possible_keys 有索引 idx_abc,但是当 t1 表作为驱动表时,就无法使用索引 idx_abc。
mysql> explain select * from tab t1, tab t2 where t1.a = t2.a and t1.b = t2.b;
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
| 1 | SIMPLE | t1 | ALL | idx_abc | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | SIMPLE | t2 | ref | idx_abc | idx_abc | 8 | rep.t1.a,rep.t1.b | 1 | 100.00 | NULL |
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
KEY
key 列显示执行计划实际使用的索引。如果 key 列为 NULL,则说明查询单元没有使用索引。对于 index_merge 访问路径,key 列中会显示多个索引。
KEY_LEN
key_len 列显示执行计划使用到的索引列的总长度。根据 key_len 可以推算出执行计划使用到了索引中的哪几个字段。key_len 根据索引字段的类型和字段是否为空计算得到。对于字符类型如 varchar、char,key_len 为字符数乘以单个字符的最大可能字节数。对于每个可变长类型如 varchar,key_len 额外加 2。对于每个可以为空的字段,key_len 额外加 1。
create table t_k(
a varchar(20),
b char(20),
key idx_a(a,b)
) engine=innodb charset=utf8mb4;
mysql> explain select * from t_k where a='x';
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | t_k | ref | idx_a | idx_a | 83 | const | 1 | 100.00 | Using index |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
mysql> explain select * from t_k where a='x' and b='x';
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
| 1 | SIMPLE | t_k | ref | idx_a | idx_a | 164 | const,const | 1 | 100.00 | Using where; Using index |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
上面这个例子中,SQL 1 使用索引中的字段 a,字段 A 的类型为 varchar(20),字符集为 utf8mb4,key_len 为 20 * 4 + 2 + 1 = 83。
SQL 2 使用到了字段 A 和 B,字段 A 的 key_len 为 83,字段 B 的 key_len 为 20 * 4 + 1 = 81,整体 Key_len 为字段 A 和字段 B 相加 164。
REF
ref 列显示用来进行索引查找的值,ref 的取值可能是以下几种情况:
-
const:使用常量匹配
-
db.tab.c:使用驱动表的某个字段匹配
-
func:使用某个函数的计算结果匹配。可以在执行 explain 后使用 show warnings 命令查看转换后的 SQL。
ROWS
查询单元需要访问的记录数。对于 InnoDB 引擎,这里的记录数是一个预估的行数,跟实际执行过程中真实访问的记录数可能会有一些差异。对于全表扫描和全索引扫描,这里的行数从统计信息中获取。对于索引访问(type 为 ref 或 range),rows 通过访问索引评估得到,或通过索引的统计信息计算得到。对应派生表,rows 通过一些规则评估得到。
一般来说,rows 越大,说明查询单元需要访问的记录数越多,执行时间越长。
FILTERED
filtered 字段单位为百分比,取值范围为 0-100,表示经过 where 子句中的条件过滤后,满足条件的记录数相对于 rows 列中显示的行数所占的百分比。使用公示 rows * filtered / 100 可以得到优化器预估的查询单元返回的记录数。如果当前的查询单元作为驱动表,那么这里的记录数还决定了被驱动的查询单元需要执行多次。
优化器中有一系列固定的规则来计算 filtered 的取值。你可以在分析表(analyze table)的时候给字段添加直方图,使优化器能更精确地计算 filtered。参数 optimizer_switch 中的选项 condition_fanout_filter 用来控制是否开启条件过滤。
EXTRA
Extra 列中显示了执行计划额外的一些重要信息。
- using where
如果访问路径为 ALL 或 index,Extra 中没用 using where,说明查询需要读取整个表或索引的数据。
- Range checked for each record (index map: 0x n)
如果 Extra 中出现了“Range checked for each record”,那么查询的性能很可能不太好。这里 index map 是索引编号的位图信息。
mysql> explain select * from tab a, tab b where a.id > b.id and a.c > b.c;
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
| 1 | SIMPLE | a | ALL | PRIMARY | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | SIMPLE | b | ALL | PRIMARY | NULL | NULL | NULL | 9913 | 11.11 | Range checked for each record (index map: 0x1) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
- Using index; Using temporary
Extra 中出现 Using temporary,说明用到了临时表。
mysql> explain select b,a,count(*) from tab group by b,a;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
| 1 | SIMPLE | tab | index | idx_abc | idx_abc | 12 | NULL | 9913 | 100.00 | Using index; Using temporary |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
- Using index for skip scan
查询条件没有传入索引的前缀字段,又用到了覆盖索引时,MySQL 可能会使用 skip scan。如果前缀列的唯一值很低,skip scan 也可能会有不错的性能。
mysql> explain select c from tab where b=1;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
| 1 | SIMPLE | tab | range | idx_abc | idx_abc | 8 | NULL | 991 | 100.00 | Using where; Using index for skip scan |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
- Using index
使用了覆盖索引,也就是查询中所有列都包含在索引中。
- no matching row in const table
说明表里面不存在满足条件的记录。
mysql> explain select * from tab where id = 12345;
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
| 1 | SIMPLE | NULL NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table |
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
- Using index for group-by
mysql> explain select a, min(b) from tab group by a;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | tab | range | idx_abc | idx_abc | 4 | NULL | 4 | 100.00 | Using index for group-by |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
- LooseScan
LooseScan,以及 FirstMatch、Start temporary、End temporary,都是子查询自动改写为表连接后的执行方式,我们在后续的子查询这一讲中再具体介绍。
mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)
mysql> explain select * from tab ta
where a=1 and c in (select a from tab tb);
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
| 1 | SIMPLE | tb | index | idx_abc | idx_abc | 12 | NULL | 9913 | 0.03 | Using index; LooseScan |
| 1 | SIMPLE | ta | ref | idx_abc | idx_abc | 4 | const | 3333 | 10.00 | Using index condition |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
- FirstMatch(ta)
mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)
mysql> explain select * from tab ta where a=1 and c in (select c from tab tb);
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
| 1 | SIMPLE | ta | ref | idx_abc | idx_abc | 4 | const | 3333 | 100.00 | NULL |
| 1 | SIMPLE | tb | index | NULL | idx_abc | 12 | NULL | 9913 | 10.00 | Using where; Using index; FirstMatch(ta); Using join buffer (hash join) |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
- Start temporary, End temporary
mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)
mysql> explain select * from tab ta where id in (select b from tab tb where b < 10);
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
| 1 | SIMPLE | tb | index | NULL | idx_abc | 12 | NULL | 9913 | 33.33 | Using where; Using index; Start temporary |
| 1 | SIMPLE | ta | eq_ref | PRIMARY | PRIMARY | 4 | rep.tb.b | 1 | 100.00 | End temporary |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
- Using index condition
使用到了索引下推条件。
mysql> explain select * from tab where a=1 and c=1;
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | tab | ref | idx_abc | idx_abc | 4 | const | 3333 | 10.00 | Using index condition |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
- Using filesort
说明查询需要排序。
mysql> explain select * from tab order by b;
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | tab | ALL | NULL | NULL | NULL | NULL | 9913 | 100.00 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
- Using join buffer (hash join)
被驱动表缺少合适的索引时,MySQL 会考虑使用 Hash 连接算法。
mysql> explain select * from tab t1, tab t2 where t1.a = 1 and t1.c=t2.c;
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
| 1 | SIMPLE | t1 | ref | idx_abc | idx_abc | 4 | const | 3333 | 100.00 | NULL |
| 1 | SIMPLE | t2 | ALL | NULL | NULL | NULL | NULL | 9913 | 10.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
- Using join buffer (Batched Key Access)
表关联时,使用了 BKA 优化。和 MRR 类似,BKA 也是为了减少查询的随机 IO 的数量。
mysql> explain select /*+ BKA(tb) */ *
from tab ta, tab tb where ta.a = tb.a;
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
| 1 | SIMPLE | ta | ALL | idx_abc | NULL | NULL | NULL | 9913 | 100.00 | NULL |
| 1 | SIMPLE | tb | ref | idx_abc | idx_abc | 4 | rep.ta.a | 3304 | 100.00 | Using join buffer (Batched Key Access) |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
- Using MRR
查询使用了 MRR(Multi-Range Read),MRR 主要是为了减少回表查询数据时随机 IO 的数量。下面这个例子中使用了 BKA 提示,强制优化器使用 MRR。
mysql> explain select /*+ BKA(tab) */ *
from tab
where a=1 and b in (1,2,3);
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| 1 | SIMPLE | tab | range | idx_abc | idx_abc | 8 | NULL | 3 | 100.00 | Using index condition; Using MRR |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
- Using sort_union(…), Using union(…), Using intersect(…)
mysql> explain select *
from t_merge
where b=2 and c=2 and d=1;
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
| 1 | SIMPLE | t_merge | index_merge | idx_bd,idx_cd | idx_cd,idx_bd | 8,8 | NULL | 2 | 66.99 | Using intersect(idx_cd,idx_bd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
mysql> explain select *
from t_merge
where (b=2 and d=1) or (c=2 and d=1);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
| 1 | SIMPLE | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8 | NULL | 108 | 100.00 | Using union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
mysql> explain select *
from t_merge
where (b=2 and d between 1 and 2) or (d=1 and c=2);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| 1 | SIMPLE | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8 | NULL | 165 | 100.00 | Using sort_union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
总结
出于完整性的考虑,这一讲中我们使用了四十多个 SQL 语句,演示并解释了 MySQL 执行计划的各种输出。在实际工作中,一般也不会遇到这里的每一种情况。
执行计划有几个信息要重点关注:
-
通过 ID、SELECT_TYPE、TABLE 这几列可以了解语句整体的连接、嵌套结构。
-
TYPE 列为 ref、range 时,才是我们平时说的用到了索引。type 为 index 时,实际上是使用了全索引扫描。
-
ROWS 列是优化器评估的需要从到存储引擎里访问的记录的数量,这个数量对性能有直接的影响。
-
EXTRA 列里面提供了执行计划的额外信息,对这里出现的内容要有大致的了解。
你也可以使用 FORMAT=TREE,以树的形式显示执行计划,有时候这样可能会更直观。
思考题
这一讲中有这么一个 SQL:
select id, a, (select avg(b) from tab where a=t1.a) from tab t1;
在我们的测试表中,字段 A 的唯一值有三个,所以理论上,最优的情况下只需要将 3 个不同的 A 的值分别传入子查询中(select avg(b) from tab where a = t1.a),并将计算结果缓存起来,这样子查询只需要执行 3 次。
MariaDB 实际上就有这样的处理,因此在执行上面这个 SQL 时,速度比较快。
mariadb> select id, a, (select avg(b) from tab where a=t1.a) from tab t1 order by id;
+------+---+---------------------------------------+
| id | a | (select avg(b) from tab where a=t1.a) |
+------+---+---------------------------------------+
| 0 | 0 | 4999.5000 |
| 1 | 1 | 4999.0000 |
| 2 | 2 | 5000.0000 |
......
| 9999 | 0 | 4999.5000 |
+------+---+---------------------------------------+
10000 rows in set (0.01 sec)
但是 MySQL 中,同样的表结构和数据,配置一样的服务器,执行这个 SQL 时,执行时间是好几个数量级。
mysql> select id, a, (select avg(b) from tab where a=t1.a) from tab t1 order by id;
+------+------+---------------------------------------+
| id | a | (select avg(b) from tab where a=t1.a) |
+------+------+---------------------------------------+
| 0 | 0 | 4999.5000 |
| 1 | 1 | 4999.0000 |
| 2 | 2 | 5000.0000 |
......
| 9999 | 0 | 4999.5000 |
+------+------+---------------------------------------+
10000 rows in set (4 min 17.90 sec)
对于这种情况,你会怎么解决呢?
要点总结
- 执行计划描述了SQL语句的访问路径,包括表上是否有可用的索引、表连接的顺序、使用的表连接算法等。
- 使用EXPLAIN命令可以查看SQL语句的执行计划,可以指定输出格式为traditional、json或tree。
- 了解如何解读EXPLAIN输出信息,可以帮助优化SQL语句的执行性能,选择合适的访问路径和连接算法。
- TYPE列显示查询单元的访问路径,可以根据TYPE列来判断查询单元是否使用了索引。
- const表示查询最多返回1行记录,对主键或唯一索引的所有字段都使用常量等值匹配时,type为const。
- eq_ref使用主键或唯一索引等值匹配时,对于组合主键、组合唯一索引,索引中的每一个字段都需要以等值匹配时,type才为eq_ref。
- ref表示普通索引字段的等值匹配,或主键和唯一索引前缀字段上的等值匹配。
- range使用索引字段上的范围条件查询数据,包括
<, <=, >, >=, BETWEEN
等条件。 - index_merge会使用多个索引来查询数据,并将通过多个索引获取到的数据取并集或交集,得到最终的结果.
- index_subquery是执行特定类型的子查询的一种方式,如果子查询中的表有合适的索引可以用来检索,则可以使用index_subquery执行路径.
- 索引路径和实际使用的索引对于优化器选择索引和执行计划的影响
- 索引列的总长度和使用情况,以及如何推算执行计划使用到的索引字段
- 用于进行索引查找的值,包括常量匹配、字段匹配或函数计算结果匹配
- 查询单元需要访问的记录数的预估值,对执行时间的影响
- 经过where子句条件过滤后的记录数百分比,优化器计算和优化规则的影响
- 索引的实际使用情况,包括索引路径、索引列长度、查找值、访问记录数和过滤后的记录数百分比
- 优化器的成本评估和规则,对索引路径选择和使用的影响
- 索引路径选择和成本评估,受优化器成本评估、表连接顺序和连接算法的影响
- 条件过滤的优化器规则和参数,包括通过直方图精确计算filtered值和控制条件过滤的参数
- 执行计划中额外的重要信息,如使用临时表、skip scan、覆盖索引、排序需求、连接算法等的影响