MySQL基础运维-18 读懂MySQL中的执行计划(上)

内容纲要

上篇文章讲了数据库中最主要的几种访问路径,不同的访问路径,在执行性能上可能会存在巨大的差别。但是我们怎么知道某一个具体的 SQL 语句在执行时,采用了什么样的访问路径呢?这就涉及到 SQL 的执行计划了。

执行计划描述了 SQL 语句的访问路径,通过执行计划,我们可以知道:

  • 表上是否有可用的索引,SQL 执行时是否使用了索引,使用了哪些索引?

  • 表连接的顺序是怎样的?

  • 使用了哪种表连接算法,是用了 Nest Loop 还是 Hash Join?

  • 查询是否用到了临时表,是否进行了文件排序?

在 MySQL 中,我们使用 Explain 命令查看语句的执行计划。这一讲中我会使用四十多个演示 SQL,来解释 EXPLAIN 输出信息的具体含义。至于为什么一个 SQL 使用了这个执行计划,而不是别的执行计划,我会在接下来的几讲中慢慢展开。

为了便于演示各种不同的执行计划,我使用了下面这些测试表和测试数据,你也可以在自己的环境中进行测试。

CREATE TABLE `digit` (
  `a` tinyint NOT NULL,
  PRIMARY KEY (`a`)
) ENGINE=InnoDB;

insert into digit values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

CREATE VIEW numbers AS 
select a.a*1000 + b.a*100 + c.a*10 + d.a as n 
from digit a, digit b, digit c, digit d;

-- 本章中大部分案例都使用tab表
CREATE TABLE `tab` (
  `id` int NOT NULL AUTO_INCREMENT,
  `a` int NOT NULL,
  `b` int NOT NULL,
  `c` int NOT NULL,
  `padding` varchar(7000) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `idx_abc` (`a`,`b`,`c`)
) ENGINE=InnoDB;

insert into tab (a,b,c,padding) 
select n%3, n, n%100, rpad('x', 100, 'x')  
from numbers where n < 10000;

-- t_merge主要用于演示index_merge的几种情况
create table t_merge(
    id int not null auto_increment, 
    a int not null,
    b int not null,
    c int not null,
    d int not null,
    padding varchar(4000),
    primary key(id),
    key idx_ad(a,d),
    key idx_bd(b,d),
    key idx_cd(c,d)
) ENGINE=InnoDB;

insert into t_merge(a,b,c,d,padding)
select n % 3 + 1, n % 17 + 1, n % 19 + 1, n % 10 + 1, rpad('y', 100, 'y')
from numbers
where n between 1 and 3*17*19*10;

使用 EXPLAIN 命令

使用 EXPLAIN 命令可以查看 SELECT、INSERT、UPDATE、DELETE、REPLACE 等语句的执行计划。EXPLAIN 命令的基本语法如下:

explain [format=traditional|json|tree] SQL Statement

FORMAT 指定了执行计划的输出格式,不指定时,默认以表格形式输出执行计划。

mysql> explain select * from tab;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | tab   | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 9755 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+

FORMAT 指定为 TREE 时,以树的形式显示执行计划,这种显示格式和 Oracle 执行计划的显示方式比较像。

下面这个测试 SQL 中,使用了 NO_SEMIJOIN 提示,阻止优化器将子查询改写成表连接。你可以试一下,将提示去掉后,会使用怎样的执行计划。

explain format=tree 
select  /*+ NO_SEMIJOIN(@subq1)  */ * 
from tab a 
where id in ( 
    select /*+ QB_NAME(subq1) */ id from tab b 
)
EXPLAIN: -> Filter: <in_optimizer>(a.id,<exists>(select #2))  (cost=1031.55 rows=9913)
    -> Table scan on a  (cost=1031.55 rows=9913)
    -> Select #2 (subquery in condition; dependent)
        -> Limit: 1 row(s)  (cost=0.35 rows=1)
            -> Single-row covering index lookup on b using PRIMARY (id=<cache>(a.id))  (cost=0.35 rows=1)

FORMAT 指定为 JSON 时,以 JSON 格式显示执行计划。JSON 格式的执行计划中,可以看到一些成本的信息。

explain format=json 
select  /*+ NO_SEMIJOIN(@subq1)  */ * 
from tab a 
where id in ( 
    select /*+ QB_NAME(subq1) */ id from tab b 
)

{
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "1031.55"
    },
    "table": {
      "table_name": "a",
      "access_type": "ALL",
      "rows_examined_per_scan": 9913,
      "rows_produced_per_join": 9913,
      "filtered": "100.00",
      "cost_info": {
        "read_cost": "40.25",
        "eval_cost": "991.30",
        "prefix_cost": "1031.55",
        "data_read_per_join": "264M"
      },
      "used_columns": [
        "id",
        "a",
        "b",
        "c",
        "padding"
      ],
      "attached_condition": "<in_optimizer>(`rep`.`a`.`id`,<exists>(<primary_index_lookup>(<cache>(`rep`.`a`.`id`) in tab on PRIMARY)))",
      "attached_subqueries": [
        {
          "dependent": true,
          "cacheable": false,
          "query_block": {
            "select_id": 2,
            "cost_info": {
              "query_cost": "0.35"
            },
            "table": {
              "table_name": "b",
              "access_type": "unique_subquery",
              "possible_keys": [
                "PRIMARY"
              ],
              "key": "PRIMARY",
              "used_key_parts": [
                "id"
              ],
              "key_length": "4",
              "ref": [
                "func"
              ],
              "rows_examined_per_scan": 1,
              "rows_produced_per_join": 1,
              "filtered": "100.00",
              "using_index": true,
              "cost_info": {
                "read_cost": "0.25",
                "eval_cost": "0.10",
                "prefix_cost": "0.35",
                "data_read_per_join": "27K"
              },
              "used_columns": [
                "id"
              ]
            }
          }
        }
      ]
    }
  }
}

EXPLAIN 输出详解

接下来,我将以默认的表格输出形式为准,来进行介绍。在 EXPLAIN 的输出中,每一行表示一个查询单元。查询单元有几种情况:

  • 关联查询中,每一个关联的表是一个查询单元;

  • 每个子查询都会对应一个查询单元;

  • 组成 UNION 语句的每个子句都会对应一个查询单元;

  • 每个派生表对应一个查询单元。

EXPLAIN 的输出中包含以下字段:

image-20241027220235293

ID

ID 为查询单元的编号。主查询(顶层查询)的 ID 为 1。同一层级内,如果有表连接,则它们的查询单元 ID 一样。子查询的嵌套层级越深,ID 越大。下面的例子中,有 2 个子查询,子查询 2 嵌套在子查询 1 中。

mysql> explain 
select  /*+ NO_SEMIJOIN(@subq1 )  */ * 
from tab ta, tab tx
where ta.id in ( 
    select /*+ QB_NAME(subq1) NO_SEMIJOIN(@subq2) */ id from tab tb 
      where c in (select /*+ QB_NAME(subq2) */ c from tab tc)
)
and ta.id = tx.id

+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------+--------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref       | rows | filtered | Extra                    |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------+--------------------------+
|  1 | PRIMARY     | ta    | ALL    | PRIMARY       | NULL    | NULL    | NULL      | 9913 |   100.00 | Using where              |
|  1 | PRIMARY     | tx    | eq_ref | PRIMARY       | PRIMARY | 4       | rep.ta.id |    1 |   100.00 | NULL                     |
|  2 | SUBQUERY    | tb    | index  | PRIMARY       | idx_abc | 12      | NULL      | 9913 |   100.00 | Using where; Using index |
|  3 | SUBQUERY    | tc    | index  | NULL          | idx_abc | 12      | NULL      | 9913 |   100.00 | Using index              |
+----+-------------+-------+--------+---------------+---------+---------+-----------+------+----------+--------------------------+

查询中使用提示 QB_NAME、NO_SEMIJOIN,是为了避免优化器使用半连接查询优化,因为使用半连接优化后,这个示例中的子查询就被优化掉了。ID 为 1 的 2 个查询单元,按从上到下的顺序关联,驱动表为 ta。这里 Extra 中没有额外的信息,说明使用了嵌套循环连接。

SELECT_TYPE

SELECT_TYPE 为查询单元的类型,有以下几种可能的类型。

  • SIMPLE

如果语句没有使用 UNION 或子查询,那么 SELECT_TYPE 为 SIMPLE。

  • PRIMARY

如果语句用到了子查询,那么最外层查询的 select_type 为 PRIMARY,表示这是主语句。如果语句使用了 UNION,则第一个查询单元的 select_type 为 PRIMARY。

  • UNION

使用 UNION 或 UNION ALL 的查询中,第二个和后续查询的 select_type 为 UNION

  • UNION RESULT

UNION RESULT 表示 UNION 去重后的结果。下面这个例子中,ID 为 3 的这一行就是 ID 为 1 和 2 这两个查询单元结果集 Union 之后得到的。

mysql> explain select a from tab ta union select c from tab tb;
+----+--------------+------------+-------+---------------+---------+---------+------+------+----------+-----------------+
| id | select_type  | table      | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra           |
+----+--------------+------------+-------+---------------+---------+---------+------+------+----------+-----------------+
|  1 | PRIMARY      | ta         | index | NULL          | idx_abc | 12      | NULL | 9913 |   100.00 | Using index     |
|  2 | UNION        | tb         | index | NULL          | idx_abc | 12      | NULL | 9913 |   100.00 | Using index     |
|  3 | UNION RESULT | <union1,2> | ALL   | NULL          | NULL    | NULL    | NULL | NULL |     NULL | Using temporary |
+----+--------------+------------+-------+---------------+---------+---------+------+------+----------+-----------------+
  • DEPENDENT UNION

如果 UNION 子句位于子查询中,并且子查询依赖了外部查询中字段,则 select_type 为 DEPENDENT UNION。

mysql> explain select * from tab ta where id in (
    select a from tab tb union select c from tab tc);

+----+--------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type        | table      | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+--------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY            | ta         | ALL   | NULL          | NULL    | NULL    | NULL | 9913 |   100.00 | Using where              |
|  2 | DEPENDENT SUBQUERY | tb         | ref   | idx_abc       | idx_abc | 4       | func | 3304 |   100.00 | Using index              |
|  3 | DEPENDENT UNION    | tc         | index | NULL          | idx_abc | 12      | NULL | 9913 |    10.00 | Using where; Using index |
|  4 | UNION RESULT       | <union2,3> | ALL   | NULL          | NULL    | NULL    | NULL | NULL |     NULL | Using temporary          |
+----+--------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
  • UNCACHEABLE UNION

union 查询出现在子查询中,并且 union 语句使用了非确定性函数(如 rand)或使用了变量,那么 union 查询单元的 select_type 为 UNCACHEABLE UNION。

mysql> explain select distinct a from  (
    select a from tab tb  
    union all 
    select c from tab tc where c > rand() 
) tx;

+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
| id | select_type       | table      | type  | possible_keys | key     | key_len | ref  | rows  | filtered | Extra                    |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
|  1 | PRIMARY           | <derived2> | ALL   | NULL          | NULL    | NULL    | NULL | 13217 |   100.00 | Using temporary          |
|  2 | DERIVED           | tb         | index | NULL          | idx_abc | 12      | NULL |  9913 |   100.00 | Using index              |
|  3 | UNCACHEABLE UNION | tc         | index | NULL          | idx_abc | 12      | NULL |  9913 |    33.33 | Using where; Using index |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
  • SUBQUERY

子查询可以出现在 select 列表中、where 条件中,也可以出现在 from 列表中。出现在 from 列表中的子查询为派生表(DERIVED)。如果子查询不依赖主查询中的字段,则称为不相关子查询,select_type 为 SUBQUERY。SQL 执行时,会缓存子查询的查询结果。下面的例子中,子查询中的 SQL 只需要执行 1 次,不需要每访问 1 行数据就重复执行 1 次子查询。

mysql> explain select a, (select avg(b) from tab )  from tab t1;

+----+-------------+-------+-------+---------------+---------+---------+------+-------+----------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows  | filtered | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+----------+-------------+
|  1 | PRIMARY     | t1    | index | NULL          | idx_abc | 12      | NULL | 10010 |   100.00 | Using index |
|  2 | SUBQUERY    | tab   | index | NULL          | idx_abc | 12      | NULL | 10010 |   100.00 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+-------+----------+-------------+

使用 EXPLAN ANALYZE 命令,可以看到子查询只执行了 1 次(run only once)。

mysql> explain analyze select a, (select avg(b) from tab )  from tab t1\G
*************************** 1. row ***************************
EXPLAIN: 
-> Covering index scan on t1 using idx_abc  (cost=1025 rows=10010) (actual time=0.108..2.54 rows=10000 loops=1)
-> Select #2 (subquery in projection; run only once)
    -> Aggregate: avg(tab.b)  (cost=2026 rows=1) (actual time=3.79..3.79 rows=1 loops=1)
        -> Covering index scan on tab using idx_abc  (cost=1025 rows=10010) (actual time=0.0235..2.61 rows=10000 loops=1)
  • DEPENDENT SUBQUERY

子查询引用了主查询中的字段,则子查询称为相关子查询。MySQL 中,相关子查询无法被缓存。对主查询中的每 1 行数据,都需要执行 1 次子查询。

mysql> explain select a, (select avg(b) from tab where a=t1.a) from tab t1;

+----+--------------------+-------+-------+---------------+---------+---------+----------+------+----------+-------------+
| id | select_type        | table | type  | possible_keys | key     | key_len | ref      | rows | filtered | Extra       |
+----+--------------------+-------+-------+---------------+---------+---------+----------+------+----------+-------------+
|  1 | PRIMARY            | t1    | index | NULL          | idx_abc | 12      | NULL     | 9913 |   100.00 | Using index |
|  2 | DEPENDENT SUBQUERY | tab   | ref   | idx_abc       | idx_abc | 4       | rep.t1.a | 3304 |   100.00 | Using index |
+----+--------------------+-------+-------+---------------+---------+---------+----------+------+----------+-------------+

使用 EXPLAN ANLYZE 命令,可以看到子查询执行了 10000 次(loops=10000)。

mysql> explain  analyze select a, (select avg(b) from tab where a=t1.a) from tab t1\G
*************************** 1. row ***************************
EXPLAIN: 
-> Covering index scan on t1 using idx_abc  (cost=1025 rows=10010) (actual time=0.0478..3.09 rows=10000 loops=1)
-> Select #2 (subquery in projection; dependent)
    -> Aggregate: avg(tab.b)  (cost=669 rows=1) (actual time=1.39..1.39 rows=1 loops=10000)
        -> Covering index lookup on tab using idx_abc (a=t1.a)  (cost=336 rows=3337) (actual time=0.023..1.03 rows=3333 loops=10000)

1 row in set, 1 warning (13.94 sec)

MariaDB 中,相关子查询也可以 cache,对于本例中的 SQL,子查询依赖主查询中 t1 表的字段 a,每次 t1.a 出现新值时,执行 1 次子查询,并缓存 t1.a 的值和查询结果。子查询执行的次数取决于关联的外部查询中字段的唯一值数量。

下面这个例子是在 MariaDB 10.0 的环境中执行得到的。explain extended 的 warning 信息中, 表示使用了子查询 cache。

--这个例子在mariadb 10.0 中执行
mysql> explain extended select a, (select avg(b) from tab where a=t1.a) from tab t1;
+------+--------------------+-------+-------+---------------+---------+---------+------------+------+----------+-------------+
| id   | select_type        | table | type  | possible_keys | key     | key_len | ref        | rows | filtered | Extra       |
+------+--------------------+-------+-------+---------------+---------+---------+------------+------+----------+-------------+
|    1 | PRIMARY            | t1    | index | NULL          | idx_abc | 12      | NULL       | 9919 |   100.00 | Using index |
|    2 | DEPENDENT SUBQUERY | tab   | ref   | idx_abc       | idx_abc | 4       | mysql.t1.a | 1668 |   100.00 | Using index |
+------+--------------------+-------+-------+---------------+---------+---------+------------+------+----------+-------------+
2 rows in set, 2 warnings (0.00 sec)

mysql> show warnings;
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                       |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1276 | Field or reference 'mysql.t1.a' of SELECT #2 was resolved in SELECT #1                                                                                                                                                                                        |
| Note  | 1003 | /* select#1 */ select `mysql`.`t1`.`a` AS `a`,<expr_cache><`mysql`.`t1`.`a`>((/* select#2 */ select avg(`mysql`.`tab`.`b`) from `mysql`.`tab` where `mysql`.`tab`.`a` = `mysql`.`t1`.`a`)) AS `(select avg(b) from tab where a=t1.a)` from `mysql`.`tab` `t1` |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  • DRIVED

子查询出现在主查询的 FROM 子句的位置时,select_type 为 DERIVED。

mysql> explain select t1.* 
    from tab t1, (select a, avg(b) as avgb from tab group by a) t2  
    where t1.a = t2.a;

+----+-------------+------------+-------+---------------+-------------+---------+----------+------+----------+-------------+
| id | select_type | table      | type  | possible_keys | key         | key_len | ref      | rows | filtered | Extra       |
+----+-------------+------------+-------+---------------+-------------+---------+----------+------+----------+-------------+
|  1 | PRIMARY     | t1         | ALL   | idx_abc       | NULL        | NULL    | NULL     | 9913 |   100.00 | NULL        |
|  1 | PRIMARY     | <derived2> | ref   | <auto_key0>   | <auto_key0> | 4       | rep.t1.a |   10 |   100.00 | NULL        |
|  2 | DERIVED     | tab        | index | idx_abc       | idx_abc     | 12      | NULL     | 9913 |   100.00 | Using index |
+----+-------------+------------+-------+---------------+-------------+---------+----------+------+----------+-------------+
  • DEPENDENT DRIVED

如果派生查询引用了主查询中的字段,则 select_type 为 DEPENDENT DRIVED。

mysql> explain select a, (select * from (select avg(b) as avgb from tab where a=t0.a) td) from tab t0;

+----+--------------------+------------+-------+---------------+---------+---------+----------+------+----------+-------------+
| id | select_type        | table      | type  | possible_keys | key     | key_len | ref      | rows | filtered | Extra       |
+----+--------------------+------------+-------+---------------+---------+---------+----------+------+----------+-------------+
|  1 | PRIMARY            | t0         | index | NULL          | idx_abc | 12      | NULL     | 9913 |   100.00 | Using index |
|  2 | DEPENDENT SUBQUERY | <derived3> | ALL   | NULL          | NULL    | NULL    | NULL     |    2 |   100.00 | NULL        |
|  3 | DEPENDENT DERIVED  | tab        | ref   | idx_abc       | idx_abc | 4       | rep.t0.a | 3304 |   100.00 | Using index |
+----+--------------------+------------+-------+---------------+---------+---------+----------+------+----------+-------------+
  • MATERIALIZED

子查询的结果先存储到临时表,查询的其他部分和生成的临时表再关联。

mysql> explain select * from tab where a in( select b from tab);

+----+--------------+-------------+--------+---------------------+---------------------+---------+-----------+------+----------+-------------+
| id | select_type  | table       | type   | possible_keys       | key                 | key_len | ref       | rows | filtered | Extra       |
+----+--------------+-------------+--------+---------------------+---------------------+---------+-----------+------+----------+-------------+
|  1 | SIMPLE       | tab         | ALL    | idx_abc             | NULL                | NULL    | NULL      | 9913 |   100.00 | NULL        |
|  1 | SIMPLE       | <subquery2> | eq_ref | <auto_distinct_key> | <auto_distinct_key> | 4       | rep.tab.a |    1 |   100.00 | NULL        |
|  2 | MATERIALIZED | tab         | index  | NULL                | idx_abc             | 12      | NULL      | 9913 |   100.00 | Using index |
+----+--------------+-------------+--------+---------------------+---------------------+---------+-----------+------+----------+-------------+
  • UNCACHEABLE SUBQUERY

如果(不相关)子查询结果无法被缓存,则 select_type 为 UNCACHEABLE SUBQUERY。

mysql> explain select * from tab where b > (select avg(b) from tab where b > rand());

+----+----------------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type          | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+----------------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | PRIMARY              | tab   | ALL   | NULL          | NULL    | NULL    | NULL | 9913 |   100.00 | Using where              |
|  2 | UNCACHEABLE SUBQUERY | tab   | index | NULL          | idx_abc | 12      | NULL | 9913 |    33.33 | Using where; Using index |
+----+----------------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+

导致子查询无法 Cache 的一些情况:子查询使用了结果不确定的函数(如 rand),子查询使用了变量。

  • UNCACHEABLE UNION
mysql> explain select distinct a from  (select a from tab tb  union all select c from tab tc where c > rand() ) tx;

+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
| id | select_type       | table      | type  | possible_keys | key     | key_len | ref  | rows  | filtered | Extra                    |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+
|  1 | PRIMARY           | <derived2> | ALL   | NULL          | NULL    | NULL    | NULL | 13346 |   100.00 | Using temporary          |
|  2 | DERIVED           | tb         | index | NULL          | idx_abc | 12      | NULL | 10010 |   100.00 | Using index              |
|  3 | UNCACHEABLE UNION | tc         | index | NULL          | idx_c   | 4       | NULL | 10010 |    33.33 | Using where; Using index |
+----+-------------------+------------+-------+---------------+---------+---------+------+-------+----------+--------------------------+

TABLE

TABLE 这一列说明了一个查询单元的数据是从哪个表获取的。这里分几种情况。

  1. table 是一个真实存在的物理表,这里会显示为表的别名。

  2. ,说明是从派生表获取的数据,真实的数据来自 ID 为 N 的查询单元
  3. <unionM,N>,说明是将 ID 为 M 和 N 的查询单元 Union 后得到的数据。

  4. ,说明是 ID 为 N 的子查询物化后得到的数据。

这里的几种情况,在前面的例子中都出现过,你可以回头再看一下。

TYPE

TYPE 列显示查询单元的访问路径,你可以根据 TYPE 列来判断查询单元是否使用了索引。

  • system

如果使用了 MyISAM 存储引擎,并且表中只有 1 行,则 Type 为 system。

mysql> create table t1_myisam(a int) engine=myisam;
mysql> insert into t1_myisam values(100);

mysql> explain select * from t1_myisam;
+----+-------------+-----------+--------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table     | type   | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+-----------+--------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | t1_myisam | system | NULL          | NULL | NULL    | NULL |    1 |   100.00 | NULL  |
+----+-------------+-----------+--------+---------------+------+---------+------+------+----------+-------+
  • const

const 表示查询最多返回 1 行记录。对主键或唯一索引的所有字段都使用常量等值匹配时,type 为 const。优化器会将 type 为 const 的查询单元直接替换为常量表。

mysql> explain select * from tab where id = 10\G
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | tab   | const | PRIMARY       | PRIMARY | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------+
  • eq_ref

使用主键或唯一索引等值匹配时,type 为 eq_ref。对于组合主键、组合唯一索引,索引中的每一个字段都需要以等值匹配时,type 才为 eq_ref。eq_ref 访问路径每次最多只返回 1 行记录。eq_ref 和 const 的区别在于,const 使用常量匹配,而 eq_ref 中,匹配索引字段的值来自驱动表,不是固定的常量。

下面这个例子中,t2 表 id 字段的匹配条件来自驱动表 t1 的字段 a。

mysql> explain select * from tab t1 , tab t2 where t1.a = t2.id;

+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref      | rows | filtered | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------+
|  1 | SIMPLE      | t1    | ALL    | idx_abc       | NULL    | NULL    | NULL     | 9913 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | eq_ref | PRIMARY       | PRIMARY | 4       | rep.t1.a |    1 |   100.00 | NULL  |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------+
  • ref

普通索引字段的等值匹配,或主键和唯一索引前缀字段上的等值匹配,类型为 ref。对于 ref 类型的访问路径,执行计划输出中的 key 列显示了实际使用的索引,key_len 显示使用的索引字段的长度,ref 字段显示用于匹配的值。

mysql> explain select * from tab where a=1;

+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | tab   | ref  | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | NULL  |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+

mysql> explain select * from tab t1, tab t2 where t1.a = t2.a;

+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+-------+
| id | select_type | table | type | possible_keys | key     | key_len | ref      | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+-------+
|  1 | SIMPLE      | t1    | ALL  | idx_abc       | NULL    | NULL    | NULL     | 9913 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | ref  | idx_abc       | idx_abc | 4       | rep.t1.a | 3304 |   100.00 | NULL  |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+-------+

如果索引字段的条件使用了 or 或 in,那么 type 就不再是 ref 了。

  • ref_or_null

ref_or_null 和 ref 类似,只是额外加了字段为空的条件。对比下下面这两个语句,字段 A 有 not null 约束时,type 为 ref。

mysql> create table tab3 like tab;
mysql> alter table tab3 modify a int;
mysql> insert into tab3 select * from tab;

mysql> explain select * from tab where a is null or a = 1;
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | tab   | ref  | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | NULL  |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-------+

mysql> explain select * from tab3 where a is null or a = 1;
+----+-------------+-------+-------------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table | type        | possible_keys | key     | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+-------------+---------------+---------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | tab3  | ref_or_null | idx_abc       | idx_abc | 5       | const | 3334 |   100.00 | Using index condition |
+----+-------------+-------+-------------+---------------+---------+---------+-------+------+----------+-----------------------+
  • range

使用索引字段上的范围条件查询数据。范围条件包括 <, <=, >, >=, BETWEEN 等,索引字段使用 OR 或 IN 多个值时,类型也为 range。对于 range 访问路径,执行计划中 key 列显示实际使用的索引名。key_len 列显示使用到的索引字段的长度,ken_len 取决于使用到的索引字段的数据类型、字段是否可以为空。rows 字段显示优化器评估得到的需要在索引中访问的记录数量。

mysql> explain select * from tab force index(idx_abc) where   a = 1 or a = 2;

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 4       | NULL | 6666 |   100.00 | Using index condition |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+

mysql> explain select * from tab where a = 1 and b between 100 and 200;
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 8       | NULL |   34 |   100.00 | Using index condition |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-----------------------+
  • index_merge

index_merge 会使用多个索引来查询数据,并将通过多个索引获取到的数据取并集或交集,得到最终的结果。执行计划输出中,key 字段显示实际参与 index_merge 的索引。Extra 列中的信息显示了实际的索引合并方法,包括 sort_union, union 和 intersect。

如果在执行计划中看到 index_merge 访问路径,一般要考虑是否可以创建联合索引,将访问路径改成 range 或 ref,或者将 SQL 改写为 union 查询。

mysql> explain select * 
  from t_merge 
  where (b=1 and d=1) or (c=1 and d between 3 and 5);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                        |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8     | NULL |  210 |   100.00 | Using sort_union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+

mysql> explain select * from t_merge where a=1 and b=1 and d=1 and c=1;
+----+-------------+---------+-------------+----------------------+----------------------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table   | type        | possible_keys        | key                  | key_len | ref  | rows | filtered | Extra                                              |
+----+-------------+---------+-------------+----------------------+----------------------+---------+------+------+----------+----------------------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_ad,idx_bd,idx_cd | idx_cd,idx_bd,idx_ad | 8,8,8   | NULL |    1 |   100.00 | Using intersect(idx_cd,idx_bd,idx_ad); Using where |
+----+-------------+---------+-------------+----------------------+----------------------+---------+------+------+----------+----------------------------------------------------+
  • index_subquery

index_subquery 是执行下面这种类型的子查询的一种方式:

where value in (select col from tab where ...)

如果子查询中的表 tab 有合适的索引可以用来检索,则可以使用 index_subquery 执行路径,对主查询中的每 1 行记录,执行类似下面的 SQL:

select * from tab where col = outer.value and ...
mysql> explain select /*+ NO_SEMIJOIN(@qb1) */ * from tab 
  where a = 1 
  and  b between 100 and 200  
  and c in (select /*+ QB_NAME(qb1) */  a from tab where b=1  );

+----+--------------------+-------+----------------+---------------+---------+---------+------------+------+----------+------------------------------------+
| id | select_type        | table | type           | possible_keys | key     | key_len | ref        | rows | filtered | Extra                              |
+----+--------------------+-------+----------------+---------------+---------+---------+------------+------+----------+------------------------------------+
|  1 | PRIMARY            | tab   | range          | idx_abc       | idx_abc | 8       | NULL       |   34 |   100.00 | Using index condition; Using where |
|  2 | DEPENDENT SUBQUERY | tab   | index_subquery | idx_abc       | idx_abc | 8       | func,const |    1 |   100.00 | Using index                        |
+----+--------------------+-------+----------------+---------------+---------+---------+------------+------+----------+------------------------------------+

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select /*+ NO_SEMIJOIN(@`qb1`) */ 
    `rep`.`tab`.`id` AS `id`,`rep`.`tab`.`a` AS `a`,`rep`.`tab`.`b` AS `b`,`rep`.`tab`.`c` AS `c`,`rep`.`tab`.`padding` AS `padding` 
from `rep`.`tab` 
where ((`rep`.`tab`.`a` = 1) 
and (`rep`.`tab`.`b` between 100 and 200) 
and <in_optimizer>(`rep`.`tab`.`c`,<exists>(<index_lookup>(<cache>(`rep`.`tab`.`c`) in tab on idx_abc))))
1 row in set (0.00 sec)
  • unique_subquery

unique_subquery 和 index_subquery 类似,区别在于 type 为 unique_subquery 时,子查询中的表使用主键或唯一索引来关联查询。

mysql> explain select /*+ NO_SEMIJOIN(@qb1) */ * 
  from tab 
  where a=1 
  and id in (select /*+ QB_NAME(qb1) */  id from tab);

+----+--------------------+-------+-----------------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type        | table | type            | possible_keys | key     | key_len | ref   | rows | filtered | Extra       |
+----+--------------------+-------+-----------------+---------------+---------+---------+-------+------+----------+-------------+
|  1 | PRIMARY            | tab   | ref             | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | tab   | unique_subquery | PRIMARY       | PRIMARY | 4       | func  |    1 |   100.00 | Using index |
+----+--------------------+-------+-----------------+---------------+---------+---------+-------+------+----------+-------------+

mysql> show warnings\G
*************************** 1. row ***************************
  Level: Note
   Code: 1003
Message: /* select#1 */ select /*+ NO_SEMIJOIN(@`qb1`) */ 
    `rep`.`tab`.`id` AS `id`,`rep`.`tab`.`a` AS `a`,`rep`.`tab`.`b` AS `b`,`rep`.`tab`.`c` AS `c`,`rep`.`tab`.`padding` AS `padding` 
  from `rep`.`tab` 
  where ((`rep`.`tab`.`a` = 1) 
  and <in_optimizer>(`rep`.`tab`.`id`,<exists>(<primary_index_lookup>(<cache>(`rep`.`tab`.`id`) in tab on PRIMARY))))

上面例子中 NO_SEMIJOIN 提示是为了阻止优化器使用半连接优化,因为使用半连接优化后,子查询被转换为半连接,就不会出现 index_subquery 和 unique_subquery 访问路径了。

  • index

index 访问路径和 ALL 类似,主要区别是 ALL 访问路径需要访问表的每一行数据,而 index 访问路径是访问索引中的每一行数据。访问路径为 index 时,key_len 字段显示的是所有索引字段的长度总和。一般 type 为 index 时,possible_keys 总是 NULL。

以下 2 种情况可以使用 index 访问路径:

情况 1:语句使用了覆盖索引,但是又缺少索引字段的查询条件。这种情况下 Extra 列会显示 Using index。

情况 2:按索引的顺序读取表的记录,但是无法使用覆盖索引。这种情况下 Extra 列不显示 Using index。

下面的 2 个例子分别展示了这两种情况。

mysql> explain select a,b,c from tab force index(idx_abc) order by a,b,c;

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | tab   | index | NULL          | idx_abc | 12      | NULL | 9913 |   100.00 | Using index |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------------+

mysql> explain select * from tab force index(idx_abc) order by a,b,c;

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------+
|  1 | SIMPLE      | tab   | index | NULL          | idx_abc | 12      | NULL | 9913 |   100.00 | NULL  |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+-------+
  • ALL

全表扫描,需要访问表中的每一行记录。出现 ALL 访问路径,可能是因为表上缺少合适的索引,或者是因为 SQL 语句的写法问题导致无法使用索引,也可能是因为查询中对表缺少合适的过滤条件,或者是索引字段的过滤性不好,需要根据查询语句的情况具体分析。

POSSIBLE_KEYS

possible_keys 列显示查询单元能使用 range、ref 等访问路径访问的索引。执行计划最终选择的索引在 keys 列显示。是否使用索引,以及使用哪个索引,取决于优化器对各种访问方式的成本评估,还跟表连接的顺序和连接算法也有关系。

下面这个例子中,t1 表的 possible_keys 有索引 idx_abc,但是当 t1 表作为驱动表时,就无法使用索引 idx_abc。

mysql> explain select * from tab t1, tab t2 where t1.a = t2.a and t1.b = t2.b;

+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
| id | select_type | table | type | possible_keys | key     | key_len | ref               | rows | filtered | Extra |
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+
|  1 | SIMPLE      | t1    | ALL  | idx_abc       | NULL    | NULL    | NULL              | 9913 |   100.00 | NULL  |
|  1 | SIMPLE      | t2    | ref  | idx_abc       | idx_abc | 8       | rep.t1.a,rep.t1.b |    1 |   100.00 | NULL  |
+----+-------------+-------+------+---------------+---------+---------+-------------------+------+----------+-------+

KEY

key 列显示执行计划实际使用的索引。如果 key 列为 NULL,则说明查询单元没有使用索引。对于 index_merge 访问路径,key 列中会显示多个索引。

KEY_LEN

key_len 列显示执行计划使用到的索引列的总长度。根据 key_len 可以推算出执行计划使用到了索引中的哪几个字段。key_len 根据索引字段的类型和字段是否为空计算得到。对于字符类型如 varchar、char,key_len 为字符数乘以单个字符的最大可能字节数。对于每个可变长类型如 varchar,key_len 额外加 2。对于每个可以为空的字段,key_len 额外加 1。

 create table t_k(
     a varchar(20), 
     b char(20), 
     key idx_a(a,b)
) engine=innodb charset=utf8mb4;

mysql> explain select * from t_k where  a='x';
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+
|  1 | SIMPLE      | t_k   | ref  | idx_a         | idx_a | 83      | const |    1 |   100.00 | Using index |
+----+-------------+-------+------+---------------+-------+---------+-------+------+----------+-------------+

mysql> explain select * from t_k where  a='x' and b='x';

+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key   | key_len | ref         | rows | filtered | Extra                    |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+
|  1 | SIMPLE      | t_k   | ref  | idx_a         | idx_a | 164     | const,const |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------+---------------+-------+---------+-------------+------+----------+--------------------------+

上面这个例子中,SQL 1 使用索引中的字段 a,字段 A 的类型为 varchar(20),字符集为 utf8mb4,key_len 为 20 * 4 + 2 + 1 = 83。

SQL 2 使用到了字段 A 和 B,字段 A 的 key_len 为 83,字段 B 的 key_len 为 20 * 4 + 1 = 81,整体 Key_len 为字段 A 和字段 B 相加 164。

REF

ref 列显示用来进行索引查找的值,ref 的取值可能是以下几种情况:

  • const:使用常量匹配

  • db.tab.c:使用驱动表的某个字段匹配

  • func:使用某个函数的计算结果匹配。可以在执行 explain 后使用 show warnings 命令查看转换后的 SQL。

ROWS

查询单元需要访问的记录数。对于 InnoDB 引擎,这里的记录数是一个预估的行数,跟实际执行过程中真实访问的记录数可能会有一些差异。对于全表扫描和全索引扫描,这里的行数从统计信息中获取。对于索引访问(type 为 ref 或 range),rows 通过访问索引评估得到,或通过索引的统计信息计算得到。对应派生表,rows 通过一些规则评估得到。

一般来说,rows 越大,说明查询单元需要访问的记录数越多,执行时间越长。

FILTERED

filtered 字段单位为百分比,取值范围为 0-100,表示经过 where 子句中的条件过滤后,满足条件的记录数相对于 rows 列中显示的行数所占的百分比。使用公示 rows * filtered / 100 可以得到优化器预估的查询单元返回的记录数。如果当前的查询单元作为驱动表,那么这里的记录数还决定了被驱动的查询单元需要执行多次。

优化器中有一系列固定的规则来计算 filtered 的取值。你可以在分析表(analyze table)的时候给字段添加直方图,使优化器能更精确地计算 filtered。参数 optimizer_switch 中的选项 condition_fanout_filter 用来控制是否开启条件过滤。

EXTRA

Extra 列中显示了执行计划额外的一些重要信息。

  • using where

如果访问路径为 ALL 或 index,Extra 中没用 using where,说明查询需要读取整个表或索引的数据。

  • Range checked for each record (index map: 0x n)

如果 Extra 中出现了“Range checked for each record”,那么查询的性能很可能不太好。这里 index map 是索引编号的位图信息。

mysql> explain select * from tab a, tab b where  a.id > b.id and a.c > b.c;

+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                                          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
|  1 | SIMPLE      | a     | ALL  | PRIMARY       | NULL | NULL    | NULL | 9913 |   100.00 | NULL                                           |
|  1 | SIMPLE      | b     | ALL  | PRIMARY       | NULL | NULL    | NULL | 9913 |    11.11 | Range checked for each record (index map: 0x1) |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+------------------------------------------------+
  • Using index; Using temporary

Extra 中出现 Using temporary,说明用到了临时表。

mysql> explain select b,a,count(*) from  tab group by b,a;

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                        |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
|  1 | SIMPLE      | tab   | index | idx_abc       | idx_abc | 12      | NULL | 9913 |   100.00 | Using index; Using temporary |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+------------------------------+
  • Using index for skip scan

查询条件没有传入索引的前缀字段,又用到了覆盖索引时,MySQL 可能会使用 skip scan。如果前缀列的唯一值很低,skip scan 也可能会有不错的性能。

mysql> explain select c from tab where b=1;

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 8       | NULL |  991 |   100.00 | Using where; Using index for skip scan |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
  • Using index

使用了覆盖索引,也就是查询中所有列都包含在索引中。

  • no matching row in const table

说明表里面不存在满足条件的记录。

mysql> explain select * from tab where id = 12345;

+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
| id | select_type | table  type | possible_keys | key  | key_len | ref  | rows | filtered | Extra                          |
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
|  1 | SIMPLE      | NULL   NULL | NULL          | NULL | NULL    | NULL | NULL |     NULL | no matching row in const table |
+----+-------------+-------------+---------------+------+---------+------+------+----------+--------------------------------+
  • Using index for group-by
mysql> explain select a, min(b) from tab group by a;

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 4       | NULL |    4 |   100.00 | Using index for group-by |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+--------------------------+
  • LooseScan

LooseScan,以及 FirstMatch、Start temporary、End temporary,都是子查询自动改写为表连接后的执行方式,我们在后续的子查询这一讲中再具体介绍。

mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)

mysql> explain select * from tab ta 
  where a=1 and c in (select a from tab tb);

+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra                  |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
|  1 | SIMPLE      | tb    | index | idx_abc       | idx_abc | 12      | NULL  | 9913 |     0.03 | Using index; LooseScan |
|  1 | SIMPLE      | ta    | ref   | idx_abc       | idx_abc | 4       | const | 3333 |    10.00 | Using index condition  |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+------------------------+
  • FirstMatch(ta)
mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)

mysql> explain select * from tab ta where a=1 and c in (select c from tab tb);
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref   | rows | filtered | Extra                                                                   |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
|  1 | SIMPLE      | ta    | ref   | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | NULL                                                                    |
|  1 | SIMPLE      | tb    | index | NULL          | idx_abc | 12      | NULL  | 9913 |    10.00 | Using where; Using index; FirstMatch(ta); Using join buffer (hash join) |
+----+-------------+-------+-------+---------------+---------+---------+-------+------+----------+-------------------------------------------------------------------------+
  • Start temporary, End temporary
mysql> set optimizer_switch='materialization=off';
Query OK, 0 rows affected (0.01 sec)

mysql> explain select * from tab ta where id in (select b from tab tb where b < 10);

+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref      | rows | filtered | Extra                                     |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
|  1 | SIMPLE      | tb    | index  | NULL          | idx_abc | 12      | NULL     | 9913 |    33.33 | Using where; Using index; Start temporary |
|  1 | SIMPLE      | ta    | eq_ref | PRIMARY       | PRIMARY | 4       | rep.tb.b |    1 |   100.00 | End temporary                             |
+----+-------------+-------+--------+---------------+---------+---------+----------+------+----------+-------------------------------------------+
  • Using index condition

使用到了索引下推条件。

mysql> explain select * from tab where a=1 and c=1;

+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | tab   | ref  | idx_abc       | idx_abc | 4       | const | 3333 |    10.00 | Using index condition |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+-----------------------+
  • Using filesort

说明查询需要排序。

mysql> explain select * from tab order by b;

+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra          |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
|  1 | SIMPLE      | tab   | ALL  | NULL          | NULL | NULL    | NULL | 9913 |   100.00 | Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+----------+----------------+
  • Using join buffer (hash join)

被驱动表缺少合适的索引时,MySQL 会考虑使用 Hash 连接算法。

mysql> explain select * from tab t1, tab t2 where t1.a = 1 and t1.c=t2.c;

+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                                      |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
|  1 | SIMPLE      | t1    | ref  | idx_abc       | idx_abc | 4       | const | 3333 |   100.00 | NULL                                       |
|  1 | SIMPLE      | t2    | ALL  | NULL          | NULL    | NULL    | NULL  | 9913 |    10.00 | Using where; Using join buffer (hash join) |
+----+-------------+-------+------+---------------+---------+---------+-------+------+----------+--------------------------------------------+
  • Using join buffer (Batched Key Access)

表关联时,使用了 BKA 优化。和 MRR 类似,BKA 也是为了减少查询的随机 IO 的数量。

mysql> explain select /*+ BKA(tb) */ * 
  from tab ta, tab tb where ta.a = tb.a;

+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref      | rows | filtered | Extra                                  |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
|  1 | SIMPLE      | ta    | ALL  | idx_abc       | NULL    | NULL    | NULL     | 9913 |   100.00 | NULL                                   |
|  1 | SIMPLE      | tb    | ref  | idx_abc       | idx_abc | 4       | rep.ta.a | 3304 |   100.00 | Using join buffer (Batched Key Access) |
+----+-------------+-------+------+---------------+---------+---------+----------+------+----------+----------------------------------------+
  • Using MRR

查询使用了 MRR(Multi-Range Read),MRR 主要是为了减少回表查询数据时随机 IO 的数量。下面这个例子中使用了 BKA 提示,强制优化器使用 MRR。

mysql> explain select /*+ BKA(tab) */ * 
  from tab 
  where a=1 and b in (1,2,3);

+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
| id | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                            |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
|  1 | SIMPLE      | tab   | range | idx_abc       | idx_abc | 8       | NULL |    3 |   100.00 | Using index condition; Using MRR |
+----+-------------+-------+-------+---------------+---------+---------+------+------+----------+----------------------------------+
  • Using sort_union(…), Using union(…), Using intersect(…)
mysql> explain select * 
  from t_merge 
  where b=2 and c=2 and d=1;
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                       |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_cd,idx_bd | 8,8     | NULL |    2 |    66.99 | Using intersect(idx_cd,idx_bd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+---------------------------------------------+

mysql> explain select * 
  from t_merge 
  where (b=2 and d=1) or (c=2 and d=1);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                   |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8     | NULL |  108 |   100.00 | Using union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+-----------------------------------------+

mysql> explain select * 
  from t_merge 
  where (b=2 and d between 1 and 2) or (d=1 and c=2);
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
| id | select_type | table   | type        | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                        |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+
|  1 | SIMPLE      | t_merge | index_merge | idx_bd,idx_cd | idx_bd,idx_cd | 8,8     | NULL |  165 |   100.00 | Using sort_union(idx_bd,idx_cd); Using where |
+----+-------------+---------+-------------+---------------+---------------+---------+------+------+----------+----------------------------------------------+

总结

出于完整性的考虑,这一讲中我们使用了四十多个 SQL 语句,演示并解释了 MySQL 执行计划的各种输出。在实际工作中,一般也不会遇到这里的每一种情况。

执行计划有几个信息要重点关注:

  • 通过 ID、SELECT_TYPE、TABLE 这几列可以了解语句整体的连接、嵌套结构。

  • TYPE 列为 ref、range 时,才是我们平时说的用到了索引。type 为 index 时,实际上是使用了全索引扫描。

  • ROWS 列是优化器评估的需要从到存储引擎里访问的记录的数量,这个数量对性能有直接的影响。

  • EXTRA 列里面提供了执行计划的额外信息,对这里出现的内容要有大致的了解。

你也可以使用 FORMAT=TREE,以树的形式显示执行计划,有时候这样可能会更直观。

思考题

这一讲中有这么一个 SQL:

select id, a, (select avg(b) from tab where a=t1.a) from tab t1;

在我们的测试表中,字段 A 的唯一值有三个,所以理论上,最优的情况下只需要将 3 个不同的 A 的值分别传入子查询中(select avg(b) from tab where a = t1.a),并将计算结果缓存起来,这样子查询只需要执行 3 次。

MariaDB 实际上就有这样的处理,因此在执行上面这个 SQL 时,速度比较快。

mariadb> select id, a, (select avg(b) from tab where a=t1.a) from tab t1 order by id;
+------+---+---------------------------------------+
| id   | a | (select avg(b) from tab where a=t1.a) |
+------+---+---------------------------------------+
|    0 | 0 |                             4999.5000 |
|    1 | 1 |                             4999.0000 |
|    2 | 2 |                             5000.0000 |
......
| 9999 | 0 |                             4999.5000 |
+------+---+---------------------------------------+
10000 rows in set (0.01 sec)

但是 MySQL 中,同样的表结构和数据,配置一样的服务器,执行这个 SQL 时,执行时间是好几个数量级。

mysql> select id, a, (select avg(b) from tab where a=t1.a) from tab t1 order by id;
+------+------+---------------------------------------+
| id   | a    | (select avg(b) from tab where a=t1.a) |
+------+------+---------------------------------------+
|    0 |    0 |                             4999.5000 |
|    1 |    1 |                             4999.0000 |
|    2 |    2 |                             5000.0000 |
......
| 9999 |    0 |                             4999.5000 |
+------+------+---------------------------------------+
10000 rows in set (4 min 17.90 sec)

对于这种情况,你会怎么解决呢?

要点总结

  1. 执行计划描述了SQL语句的访问路径,包括表上是否有可用的索引、表连接的顺序、使用的表连接算法等。
  2. 使用EXPLAIN命令可以查看SQL语句的执行计划,可以指定输出格式为traditional、json或tree。
  3. 了解如何解读EXPLAIN输出信息,可以帮助优化SQL语句的执行性能,选择合适的访问路径和连接算法。
  4. TYPE列显示查询单元的访问路径,可以根据TYPE列来判断查询单元是否使用了索引。
  5. const表示查询最多返回1行记录,对主键或唯一索引的所有字段都使用常量等值匹配时,type为const。
  6. eq_ref使用主键或唯一索引等值匹配时,对于组合主键、组合唯一索引,索引中的每一个字段都需要以等值匹配时,type才为eq_ref。
  7. ref表示普通索引字段的等值匹配,或主键和唯一索引前缀字段上的等值匹配。
  8. range使用索引字段上的范围条件查询数据,包括 <, <=, >, >=, BETWEEN 等条件。
  9. index_merge会使用多个索引来查询数据,并将通过多个索引获取到的数据取并集或交集,得到最终的结果.
  10. index_subquery是执行特定类型的子查询的一种方式,如果子查询中的表有合适的索引可以用来检索,则可以使用index_subquery执行路径.
  11. 索引路径和实际使用的索引对于优化器选择索引和执行计划的影响
  12. 索引列的总长度和使用情况,以及如何推算执行计划使用到的索引字段
  13. 用于进行索引查找的值,包括常量匹配、字段匹配或函数计算结果匹配
  14. 查询单元需要访问的记录数的预估值,对执行时间的影响
  15. 经过where子句条件过滤后的记录数百分比,优化器计算和优化规则的影响
  16. 索引的实际使用情况,包括索引路径、索引列长度、查找值、访问记录数和过滤后的记录数百分比
  17. 优化器的成本评估和规则,对索引路径选择和使用的影响
  18. 索引路径选择和成本评估,受优化器成本评估、表连接顺序和连接算法的影响
  19. 条件过滤的优化器规则和参数,包括通过直方图精确计算filtered值和控制条件过滤的参数
  20. 执行计划中额外的重要信息,如使用临时表、skip scan、覆盖索引、排序需求、连接算法等的影响

Leave a Comment

您的电子邮箱地址不会被公开。 必填项已用*标注

close
arrow_upward