Index Condition Pushdown Optimization

时间:2022-07-01 15:05:42

Index Condition Pushdown (ICP) is an optimization for the case where MySQL retrieves rows from a table using an index(ICP是MySQL用索引从表中获取数据的一种优化). Without ICP, the storage engine traverses the index to locate rows in the base table and returns them to the MySQL server which evaluates the WHERE condition for the rows(如果没有ICP,存储引擎层将会穿过索引从基表里面定位行,并把结果返回给server层,MySQL server层将会根据where条件判断哪些结果合格并返回给客户端). With ICP enabled, and if parts of the WHERE condition can be evaluated by using only fields from the index, the MySQL server pushes this part of the WHERE condition down to the storage engine(如果用了ICP,如果能根据索引里的字段获取符合where条件,server层就会把where条件下推到存储引擎层). The storage engine then evaluates the pushed index condition by using the index entry and only if this is satisfied is the row read from the table(然后存储引擎层). ICP can reduce the number of times the storage engine must access the base table and the number of times the MySQL server must access the storage engine(ICP可以减少存储引擎层必须访问基表的次数和server层必须访问存储引擎层的次数).

Index Condition Pushdown optimization is used for the rangerefeq_ref, and ref_or_null access methods when there is a need to access full table rows(ICP是为需要使用range,ref,eq_ref和ref_or_null方法访问全表数据优化的).另外MariaDB也针对索引下推做了一个优化,Batched Key Access.

This strategy can be used for InnoDB and MyISAM tables. (Note that index condition pushdown is not supported with partitioned tables in MySQL 5.6; this issue is resolved in MySQL 5.7.) For InnoDB tables, however, ICP is used only for secondary indexes(然而对于innodb表,ICP只适用于二级索引). The goal of ICP is to reduce the number of full-record reads and thereby reduce IO operations(ICP的目标是减少全表扫描的次数和减少IO操作). For InnoDB clustered indexes, the complete record is already read into the InnoDB buffer. Using ICP in this case does not reduce IO(对于InnoDB的聚簇索引,所有的记录都已经在buffer pool里面了,所以使用ICP在这种情况下并没有减少IO).

The idea is to check part of the WHERE condition that refers to index fields (we call it Pushed Index Condition) as soon as we've accessed the index. If the Pushed Index Condition is not satisfied, we won't need to read the whole table record我认为是作者的笔误(概括来说,索引下推总的思想是当我们访问索引的时候,检查索引里的字段是否有满足where条件的,我们叫下推索引条件。如果下推索引条件满足,我们没必要访问全表记录).

To see how this optimization works, consider first how an index scan proceeds when Index Condition Pushdown is not used:

  1. Get the next row, first by reading the index tuple, and then by using the index tuple to locate and read the full table row.

  2. Test the part of the WHERE condition that applies to this table. Accept or reject the row based on the test result.

用一张图表示更容易理解一点:

Index Condition Pushdown Optimization

When Index Condition Pushdown is used, the scan proceeds like this instead:

  1. Get the next row's index tuple (but not the full table row).

  2. Test the part of the WHERE condition that applies to this table and can be checked using only index columns. If the condition is not satisfied, proceed to the index tuple for the next row.

  3. If the condition is satisfied, use the index tuple to locate and read the full table row.

  4. Test the remaining part of the WHERE condition that applies to this table. Accept or reject the row based on the test result.

用一张图说明:

Index Condition Pushdown Optimization

Index Condition Pushdown is enabled by default; it can be controlled with the optimizer_switch system variable by setting the index_condition_pushdown flag. See Section 8.9.2, “Controlling Switchable Optimizations”.

索引下推默认打开的.如果使用了索引下推你就会在执行计划中看到"Using index condition"字样,如:

Mysql [test]> explain select * from tbl where key_col1 between 10 and 11 and key_col2 like '%foo%';
+----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
| 1 | SIMPLE | tbl | range | key_col1 | key_col1 | 5 | NULL | 2 | Using index condition |
+----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------+
1 row in set (0.01 sec)

那么它能带来多少性能提升呢?

答案是取决于它能过滤掉的记录数和要获取结果集的成本。前者取决于语句和结果集,后者取决于表的记录数,在什么样的硬盘上和表中是否有大字段(如:blob)。

来看个例子:

alter table lineitem add index s_r (l_shipdate, l_receiptdate);
select count(*) from lineitemwhere
l_shipdate between '1993-01-01' and '1993-02-01' and
datediff(l_receiptdate,l_shipdate) > 25 and
l_quantity > 40;
没有下推索引条件
-+----------+-------+----------------------+-----+---------+------+--------+-------------+
 table    | type | possible_keys         | key | key_len | ref | rows    | Extra       |
-+----------+-------+----------------------+-----+---------+------+--------+-------------+
 | lineitem | range | s_r                  | s_r | 4       | NULL | 152064 | Using where |
-+----------+-------+----------------------+-----+---------+------+--------+-------------+
使用了下推索引
-+-----------+-------+---------------+-----+---------+------+--------+------------------------------------+
 table     | type | possible_keys | key | key_len | ref | rows     | Extra                              |
-+-----------+-------+---------------+-----+---------+------+--------+------------------------------------+
 | lineitem | range | s_r            | s_r | 4       | NULL | 152064 | Using index condition; Using where |
-+-----------+-------+---------------+-----+---------+------+--------+------------------------------------+

速度上的提升呢?

  • 冷buffer pool:从 5 min 下降到 1 min

  • 热buffer pool:从0.19 sec 下降到 0.07 sec

在MariaDB中有两个变量来查看使用下推索引的状态:

Status variables

There are two server status variables:

Variable name Meaning
Handler_icp_attempts Number of times pushed index condition was checked.
Handler_icp_match Number of times the condition was matched.

That way, the value Handler_icp_attempts - Handler_icp_match shows the number records that the server did not have to read because of Index Condition Pushdown.

如果用的不是MariaDB,我们怎么知道索引下推带来了多少优化呢?我们可以使用这个参数Handler_read_next来看,

参考手册:

http://dev.mysql.com/doc/refman/5.6/en/index-condition-pushdown-optimization.html

https://mariadb.com/kb/en/mariadb/index-condition-pushdown/