如何改进具有NULL的MySQL查询的性能?

时间:2022-06-19 06:24:13

I have several million records in the following table:

在下表中我有几百万条记录:

CREATE TABLE `customers` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `store_id` int(10) unsigned DEFAULT NULL,
  `first_name` varchar(64) DEFAULT NULL,
  `middle_name` varchar(64) DEFAULT NULL,
  `last_name` varchar(64) DEFAULT NULL,
  `email` varchar(128) DEFAULT NULL,
  `phone` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `index_store_email` (`store_id`,`email`),
  KEY `index_store_phone` (`store_id`,`phone`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

Query #1 takes ~800ms:
SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1;

查询#1需要~800ms:从' store_id ' = 1的' customers '中选择COUNT(*);

Query #2 takes ~1.5ms:
SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NULL;

查询#2取~1.5ms:从' customers '中选择COUNT(*),其中' store_id ' = 1, ' email '为空;

Query #3 takes a whopping 5 seconds:
SELECT COUNT(*) FROM `customers` WHERE `store_id` = 1 AND `email` IS NOT NULL;

查询#3耗时5秒:从' customers '中选择COUNT(*),其中' store_id ' = 1和' email '不是空的;

Notes:

注:

  • I've simplified the table to ask the question, but the query is identical.
  • 我简化了问这个问题的表,但是查询是相同的。
  • Yes, my table is optimized.
  • 是的,我的表优化了。
  • Yes, both fields are indexed, see the create syntax above.
  • 是的,两个字段都被索引,请参阅上面的create语法。
  • There are only a few store_ids, but every record has one.
  • 只有几个store_id,但是每个记录都有一个。
  • There are very few customers with email set to null.
  • 很少有客户将电子邮件设置为null。

I find a few things strange here:

我在这里发现了一些奇怪的事情:

  1. Query #1 is simplest! There are only a few possible INT values. Shouldn't it be fastest?
  2. 查询# 1是简单的!只有几个可能的INT值。不应该是最快的?
  3. Why is Query #3 so slow? I could cut the time in half by doing the other two queries, and subtracting #1 from #2, but I shouldn't have to.
  4. 为什么查询#3这么慢?我可以通过做另外两个查询将时间缩短一半,并从#2中减去#1,但我不需要这么做。

Any thoughts on this seemingly basic question? Feel like I'm missing something simple. Did I sleep through a class in db school?

对这个看似基本的问题有什么想法吗?感觉我错过了一些简单的东西。我在db学校上了一节课吗?

2 个解决方案

#1


2  

At times the MySQL query parser guesses wrong when it decides which indices to use. For cases like these the index hints can be useful (http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)

有时MySQL查询解析器在决定使用哪些索引时猜测错误。对于这样的情况,索引提示可能是有用的(http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)。

To force the use of an index:

强制使用指数:

SELECT * FROM table1 USE INDEX (col1_index,col2_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To force the use of an index including replacing table scans:

强制使用索引,包括替换表扫描:

SELECT * FROM table1 FORCE INDEX (col1_index,col2_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To ignore a certain index:

忽略某一指标:

SELECT * FROM table1 IGNORE INDEX (col3_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To debug which index is being used the EXPLAIN statement can be used: (https://dev.mysql.com/doc/refman/5.7/en/explain-output.html)

要调试哪个索引,可以使用EXPLAIN语句:(https://dev.mysql.com/doc/refman/5.7/en/explainoutput.html)

EXPLAIN SELECT * FROM table1
  WHERE col1=1 AND col2=2 AND col3=3;

#2


2  

Drop the index with just (store_id); it is redundant with two other indexes.

用just (store_id)删除索引;它与另外两个索引是冗余的。

This will probably also obviate the need for FORCE INDEX, etc.

这可能也会消除对力指数等的需要。

INDEX(store_id, email) works for all three queries.

索引(store_id、电子邮件)适用于所有三个查询。

#1


2  

At times the MySQL query parser guesses wrong when it decides which indices to use. For cases like these the index hints can be useful (http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)

有时MySQL查询解析器在决定使用哪些索引时猜测错误。对于这样的情况,索引提示可能是有用的(http://dev.mysql.com/doc/refman/5.7/en/index-hints.html)。

To force the use of an index:

强制使用指数:

SELECT * FROM table1 USE INDEX (col1_index,col2_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To force the use of an index including replacing table scans:

强制使用索引,包括替换表扫描:

SELECT * FROM table1 FORCE INDEX (col1_index,col2_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To ignore a certain index:

忽略某一指标:

SELECT * FROM table1 IGNORE INDEX (col3_index)
  WHERE col1=1 AND col2=2 AND col3=3;

To debug which index is being used the EXPLAIN statement can be used: (https://dev.mysql.com/doc/refman/5.7/en/explain-output.html)

要调试哪个索引,可以使用EXPLAIN语句:(https://dev.mysql.com/doc/refman/5.7/en/explainoutput.html)

EXPLAIN SELECT * FROM table1
  WHERE col1=1 AND col2=2 AND col3=3;

#2


2  

Drop the index with just (store_id); it is redundant with two other indexes.

用just (store_id)删除索引;它与另外两个索引是冗余的。

This will probably also obviate the need for FORCE INDEX, etc.

这可能也会消除对力指数等的需要。

INDEX(store_id, email) works for all three queries.

索引(store_id、电子邮件)适用于所有三个查询。