如果在查询中使用order by,如何在MySQL中排序索引列?

时间:2021-12-30 23:03:53

I am reading an article about how Pinterest shards their MySQL database: https://medium.com/@Pinterest_Engineering/sharding-pinterest-how-we-scaled-our-mysql-fleet-3f341e96ca6f

我正在阅读一篇关于Pinterest如何分割MySQL数据库的文章:https://medium.com/@pinterest - engineering /sharding- Pinterest - how-scaled -our- MySQL -fleet-3f341e96ca6f

And here they have an example of a table:

这里有一个表格的例子

CREATE TABLE board_has_pins (
  board_id INT,
  pin_id INT,
  sequence INT,
  INDEX(board_id, pin_id, sequence)
) ENGINE=InnoDB;

And they are showing how they query from that table:

他们展示了他们如何从表格中查询

SELECT pin_id FROM board_has_pins 
WHERE board_id=241294561224164665 ORDER BY sequence 
LIMIT 50 OFFSET 150

What I don't understand here is the ordering of the index. Would it not make more sense if the index was like this since they are ordering by sequence and filtering by board_id?

这里我不明白的是索引的顺序。如果索引是这样的,这不是更有意义吗?因为它们是按顺序排序的,用board_id进行过滤。

INDEX(board_id, sequence, pin_id)

Am I missing something here or have I misunderstood how indexing works?

我在这里漏掉了什么,还是我误解了索引的工作方式?

1 个解决方案

#1


2  

You are correct. The better index for this query is:

你是正确的。这个查询更好的索引是:

INDEX(board_id, sequence, pin_id)

The columns should be in this order:

列的顺序应如下:

  • Column(s) involved in equality comparisons. If there are multiple columns, their order does not matter.
  • 包含相等比较的列。如果有多个列,它们的顺序无关紧要。
  • Column(s) involved the ORDER BY clause, in the same order they appear in the ORDER BY.
  • 列(s)包含ORDER BY子句,其顺序与它们在ORDER BY中的顺序相同。
  • Other columns used to fetch values, like pin_id.
  • 用于获取值的其他列,如pin_id。

Once the equality conditions find the subset of matching rows, they are all tied with respect to their order, because naturally they all have the same value for the column of the quality condition (board_id in this case).

一旦相等的条件找到匹配行的子集,它们就都与它们的顺序有关,因为它们对于质量条件的列都具有相同的值(在本例中是board_id)。

The tie is resolved by the order of the next column in the index. If (and only if) the next column is the one used in the ORDER BY clause, then the rows can be read in index order, with no further work needed to sort them.

关系由索引中的下一列的顺序来解析。如果(且仅当)下一列是ORDER BY子句中使用的列,则可以按索引顺序读取行,不需要进行进一步的排序工作。

I don't know what is the explanation for the Pinterest blog post you linked to. I guess it's a mistake, because the index is not optimal for the query they showed.

我不知道你链接的Pinterest博客有什么解释。我猜这是一个错误,因为索引对于它们显示的查询不是最优的。

#1


2  

You are correct. The better index for this query is:

你是正确的。这个查询更好的索引是:

INDEX(board_id, sequence, pin_id)

The columns should be in this order:

列的顺序应如下:

  • Column(s) involved in equality comparisons. If there are multiple columns, their order does not matter.
  • 包含相等比较的列。如果有多个列,它们的顺序无关紧要。
  • Column(s) involved the ORDER BY clause, in the same order they appear in the ORDER BY.
  • 列(s)包含ORDER BY子句,其顺序与它们在ORDER BY中的顺序相同。
  • Other columns used to fetch values, like pin_id.
  • 用于获取值的其他列,如pin_id。

Once the equality conditions find the subset of matching rows, they are all tied with respect to their order, because naturally they all have the same value for the column of the quality condition (board_id in this case).

一旦相等的条件找到匹配行的子集,它们就都与它们的顺序有关,因为它们对于质量条件的列都具有相同的值(在本例中是board_id)。

The tie is resolved by the order of the next column in the index. If (and only if) the next column is the one used in the ORDER BY clause, then the rows can be read in index order, with no further work needed to sort them.

关系由索引中的下一列的顺序来解析。如果(且仅当)下一列是ORDER BY子句中使用的列,则可以按索引顺序读取行,不需要进行进一步的排序工作。

I don't know what is the explanation for the Pinterest blog post you linked to. I guess it's a mistake, because the index is not optimal for the query they showed.

我不知道你链接的Pinterest博客有什么解释。我猜这是一个错误,因为索引对于它们显示的查询不是最优的。