使用UNION ALL和ORDER BY优化查询

时间:2022-09-12 15:44:07

I have 3 tables (ex. a,b,c) which indicates activities for different items (ex. commenting, liking, etc) as well as the time for each activity. I am trying to essentially do a sort of news feed that shows the most recent activities first. I constructed a UNION ALL for all three tables to group all the activities together and then a GROUP BY to ensure that activities for the same items are not shown twice and order by time DESC. This function uses an infinite scroll so the query must also be able to shift appropriately.

我有3个表(例如a,b,c),表示不同项目的活动(例如评论,喜欢等)以及每个活动的时间。我试图基本上做一种新闻提要,首先显示最近的活动。我为所有三个表构建了一个UNION ALL,将所有活动组合在一起,然后组建一个GROUP BY,以确保相同项目的活动不会显示两次,并按时间顺序显示DESC。此函数使用无限滚动,因此查询还必须能够适当地移动。

I am wondering if there is any way to optimize this (Each table is about 500-900K and growing). Truncated code is shown below.

我想知道是否有任何方法可以优化这一点(每个表约为500-900K并且正在增长)。截断的代码如下所示。

SELECT time,item_id FROM (
   SELECT a.time AS time, a.item_id FROM a 
      UNION ALL 
   SELECT b.time AS time, b.item_id FROM b 
      UNION ALL 
   SELECT c.time AS time, c.item_id FROM c
) temp 
GROUP BY item_id 
ORDER BY time DESC 
LIMIT 10

1 个解决方案

#1


0  

The query you've written will create a very large temporary table. You're then sorting by a column in that temporary table. You should try to limit each table, perhaps like this:

您编写的查询将创建一个非常大的临时表。然后,您将按该临时表中的列进行排序。您应该尝试限制每个表,可能是这样的:

SELECT time,item_id FROM (
   SELECT a.time AS time, a.item_id FROM a LIMIT 10 ORDER BY time DESC 
      UNION ALL 
   SELECT b.time AS time, b.item_id FROM b LIMIT 10 ORDER BY time DESC 
      UNION ALL 
   SELECT c.time AS time, c.item_id FROM c LIMIT 10 ORDER BY time DESC 
) temp 
GROUP BY item_id 
ORDER BY time DESC 
LIMIT 10

You'll want to make sure time has an index on each table.

您需要确保时间在每个表上都有索引。

I don't really like doing this though, as it may be difficult to "scroll" through the results accurately.

我不喜欢这样做,因为可能很难准确地“滚动”结果。

When going to the "next page" you may want to consider adding a WHERE clause like WHERE a/b/c.item_id > num instead of LIMIT offset, length. That will help with the accuracy.

当转到“下一页”时,您可能需要考虑添加WHERE子句,如WHERE a / b / c.item_id> num而不是LIMIT offset,length。这将有助于准确性。

When writing the query you should prefix the query with EXPLAIN to see how the query is being handled. This will give you a better idea of what's happening: Are temporary tables being created? How large is it? What indexes are being used? etc...

在编写查询时,您应该在查询前面加上EXPLAIN,以查看查询的处理方式。这将使您更好地了解正在发生的事情:是否正在创建临时表?它有多大?正在使用哪些索引?等等...

Another approach could be to use a MySQL trigger to populate a single "feed" table.

另一种方法可能是使用MySQL触发器来填充单个“feed”表。

#1


0  

The query you've written will create a very large temporary table. You're then sorting by a column in that temporary table. You should try to limit each table, perhaps like this:

您编写的查询将创建一个非常大的临时表。然后,您将按该临时表中的列进行排序。您应该尝试限制每个表,可能是这样的:

SELECT time,item_id FROM (
   SELECT a.time AS time, a.item_id FROM a LIMIT 10 ORDER BY time DESC 
      UNION ALL 
   SELECT b.time AS time, b.item_id FROM b LIMIT 10 ORDER BY time DESC 
      UNION ALL 
   SELECT c.time AS time, c.item_id FROM c LIMIT 10 ORDER BY time DESC 
) temp 
GROUP BY item_id 
ORDER BY time DESC 
LIMIT 10

You'll want to make sure time has an index on each table.

您需要确保时间在每个表上都有索引。

I don't really like doing this though, as it may be difficult to "scroll" through the results accurately.

我不喜欢这样做,因为可能很难准确地“滚动”结果。

When going to the "next page" you may want to consider adding a WHERE clause like WHERE a/b/c.item_id > num instead of LIMIT offset, length. That will help with the accuracy.

当转到“下一页”时,您可能需要考虑添加WHERE子句,如WHERE a / b / c.item_id> num而不是LIMIT offset,length。这将有助于准确性。

When writing the query you should prefix the query with EXPLAIN to see how the query is being handled. This will give you a better idea of what's happening: Are temporary tables being created? How large is it? What indexes are being used? etc...

在编写查询时,您应该在查询前面加上EXPLAIN,以查看查询的处理方式。这将使您更好地了解正在发生的事情:是否正在创建临时表?它有多大?正在使用哪些索引?等等...

Another approach could be to use a MySQL trigger to populate a single "feed" table.

另一种方法可能是使用MySQL触发器来填充单个“feed”表。