需要有关涉及多个表的SQL查询的帮助 - 不要选择加入

时间:2022-04-26 22:25:17
SELECT i.*, i.id IN (
  SELECT id
  FROM w 
  WHERE w.status='active') AS wish 
FROM i
INNER JOIN r ON i.id=r.id
WHERE r.member_id=1 && r.status='active' 
ORDER BY wish DESC 
LIMIT 0,50

That's a query that I'm trying to run. It doesn't scale well, and I'm wondering if someone here can tell me where I could improve things. I don't join w to r and i because I need to show rows from i that are unrepresented in w. I tried a left join, but it didn't perform too well. This is better, but not ideal yet. All three tables are very large. All three are indexed on the fields I'm joining and selecting on.

这是我正在尝试运行的查询。它不能很好地扩展,我想知道这里是否有人可以告诉我在哪里可以改进。我不加入r和我,因为我需要显示来自i的行,这些行在w中没有代表。我尝试了一个左连接,但它的表现并不太好。这更好,但还不理想。所有三个表都非常大。所有这三个都在我加入和选择的字段上编入索引。

Any comments, pointers, or constructive criticisms would be greatly appreciated.

任何评论,指示或建设性批评将不胜感激。

EDIT Addition:

I should have put this in my original question. It's the EXPLAIN as return from SQLYog.

我应该把这个放在原来的问题中。这是从SQLYog返回的EXPLAIN。

id|select_type       |table|type          |possible_keys|key      |key_len|ref  |rows|Extra|  
1 |PRIMARY           |r    |ref           |member_id,id |member_id|3      |const|3120|Using where; Using temporary; Using filesort  
1 |PRIMARY           |i    |eq_ref        |id           |id       |8      |r.id |1   |  
2 |DEPENDENT SUBQUERY|w    |index_subquery|id,status    |id       |8      |func |8   |Using where


EDIT le dorfier - more comments ...

I should mention that the key for w is (member_id, id). So each id can exist multiple times in w, and I only want to know if it exists.

我应该提到w的关键是(member_id,id)。所以每个id可以在w中多次存在,我只想知道它是否存在。

5 个解决方案

#1


WHERE x IN () is identical to an INNER JOIN to a SELECT DISTINCT subquery, and in general, a join to a subquery will typically perform better if the optimizer doesn't turn the IN into a JOIN - which it should:

WHERE x IN()与SELECT DISTINCT子查询的INNER JOIN相同,通常,如果优化器不将IN转换为JOIN,则子查询的连接通常会更好 - 它应该:

SELECT i.*
FROM i
INNER JOIN (
    SELECT DISTINCT id
    FROM w 
    WHERE w.status = 'active'
) AS wish 
    ON i.id = wish.id
INNER JOIN r
    ON i.id = r.id
WHERE r.member_id = 1 && r.status = 'active' 
ORDER BY wish.id DESC 
LIMIT 0,50

Which, would probably be equivalent to this if you don't need the DISTINCT:

如果您不需要DISTINCT,那么可能等同于此:

SELECT i.*
FROM i
INNER JOIN w 
    ON w.status = 'active'
    AND i.id = wish.id
INNER JOIN r
    ON i.id = r.id
    AND r.member_id = 1 && r.status = 'active' 
ORDER BY i.id DESC 
LIMIT 0,50

Please post your schema.

请发布您的架构。

If you are using wish as an existence flag, try:

如果您使用wish作为存在标志,请尝试:

SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
    ON i.id = r.id
    AND r.member_id = 1 && r.status = 'active' 
LEFT JOIN w 
    ON w.status = 'active'
    AND i.id = w.id
ORDER BY wish DESC 
LIMIT 0,50

You can use the same technique with a LEFT JOIN to a SELECT DISTINCT subquery. I assume you aren't specifying the w.member_id because you want to know if any members have this? In this case, definitely use the SELECT DISTINCT. You should have an index with id as the first column on w as well in order for that to perform:

对于SELECT DISTINCT子查询,可以使用与LEFT JOIN相同的技术。我假设你没有指定w.member_id,因为你想知道是否有任何成员有这个?在这种情况下,一定要使用SELECT DISTINCT。你应该有一个id为w的第一列的索引,以便执行:

SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
    ON i.id = r.id
    AND r.member_id = 1 && r.status = 'active' 
LEFT JOIN (
    SELECT DISTINCT w.id
    FROM w 
    WHERE w.status = 'active'
) AS w
    ON i.id = w.id
ORDER BY wish DESC 
LIMIT 0,50

#2


I should have put this in my original question. It's the EXPLAIN as return from SQLYog.
id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
1|PRIMARY|r|ref|member_id,id|member_id|3|const|3120|Using where; Using temporary; Using filesort
1|PRIMARY|i|eq_ref|id|id|8|r.id|1|
2|DEPENDENT SUBQUERY|w|index_subquery|id,status|id|8|func|8|Using where

我应该把这个放在原来的问题中。这是从SQLYog返回的EXPLAIN。 ID | SELECT_TYPE |表|类型| possible_keys |关键| key_len |裁判|行|额外| 1 | PRIMARY | r | ref | member_id,id | member_id | 3 | const | 3120 |使用where;使用临时;使用filesort 1 | PRIMARY | i | eq_ref | id | id | 8 | r.id | 1 | 2 | DEPENDENT SUBQUERY | w | index_subquery | id,status | id | 8 | func | 8 |使用where

#3


Please post the EXPLAIN listing. And explain what the tables and columns mean.

请发布EXPLAIN列表。并解释表和列的含义。

wish appears to be a boolean - and you're ORDERing by it?

希望似乎是一个布尔值 - 而你正在按它排序?


EDIT: Well, it looks like it's doing what it's being instructed to do. Cade seems to be thinking expansively on what this all could possibly mean (he probably deserves a vote just for effort.) But I'd really rather you tell us.

编辑:嗯,看起来它正在做它被指示做的事情。凯德似乎在思考这一切可能意味着什么(他可能只是为了努力而投票。)但我真的宁愿你告诉我们。

Wild guessing just confuses everyone (including you, I'm sure.)

狂野的猜测让每个人都感到困惑(包括你,我敢肯定。)


OK, based on new info, here's my (slightly less wild) guess.

好的,基于新信息,这是我的(稍微不那么狂野)猜测。

SELECT i.*,  
    CASE WHEN EXISTS (SELECT 1 FROM w WHERE id = i.id AND w.status = 'active' THEN 1 ELSE 0 END) AS wish  
FROM i  
INNER JOIN r ON i.id = r.id AND r.status = 'active'  
WHERE r.member_id = 1

Do you want a row for each match in w? Or just to know for i.id , whether there is an active w record? I assumed the second answer, so you don't need to ORDER BY - it's for only one ID anyway. And since you're only returning columns from i, if there are multiple rows in r, you'll just get duplicate rows.

你想为w中的每场比赛换一排吗?或者只是知道i.id,是否有活跃的记录?我假设第二个答案,所以你不需要ORDER BY - 它只适用于一个ID。而且由于你只是从i返回列,如果r中有多行,你只会得到重复的行。

How about posting what you expect to get for a proper answer?

如何发布您期望得到的正确答案?

#4


...
ORDER BY wish DESC 
LIMIT 0,50

This appears to be the big expense. You're sorting by a computed column "wish" which cannot benefit from an index. This forces it to use a filesort (as indicated by the EXPLAIN) output, which means it writes the whole result set to disk and sorts it using disk I/O which is very slow.

这似乎是一笔巨大的开支。您正在按计算列“愿望”进行排序,该列不能从索引中受益。这会强制它使用一个filesort(由EXPLAIN指示)输出,这意味着它将整个结果集写入磁盘并使用非常慢的磁盘I / O对其进行排序。

When you post questions like this, you should not expect people to guess how you have defined your tables and indexes. It's very simple to get the full definitions:

当您发布这样的问题时,您不应该指望人们猜测您是如何定义表和索引的。获得完整定义非常简单:

mysql> SHOW CREATE TABLE w;
mysql> SHOW CREATE TABLE i;
mysql> SHOW CREATE TABLE r;

Then paste the output into your question.

然后将输出粘贴到您的问题中。

It's not clear what your purpose is for the "wish" column. The "IN" predicate is a boolean expression, so it always results in 0 or 1. But I'm guessing you're trying to use "IN" in hopes of accomplishing a join without doing a join. It would help if you describe what you're trying to accomplish.

目前尚不清楚你的目的是什么“愿望”专栏。 “IN”谓词是一个布尔表达式,因此它总是导致0或1.但我猜你正在尝试使用“IN”以期在没有连接的情况下完成连接。如果你描述你想要完成的事情会有所帮助。

Try this:

SELECT i.*
FROM i
 INNER JOIN r ON i.id=r.id
 LEFT OUTER JOIN w ON i.id=w.id AND w.status='active'
WHERE r.member_id=1 AND r.status='active'
 AND w.id IS NULL
LIMIT 0,50;

It uses an additional outer join, but it doesn't incur a filesort according to my test with EXPLAIN.

它使用一个额外的外连接,但根据我的EXPLAIN测试,它不会产生一个文件排序。

#5


Have you tried this?

你试过这个吗?

SELECT i.*, w.id as wish FROM i
LEFT OUTER JOIN w ON i.id = w.id
  AND w.status = 'active'
WHERE i.id in (SELECT id FROM r WHERE r.member_id = 1 AND r.status = 'active')
ORDER BY wish DESC
LIMIT 0,50

#1


WHERE x IN () is identical to an INNER JOIN to a SELECT DISTINCT subquery, and in general, a join to a subquery will typically perform better if the optimizer doesn't turn the IN into a JOIN - which it should:

WHERE x IN()与SELECT DISTINCT子查询的INNER JOIN相同,通常,如果优化器不将IN转换为JOIN,则子查询的连接通常会更好 - 它应该:

SELECT i.*
FROM i
INNER JOIN (
    SELECT DISTINCT id
    FROM w 
    WHERE w.status = 'active'
) AS wish 
    ON i.id = wish.id
INNER JOIN r
    ON i.id = r.id
WHERE r.member_id = 1 && r.status = 'active' 
ORDER BY wish.id DESC 
LIMIT 0,50

Which, would probably be equivalent to this if you don't need the DISTINCT:

如果您不需要DISTINCT,那么可能等同于此:

SELECT i.*
FROM i
INNER JOIN w 
    ON w.status = 'active'
    AND i.id = wish.id
INNER JOIN r
    ON i.id = r.id
    AND r.member_id = 1 && r.status = 'active' 
ORDER BY i.id DESC 
LIMIT 0,50

Please post your schema.

请发布您的架构。

If you are using wish as an existence flag, try:

如果您使用wish作为存在标志,请尝试:

SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
    ON i.id = r.id
    AND r.member_id = 1 && r.status = 'active' 
LEFT JOIN w 
    ON w.status = 'active'
    AND i.id = w.id
ORDER BY wish DESC 
LIMIT 0,50

You can use the same technique with a LEFT JOIN to a SELECT DISTINCT subquery. I assume you aren't specifying the w.member_id because you want to know if any members have this? In this case, definitely use the SELECT DISTINCT. You should have an index with id as the first column on w as well in order for that to perform:

对于SELECT DISTINCT子查询,可以使用与LEFT JOIN相同的技术。我假设你没有指定w.member_id,因为你想知道是否有任何成员有这个?在这种情况下,一定要使用SELECT DISTINCT。你应该有一个id为w的第一列的索引,以便执行:

SELECT i.*, CASE WHEN w.id IS NOT NULL THEN 1 ELSE 0 END AS wish
FROM i
INNER JOIN r
    ON i.id = r.id
    AND r.member_id = 1 && r.status = 'active' 
LEFT JOIN (
    SELECT DISTINCT w.id
    FROM w 
    WHERE w.status = 'active'
) AS w
    ON i.id = w.id
ORDER BY wish DESC 
LIMIT 0,50

#2


I should have put this in my original question. It's the EXPLAIN as return from SQLYog.
id|select_type|table|type|possible_keys|key|key_len|ref|rows|Extra|
1|PRIMARY|r|ref|member_id,id|member_id|3|const|3120|Using where; Using temporary; Using filesort
1|PRIMARY|i|eq_ref|id|id|8|r.id|1|
2|DEPENDENT SUBQUERY|w|index_subquery|id,status|id|8|func|8|Using where

我应该把这个放在原来的问题中。这是从SQLYog返回的EXPLAIN。 ID | SELECT_TYPE |表|类型| possible_keys |关键| key_len |裁判|行|额外| 1 | PRIMARY | r | ref | member_id,id | member_id | 3 | const | 3120 |使用where;使用临时;使用filesort 1 | PRIMARY | i | eq_ref | id | id | 8 | r.id | 1 | 2 | DEPENDENT SUBQUERY | w | index_subquery | id,status | id | 8 | func | 8 |使用where

#3


Please post the EXPLAIN listing. And explain what the tables and columns mean.

请发布EXPLAIN列表。并解释表和列的含义。

wish appears to be a boolean - and you're ORDERing by it?

希望似乎是一个布尔值 - 而你正在按它排序?


EDIT: Well, it looks like it's doing what it's being instructed to do. Cade seems to be thinking expansively on what this all could possibly mean (he probably deserves a vote just for effort.) But I'd really rather you tell us.

编辑:嗯,看起来它正在做它被指示做的事情。凯德似乎在思考这一切可能意味着什么(他可能只是为了努力而投票。)但我真的宁愿你告诉我们。

Wild guessing just confuses everyone (including you, I'm sure.)

狂野的猜测让每个人都感到困惑(包括你,我敢肯定。)


OK, based on new info, here's my (slightly less wild) guess.

好的,基于新信息,这是我的(稍微不那么狂野)猜测。

SELECT i.*,  
    CASE WHEN EXISTS (SELECT 1 FROM w WHERE id = i.id AND w.status = 'active' THEN 1 ELSE 0 END) AS wish  
FROM i  
INNER JOIN r ON i.id = r.id AND r.status = 'active'  
WHERE r.member_id = 1

Do you want a row for each match in w? Or just to know for i.id , whether there is an active w record? I assumed the second answer, so you don't need to ORDER BY - it's for only one ID anyway. And since you're only returning columns from i, if there are multiple rows in r, you'll just get duplicate rows.

你想为w中的每场比赛换一排吗?或者只是知道i.id,是否有活跃的记录?我假设第二个答案,所以你不需要ORDER BY - 它只适用于一个ID。而且由于你只是从i返回列,如果r中有多行,你只会得到重复的行。

How about posting what you expect to get for a proper answer?

如何发布您期望得到的正确答案?

#4


...
ORDER BY wish DESC 
LIMIT 0,50

This appears to be the big expense. You're sorting by a computed column "wish" which cannot benefit from an index. This forces it to use a filesort (as indicated by the EXPLAIN) output, which means it writes the whole result set to disk and sorts it using disk I/O which is very slow.

这似乎是一笔巨大的开支。您正在按计算列“愿望”进行排序,该列不能从索引中受益。这会强制它使用一个filesort(由EXPLAIN指示)输出,这意味着它将整个结果集写入磁盘并使用非常慢的磁盘I / O对其进行排序。

When you post questions like this, you should not expect people to guess how you have defined your tables and indexes. It's very simple to get the full definitions:

当您发布这样的问题时,您不应该指望人们猜测您是如何定义表和索引的。获得完整定义非常简单:

mysql> SHOW CREATE TABLE w;
mysql> SHOW CREATE TABLE i;
mysql> SHOW CREATE TABLE r;

Then paste the output into your question.

然后将输出粘贴到您的问题中。

It's not clear what your purpose is for the "wish" column. The "IN" predicate is a boolean expression, so it always results in 0 or 1. But I'm guessing you're trying to use "IN" in hopes of accomplishing a join without doing a join. It would help if you describe what you're trying to accomplish.

目前尚不清楚你的目的是什么“愿望”专栏。 “IN”谓词是一个布尔表达式,因此它总是导致0或1.但我猜你正在尝试使用“IN”以期在没有连接的情况下完成连接。如果你描述你想要完成的事情会有所帮助。

Try this:

SELECT i.*
FROM i
 INNER JOIN r ON i.id=r.id
 LEFT OUTER JOIN w ON i.id=w.id AND w.status='active'
WHERE r.member_id=1 AND r.status='active'
 AND w.id IS NULL
LIMIT 0,50;

It uses an additional outer join, but it doesn't incur a filesort according to my test with EXPLAIN.

它使用一个额外的外连接,但根据我的EXPLAIN测试,它不会产生一个文件排序。

#5


Have you tried this?

你试过这个吗?

SELECT i.*, w.id as wish FROM i
LEFT OUTER JOIN w ON i.id = w.id
  AND w.status = 'active'
WHERE i.id in (SELECT id FROM r WHERE r.member_id = 1 AND r.status = 'active')
ORDER BY wish DESC
LIMIT 0,50