左内连接与左外连接-为什么外层需要更长时间?

时间:2021-07-03 08:17:16

We have the query below. Using a LEFT OUTER join takes 9 seconds to execute. Changing the LEFT OUTER to an LEFT INNER reduces the execution time to 2 seconds, and the same number of rows are returned. Since the same number of rows from the dbo.Accepts table are being processed, regardless of the join type, why would the outer take 3x longer?

我们有下面的查询。使用左外连接执行需要9秒。将左外部更改为左内部将执行时间减少到2秒,并返回相同的行数。因为来自dbo的行数相同。正在处理接受的表,无论连接类型如何,为什么外部要花费3倍的时间?

SELECT CONVERT(varchar, a.ReadTime, 101) as ReadDate,
       a.SubID,
       a.PlantID,
       a.Unit as UnitID,
       a.SubAssembly,
       m.Lot
  FROM dbo.Accepts a WITH (NOLOCK)
LEFT OUTER Join dbo.Marker m WITH (NOLOCK) ON m.SubID = a.SubID
WHERE a.LastModifiedTime BETWEEN @LastModifiedTimeStart AND @LastModifiedTimeEnd 
  AND a.SubAssembly = '400'

4 个解决方案

#1


35  

The fact that the same number of rows is returned is an after fact, the query optimizer cannot know in advance that every row in Accepts has a matching row in Marker, can it?

返回相同行数的事实是事后事实,查询优化器不能预先知道接受的每一行都有一个标记中的匹配行,是吗?

If you join two tables A and B, say A has 1 million rows and B has 1 row. If you say A LEFT INNER JOIN B it means only rows that match both A and B can result, so the query plan is free to scan B first, then use an index to do a range scan in A, and perhaps return 10 rows. But if you say A LEFT OUTER JOIN B then at least all rows in A have to be returned, so the plan must scan everything in A no matter what it finds in B. By using an OUTER join you are eliminating one possible optimization.

如果你连接两个表A和B,假设A有一百万行B有一行。如果你说左内连接B,它意味着只有匹配A和B的行才能得到结果,所以查询计划可以先*地扫描B,然后使用索引在A中执行范围扫描,可能返回10行。但是,如果你说左外连接B,那么至少A中的所有行都必须返回,所以无论在B中找到什么,计划都必须扫描所有A中的内容。

If you do know that every row in Accepts will have a match in Marker, then why not declare a foreign key to enforce this? The optimizer will see the constraint, and if is trusted, will take it into account in the plan.

如果您确实知道在accept中的每一行将在Marker中有一个匹配,那么为什么不声明一个外键来执行这个操作呢?优化器将看到约束,如果受信任,将在计划中考虑它。

#2


28  

1) in a query window in SQL Server Management Studio, run the command:

1)在SQL Server Management Studio的查询窗口中,运行以下命令:

SET SHOWPLAN_ALL ON

设置SHOWPLAN_ALL

2) run your slow query

2)运行慢速查询

3) your query will not run, but the execution plan will be returned. store this output

3)查询不会运行,但是执行计划会返回。存储这个输出

4) run your fast version of the query

4)运行查询的快速版本

5) your query will not run, but the execution plan will be returned. store this output

5)查询不会运行,但执行计划将返回。存储这个输出

6) compare the slow query version output to the fast query version output.

6)将慢查询版本输出与快速查询版本输出进行比较。

7) if you still don't know why one is slower, post both outputs in your question (edit it) and someone here can help from there.

如果你还是不知道为什么慢了,把两个输出都贴在你的问题里(编辑一下),有人可以帮你。

#3


6  

This is because the LEFT OUTER Join is doing more work than an INNER Join BEFORE sending the results back.

这是因为在返回结果之前,左外部连接要比内部连接做更多的工作。

The Inner Join looks for all records where the ON statement is true (So when it creates a new table, it only puts in records that match the m.SubID = a.SubID). Then it compares those results to your WHERE statement (Your last modified time).

内部连接查找ON语句为真的所有记录(因此当它创建一个新表时,它只放入与m匹配的记录。SubID = a.SubID)。然后将这些结果与WHERE语句(最后修改的时间)进行比较。

The Left Outer Join...Takes all of the records in your first table. If the ON statement is not true (m.SubID does not equal a.SubID), it simply NULLS the values in the second table's column for that recordset.

左外连接…获取第一个表中的所有记录。如果ON语句不是true (m)SubID不等于a.SubID),它只是在第二个表的列中为该记录集空值。

The reason you get the same number of results at the end is probably coincidence due to the WHERE clause that happens AFTER all of the copying of records.

最后得到相同数量结果的原因可能是由于在所有记录复制之后发生的WHERE子句的巧合。

Join (SQL) Wikipedia

加入*(SQL)

#4


2  

Wait -- did you actually mean that "the same number of rows ... are being processed" or that "the same number of rows are being returned"? In general, the outer join would process many more rows, including those for which there is no match, even if it returns the same number of records.

等等——你的意思是“同样的行数……”是否正在处理“或“正在返回相同数量的行”?通常,外部连接将处理更多的行,包括那些不匹配的行,即使它返回相同数量的记录。

#1


35  

The fact that the same number of rows is returned is an after fact, the query optimizer cannot know in advance that every row in Accepts has a matching row in Marker, can it?

返回相同行数的事实是事后事实,查询优化器不能预先知道接受的每一行都有一个标记中的匹配行,是吗?

If you join two tables A and B, say A has 1 million rows and B has 1 row. If you say A LEFT INNER JOIN B it means only rows that match both A and B can result, so the query plan is free to scan B first, then use an index to do a range scan in A, and perhaps return 10 rows. But if you say A LEFT OUTER JOIN B then at least all rows in A have to be returned, so the plan must scan everything in A no matter what it finds in B. By using an OUTER join you are eliminating one possible optimization.

如果你连接两个表A和B,假设A有一百万行B有一行。如果你说左内连接B,它意味着只有匹配A和B的行才能得到结果,所以查询计划可以先*地扫描B,然后使用索引在A中执行范围扫描,可能返回10行。但是,如果你说左外连接B,那么至少A中的所有行都必须返回,所以无论在B中找到什么,计划都必须扫描所有A中的内容。

If you do know that every row in Accepts will have a match in Marker, then why not declare a foreign key to enforce this? The optimizer will see the constraint, and if is trusted, will take it into account in the plan.

如果您确实知道在accept中的每一行将在Marker中有一个匹配,那么为什么不声明一个外键来执行这个操作呢?优化器将看到约束,如果受信任,将在计划中考虑它。

#2


28  

1) in a query window in SQL Server Management Studio, run the command:

1)在SQL Server Management Studio的查询窗口中,运行以下命令:

SET SHOWPLAN_ALL ON

设置SHOWPLAN_ALL

2) run your slow query

2)运行慢速查询

3) your query will not run, but the execution plan will be returned. store this output

3)查询不会运行,但是执行计划会返回。存储这个输出

4) run your fast version of the query

4)运行查询的快速版本

5) your query will not run, but the execution plan will be returned. store this output

5)查询不会运行,但执行计划将返回。存储这个输出

6) compare the slow query version output to the fast query version output.

6)将慢查询版本输出与快速查询版本输出进行比较。

7) if you still don't know why one is slower, post both outputs in your question (edit it) and someone here can help from there.

如果你还是不知道为什么慢了,把两个输出都贴在你的问题里(编辑一下),有人可以帮你。

#3


6  

This is because the LEFT OUTER Join is doing more work than an INNER Join BEFORE sending the results back.

这是因为在返回结果之前,左外部连接要比内部连接做更多的工作。

The Inner Join looks for all records where the ON statement is true (So when it creates a new table, it only puts in records that match the m.SubID = a.SubID). Then it compares those results to your WHERE statement (Your last modified time).

内部连接查找ON语句为真的所有记录(因此当它创建一个新表时,它只放入与m匹配的记录。SubID = a.SubID)。然后将这些结果与WHERE语句(最后修改的时间)进行比较。

The Left Outer Join...Takes all of the records in your first table. If the ON statement is not true (m.SubID does not equal a.SubID), it simply NULLS the values in the second table's column for that recordset.

左外连接…获取第一个表中的所有记录。如果ON语句不是true (m)SubID不等于a.SubID),它只是在第二个表的列中为该记录集空值。

The reason you get the same number of results at the end is probably coincidence due to the WHERE clause that happens AFTER all of the copying of records.

最后得到相同数量结果的原因可能是由于在所有记录复制之后发生的WHERE子句的巧合。

Join (SQL) Wikipedia

加入*(SQL)

#4


2  

Wait -- did you actually mean that "the same number of rows ... are being processed" or that "the same number of rows are being returned"? In general, the outer join would process many more rows, including those for which there is no match, even if it returns the same number of records.

等等——你的意思是“同样的行数……”是否正在处理“或“正在返回相同数量的行”?通常,外部连接将处理更多的行,包括那些不匹配的行,即使它返回相同数量的记录。