联接顺序如何影响查询的性能

时间:2021-10-10 06:57:41

I'm experiencing big differences in timeperformance in my query, and it seems the order of which the joins (inner and left outer) occur in the query makes all the difference. Are there some "ground rules" in what order joins should be in?

我在查询中遇到时间性能上的巨大差异,看起来连接(内部和左外部)在查询中出现的顺序会产生重大影响。连接的顺序是否存在一些“基本规则”?

Both of them are part of a bigger query. The difference between them is that the left join is placed last in the faster query.

它们都是更大查询的一部分。它们之间的区别在于左连接位于更快查询的最后。

Slow query: (> 10 minutes)

慢查询:(> 10分钟)

SELECT [t0].[Ref], [t1].[Key], [t1].[Name],  
    (CASE 
        WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
        ELSE CONVERT(NVarChar(250),[t3].[Key])
     END) AS [value], 
    (CASE 
        WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
        ELSE CONVERT(NVarChar(250),[t4].[Key])
     END) AS [value2]

FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]

LEFT OUTER JOIN (
    SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
    FROM [dbo].[tblC] AS [t2]
    ) AS [t3] ON [t0].[RefC] = ([t3].[Ref])

INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])

Faster query: (~ 30 seconds)

查询速度更快:(~30秒)

SELECT [t0].[Ref], [t1].[Key], [t1].[Name],  
    (CASE 
        WHEN [t3].[test] IS NULL THEN CONVERT(NVarChar(250),@p0)
        ELSE CONVERT(NVarChar(250),[t3].[Key])
     END) AS [value], 
    (CASE 
        WHEN 0 = 1 THEN CONVERT(NVarChar(250),@p1)
        ELSE CONVERT(NVarChar(250),[t4].[Key])
     END) AS [value2]

FROM [dbo].[tblA] AS [t0]
INNER JOIN [dbo].[tblB] AS [t1] ON [t0].[RefB] = [t1].[Ref]

INNER JOIN [dbo].[tblD] AS [t4] ON [t0].[RefD] = ([t4].[Ref])

LEFT OUTER JOIN (
    SELECT 1 AS [test], [t2].[Ref], [t2].[Key]
    FROM [dbo].[tblC] AS [t2]
    ) AS [t3] ON [t0].[RefC] = ([t3].[Ref])

3 个解决方案

#1


9  

Generally INNER JOIN order won't matter because inner joins are commutative and associative. In both cases, you still have t0 inner join t4 so should make no difference.

通常INNER JOIN顺序无关紧要,因为内连接是可交换的和关联的。在这两种情况下,你仍然有t0内连接t4所以应该没有区别。

Re-phrasing that, SQL is declarative: you say "what you want", not "how". The optimiser works the "how" and will re-order JOINs as needed, looking as WHEREs etc too in practice.

重新说明,SQL是声明性的:你说“你想要什么”,而不是“如何”。优化器工作“如何”并将根据需要重新排序JOIN,在实践中也看起来像WHERE等。

In complex queries, a cost based query optimiser won't exhaust all permutation so it could matter occasionally.

在复杂查询中,基于成本的查询优化器不会耗尽所有排列,因此偶尔会有问题。

So, I'd check for these:

所以,我会检查这些:

  • You said these are part of a bigger query, so this section matters less because the whole query matters.
  • 你说这些是一个更大的查询的一部分,所以这部分更重要,因为整个查询很重要。

  • Complexity can be hidden using views too if any of the tables are actually views
  • 如果任何表实际上是视图,也可以使用视图隐藏复杂性

  • Is this repeatable, no matter what order code runs in?
  • 无论订单代码是什么,这都是可重复的吗?

  • What are the query plan differences?
  • 查询计划的差异是什么?

See some other SO questions:

查看其他一些SO问题:

#2


1  

If u have more than 2 tables it is important to order table joins. It can make big differences. First table should get a leading hint. First table is that object with most selective rows. For example: If u have a member table with 1.000.000 people and you only want to select female gender and it is first table, so you only join 500.000 records to next table. If this table is at the end of join order (maybe table 4,5 or 6) then each record (worst case 1.000.000) will be joined. This includes inner and outer joins.

如果你有超过2个表,那么订购表连接很重要。它可以产生很大的差异。第一个表应该得到一个领先的提示。第一个表是具有最多选择行的对象。例如:如果您有一个拥有1.000.000人的成员表,并且您只想选择女性并且它是第一个表,那么您只能将500.000条记录加入下一个表。如果此表位于连接顺序的末尾(可能是表4,5或6),则每个记录(最差情况为1.000.000)将被连接。这包括内部和外部联接。

The Rule: Start with most selective table, then join next logical most selective table.

规则:从最具选择性的表开始,然后加入下一个逻辑最具选择性的表。

Converting functions and beautifying should do last. Sometimes it is better to bundle the shole SQL in brackets and use expressions and functions in outer select statements.

转换功能和美化应该持久。有时最好将shole SQL捆绑在括号中,并在外部select语句中使用表达式和函数。

#3


0  

At least in SQLite, I found out that it makes a huge difference. Actually it didn't need to be a very complex query for the difference to show itself. My JOIN statements were inside an embedded clause however.

至少在SQLite中,我发现它产生了巨大的差异。实际上,它不需要是一个非常复杂的查询,以显示差异。然而,我的JOIN语句在嵌入式子句中。

Basically, you should start with the most specific limitations first, as Christian has pointed out.

基本上,正如Christian所指出的那样,你应该首先从最具体的限制开始。

#1


9  

Generally INNER JOIN order won't matter because inner joins are commutative and associative. In both cases, you still have t0 inner join t4 so should make no difference.

通常INNER JOIN顺序无关紧要,因为内连接是可交换的和关联的。在这两种情况下,你仍然有t0内连接t4所以应该没有区别。

Re-phrasing that, SQL is declarative: you say "what you want", not "how". The optimiser works the "how" and will re-order JOINs as needed, looking as WHEREs etc too in practice.

重新说明,SQL是声明性的:你说“你想要什么”,而不是“如何”。优化器工作“如何”并将根据需要重新排序JOIN,在实践中也看起来像WHERE等。

In complex queries, a cost based query optimiser won't exhaust all permutation so it could matter occasionally.

在复杂查询中,基于成本的查询优化器不会耗尽所有排列,因此偶尔会有问题。

So, I'd check for these:

所以,我会检查这些:

  • You said these are part of a bigger query, so this section matters less because the whole query matters.
  • 你说这些是一个更大的查询的一部分,所以这部分更重要,因为整个查询很重要。

  • Complexity can be hidden using views too if any of the tables are actually views
  • 如果任何表实际上是视图,也可以使用视图隐藏复杂性

  • Is this repeatable, no matter what order code runs in?
  • 无论订单代码是什么,这都是可重复的吗?

  • What are the query plan differences?
  • 查询计划的差异是什么?

See some other SO questions:

查看其他一些SO问题:

#2


1  

If u have more than 2 tables it is important to order table joins. It can make big differences. First table should get a leading hint. First table is that object with most selective rows. For example: If u have a member table with 1.000.000 people and you only want to select female gender and it is first table, so you only join 500.000 records to next table. If this table is at the end of join order (maybe table 4,5 or 6) then each record (worst case 1.000.000) will be joined. This includes inner and outer joins.

如果你有超过2个表,那么订购表连接很重要。它可以产生很大的差异。第一个表应该得到一个领先的提示。第一个表是具有最多选择行的对象。例如:如果您有一个拥有1.000.000人的成员表,并且您只想选择女性并且它是第一个表,那么您只能将500.000条记录加入下一个表。如果此表位于连接顺序的末尾(可能是表4,5或6),则每个记录(最差情况为1.000.000)将被连接。这包括内部和外部联接。

The Rule: Start with most selective table, then join next logical most selective table.

规则:从最具选择性的表开始,然后加入下一个逻辑最具选择性的表。

Converting functions and beautifying should do last. Sometimes it is better to bundle the shole SQL in brackets and use expressions and functions in outer select statements.

转换功能和美化应该持久。有时最好将shole SQL捆绑在括号中,并在外部select语句中使用表达式和函数。

#3


0  

At least in SQLite, I found out that it makes a huge difference. Actually it didn't need to be a very complex query for the difference to show itself. My JOIN statements were inside an embedded clause however.

至少在SQLite中,我发现它产生了巨大的差异。实际上,它不需要是一个非常复杂的查询,以显示差异。然而,我的JOIN语句在嵌入式子句中。

Basically, you should start with the most specific limitations first, as Christian has pointed out.

基本上,正如Christian所指出的那样,你应该首先从最具体的限制开始。