哪个序列是SQL引擎执行的查询和子查询?

时间:2023-02-13 15:46:53

Hello I made a SQL test and dubious/curious about one question:

你好,我做了一个SQL测试,对一个问题很怀疑/好奇:

In which sequence are queries and sub-queries executed by the SQL engine?

哪个序列是SQL引擎执行的查询和子查询?

the answers was

答案是

  1. primary query -> sub query -> sub sub query and so on
  2. 主查询 - >子查询 - >子子查询等
  3. sub sub query -> sub query -> prime query
  4. 子子查询 - >子查询 - >素数查询
  5. the whole query is interpreted at one time
  6. 整个查询一次解释
  7. There is no fixed sequence of interpretation, the query parser takes a decision on fly
  8. 没有固定的解释序列,查询解析器在飞行中做出决定

I choosed the last answer (just supposing that it is most reliable w.r.t. others). Now the curiosity:

我选择了最后一个答案(假设它是最可靠的w.r.t.其他人)。现在的好奇心:

where can i read about this and briefly what is the mechanism under all of that?

我在哪里可以读到这一点,简要说明所有这些机制是什么?

Thank you.

谢谢。

5 个解决方案

#1


14  

Option 4 is close.

方案4很接近。

SQL is declarative: you tell the query optimiser what you want and it works out the best (subject to time/"cost" etc) way of doing it. This may vary for outwardly identical queries and tables depending on statistics, data distribution, row counts, parallelism and god knows what else.

SQL是声明性的:您告诉查询优化器您想要什么,并且它可以实现最佳(受时间/“成本”等)方式。对于外观相同的查询和表,这可能会有所不同,具体取决于统计数据,数据分布,行数,并行性,而且上帝知道还有什么。

This means there is no fixed order. But it's not quite "on the fly"

这意味着没有固定的订单。但它并不是“在飞行中”

Even with identical servers, schema, queries, and data I've seen execution plans differ

即使使用相同的服务器,架构,查询和数据,我也看到了执行计划的不同

#2


22  

I think answer 4 is correct. There are a few considerations:

我认为答案4是正确的。有几个注意事项:

type of subquery - is it corrrelated, or not. Consider:

子查询的类型 - 它是否相关。考虑:

SELECT *
FROM   t1
WHERE  id IN (
             SELECT id
             FROM   t2
            )

Here, the subquery is not correlated to the outer query. If the number of values in t2.id is small in comparison to t1.id, it is probably most efficient to first execute the subquery, and keep the result in memory, and then scan t1 or an index on t1.id, matching against the cached values.

这里,子查询与外部查询无关。如果t2.id中的值的数量与t1.id相比较小,则首先执行子查询可能是最有效的,并将结果保存在内存中,然后在t1.id上扫描t1或索引,以匹配缓存的值。

But if the query is:

但如果查询是:

SELECT *
FROM   t1
WHERE  id IN (
             SELECT id
             FROM   t2
             WHERE  t2.type = t1.type
            )

here the subquery is correlated - there is no way to compute the subquery unless t1.type is known. Since the value for t1.type may vary for each row of the outer query, this subquery could be executed once for each row of the outer query.

这里子查询是相关的 - 除非知道t1.type,否则无法计算子查询。由于t1.type的值可能因外部查询的每一行而异,因此可以对外部查询的每一行执行一次该子查询。

Then again, the RDBMS may be really smart and realize there are only a few possible values for t2.type. In that case, it may still use the approach used for the uncorrelated subquery if it can guess that the cost of executing the subquery once will be cheaper that doing it for each row.

然后,RDBMS可能非常聪明,并且意识到t2.type只有几个可能的值。在这种情况下,它仍然可以使用用于不相关子查询的方法,如果它可以猜测执行子查询一次的成本将比为每行执行它更便宜。

#3


1  

The SQL engine tries to optimise the order in which (sub)queries are executed. The part deciding about that is called a query optimizer. The query optimizer knows how many rows are in each table, which tables have indexes and on what fields. It uses that information to decide what part to execute first.

SQL引擎尝试优化(子)查询的执行顺序。决定这一点的部分称为查询优化器。查询优化器知道每个表中有多少行,哪些表具有索引以及哪些字段。它使用该信息来决定首先执行哪个部分。

#4


1  

If you want something to read up on these topics, get a copy of Inside SQL Server 2008: T-SQL Querying. It has two dedicated chapters on how queries are processed logically and physically in SQL Server.

如果您想了解有关这些主题的内容,请获取Inside SQL Server 2008:T-SQL查询的副本。它有两个专门的章节,介绍如何在SQL Server中逻辑和物理地处理查询。

#5


0  

It's usually depends from your DBMS, but ... I think second answer is more plausible. Prime query usually can't be calculated without sub query results.

这通常取决于您的DBMS,但......我认为第二个答案更合理。如果没有子查询结果,通常无法计算Prime查询。

#1


14  

Option 4 is close.

方案4很接近。

SQL is declarative: you tell the query optimiser what you want and it works out the best (subject to time/"cost" etc) way of doing it. This may vary for outwardly identical queries and tables depending on statistics, data distribution, row counts, parallelism and god knows what else.

SQL是声明性的:您告诉查询优化器您想要什么,并且它可以实现最佳(受时间/“成本”等)方式。对于外观相同的查询和表,这可能会有所不同,具体取决于统计数据,数据分布,行数,并行性,而且上帝知道还有什么。

This means there is no fixed order. But it's not quite "on the fly"

这意味着没有固定的订单。但它并不是“在飞行中”

Even with identical servers, schema, queries, and data I've seen execution plans differ

即使使用相同的服务器,架构,查询和数据,我也看到了执行计划的不同

#2


22  

I think answer 4 is correct. There are a few considerations:

我认为答案4是正确的。有几个注意事项:

type of subquery - is it corrrelated, or not. Consider:

子查询的类型 - 它是否相关。考虑:

SELECT *
FROM   t1
WHERE  id IN (
             SELECT id
             FROM   t2
            )

Here, the subquery is not correlated to the outer query. If the number of values in t2.id is small in comparison to t1.id, it is probably most efficient to first execute the subquery, and keep the result in memory, and then scan t1 or an index on t1.id, matching against the cached values.

这里,子查询与外部查询无关。如果t2.id中的值的数量与t1.id相比较小,则首先执行子查询可能是最有效的,并将结果保存在内存中,然后在t1.id上扫描t1或索引,以匹配缓存的值。

But if the query is:

但如果查询是:

SELECT *
FROM   t1
WHERE  id IN (
             SELECT id
             FROM   t2
             WHERE  t2.type = t1.type
            )

here the subquery is correlated - there is no way to compute the subquery unless t1.type is known. Since the value for t1.type may vary for each row of the outer query, this subquery could be executed once for each row of the outer query.

这里子查询是相关的 - 除非知道t1.type,否则无法计算子查询。由于t1.type的值可能因外部查询的每一行而异,因此可以对外部查询的每一行执行一次该子查询。

Then again, the RDBMS may be really smart and realize there are only a few possible values for t2.type. In that case, it may still use the approach used for the uncorrelated subquery if it can guess that the cost of executing the subquery once will be cheaper that doing it for each row.

然后,RDBMS可能非常聪明,并且意识到t2.type只有几个可能的值。在这种情况下,它仍然可以使用用于不相关子查询的方法,如果它可以猜测执行子查询一次的成本将比为每行执行它更便宜。

#3


1  

The SQL engine tries to optimise the order in which (sub)queries are executed. The part deciding about that is called a query optimizer. The query optimizer knows how many rows are in each table, which tables have indexes and on what fields. It uses that information to decide what part to execute first.

SQL引擎尝试优化(子)查询的执行顺序。决定这一点的部分称为查询优化器。查询优化器知道每个表中有多少行,哪些表具有索引以及哪些字段。它使用该信息来决定首先执行哪个部分。

#4


1  

If you want something to read up on these topics, get a copy of Inside SQL Server 2008: T-SQL Querying. It has two dedicated chapters on how queries are processed logically and physically in SQL Server.

如果您想了解有关这些主题的内容,请获取Inside SQL Server 2008:T-SQL查询的副本。它有两个专门的章节,介绍如何在SQL Server中逻辑和物理地处理查询。

#5


0  

It's usually depends from your DBMS, but ... I think second answer is more plausible. Prime query usually can't be calculated without sub query results.

这通常取决于您的DBMS,但......我认为第二个答案更合理。如果没有子查询结果,通常无法计算Prime查询。