在连接中的Oracle SQL查询过滤器

时间:2022-09-16 14:30:24

For inner joins, is there any difference in performance to apply a filter in the JOIN ON clause or the WHERE clause? Which is going to be more efficient, or will the optimizer render them equal?

对于内部连接,在JOIN ON子句或WHERE子句中应用过滤器在性能上有什么不同吗?哪一种更有效,或者优化器将使它们相等?

JOIN ON

加入在

SELECT u.name
FROM users u
JOIN departments d
ON u.department_id = d.id
AND d.name         = 'IT'

VS

VS

WHERE

在哪里

SELECT u.name
FROM users u
JOIN departments d
ON u.department_id = d.id
WHERE d.name       = 'IT'

Oracle 11gR2

Oracle 11 gr2

2 个解决方案

#1


6  

There should be no difference. The optimizer should generate the same plan in both cases and should be able to apply the predicate before, after, or during the join in either case based on what is the most efficient approach for that particular query.

应该没有区别。在这两种情况下,优化器都应该生成相同的计划,并且应该能够根据该特定查询的最有效方法,在任何一种情况下,在连接之前、之后或连接期间应用谓词。

Of course, the fact that the optimizer can do something, in general, is no guarantee that the optimizer will actually do something in a particular query. As queries get more complicated, it becomes impossible to exhaustively consider every possible query plan which means that even with perfect information and perfect code, the optimizer may not have time to do everything that you'd like it to do. You'd need to take a look at the actual plans generated for the two queries to see if they are actually identical.

当然,通常来说,优化器可以做一些事情,这并不能保证优化器实际上会在特定查询中做一些事情。随着查询变得越来越复杂,不可能详尽地考虑每一个可能的查询计划,这意味着即使有了完美的信息和完美的代码,优化器也没有时间做您希望它做的所有事情。您需要查看两个查询生成的实际计划,看看它们是否完全相同。

#2


0  

I prefer putting the filter criteria in the where clause.

我更喜欢将筛选条件放在where子句中。

With data warehouse queries, putting the filter criteria in the join results to make the query lasting very very much longer.

对于数据仓库查询,将筛选条件放在连接结果中,使查询持续的时间更长。

For example, I have Table1 indexed by field Date and Table2 partitioned by field Partition, Table2 is the biggest table in the query and it is in another database server. I use driving_site hint to tell the optimizer to use Table2 partitions.

例如,我有表1按字段日期进行索引,表2按字段分区进行分区,表2是查询中最大的表,它位于另一个数据库服务器中。我使用driving_site提示告诉优化器使用表2分区。

select /*+driving_site(b)*/ a.key, sum(b.money) money
  from schema.table1 a
  join schema2.table2@dblink b
    on a.key = b.key
 where b.partition = to_number(to_char(:i,'yyyymm'))
   and a.date = :i
 group by a.key`

If I do the query this way, it takes about 30 - 40 seconds to return the results.

如果我以这种方式执行查询,返回结果需要大约30 - 40秒。

If I don't do the query this way, it takes about 10 minutes until I cancel the execution with no results.

如果我不以这种方式执行查询,大约需要10分钟,直到我取消执行而没有结果。

#1


6  

There should be no difference. The optimizer should generate the same plan in both cases and should be able to apply the predicate before, after, or during the join in either case based on what is the most efficient approach for that particular query.

应该没有区别。在这两种情况下,优化器都应该生成相同的计划,并且应该能够根据该特定查询的最有效方法,在任何一种情况下,在连接之前、之后或连接期间应用谓词。

Of course, the fact that the optimizer can do something, in general, is no guarantee that the optimizer will actually do something in a particular query. As queries get more complicated, it becomes impossible to exhaustively consider every possible query plan which means that even with perfect information and perfect code, the optimizer may not have time to do everything that you'd like it to do. You'd need to take a look at the actual plans generated for the two queries to see if they are actually identical.

当然,通常来说,优化器可以做一些事情,这并不能保证优化器实际上会在特定查询中做一些事情。随着查询变得越来越复杂,不可能详尽地考虑每一个可能的查询计划,这意味着即使有了完美的信息和完美的代码,优化器也没有时间做您希望它做的所有事情。您需要查看两个查询生成的实际计划,看看它们是否完全相同。

#2


0  

I prefer putting the filter criteria in the where clause.

我更喜欢将筛选条件放在where子句中。

With data warehouse queries, putting the filter criteria in the join results to make the query lasting very very much longer.

对于数据仓库查询,将筛选条件放在连接结果中,使查询持续的时间更长。

For example, I have Table1 indexed by field Date and Table2 partitioned by field Partition, Table2 is the biggest table in the query and it is in another database server. I use driving_site hint to tell the optimizer to use Table2 partitions.

例如,我有表1按字段日期进行索引,表2按字段分区进行分区,表2是查询中最大的表,它位于另一个数据库服务器中。我使用driving_site提示告诉优化器使用表2分区。

select /*+driving_site(b)*/ a.key, sum(b.money) money
  from schema.table1 a
  join schema2.table2@dblink b
    on a.key = b.key
 where b.partition = to_number(to_char(:i,'yyyymm'))
   and a.date = :i
 group by a.key`

If I do the query this way, it takes about 30 - 40 seconds to return the results.

如果我以这种方式执行查询,返回结果需要大约30 - 40秒。

If I don't do the query this way, it takes about 10 minutes until I cancel the execution with no results.

如果我不以这种方式执行查询,大约需要10分钟,直到我取消执行而没有结果。