不存在、不存在和左连接之间的区别是什么?

时间:2021-09-09 07:22:34

It seems to me that you can do the same thing in a SQL query using either NOT EXISTS, NOT IN, or LEFT JOIN WHERE IS NULL. For example:

在我看来,您可以在SQL查询中使用NOT exist、NOT in或LEFT JOIN来执行相同的操作。例如:

SELECT a FROM table1 WHERE a NOT IN (SELECT a FROM table2)

SELECT a FROM table1 WHERE NOT EXISTS (SELECT * FROM table2 WHERE table1.a = table2.a)

SELECT a FROM table1 LEFT JOIN table2 ON table1.a = table2.a WHERE table1.a IS NULL

I'm not sure if I got all the syntax correct, but these are the general techniques I've seen. Why would I choose to use one over the other? Does performance differ...? Which one of these is the fastest / most efficient? (If it depends on implementation, when would I use each one?)

我不确定是否所有的语法都是正确的,但这些是我所见过的一般技术。为什么我会选择使用一个而不是另一个呢?性能不同…吗?哪一个是最快的/效率最高的?(如果这取决于实现,我什么时候使用它们?)

5 个解决方案

#1


119  

In a nutshell:

简而言之:

NOT IN is a little bit different: it never matches if there is but a single NULL in the list.

NOT IN有点不同:如果列表中只有一个空值,它就不会匹配。

  • In MySQL, NOT EXISTS is a little bit less efficient

    在MySQL中,不存在的效率要低一些。

  • In SQL Server, LEFT JOIN / IS NULL is less efficient

    在SQL Server中,左连接/为NULL的效率较低

  • In PostgreSQL, NOT IN is less efficient

    在PostgreSQL中,NOT In的效率更低

  • In Oracle, all three methods are the same.

    在Oracle中,这三个方法都是相同的。

#2


5  

If the database is good at optimising the query, the two first will be transformed to something close to the third.

如果数据库善于优化查询,那么第一个查询将被转换为接近第三个查询的查询。

For simple situations like the ones in you question, there should be little or no difference, as they all will be executed as joins. In more complex queries, the database might not be able to make a join out of the not in and not exists queryes. In that case the queries will get a lot slower. On the other hand, a join may also perform badly if there is no index that can be used, so just because you use a join doesn't mean that you are safe. You would have to examine the execution plan of the query to tell if there may be any performance problems.

对于像您这样的简单情况,应该很少或没有区别,因为它们都将作为连接执行。在更复杂的查询中,数据库可能无法使用not In和not exist的queryes进行连接。在这种情况下,查询将变得更慢。另一方面,如果没有可以使用的索引,那么连接的性能也会很差,所以仅仅因为使用了连接并不意味着您是安全的。您必须检查查询的执行计划,以确定是否存在任何性能问题。

#3


1  

Assuming you are avoiding nulls, they are all ways of writing an anti-join using Standard SQL.

假设您正在避免null,它们都是使用标准SQL编写反连接的方法。

An obvious omission is the equivalent using EXCEPT:

一个明显的省略是等价的用法,除了:

SELECT a FROM table1
EXCEPT
SELECT a FROM table2

Note in Oracle you need to use the MINUS operator (arguably a better name):

注意,在Oracle中,您需要使用负号操作符(可能是更好的名称):

SELECT a FROM table1
MINUS
SELECT a FROM table2

Speaking of proprietary syntax, there may also be non-Standard equivalents worth investigating depending on the product you are using e.g. OUTER APPLY in SQL Server (something like):

说到专有语法,根据您正在使用的产品,可能还有一些非标准的对等物值得研究。

SELECT t1.a
  FROM table1 t1
       OUTER APPLY 
       (
        SELECT t2.a
          FROM table2 t2
         WHERE t2.a = t1.a
       ) AS dt1
 WHERE dt1.a IS NULL;

#4


0  

When need to insert data in table with multi-field primary key, consider that it will be much faster (I tried in Access but I think in any Database) not to check that "not exists records with 'such' values in table", - rather just insert into table, and excess records (by the key) will not be inserted twice.

时需要插入数据与多字段主键表,认为它将更快(我试着访问在任何数据库)但我认为不是检查,“不存在记录与这些值在表”,而不是插入表,多余的记录(由键)将不会被插入的两倍。

#5


0  

Performance perspective always avoid using inverse keywords like NOT IN, NOT EXISTS, ... Because to check the inverse items DBMS need to runs through all the available and drop the inverse selection.

性能透视图总是避免使用逆关键字如NOT IN, NOT exist,…因为要检查逆项,DBMS需要遍历所有可用项并删除逆选择。

#1


119  

In a nutshell:

简而言之:

NOT IN is a little bit different: it never matches if there is but a single NULL in the list.

NOT IN有点不同:如果列表中只有一个空值,它就不会匹配。

  • In MySQL, NOT EXISTS is a little bit less efficient

    在MySQL中,不存在的效率要低一些。

  • In SQL Server, LEFT JOIN / IS NULL is less efficient

    在SQL Server中,左连接/为NULL的效率较低

  • In PostgreSQL, NOT IN is less efficient

    在PostgreSQL中,NOT In的效率更低

  • In Oracle, all three methods are the same.

    在Oracle中,这三个方法都是相同的。

#2


5  

If the database is good at optimising the query, the two first will be transformed to something close to the third.

如果数据库善于优化查询,那么第一个查询将被转换为接近第三个查询的查询。

For simple situations like the ones in you question, there should be little or no difference, as they all will be executed as joins. In more complex queries, the database might not be able to make a join out of the not in and not exists queryes. In that case the queries will get a lot slower. On the other hand, a join may also perform badly if there is no index that can be used, so just because you use a join doesn't mean that you are safe. You would have to examine the execution plan of the query to tell if there may be any performance problems.

对于像您这样的简单情况,应该很少或没有区别,因为它们都将作为连接执行。在更复杂的查询中,数据库可能无法使用not In和not exist的queryes进行连接。在这种情况下,查询将变得更慢。另一方面,如果没有可以使用的索引,那么连接的性能也会很差,所以仅仅因为使用了连接并不意味着您是安全的。您必须检查查询的执行计划,以确定是否存在任何性能问题。

#3


1  

Assuming you are avoiding nulls, they are all ways of writing an anti-join using Standard SQL.

假设您正在避免null,它们都是使用标准SQL编写反连接的方法。

An obvious omission is the equivalent using EXCEPT:

一个明显的省略是等价的用法,除了:

SELECT a FROM table1
EXCEPT
SELECT a FROM table2

Note in Oracle you need to use the MINUS operator (arguably a better name):

注意,在Oracle中,您需要使用负号操作符(可能是更好的名称):

SELECT a FROM table1
MINUS
SELECT a FROM table2

Speaking of proprietary syntax, there may also be non-Standard equivalents worth investigating depending on the product you are using e.g. OUTER APPLY in SQL Server (something like):

说到专有语法,根据您正在使用的产品,可能还有一些非标准的对等物值得研究。

SELECT t1.a
  FROM table1 t1
       OUTER APPLY 
       (
        SELECT t2.a
          FROM table2 t2
         WHERE t2.a = t1.a
       ) AS dt1
 WHERE dt1.a IS NULL;

#4


0  

When need to insert data in table with multi-field primary key, consider that it will be much faster (I tried in Access but I think in any Database) not to check that "not exists records with 'such' values in table", - rather just insert into table, and excess records (by the key) will not be inserted twice.

时需要插入数据与多字段主键表,认为它将更快(我试着访问在任何数据库)但我认为不是检查,“不存在记录与这些值在表”,而不是插入表,多余的记录(由键)将不会被插入的两倍。

#5


0  

Performance perspective always avoid using inverse keywords like NOT IN, NOT EXISTS, ... Because to check the inverse items DBMS need to runs through all the available and drop the inverse selection.

性能透视图总是避免使用逆关键字如NOT IN, NOT exist,…因为要检查逆项,DBMS需要遍历所有可用项并删除逆选择。