如果列值不在不同列值的组中,SQL将选择行

时间:2021-12-15 22:44:11

For each identifier, how can I return the quantity when the received country is not equal to any of the delivered countries? I need an efficient query for the steps below since my table is huge.

对于每个标识符,当接收国不等于任何一个发送国时,如何返回数量?由于我的表很大,因此需要对下面的步骤进行有效的查询。

These are the steps I would think could do this, of course you don't need to follow them :)

这些步骤我认为可以做到,当然你不需要遵循它们:

  1. Create a group of 'delivered' countries for each identifier.
  2. 为每个标识符创建一组“交付”的国家。
  3. See if 'received' is any of these countries for each identifier. If there is no match, return this result.
  4. 看看每个标识符的“received”是否是这些国家中的任何一个。如果没有匹配,返回此结果。

Starting Table:

从表:

identifier         delivered            received        quantity
-------------      ------------         -----------     ------------
1                  USA                  France          432
1                  France               USA             450
1                  Ireland              Russia          100
2                  Germany              Germany         1,034
3                  USA                  France          50
3                  USA                  USA             120

Result:

结果:

identifier         delivered            received        quantity
-------------      ------------         -----------     ------------
1                  Ireland              Russia          100 

The starting table is about 30,000,000 rows, so self-joins will be impossible unfortunately. I am using something similar to MySQL.

起始表大约有30,000,000行,因此很不幸,不可能进行自连接。我使用的是类似于MySQL的东西。

2 个解决方案

#1


6  

I think LEFT JOIN query should work for you:

我认为左连接查询应该适合您:

SELECT a.*
FROM starting a
     LEFT JOIN starting b
        ON a.id = b.id
           AND a.delivered = b.received
WHERE b.received IS NULL;

Example: SQLFiddle

For optimizing above query, adding following composite index should give you better performance:

为了优化上面的查询,添加以下复合索引应该会给您更好的性能:

ALTER TABLE starting  ADD KEY ix1(id, delivered, received);

#2


0  

You could use a not exists subquery:

可以使用不存在子查询:

SELECT  a.*
FROM    starting a
WHERE   NOT EXISTS
        (
        SELECT  *
        FROM    starting b
        WHERE   a.id = b.id
                AND a.delivered = b.received
        )

This is not a self-join, but the query optimizer is free to execute it as one (and usually does.)

这不是一个自连接,但是查询优化器可以*地执行它(通常是这样)。

#1


6  

I think LEFT JOIN query should work for you:

我认为左连接查询应该适合您:

SELECT a.*
FROM starting a
     LEFT JOIN starting b
        ON a.id = b.id
           AND a.delivered = b.received
WHERE b.received IS NULL;

Example: SQLFiddle

For optimizing above query, adding following composite index should give you better performance:

为了优化上面的查询,添加以下复合索引应该会给您更好的性能:

ALTER TABLE starting  ADD KEY ix1(id, delivered, received);

#2


0  

You could use a not exists subquery:

可以使用不存在子查询:

SELECT  a.*
FROM    starting a
WHERE   NOT EXISTS
        (
        SELECT  *
        FROM    starting b
        WHERE   a.id = b.id
                AND a.delivered = b.received
        )

This is not a self-join, but the query optimizer is free to execute it as one (and usually does.)

这不是一个自连接,但是查询优化器可以*地执行它(通常是这样)。