For each identifier, how can I return the quantity when the received country is not equal to any of the delivered countries? I need an efficient query for the steps below since my table is huge.

对于每个标识符，当接收国不等于任何一个发送国时，如何返回数量?由于我的表很大，因此需要对下面的步骤进行有效的查询。

These are the steps I would think could do this, of course you don't need to follow them :)

这些步骤我认为可以做到，当然你不需要遵循它们:

Create a group of 'delivered' countries for each identifier.
为每个标识符创建一组“交付”的国家。
See if 'received' is any of these countries for each identifier. If there is no match, return this result.
看看每个标识符的“received”是否是这些国家中的任何一个。如果没有匹配，返回此结果。

Starting Table:

从表:

identifier         delivered            received        quantity
-------------      ------------         -----------     ------------
1                  USA                  France          432
1                  France               USA             450
1                  Ireland              Russia          100
2                  Germany              Germany         1,034
3                  USA                  France          50
3                  USA                  USA             120

Result:

结果:

identifier         delivered            received        quantity
-------------      ------------         -----------     ------------
1                  Ireland              Russia          100

The starting table is about 30,000,000 rows, so self-joins will be impossible unfortunately. I am using something similar to MySQL.

起始表大约有30,000,000行，因此很不幸，不可能进行自连接。我使用的是类似于MySQL的东西。

2 个解决方案

#1

I think LEFT JOIN query should work for you:

我认为左连接查询应该适合您:

SELECT a.*
FROM starting a
     LEFT JOIN starting b
        ON a.id = b.id
           AND a.delivered = b.received
WHERE b.received IS NULL;

Example: SQLFiddle

For optimizing above query, adding following composite index should give you better performance:

为了优化上面的查询，添加以下复合索引应该会给您更好的性能:

ALTER TABLE starting  ADD KEY ix1(id, delivered, received);

#2

You could use a not exists subquery:

可以使用不存在子查询:

SELECT  a.*
FROM    starting a
WHERE   NOT EXISTS
        (
        SELECT  *
        FROM    starting b
        WHERE   a.id = b.id
                AND a.delivered = b.received
        )

This is not a self-join, but the query optimizer is free to execute it as one (and usually does.)

这不是一个自连接，但是查询优化器可以*地执行它(通常是这样)。

#1

I think LEFT JOIN query should work for you:

我认为左连接查询应该适合您:

SELECT a.*
FROM starting a
     LEFT JOIN starting b
        ON a.id = b.id
           AND a.delivered = b.received
WHERE b.received IS NULL;

Example: SQLFiddle

For optimizing above query, adding following composite index should give you better performance:

为了优化上面的查询，添加以下复合索引应该会给您更好的性能:

ALTER TABLE starting  ADD KEY ix1(id, delivered, received);

#2

You could use a not exists subquery:

可以使用不存在子查询:

SELECT  a.*
FROM    starting a
WHERE   NOT EXISTS
        (
        SELECT  *
        FROM    starting b
        WHERE   a.id = b.id
                AND a.delivered = b.received
        )

This is not a self-join, but the query optimizer is free to execute it as one (and usually does.)

这不是一个自连接，但是查询优化器可以*地执行它(通常是这样)。

秒客网

如果列值不在不同列值的组中，SQL将选择行

2 个解决方案

#1

Example: SQLFiddle

#2

#1

Example: SQLFiddle

#2

相关文章