MySQL表自联接返回太多行

时间:2022-09-20 15:54:32

So I have a table, my_table with a primary key, id (INT), and further columns foo (VARCHAR) and bar (DOUBLE). Each foo should appear once in my table, with an associated bar value, but I know that I have several rows with identical foos associated different bars. How do I get a list of those rows containing the same foo value, but which have different bars (say, different by more than 10.)? I tried:

所以我有一个表,my_table带有主键,id(INT),还有列foo(VARCHAR)和bar(DOUBLE)。每个foo应该在我的表中出现一次,带有相关的条形值,但我知道我有几行具有相同的foos关联不同的条形。如何获取包含相同foo值但具有不同条形(即,差异超过10)的那些行的列表?我试过了:

    SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo
    FROM my_table t1, my_table t2
    WHERE t1.foo=t2.foo
    AND t1.bar - t2.bar > 10.; 

But I get lots and lots of results (more than the total number of rows in my_table). I feel I must be doing something very obviously stupid, but can't see my mistake.

但是我得到了很多很多结果(超过了my_table中的总行数)。我觉得我必须做一些非常明显愚蠢的事情,但看不出我的错误。

Ah - thanks SWeko: I think I understand why I'm getting so many results, then. Is there a way in SQL of counting, for each foo, the number of rows with that foo but bars differing by more than 10.?

啊 - 谢谢SWeko:我想我理解为什么我会得到这么多结果呢。在SQL中,对于每个foo,有一种方法可以计算具有该foo的行数,但是条的差异超过10。

3 个解决方案

#1


0  

If, for example, you have 5 rows with foo='A' and 10 rows with foo='B' the self-join will join each A-row with each other A-row (including itself) and each B-row with each other B-row, so a simple

例如,如果你有5行foo ='A'和10行foo ='B',那么自连接会将每个A行与另一个A行(包括它自己)和每个B行连接起来对方B排,这么简单

SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo
FROM my_table t1, my_table t2
WHERE t1.foo=t2.foo

will return 5*5+10*10=125 rows. Filtering the values will cut that number down, but you might still have (significantly) more rows than you started with. E.g. if we presume that the B-rows have values of bar of 5 through 50 respectively, that would mean that they will be matched with:

将返回5 * 5 + 10 * 10 = 125行。过滤这些值会减少该数字,但您可能仍然拥有(显着)多于您开始的行数。例如。如果我们假设B行的条形值分别为5到50,那就意味着它们将匹配:

bar = 5  - 0 rows that have bar less than -5
bar = 10 - 0 rows that have bar less than 0
bar = 15 - 0 rows that have bar less than 5
bar = 20 - 1 rows that have bar less than 10
bar = 25 - 2 rows that have bar less than 15
bar = 30 - 3 rows that have bar less than 20
bar = 35 - 4 rows that have bar less than 25
bar = 40 - 5 rows that have bar less than 30
bar = 45 - 6 rows that have bar less than 35
bar = 50 - 7 rows that have bar less than 40

so you will have 28 results for the B-rows alone, and that number rises with the square of the rows that have the same value of foo.

因此,对于B行,您将获得28个结果,并且该数字随着具有相同foo值的行的平方而上升。

#2


2  

To answer your latest question:

要回答您的最新问题:

Is there a way in SQL of counting, for each foo, the number of rows with that foo but bars differing by more than 10.?

在SQL中,对于每个foo,有一种方法可以计算具有该foo的行数,但是条的差异超过10。

A query like this should work:

像这样的查询应该有效:

select t1.id, t1.foo, t1.bar, count(t2.id) as dupes
from my_table t1
  left outer join my_table t2 on t1.foo=t2.foo and (t1.bar - t2.bar) > 10
group by t1.id, t1.foo, t1.bar; 

#3


-1  

Have you tried the same thing with the "new" JOIN syntax?

你有没有用“新”JOIN语法尝试相同的东西?

    SELECT t1.*,
           t2.*
      FROM my_table t1
      JOIN my_table t2 ON t1.foo = t2.foo
     WHERE (t1.bar - t2.bar) > 10

I don't suspect that that will fix your problem, but for me that's at least where I would start.

我不怀疑这会解决你的问题,但对我而言,这至少是我要开始的地方。

I might also try this:

我也可以试试这个:

    SELECT t1.*,
           t2.*
      FROM my_table t1
      JOIN my_table t2 ON t1.foo = t2.foo AND t1.id != t2.id
     WHERE (t1.bar - t2.bar) > 10

#1


0  

If, for example, you have 5 rows with foo='A' and 10 rows with foo='B' the self-join will join each A-row with each other A-row (including itself) and each B-row with each other B-row, so a simple

例如,如果你有5行foo ='A'和10行foo ='B',那么自连接会将每个A行与另一个A行(包括它自己)和每个B行连接起来对方B排,这么简单

SELECT t1.id, t1.bar, t2.id, t2.bar, t1.foo
FROM my_table t1, my_table t2
WHERE t1.foo=t2.foo

will return 5*5+10*10=125 rows. Filtering the values will cut that number down, but you might still have (significantly) more rows than you started with. E.g. if we presume that the B-rows have values of bar of 5 through 50 respectively, that would mean that they will be matched with:

将返回5 * 5 + 10 * 10 = 125行。过滤这些值会减少该数字,但您可能仍然拥有(显着)多于您开始的行数。例如。如果我们假设B行的条形值分别为5到50,那就意味着它们将匹配:

bar = 5  - 0 rows that have bar less than -5
bar = 10 - 0 rows that have bar less than 0
bar = 15 - 0 rows that have bar less than 5
bar = 20 - 1 rows that have bar less than 10
bar = 25 - 2 rows that have bar less than 15
bar = 30 - 3 rows that have bar less than 20
bar = 35 - 4 rows that have bar less than 25
bar = 40 - 5 rows that have bar less than 30
bar = 45 - 6 rows that have bar less than 35
bar = 50 - 7 rows that have bar less than 40

so you will have 28 results for the B-rows alone, and that number rises with the square of the rows that have the same value of foo.

因此,对于B行,您将获得28个结果,并且该数字随着具有相同foo值的行的平方而上升。

#2


2  

To answer your latest question:

要回答您的最新问题:

Is there a way in SQL of counting, for each foo, the number of rows with that foo but bars differing by more than 10.?

在SQL中,对于每个foo,有一种方法可以计算具有该foo的行数,但是条的差异超过10。

A query like this should work:

像这样的查询应该有效:

select t1.id, t1.foo, t1.bar, count(t2.id) as dupes
from my_table t1
  left outer join my_table t2 on t1.foo=t2.foo and (t1.bar - t2.bar) > 10
group by t1.id, t1.foo, t1.bar; 

#3


-1  

Have you tried the same thing with the "new" JOIN syntax?

你有没有用“新”JOIN语法尝试相同的东西?

    SELECT t1.*,
           t2.*
      FROM my_table t1
      JOIN my_table t2 ON t1.foo = t2.foo
     WHERE (t1.bar - t2.bar) > 10

I don't suspect that that will fix your problem, but for me that's at least where I would start.

我不怀疑这会解决你的问题,但对我而言,这至少是我要开始的地方。

I might also try this:

我也可以试试这个:

    SELECT t1.*,
           t2.*
      FROM my_table t1
      JOIN my_table t2 ON t1.foo = t2.foo AND t1.id != t2.id
     WHERE (t1.bar - t2.bar) > 10