MySQL使用多列选择重复记录

时间:2022-01-12 23:16:51

I would like the select records from a table, or insert them into a new blank table where multiple of the columns is the same as another record in the database. The problem is similar to this Question. Find duplicate records in MySQL However that only compares one column. Also, one of my columns, lets say column C in the example below, is an integer. Like the question in the link above, I want each of the rows to be returned. Unforunately I am just not familiar enough with how joins work to figure this out on my own yet. I know that the code below doesn't resemble the actual SQL code need at all, it is just the clearest way I can think to describe the comparisons I am trying to get.

我想从表中选择记录,或将它们插入到一个新的空白表中,其中多个列与数据库中的另一个记录相同。问题与此问题类似。在MySQL中查找重复记录但是只比较一列。另外,我的一个列,比如下面例子中的C列,是一个整数。与上面链接中的问题一样,我希望返回每一行。不幸的是,我只是不熟悉联合如何工作来解决这个问题。我知道下面的代码根本不像实际的SQL代码需要,它只是我能想到的最清晰的方式来描述我想要的比较。

SELECT ColumnE, ColumnA, ColumnB, ColumnC from table where (
  Row1.ColumnA = Row2.ColumnA &&
  Row1.ColumnB = Row2.ColumnB &&
  Row1.ColumnC = Row2.ColumnC
)

Any help would be appreciated, all of the "select duplicates from MYSQL" questions I have seen use only one column as a comparison.

任何帮助将不胜感激,我所看到的所有“MYSQL选择重复”问题只使用一列作为比较。

2 个解决方案

#1


51  

If you want to count duplicates among multiple columns, use group by:

如果要计算多列之间的重复项,请使用group by:

select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC

If you only want the values that are duplicated, then the count is bigger than 1. You get this using the having clause:

如果您只想要重复的值,那么计数大于1.您可以使用having子句获得:

select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC
having NumDuplicates > 1

If you actually want all the duplicate rows returns, then join the last query back to the original data:

如果您确实希望所有重复行返回,则将最后一个查询加回原始数据:

select t.*
from table t join
     (select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
      from table
      group by ColumnA, ColumnB, ColumnC
      having NumDuplicates > 1
     ) tsum
     on t.ColumnA = tsum.ColumnA and t.ColumnB = tsum.ColumnB and t.ColumnC = tsum.ColumnC

This will work, assuming none of the column values are NULL. If so, then try:

假设没有列值为NULL,这将起作用。如果是这样,那么试试:

     on (t.ColumnA = tsum.ColumnA or t.ColumnA is null and tsum.ColumnA is null) and
        (t.ColumnB = tsum.ColumnB or t.ColumnB is null and tsum.ColumnB is null) and
        (t.ColumnC = tsum.ColumnC or t.ColumnC is null and tsum.ColumnC is null)

#2


1  

why don't you try using union or creating temporary table. but personally, i do recommend using union than that of creating temporary table cause it would take you a longer time doing that. try doing this:

为什么不尝试使用union或创建临时表。但就个人而言,我建议使用union而不是创建临时表,因为这会花费你更长的时间。试着这样做:

  select field1, field2 from(
   select '' as field2, field1, count(field1) as cnt FROM list GROUP BY field2 HAVING cnt > 1
    union
    select ''as field1, field2, cound(field2) as cnt from list group by field1 having cnt > 1
  )

hope this make sense.:)

希望这是有道理的。:)

#1


51  

If you want to count duplicates among multiple columns, use group by:

如果要计算多列之间的重复项,请使用group by:

select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC

If you only want the values that are duplicated, then the count is bigger than 1. You get this using the having clause:

如果您只想要重复的值,那么计数大于1.您可以使用having子句获得:

select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
from table
group by ColumnA, ColumnB, ColumnC
having NumDuplicates > 1

If you actually want all the duplicate rows returns, then join the last query back to the original data:

如果您确实希望所有重复行返回,则将最后一个查询加回原始数据:

select t.*
from table t join
     (select ColumnA, ColumnB, ColumnC, count(*) as NumDuplicates
      from table
      group by ColumnA, ColumnB, ColumnC
      having NumDuplicates > 1
     ) tsum
     on t.ColumnA = tsum.ColumnA and t.ColumnB = tsum.ColumnB and t.ColumnC = tsum.ColumnC

This will work, assuming none of the column values are NULL. If so, then try:

假设没有列值为NULL,这将起作用。如果是这样,那么试试:

     on (t.ColumnA = tsum.ColumnA or t.ColumnA is null and tsum.ColumnA is null) and
        (t.ColumnB = tsum.ColumnB or t.ColumnB is null and tsum.ColumnB is null) and
        (t.ColumnC = tsum.ColumnC or t.ColumnC is null and tsum.ColumnC is null)

#2


1  

why don't you try using union or creating temporary table. but personally, i do recommend using union than that of creating temporary table cause it would take you a longer time doing that. try doing this:

为什么不尝试使用union或创建临时表。但就个人而言,我建议使用union而不是创建临时表,因为这会花费你更长的时间。试着这样做:

  select field1, field2 from(
   select '' as field2, field1, count(field1) as cnt FROM list GROUP BY field2 HAVING cnt > 1
    union
    select ''as field1, field2, cound(field2) as cnt from list group by field1 having cnt > 1
  )

hope this make sense.:)

希望这是有道理的。:)