如何在数据库表中找到重复的条目?

时间:2022-04-08 07:34:44

The following query will display all Dewey Decimal numbers that have been duplicated in the "book" table:

以下查询将显示已在“book”表中复制的所有Dewey Decimal数字:

SELECT dewey_number, 
 COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY dewey_number
HAVING ( COUNT(dewey_number) > 1 )

However, what I'd like to do is have my query display the name of the authors associated with the duplicated entry (the "book" table and "author" table are connected by "author_id"). In other words, the query above would yield the following:

但是,我想要做的是让我的查询显示与重复条目相关联的作者的名称(“book”表和“author”表通过“author_id”连接)。换句话说,上面的查询将产生以下结果:

dewey_number | NumOccurrences
------------------------------
5000         | 2
9090         | 3

What I'd like the results to display is something similar to the following:

我希望显示的结果类似于以下内容:

author_last_name | dewey_number | NumOccurrences
-------------------------------------------------
Smith            | 5000         | 2
Jones            | 5000         | 2
Jackson          | 9090         | 3
Johnson          | 9090         | 3
Jeffers          | 9090         | 3

Any help you can provide is greatly appreciated. And, in case it comes into play, I'm using a Postgresql DB.

非常感谢您提供的任何帮助。而且,如果它发挥作用,我正在使用Postgresql数据库。

UPDATE: Please note that "author_last_name" is not in the "book" table.

更新:请注意“author_last_name”不在“book”表中。

6 个解决方案

#1


22  

A nested query can do the job.

嵌套查询可以完成这项工作。

SELECT author_last_name, dewey_number, NumOccurrences
FROM author INNER JOIN
     ( SELECT author_id, dewey_number,  COUNT(dewey_number) AS NumOccurrences
        FROM book
        GROUP BY author_id, dewey_number
        HAVING ( COUNT(dewey_number) > 1 ) ) AS duplicates
ON author.id = duplicates.author_id

(I don't know if this is the fastest way to achieve what you want.)

(我不知道这是否是达到你想要的最快方式。)

Update: Here is my data

更新:这是我的数据

SELECT * FROM author;
 id | author_last_name 
----+------------------
  1 | Fowler
  2 | Knuth
  3 | Lang

SELECT * FROM book;
 id | author_id | dewey_number |         title          
----+-----------+--------------+------------------------
  1 |         1 |          600 | Refactoring
  2 |         1 |          600 | Refactoring
  3 |         1 |          600 | Analysis Patterns
  4 |         2 |          600 | TAOCP vol. 1
  5 |         2 |          600 | TAOCP vol. 1
  6 |         2 |          600 | TAOCP vol. 2
  7 |         3 |          500 | Algebra
  8 |         3 |          500 | Undergraduate Analysis
  9 |         1 |          600 | Refactoring
 10 |         2 |          500 | Concrete Mathematics
 11 |         2 |          500 | Concrete Mathematics
 12 |         2 |          500 | Concrete Mathematics

And here is the result of the above query:

以下是上述查询的结果:

 author_last_name | dewey_number | numoccurrences 
------------------+--------------+----------------
 Fowler           |          600 |              4
 Knuth            |          600 |              3
 Knuth            |          500 |              3
 Lang             |          500 |              2

#2


20  

You probably want this

你可能想要这个

SELECT dewey_number, author_last_name,
 COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY dewey_number,author_last_name
HAVING ( COUNT(dewey_number) > 1 )

#3


2  

SELECT dewey_number, author_last_name,
       COUNT(dewey_number) AS NumOccurrences
FROM book
JOIN author USING (author_id)
GROUP BY dewey_number,author_last_name
HAVING COUNT(dewey_number) > 1

If book.author_id can be null then change the join to:

如果book.author_id可以为null,则将连接更改为:

LEFT OUTER JOIN author USING (author_id)

If the author_id column has a different name in each table then you can't use USING, use ON instead:

如果author_id列在每个表中都有不同的名称,那么您不能使用USING,而是使用ON:

JOIN author ON author.id = book.author_id

or

要么

LEFT OUTER JOIN author ON author.id = book.author_id

#4


0  

select author_name,dewey_number,Num_of_occur
from author a,(select author_id,dewey_number,count(dewey_number) Num_of_occur
                from   book
                group by author_id,dewey_number
                having count(dewey_number) > 1) dup
where a.author_id = dup.author_id

#5


0  

Most simple and efective way i found is show below:

我找到的最简单有效的方法如下所示:

SELECT
    p.id
    , p.full_name
    , (SELECT count(id) FROM tbl_documents as t where t.person_id = p.id) as rows
FROM tbl_people as p
WHERE 
    p.id 
IN (SELECT d.person_id FROM tbl_documents as d 
    GROUP BY d.person_id HAVING count(d.id) > 1) 
ORDER BY 
    p.full_name

#6


-1  

select * from author
dewey_number    author_last_name
1   Ramu
2   Rajes
1   Samy
1   Ramu

select * from book
authorid    dewey_number
1   1
2   1

select a.dewey_number,a.author_last_name,count(a.dewey_number) from author a
where a.dewey_number in (
select b.dewey_number from book b )
group by a.dewey_number,a.author_last_name

dewey_number    author_last_name    (No column name)
1   Ramu    2
1   Samy    1

#1


22  

A nested query can do the job.

嵌套查询可以完成这项工作。

SELECT author_last_name, dewey_number, NumOccurrences
FROM author INNER JOIN
     ( SELECT author_id, dewey_number,  COUNT(dewey_number) AS NumOccurrences
        FROM book
        GROUP BY author_id, dewey_number
        HAVING ( COUNT(dewey_number) > 1 ) ) AS duplicates
ON author.id = duplicates.author_id

(I don't know if this is the fastest way to achieve what you want.)

(我不知道这是否是达到你想要的最快方式。)

Update: Here is my data

更新:这是我的数据

SELECT * FROM author;
 id | author_last_name 
----+------------------
  1 | Fowler
  2 | Knuth
  3 | Lang

SELECT * FROM book;
 id | author_id | dewey_number |         title          
----+-----------+--------------+------------------------
  1 |         1 |          600 | Refactoring
  2 |         1 |          600 | Refactoring
  3 |         1 |          600 | Analysis Patterns
  4 |         2 |          600 | TAOCP vol. 1
  5 |         2 |          600 | TAOCP vol. 1
  6 |         2 |          600 | TAOCP vol. 2
  7 |         3 |          500 | Algebra
  8 |         3 |          500 | Undergraduate Analysis
  9 |         1 |          600 | Refactoring
 10 |         2 |          500 | Concrete Mathematics
 11 |         2 |          500 | Concrete Mathematics
 12 |         2 |          500 | Concrete Mathematics

And here is the result of the above query:

以下是上述查询的结果:

 author_last_name | dewey_number | numoccurrences 
------------------+--------------+----------------
 Fowler           |          600 |              4
 Knuth            |          600 |              3
 Knuth            |          500 |              3
 Lang             |          500 |              2

#2


20  

You probably want this

你可能想要这个

SELECT dewey_number, author_last_name,
 COUNT(dewey_number) AS NumOccurrences
FROM book
GROUP BY dewey_number,author_last_name
HAVING ( COUNT(dewey_number) > 1 )

#3


2  

SELECT dewey_number, author_last_name,
       COUNT(dewey_number) AS NumOccurrences
FROM book
JOIN author USING (author_id)
GROUP BY dewey_number,author_last_name
HAVING COUNT(dewey_number) > 1

If book.author_id can be null then change the join to:

如果book.author_id可以为null,则将连接更改为:

LEFT OUTER JOIN author USING (author_id)

If the author_id column has a different name in each table then you can't use USING, use ON instead:

如果author_id列在每个表中都有不同的名称,那么您不能使用USING,而是使用ON:

JOIN author ON author.id = book.author_id

or

要么

LEFT OUTER JOIN author ON author.id = book.author_id

#4


0  

select author_name,dewey_number,Num_of_occur
from author a,(select author_id,dewey_number,count(dewey_number) Num_of_occur
                from   book
                group by author_id,dewey_number
                having count(dewey_number) > 1) dup
where a.author_id = dup.author_id

#5


0  

Most simple and efective way i found is show below:

我找到的最简单有效的方法如下所示:

SELECT
    p.id
    , p.full_name
    , (SELECT count(id) FROM tbl_documents as t where t.person_id = p.id) as rows
FROM tbl_people as p
WHERE 
    p.id 
IN (SELECT d.person_id FROM tbl_documents as d 
    GROUP BY d.person_id HAVING count(d.id) > 1) 
ORDER BY 
    p.full_name

#6


-1  

select * from author
dewey_number    author_last_name
1   Ramu
2   Rajes
1   Samy
1   Ramu

select * from book
authorid    dewey_number
1   1
2   1

select a.dewey_number,a.author_last_name,count(a.dewey_number) from author a
where a.dewey_number in (
select b.dewey_number from book b )
group by a.dewey_number,a.author_last_name

dewey_number    author_last_name    (No column name)
1   Ramu    2
1   Samy    1