SQL 之 查询操作重复记录

时间:2023-03-09 22:42:01
SQL 之 查询操作重复记录

  有时,我们的数据表中会存在一些冗余数据,这就要求我们查询并操作这些冗余数据。

一、查询表中重复记录

  例如,查找重复记录是根据单个字段(peopleId)来判断

SELECT * FROM Tpeople
WHERE peopleId IN ( SELECT peopleId FROM Tpeople GROUP BY peopleId HAVING COUNT(peopleId) > 1)

二、删除表中多余的重复记录

  例如,重复记录是根据单个字段(peopleId)来判断,只保留最先增加的记录,下面是保留ID最小的记录

DELETE FROM Tpeople
WHERE peopleName IN ( SELECT peopleName FROM Tpeople GROUP BY peopleName HAVING COUNT(peopleName)>1)
AND peopleId NOT IN ( SELECT MIN(peopleId) FROM Tpeople GROUP BY peopleName HAVING COUNT(peopleName)>1)

三、查找表中多余的重复记录(多个字段)
  A,DB2中可以如下查询

SELECT * FROM vitae TA
WHERE (TA.peopleId, TA.seq) IN ( SELECT peopleId,seq FROM vitae TB GROUP BY peopleId,seq HAVING COUNT(*) > 1)

  B,SQLServer如下查询

SELECT * FROM vitae TA
WHERE EXISTS ( SELECT * FROM vitae TB WHERE TB.peopleId=TA.peopleId AND TB.seq =TA.seq GROUP BY peopleId,seq HAVING COUNT(1) > 1)

四、删除表中多余的重复记录(多个字段),只留有最先插入的记录

  A,DB2中可以如下删除

DELETE FROM vitae TA
WHERE (TA.peopleId,TA.seq) IN (SELECT peopleId,seq FROM vitae TB GROUP BY peopleId,seq HAVING COUNT(1) > 1)
AND rowid NOT IN ( SELECT MIN(rowid) FROM vitae TC GROUP BY peopleId,seq HAVING COUNT(1)>1)

  B,SQLServer中如下删除

DELETE FROM vitae TA
WHERE EXISTS ( SELECT * FROM vitae TB WHERE TB.peopleId=TA.peopleId AND TB.seq =TA.seq GROUP BY peopleId,seq HAVING COUNT(1) > 1)
AND rowid NOT IN ( SELECT MIN(rowid) FROM vitae TC GROUP BY peopleId,seq HAVING COUNT(1)>1)