我有下表,现在我需要删除具有重复“refIDs”的行,但至少有一行与该ref,即我需要删除第4行和第5行。请帮我这个
+----+-------+--------+--+
| ID | refID | data | |
+----+-------+--------+--+
| 1 | 1023 | aaaaaa | |
| 2 | 1024 | bbbbbb | |
| 3 | 1025 | cccccc | |
| 4 | 1023 | ffffff | |
| 5 | 1023 | gggggg | |
| 6 | 1022 | rrrrrr | |
+----+-------+--------+--+
这类似于Gordon Linoff的查询,但没有子查询:
DELETE t1 FROM table t1
JOIN table t2
ON t2.refID = t1.refID
AND t2.ID < t1.ID
这使用内部联接仅删除具有相同refID但ID较低的另一行的行。
避免子查询的好处是能够利用索引进行搜索。此查询应该与refID + ID上的多列索引一起使用。
我会做:
delete from t where
ID not in (select min(ID) from table t group by refID having count(*) > 1)
and refID in (select refID from table t group by refID having count(*) > 1)
标准是refId是重复项之一,ID与重复项中的min(id)不同。如果refId被索引,它会更好
否则,并提供您可以多次发出以下查询,直到它不删除任何内容
delete from t
where
ID in (select max(ID) from table t group by refID having count(*) > 1)
在MySQL中,您可以使用join
中的delete
执行此操作:
delete t
from table t left join
(select min(id) as id
from table t
group by refId
) tokeep
on t.id = tokeep.id
where tokeep.id is null;
对于每个RefId
,子查询计算id
列的最小值(假设在整个表中是唯一的)。它使用left join
进行匹配,所以任何不匹配的东西都有NULL
的tokeep.id
值。这些是被删除的。
另一种变体,在某些情况下比Marcus和NJ73的答案要快一些:
DELETE ourTable
FROM ourTable JOIN
(SELECT ID,targetField
FROM ourTable
GROUP BY targetField HAVING COUNT(*) > 1) t2
ON ourTable.targetField = t2.targetField AND ourTable.ID != t2.ID;
希望能帮到别人。在大桌上马库斯回答摊位。