我的桌子是
id user_id date created_at
1 123 2020-02-02 2020-02-02 10:00:00
2 123 2020-02-02 2020-02-02 10:00:01
3 789 2020-02-12 2020-02-12 12:00:00
4 456 2020-02-10 2020-02-10 10:00:00
5 456 2020-02-10 2020-02-10 10:00:01
我想删除重复的条目,并且想要想要的输出-
id user_id date created_at
1 123 2020-02-02 2020-02-02 10:00:00
3 789 2020-02-12 2020-02-12 12:00:00
4 456 2020-02-10 2020-02-10 10:00:00
我尝试了以下查询-
DELETE
`a`
FROM
`table1` AS `a`,
`table1` AS `b`
WHERE
`a`.`id` < `b`.`id` AND `a`.`user_id` <=> `b`.`user_id`
但是时间太长,我得到的错误是
Lock wait timeout exceeded; try restarting transaction
我的表有超过9500000个条目。
什么是更好的替代查询?
尝试使用相关子查询:
DELETE t1
FROM table1 t1
WHERE EXISTS ( SELECT NULL
FROM table1 t2
WHERE t1.user_id = t2.user_id
AND t1.id > t2.id )
如果该过程再次过长而导致连接超时,请尝试使用块删除-以适当的数量添加ORDER BY id DESC LIMIT ????
并执行直到affected rows = 0
。
在任何情况下,索引(user_id, id)
都会提高查询速度。
假设没有空值,请GROUP BY
唯一列,SELECT
MIN
RowId作为要保留的行。然后,只需删除所有没有行ID的内容:
DELETE FROM table1
LEFT OUTER JOIN (
SELECT MIN(id) as id, user_id
FROM MyTable
GROUP BY user_id
) as KeepRows ON
MyTable.id = KeepRows.id
WHERE
KeepRows.id IS NULL