我试图在不删除第二个条目的情况下解决这些重复项。我正在寻找一种解决方案,其中标识符的重复记录将被分配 account_id 的最大标识符值。
select account_id, identifier, count(*)
from books
group by account_id, identifier
HAVING count(*) > 1;
account_id | identifier | count
------------+------------+-------
111 | 155 | 2
111 | 198 | 2
111 | 178 | 2
111 | 167 | 2
111 | 196 | 2
111 | 156 | 2
111 | 150 | 2
111 | 223 | 2
例如:(仅处理第一条记录)
(根据帐户_id,将最大标识符视为223)
为了以下记录
account_id | identifier | count
------------+------------+-------
111 | 155 | 2
应该将标识符设置为(max_identifier+1)
account_id | identifier | count
------------+------------+-------
111 | 155 | 1
111 | 224 | 1
同样,它应该在循环中对所有记录执行此操作,而不会破坏其他标识符和记录。
#
# Table name: books
#
# id :bigint not null, primary key
# account_id :bigint not null
# identifier :bigint not null
# Indexes
# unique_account_identifier (account_id,identifier) UNIQUE
这应该可以做到:
UPDATE books
SET identifier = new_identifier
FROM (
SELECT
id,
(
SELECT MAX(identifier) FROM books WHERE account_id = dup.account_id
) + (
ROW_NUMBER() OVER (PARTITION BY account_id ORDER BY id)
) AS new_identifier
FROM books dup
WHERE EXISTS(
SELECT *
FROM book orig
WHERE dup.account_id = orig.account_id
AND dup.identifier = orig.identifier
AND orig.id < dup.id
)
) to_update
WHERE books.id = to_update.id