psql - 解决重复项而不删除,根据 account_id 设置重复项的 max_identifier

问题描述 投票:0回答:1

我试图在不删除第二个条目的情况下解决这些重复项。我正在寻找一种解决方案,其中标识符的重复记录将被分配 account_id 的最大标识符值。

select account_id, identifier, count(*)
from books                                           
group by account_id, identifier   
HAVING count(*) > 1;

 account_id | identifier | count 
------------+------------+-------
        111 |        155 |     2
        111 |        198 |     2
        111 |        178 |     2
        111 |        167 |     2
        111 |        196 |     2
        111 |        156 |     2
        111 |        150 |     2
        111 |        223 |     2

例如:(仅处理第一条记录)
(根据帐户_id,将最大标识符视为223)
为了以下记录

 account_id | identifier | count 
------------+------------+-------
        111 |        155 |     2

应该将标识符设置为(max_identifier+1)

 account_id | identifier | count 
------------+------------+-------
        111 |        155 |     1
        111 |        224 |     1

同样,它应该在循环中对所有记录执行此操作,而不会破坏其他标识符和记录。

#
# Table name: books
#
#  id                 :bigint           not null, primary key
#  account_id         :bigint           not null
#  identifier         :bigint           not null

# Indexes
#  unique_account_identifier             (account_id,identifier) UNIQUE
postgresql duplicates psql uniqueidentifier
1个回答
0
投票

这应该可以做到:

UPDATE books
SET identifier = new_identifier
FROM (
  SELECT
    id,
    (
      SELECT MAX(identifier) FROM books WHERE account_id = dup.account_id
    ) + (
      ROW_NUMBER() OVER (PARTITION BY account_id ORDER BY id)
    ) AS new_identifier
  FROM books dup
  WHERE EXISTS(
    SELECT *
    FROM book orig
    WHERE dup.account_id = orig.account_id
      AND dup.identifier = orig.identifier
      AND orig.id < dup.id
  )
) to_update
WHERE books.id = to_update.id
© www.soinside.com 2019 - 2024. All rights reserved.