如何删除除一列之外具有相同数据的重复行

问题描述 投票:0回答:1
我正在尝试删除重复的行,但保留最近更新的行。

这是我的数据集的插图:

我正在尝试删除两行较旧的数据(在本例中为 11/19 - 以黄色突出显示)。

这是文本数据集:

employee ID punch_start punch_end punch_hours date_load John Doe 276567 Sep 30 2023 2:50PM Oct 1 2023 6:00AM 15.16666667 11/19/23 2:45 PM Jane Doe 140037 Sep 30 2023 10:00PM Oct 1 2023 7:05AM 9.083333333 11/19/23 2:45 PM John Doe 276567 Sep 30 2023 2:50PM Oct 1 2023 6:00AM 15.16666667 11/20/23 2:45 PM Jane Doe 140037 Sep 30 2023 10:00PM Oct 1 2023 7:05AM 9.083333333 11/20/23 2:45 PM
我尝试使用这段代码(基于这个

post):

with todelete as ( select ,[employee] ,[ID] ,[punch_start] ,[punch_end] ,[punch_hours] ,row_number() over ( partition by [employee] ,[ID] ,[punch_start] ,[punch_end] ,[punch_hours] order by [date_load] desc) as seqnum from [dbo].[dataset] ) select * from todelete where seqnum > 1; delete from todelete where seqnum > 1;
但是结果(选择*)是:

我们如何修改代码以区分/仅选择具有较旧 [date_load] 的行?

sql t-sql duplicates
1个回答
0
投票
也许 SQL Server 中的 DELETE 有足够的语法来支持这一点,但我通常在这里进行 MERGE。

merge into [dbo].[dataset] as tgt using ( select * from ( select ,[employee] ,[ID] ,[punch_start] ,[punch_end] ,[punch_hours] ,[date_load] ,row_number() over ( partition by [employee] ,[ID] ,[punch_start] ,[punch_end] ,[punch_hours] order by [date_load] desc) as seqnum from [dbo].[dataset] ) ranked where seqnum > 1 ) as mrg ON ( tgt.employee = mrg.employee and tgt.id = mrg.id and tgt.punch_start = mrg.punch_start and tgt.punch_end = mrg.punch_end and tgt.punch_hours = mrg.punch_hours and tgt.date_load = mrg.date_load ) WHEN MATCHED THEN DELETE;
    
© www.soinside.com 2019 - 2024. All rights reserved.