这是我的数据集的插图:
我正在尝试删除两行较旧的数据(在本例中为 11/19 - 以黄色突出显示)。
这是文本数据集:
employee ID punch_start punch_end punch_hours date_load
John Doe 276567 Sep 30 2023 2:50PM Oct 1 2023 6:00AM 15.16666667 11/19/23 2:45 PM
Jane Doe 140037 Sep 30 2023 10:00PM Oct 1 2023 7:05AM 9.083333333 11/19/23 2:45 PM
John Doe 276567 Sep 30 2023 2:50PM Oct 1 2023 6:00AM 15.16666667 11/20/23 2:45 PM
Jane Doe 140037 Sep 30 2023 10:00PM Oct 1 2023 7:05AM 9.083333333 11/20/23 2:45 PM
我尝试使用这段代码(基于这个post):
with todelete as (
select
,[employee]
,[ID]
,[punch_start]
,[punch_end]
,[punch_hours]
,row_number() over
(
partition by
[employee]
,[ID]
,[punch_start]
,[punch_end]
,[punch_hours]
order by [date_load] desc) as seqnum
from [dbo].[dataset]
)
select * from todelete where seqnum > 1;
delete from todelete where seqnum > 1;
但是结果(选择*)是:我们如何修改代码以区分/仅选择具有较旧 [date_load] 的行?
merge into [dbo].[dataset] as tgt using (
select *
from (
select
,[employee]
,[ID]
,[punch_start]
,[punch_end]
,[punch_hours]
,[date_load]
,row_number() over
(
partition by
[employee]
,[ID]
,[punch_start]
,[punch_end]
,[punch_hours]
order by [date_load] desc) as seqnum
from [dbo].[dataset]
) ranked
where seqnum > 1
) as mrg
ON (
tgt.employee = mrg.employee and
tgt.id = mrg.id and
tgt.punch_start = mrg.punch_start and
tgt.punch_end = mrg.punch_end and
tgt.punch_hours = mrg.punch_hours and
tgt.date_load = mrg.date_load
)
WHEN MATCHED THEN DELETE;