通过重复日期保留最新值

问题描述 投票:0回答:3

我有一张名为“staff”的桌子:

Account   Score   UpdateTime                UpdateTime_order               
K1897     A       2023-09-08 14:57:58.113   1
K1897     B       2023-09-08 14:57:57.896   2
K1897     B       2023-08-01 10:07:57.487   3
K1897     B       2023-06-28 07:23:57.696   4
K1897     B       2023-06-05 14:20:13.789   5
K1898     C       2023-06-04 14:20:13.789   1

每位员工每天只能获得一次分数,因此账户K1897在2023-09-08时的分数应为A。 (当天成绩从B变成A)

为了解决这个问题,我决定将日期时间转换为日期格式,然后在重复时保留最新的UpdateTime。

例如,我从帐户 K1897 中取消选择 UpdateTime_order=2 时的行,因为其原始 UpdateTime 2023-09-08 14:57:57.896 < 2023-09-08 14:57:58.113

Account   Score   UpdateTime   UpdateTime_order               
K1897     A       2023-09-08   1      
K1897     B       2023-08-01   3
K1897     B       2023-06-28   4
K1897     B       2023-06-05   5
K1898     C       2023-06-04   1

然后根据新结果对 UpdateTime_order 重新排序。

我的期望:

Account   Score   UpdateTime   UpdateTime_order             
K1897     A       2023-09-08   1
K1897     B       2023-08-01   2
K1897     B       2023-06-28   3
K1897     B       2023-06-05   4
K1898     C       2023-06-04   1

我的代码:

;WITH CTE_staff AS (
select 
Account, 
Score, 
CAST([UpdateTime] AS Date) UpdateDate,
UpdateTime, 
UpdateTime_order
FROM staff
)
select 
Account, 
Score, 
UpdateDate,
ROW_NUMBER()OVER(PARTITION BY Account ORDER BY UpdateTime DESC) as UpdateTime_order3
from(
select *,
ROW_NUMBER()OVER(PARTITION BY Account, UpdateDate ORDER BY UpdateTime DESC) as UpdateTime_order2 
from CTE_staff
) jj
where jj.UpdateTime_order2=1

它运行成功,但我认为我通过创建新列以复杂的方式编写了它。想知道是否有简单的方法可以做到这一点?

小提琴:https://dbfiddle.uk/dJ1qw3Lt

sql sql-server sql-server-2008
3个回答
1
投票

小提琴:https://dbfiddle.uk/_NZfe5JW

WITH RankedStaff AS (
    SELECT
        Account,
        Score,
        UpdateTime,
        ROW_NUMBER() OVER (PARTITION BY Account, CONVERT(DATE, UpdateTime) ORDER BY UpdateTime DESC) AS RowNum
    FROM
        staff
)
SELECT
    Account,
    Score,
    UpdateTime
FROM
    RankedStaff
WHERE
    RowNum = 1;

此查询使用名为

RankedStaff
的公用表表达式 (CTE) 为由
Account
UpdateTime
的日期部分的组合定义的分区内的每一行分配行号。
ROW_NUMBER()
函数用于根据
UpdateTime
按降序对每个分区内的行进行排序,因此具有最新
UpdateTime
的行的行号为 1。

最后的

SELECT
语句会过滤结果,仅包含
RowNum
等于 1 的行,这意味着它是每个帐户每天具有最新
UpdateTime
的行。

这将为您提供一个结果,其中仅包含每个帐户每天具有最新

UpdateTime
的行。


1
投票

以下是如何使用一个

ROW_NUMBER()
来获取每个帐户和日期的最新更新行:

SELECT Account, Score, UpdateTime, UpdateTime_order
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY Account, CAST(UpdateTime AS Date) ORDER BY UpdateTime DESC) AS RN
  FROM staff
) AS S
WHERE rn = 1
ORDER BY UpdateTime DESC

也可以使用

GROUP BY
和聚合函数
MAX()
: 我们首先需要确定每个帐户和每天的最大更新时间:

SELECT Account, MAX(UpdateTime) AS MAX_UpdateTime
FROM staff
GROUP BY Account, CAST(UpdateTime AS DATE)

然后将此数据集与表连接起来,如下所示:

SELECT s.*
FROM staff s
INNER JOIN (
  SELECT Account, MAX(UpdateTime) AS MAX_UpdateTime
  FROM staff
  GROUP BY Account, CAST(UpdateTime AS DATE)
) AS t ON S.Account = t.Account AND s.UpdateTime = t.MAX_UpdateTime
ORDER BY s.UpdateTime DESC

演示在这里


0
投票

这应该返回您所说的预期结果:

with cte_staff as
  (select Account, Score, CAST([UpdateTime] AS Date) UpdateDate, UpdateTime_order,
      row_number() over (partition by Account, cast([UpdateTime] as Date) order by updateTime desc) rn
   from staff
  )
select Account, Score, UpdateDate, 
  row_number() over (partition by Account order by Score, rn) UpdateTime_order
from cte_staff
where rn = 1
order by Account, UpdateTime_order; 


Account     Score   UpdateDate  UpdateTime_order
K1897       A       2023-09-08  1
K1897       B       2023-06-05  2
K1897       B       2023-06-28  3
K1897       B       2023-08-01  4
K1898       C       2023-06-04  1

参见小提琴

© www.soinside.com 2019 - 2024. All rights reserved.