获取每 5 分钟时间间隔的总和

问题描述 投票:0回答:1

问题描述

我有一个表格 (

#tmstmp
),有 2 列
dt
(
DATETIME
) 和
payload
(
INT
)。最终我想对每 5 分钟间隔的
payload
求和。

代码

设置

DECLARE @start DATETIME = N'2024-1-1 12:00:00';
DROP TABLE IF EXISTS #tmstmp
                     , #numbers;
CREATE TABLE #tmstmp (
  dt DATETIME PRIMARY KEY
  , payload INT NOT NULL
);

CREATE TABLE #numbers (
  n INT PRIMARY KEY
);
WITH numbers (n) AS (
  SELECT 0 AS n
  UNION ALL
  SELECT n + 1 AS n
    FROM numbers
   WHERE n < 100
)
INSERT
  INTO #numbers
SELECT n
  FROM numbers;

WITH rnd (mins, secs) AS (
  SELECT n2.n AS mins
         , CAST(ABS(CHECKSUM(NEWID())) % 60 AS INT) AS mins
   FROM #numbers AS n1
        , #numbers as n2
  WHERE n1.n < 5
    AND n2.n < 15
), tmstmp (dt) AS (
  SELECT DATEADD(SECOND, secs, DATEADD(MINUTE, mins, @start)) AS dt
    FROM rnd
) 
INSERT  
  INTO #tmstmp
SELECT DISTINCT dt
       , -1 AS payload
  FROM tmstmp
 ORDER BY dt;

UPDATE #tmstmp
   SET payload = CAST(ABS(CHECKSUM(NEWID())) % 10 AS INT);
GO

不重叠的时间段很容易

DECLARE @start DATETIME = N'2024-1-1 12:00:00';
DECLARE @slotDuration INT = 5;

WITH agg (slot, sum_payload) AS (
  SELECT DATEDIFF(MINUTE, @start, dt) / @slotDuration AS slot
         , SUM(payload) AS sum_payload
    FROM #tmstmp
   GROUP BY DATEDIFF(MINUTE, @start, dt) / @slotDuration
)
SELECT DATEADD(MINUTE, slot * @slotDuration, @start) AS [from]
       , DATEADD(MINUTE, (slot + 1) * @slotDuration, @start) AS [to]
       , sum_payload
  FROM agg;
来自 总有效负载
2024-01-01 12:00:00 2024-01-01 12:05:00 124
2024-01-01 12:05:00 2024-01-01 12:10:00 106
2024-01-01 12:10:00 2024-01-01 12:15:00 95

最终目标:获得跑步时间

但是,我希望在范围内输入每个间隔,即从

12:00-12:05
12:01-12:06
12:02-12:07
等直到最后一个时间段。

我可以之前构建整个范围内的限制,并在

JOIN
中使用它,如下所示:

DECLARE @start DATETIME = N'2024-1-1 12:00:00';
DECLARE @slotDuration INT = 5;
DECLARE @intervals INT = (SELECT DATEDIFF(MINUTE, @start, MAX(dt)) FROM #tmstmp);

WITH ranges ([from], [to], slot) AS (
  SELECT DATEADD(MINUTE, n, @start) AS [from]
         , DATEADD(MINUTE, n + @slotDuration, @start) AS [to]
         , n AS slot
    FROM #numbers
   WHERE n <= @intervals
), tm_mult (slot, [from], [to], dt, payload) AS (
  SELECT slot
         , [from]
         , [to]
         , dt
         , payload
    FROM #tmstmp
   INNER JOIN ranges
      ON [from] <= dt
     AND dt < [to]
)
SELECT MIN([from]) AS [from]
       , MAX([to]) AS [to]
       , SUM(payload) AS sum_payload
  FROM tm_mult
 GROUP BY slot
 ORDER BY slot;
来自 总有效负载
2024-01-01 12:00:00 2024-01-01 12:05:00 124
2024-01-01 12:01:00 2024-01-01 12:06:00 120
2024-01-01 12:02:00 2024-01-01 12:07:00 125
... ... ...
2024-01-01 12:14:00 2024-01-01 12:19:00 19

虽然这在这个玩具示例中有效,但我的真实数据中有数十万个时间戳,最糟糕的是我对索引的影响很小。我的直觉告诉我,我会用我的不平等

JOIN
创造相当多的重复,我想知道这是否是规范的做法,或者是否有更多
SQL-onic
的做法? (就像
pythonistas
喜欢调用某些代码
pythonic
,如果它使用语言固有的概念而不是尝试使用通用工具来解决它)。

sql sql-server window-functions
1个回答
0
投票

sql 中的窗口函数 (WINDOW - microsoft.com / OVER - microsoft.com) 是添加到 SQL 工具带的重要资产。也特别规范; Windows 自 SQL Server 2005 以来就已存在。

下面是一个例子:

SELECT
    [From],
    DATEADD(MINUTE, 1, [To]) [To],
    payload
FROM (
    SELECT
        dt,
        MIN(dt) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) [From],
        dt [To],
        SUM(payload) OVER(ORDER BY dt ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) payload
    FROM (
        SELECT
            DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0) dt,
            SUM(payload) payload
        FROM #tmstmp
        GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0)
    ) q
) q
WHERE DATEDIFF(MINUTE, [From], [To]) > 3

我想提请注意

4 PRECEDING
DATEADD(MINUTE, DATEDIFF(MINUTE, 0, dt), 0)
。由于后者实际上将日期时间降低到分钟,因此
2024-01-01 12:04:00.000
包含到
2024-01-01 12:04:59.999
,但不包括
2024-01-01 12:05:00.000
。希望这就是您正在寻找的功能。

这是一个小提琴

© www.soinside.com 2019 - 2024. All rights reserved.