如何获取计算数据采集周期的周期结束日期

问题描述 投票:0回答:2

我有一个数据采集时间戳列表。靠近的时间戳属于一个循环。我想列举这些周期。因此,只要两个时间戳之间的时间超过 100 秒,就会创建下一个周期。

CREATE TABLE [Cycles](
    [Cycle] [int] NOT NULL,
    [CycleStart] [datetime] NOT NULL,
    [CycleEnd] [datetime] NOT NULL,
 CONSTRAINT [PK_Cycles] PRIMARY KEY CLUSTERED 
(
    [Cycle] DESC
))
INSERT INTO [Cycles] VALUES
(10,'2023-12-04T9:00:00','2023-12-04T10:00:00'),
(11,'2023-12-04T21:00:00','2023-12-04T22:00:00'),
(12,'2023-12-04T23:00:00','2023-12-05T00:00:00')
CREATE TABLE [Data](
    [datatimestamp] [datetime] NOT NULL,
 CONSTRAINT [PK_Data] PRIMARY KEY NONCLUSTERED 
(
    [datatimestamp] ASC
))
INSERT INTO [Data] VALUES
('2023-12-05T00:05:20'),
('2023-12-05T00:05:21'),
('2023-12-05T00:05:22'),
('2023-12-05T00:10:01'),
('2023-12-05T00:10:02'),
('2023-12-05T00:10:03')

所以我需要添加

Cycles
13
14

以下是我作为精选者可以做的事情:

DECLARE @lastCycle int = (SELECT TOP 1 Cycle FROM Cycles ORDER BY Cycle DESC);
DECLARE @lastCycleEnd datetime = (SELECT TOP 1 CycleEnd FROM Cycles ORDER BY Cycle DESC);
WITH marks AS (
    SELECT datatimestamp, 
    CASE 
        WHEN DATEDIFF(Second, LAG(datatimestamp, 1, DATEADD(Second, -101, datatimestamp)) OVER (ORDER BY datatimestamp), datatimestamp) > 100 
        THEN 1 ELSE 0 
    END AS NextC
    FROM [Data] 
    WHERE datatimestamp > @lastCycleEnd 
)
SELECT @lastCycle + ROW_NUMBER() OVER (ORDER BY d.datatimestamp) AS Cycle, d.datatimestamp AS CycleBegin 
FROM [Data] d
INNER JOIN marks m On m.datatimestamp = d.datatimestamp
WHERE m.NextC = 1

这将返回新的 Cycles 及其 CycleStarts,因为示例数据的结果如下所示:

循环 循环开始
13 2023-12-05 00:05:20
14 2023-12-05 00:10:01

如何获取 CycleEnd 以及第三列?

sql-server lag lead
2个回答
0
投票

你很接近。添加一个额外的步骤,计算按日期排序的

NextC
的运行总和。这将为每个“集合”时间戳编号;对该列执行分组。

with cte1 as (
  select datatimestamp, case when datediff(second, lag(datatimestamp) over (order by datatimestamp), datatimestamp) < 100 then 0 else 1 end as flag
  from data
), cte2 as (
  select *, sum(flag) over (order by datatimestamp) as grpnum
  from cte1
)
select min(datatimestamp), max(datatimestamp)
from cte2
group by grpnum

0
投票

一旦您从

NextC
CTE 获得数据,就不再对
marks
进行过滤。即

数据时间戳 下一个C
2023-12-05T00:05:20 1
2023-12-05T00:05:21 0
2023-12-05T00:05:22 0
2023-12-05T00:10:01 1
2023-12-05T00:10:02 0
2023-12-05T00:10:03 0

而是执行

SUM(NextC) OVER(ORDER BY datatimestamp)
,这将为您提供每个组的时间戳值,即

数据时间戳 循环
2023-12-05T00:05:20 1
2023-12-05T00:05:21 1
2023-12-05T00:05:22 1
2023-12-05T00:10:01 2
2023-12-05T00:10:02 2
2023-12-05T00:10:03 2

然后,您可以对此列进行分组并获取最小和最大日期时间以获取开始/结束时间。所以你的最终查询将是:

DECLARE @lastCycle int = (SELECT TOP 1 Cycle FROM Cycles ORDER BY Cycle DESC);
DECLARE @lastCycleEnd datetime = (SELECT TOP 1 CycleEnd FROM Cycles ORDER BY Cycle DESC);
WITH marks AS (
    SELECT datatimestamp, 
   CASE 
        WHEN DATEDIFF(Second, LAG(datatimestamp, 1, DATEADD(Second, -101, datatimestamp)) OVER (ORDER BY datatimestamp), datatimestamp) > 100 
        THEN 1 ELSE 0 
    END AS NextC
    FROM [Data] 
    WHERE datatimestamp > @lastCycleEnd 
), marks2 AS (
  SELECT m.DataTimeStamp, SUM(m.NextC) OVER (ORDER BY m.DataTimeStamp) AS Cycle
  FROM marks AS m)
SELECT @lastCycle + ROW_NUMBER() OVER (ORDER BY m.Cycle) AS Cycle, 
  MIN(m.datatimestamp) AS CycleBegin ,
  MAX(m.datatimestamp) AS CycleEnd
FROM marks2 m 
GROUP BY Cycle;

db<>fiddle 的示例

© www.soinside.com 2019 - 2024. All rights reserved.