我有一个时间戳列表(WTIME)、与该时间戳相关的一些值(VAL)和一个类别(WTYPE)。每个时间戳只会有一个条目。我需要转换数据,以便 WTYPES 的每个“分组”有 1 行。
DECLARE @Table TABLE (
WTIME DATETIME,
VAL FLOAT,
WTYPE INT
)
INSERT INTO @TABLE
VALUES
('2023-07-31 01:00:00', 10, 1),
('2023-07-31 02:00:00', 11, 1),
('2023-07-31 03:00:00', 12, 1),
('2023-07-31 04:00:00', 4, 2),
('2023-07-31 05:00:00', 6, 2),
('2023-07-31 06:00:00', 15, 1)
在此示例中,有三个 WTYPE 1 时间戳,然后切换到 WTYPE 2,然后返回 WTYPE 1 - 所以我需要输出 3 行(即 3 次更改 WTYPE)和开始时间,结束时间、平均值、WTYPE 和行数。
我知道这可以通过将数据集重新连接到自身来完成,但是对于我的目的来说,这种方法运行得太慢了,而且我从骨子里感觉到,它可以单独使用窗口函数来实现,我就是无法理解它!
您可以将其视为经典的间隙和岛屿问题,并为每个岛屿指定一个唯一的编号,然后对其进行分组,但下面的代码可以在按
WTIME
排序的一次扫描中完成所有操作,因此应该比这更有效 - Fiddle .
其想法是计算整个表的累积总和和计数,同时保留每个岛的最后一行,并回顾之前保存的行以计算差异。
WITH T AS
(
SELECT *,
MinWTime = MIN(WTIME) OVER (ORDER BY WTIME ROWS UNBOUNDED PRECEDING) ,
CumeVal = SUM(VAL) OVER (ORDER BY WTIME ROWS UNBOUNDED PRECEDING) ,
CumeCount = COUNT(*) OVER (ORDER BY WTIME ROWS UNBOUNDED PRECEDING) ,
NextType = LEAD(WTYPE) OVER (ORDER BY WTIME),
NextWTime = LEAD(WTIME) OVER (ORDER BY WTIME)
FROM @TABLE
), LastInIsland AS
(
SELECT *,
SUM_VAL = CumeVal - LAG(CumeVal,1,0) OVER (ORDER BY WTIME),
NUM_ROWS = CumeCount - LAG(CumeCount,1,0) OVER (ORDER BY WTIME) ,
START_TIME = LAG(NextWTime,1,MinWTime) OVER (ORDER BY WTIME)
FROM T
WHERE NextType <> WTYPE OR NextType IS NULL
)
SELECT START_TIME,
END_TIME = CASE WHEN NextType IS NOT NULL THEN WTIME END,
WTYPE,
AVG_VAL = SUM_VAL/ NUM_ROWS,
NUM_ROWS
FROM LastInIsland
ORDER BY WTIME