假设我有下表。
CREATE TABLE transaction (
"ID" INTEGER PRIMARY KEY,
"NAME" VARCHAR(4),
"TIMESTAMP" INTEGER,
"QUANTITY" INTEGER
);
INSERT INTO transaction
("ID", "NAME", "TIMESTAMP", "QUANTITY")
VALUES
('1', 'dani', '1686311907', '1'),
('2', 'dani', '1686312071', '4'),
('3', 'dani', '1686748928', '2'),
('4', 'pet', '1687937005', '2'),
('5', 'pet', '1688109281', '6');
对应于:
| ID | NAME | TIMESTAMP | QUANTITY |
| --- | ---- | ---------- | -------- |
| 1 | dani | 1686311907 | 1 |
| 2 | dani | 1686312071 | 4 |
| 3 | dani | 1686748928 | 2 |
| 4 | pet | 1687937005 | 2 |
| 5 | pet | 1688109281 | 6 |
该表描述了一系列事务。一般来说,我想要的是,对于每个名称和每个时间戳,计算每个工作槽内数量的累积和。每个工作槽将包含时间间隔小于 10800 秒(相当于 3 小时)的时间戳。换句话说,必须在每个工作槽上为每个名称重置累积总和。
我期望得到的是如下表:
| ID | NAME | TIMESTAMP | QUANTITY | CUM_QUANTITY |
| --- | ---- | ---------- | -------- | ------------
| 1 | dani | 1686311907 | 1 | 1 |
| 2 | dani | 1686312071 | 4 | 5 |
| 3 | dani | 1686748928 | 2 | 2 | # new working slot
| 4 | pet | 1687937005 | 2 | 2 | # new name
| 5 | pet | 1688109281 | 6 | 6 | # new working slot
到目前为止,我尝试添加一个过渡列来检查记录是否属于新的一天:
select
*,
CASE WHEN coalesce(
"TIMESTAMP" - lag("TIMESTAMP", 1) over (
partition by "NAME"
order by
"TIMESTAMP" asc
),
0
) < 3600 * 3 THEN 0 ELSE 1 END AS NEW_DAY
from
transaction
ORDER BY
"NAME",
"TIMESTAMP"
这给了我们下表:
| ID | NAME | TIMESTAMP | QUANTITY | new_day |
| --- | ---- | ---------- | -------- | ------- |
| 1 | dani | 1686311907 | 1 | 0 |
| 2 | dani | 1686312071 | 4 | 0 |
| 3 | dani | 1686748928 | 2 | 1 |
| 4 | pet | 1687937005 | 2 | 0 |
| 5 | pet | 1688109281 | 6 | 1 |
然后使用 NAME 和 new_day 执行某种窗口并通过 SUM(QUANTITY) 聚合,但这种方法不起作用,因为 new_day 只取值 0 和 1,并且它应该从 0, 1, ...每个用户的最大工作槽位。
有人知道如何进行吗?