需要根据连续日期的ID来选择非活动数据。
Sample Data:
2020-04-19,SQA0199,ACTIVE
2020-04-20,SQA0199,INACTIVE
2020-04-21,SQA0199,INACTIVE
2020-04-22,SQA0199,INACTIVE
2020-04-23,SQA0199,ACTIVE
2020-04-24,SQA0199,INACTIVE
2020-04-25,SQA0199,INACTIVE
2020-04-26,SQA0199,INACTIVE
Sample Script:
SELECT
ROW_NUMBER() OVER (PARTITION BY SQA_ID ORDER BY timestamp) AS "row number",
timestamp, SQA_ID
FROM SQA_SMS_INACTIVE where status='INACTIVE';
Desire Output:
2020-04-20,SQA0199,1
2020-04-21,SQA0199,2
2020-04-22,SQA0199,3
2020-04-24,SQA0199,1
2020-04-25,SQA0199,2
2020-04-26,SQA0199,3
我的脚本的输出有继续计数的行数。请帮助我
附 LAG()
和 SUM()
窗口功能,您可以创建行的分组,行号将以此为基础。
WITH
pre AS (
SELECT *,
DATEDIFF(
timestamp,
LAG(timestamp) OVER (PARTITION BY SQA_ID ORDER BY timestamp)
) <> 1 AS flag
FROM SQA_SMS_INACTIVE
WHERE status = 'INACTIVE'
),
cte AS (
SELECT timestamp, SQA_ID,
SUM(COALESCE(flag, 0) <> 0) OVER (PARTITION BY SQA_ID ORDER BY timestamp) grp
FROM pre
)
SELECT timestamp, SQA_ID,
ROW_NUMBER() OVER (PARTITION BY SQA_ID, grp ORDER BY timestamp) AS `row number`
FROM cte
请看 演示. 结果。
| timestamp | SQA_ID | row number |
| ---------- | ------- | ---------- |
| 2020-04-20 | SQA0199 | 1 |
| 2020-04-21 | SQA0199 | 2 |
| 2020-04-22 | SQA0199 | 3 |
| 2020-04-24 | SQA0199 | 1 |
| 2020-04-25 | SQA0199 | 2 |
| 2020-04-26 | SQA0199 | 3 |