您能帮我完成以下 SQL 任务吗?这是一个数据表:
| date | hour | flag |
|---------------|----------|----------|
| 2024-04-13 | 23 | True |
| 2024-04-13 | 22 | True |
| 2024-04-13 | 21 | True |
| 2024-04-13 | 20 | False |
| 2024-04-13 | 19 | True |
| 2024-04-13 | 18 | True |
|-------------------------------------|
| 2024-04-12 | 20 | False |
| 2024-04-12 | 19 | True |
| 2024-04-12 | 18 | True |
| 2024-04-12 | 17 | True |
| 2024-04-12 | 16 | True |
| 2024-04-12 | 15 | True |
|-------------------------------------|
| 2024-04-11 | 22 | True |
| 2024-04-11 | 18 | True |
| 2024-04-11 | 15 | False |
| 2024-04-11 | 10 | True |
| 2024-04-11 | 9 | False |
| 2024-04-11 | 8 | True |
|-------------------------------------|
| 2024-04-10 | 10 | True |
| 2024-04-10 | 9 | True |
| 2024-04-10 | 6 | True |
| 2024-04-10 | 3 | False |
|-------------------------------------|
我需要计算的是,每天,在出现任何假标志之前(从顶部开始)有多少个连续的真标志。所以,结果应该是这样的:
| date | count |
|---------------|----------|
| 2024-04-13 | 3 |
| 2024-04-12 | 0 |
| 2024-04-11 | 2 |
| 2024-04-10 | 3 |
注释:小时按降序排列,但可能存在漏洞,例如2024-04-11(结果仍然是 2,我们正在计算行数)。
提前谢谢您。
也许可以改进,但这确实有效。
SET @count := 0, @aux := 0, @prevdate := null, @end := 0, @prevend := 0;
SELECT date, max(counter_final)
FROM (
SELECT
*,
(@aux := IF(@prevdate = date, 1 , 0)) AS aux,
(@count := IF(flag = true, IF(@aux = 0, 1, @count + 1), 0)) AS counter,
(@prevdate := date) AS prevdate,
(@prevend := IF(@aux = 0, 0, IF(@prevend = 1, 1,@end) )) AS prevend,
(@end := IF(@count = 0 AND flag = 0,1,0)) AS end,
IF(@end + @prevend = 0, @count, 0) AS counter_final
FROM mytable
ORDER BY date DESC, hour DESC
) aux_table
GROUP BY date ORDER BY date DESC;
结果
date count
2024-04-13 3
2024-04-12 0
2024-04-11 2
2024-04-10 3
完整示例:SQL Fiffdle
你也许可以做这样的事情:
SELECT date, max(countOver) AS TruthFlags
FROM (
SELECT COUNT(CASE WHEN flag = 'True' THEN 1 END) OVER(PARTITION BY Date ORDER BY hour DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
* CASE WHEN COUNT(CASE WHEN flag = 'False' THEN 1 END) OVER(PARTITION BY Date ORDER BY hour DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) > 0 THEN 0 ELSE 1 END AS countOver
, *
FROM
(
VALUES (N'2024-04-13', 23, N'True')
, (N'2024-04-13', 22, N'True')
, (N'2024-04-13', 21, N'True')
, (N'2024-04-13', 20, N'False')
, (N'2024-04-13', 19, N'True')
, (N'2024-04-13', 18, N'True')
, (N'2024-04-12', 20, N'False')
, (N'2024-04-12', 19, N'True')
, (N'2024-04-12', 18, N'True')
, (N'2024-04-12', 17, N'True')
, (N'2024-04-12', 16, N'True')
, (N'2024-04-12', 15, N'True')
, (N'2024-04-11', 22, N'True')
, (N'2024-04-11', 18, N'True')
, (N'2024-04-11', 15, N'False')
, (N'2024-04-11', 10, N'True')
, (N'2024-04-11', 9, N'False')
, (N'2024-04-11', 8, N'True')
, (N'2024-04-10', 10, N'True')
, (N'2024-04-10', 9, N'True')
, (N'2024-04-10', 6, N'True')
, (N'2024-04-10', 3, N'False')
) t (date,hour,flag)
) x
group by date
我的想法是创建一个“零”运行的错误标志总数,从最后一个小时到第一个小时。如果存在错误标志,则此表达式:
CASE WHEN COUNT(CASE WHEN flag = 'False' THEN 1 END) OVER(PARTITION BY Date ORDER BY hour DESC) > 0 THEN 0 ELSE 1 END
创建 0 值,否则创建 1。
然后只需将归零器乘以当前 True 标志的数量即可。
最后,我使用
max(countOver)
来查找聚合真实标记数的峰值。