学校假期设置中的空白和孤岛

问题描述 投票:3回答:1

我必须使用此periods表:

期间

id  | starts_on  |  ends_on   
----+------------+------------
678 | 2019-12-21 | 2019-12-22
534 | 2019-12-23 | 2020-01-04
679 | 2019-12-28 | 2019-12-29
  9 | 2020-01-01 | 2020-01-01
776 | 2020-01-04 | 2020-01-05
  7 | 2020-01-06 | 2020-01-06
777 | 2020-01-11 | 2020-01-12

其中列出了学生不必上学的所有时间段。不幸的是,某些时期重叠。当在学校放假期间发生周末或公共假期时,就会发生这种情况(每个人都有自己的时段行)。

Find rows with adjourning date ranges and accumulate their durationsGaps and islands for school vacations in a country with federal states的帮助下,我得到了以下查询:

SELECT p.id, p.starts_on, p.ends_on, grp,
      (Max(ends_on) OVER (PARTITION BY grp) - Min(starts_on) OVER (PARTITION BY grp) 
      ) + 1 AS duration, Array_agg(p.id) OVER (PARTITION BY grp) 
FROM (SELECT p.*,
            Count(*) FILTER (WHERE prev_eo < starts_on - INTERVAL '1 day') OVER
                (PARTITION BY 1 
                  ORDER BY starts_on
                ) AS grp 
      FROM (SELECT p.*,
                  lag(ends_on) OVER (PARTITION BY 1 ORDER BY starts_on) AS prev_eo 
            FROM (SELECT p.id, p.starts_on, p.ends_on FROM periods p
            WHERE starts_on > '2019-12-15' AND
                  starts_on < '2020-01-15' ) p 
          ) p 
  ) p;

我得到的

结果为

id  | starts_on  |  ends_on   | grp | duration |   array_agg   
----+------------+------------+-----+----------+---------------
678 | 2019-12-21 | 2019-12-22 |   0 |       15 | {678,534,679}
534 | 2019-12-23 | 2020-01-04 |   0 |       15 | {678,534,679}
679 | 2019-12-28 | 2019-12-29 |   0 |       15 | {678,534,679}
  9 | 2020-01-01 | 2020-01-01 |   1 |        1 | {9}
776 | 2020-01-04 | 2020-01-05 |   2 |        3 | {776,7}
  7 | 2020-01-06 | 2020-01-06 |   2 |        3 | {776,7}
777 | 2020-01-11 | 2020-01-12 |   3 |        2 | {777}

[前三行是grp 0(标识678、534和679)。

我想要的

但是ID 9、776和7也应该属于那个grp。不幸的是,它们重叠。是否有可能得到这样的结果(我不在乎顺序)?

id  | starts_on  |  ends_on   | grp | duration |   array_agg   
----+------------+------------+-----+----------+---------------
678 | 2019-12-21 | 2019-12-22 |   0 |       17 | {678,534,679,9,776,7}
534 | 2019-12-23 | 2020-01-04 |   0 |       17 | {678,534,679,9,776,7}
679 | 2019-12-28 | 2019-12-29 |   0 |       17 | {678,534,679,9,776,7}
  9 | 2020-01-01 | 2020-01-01 |   0 |       17 | {678,534,679,9,776,7}
776 | 2020-01-04 | 2020-01-05 |   0 |       17 | {678,534,679,9,776,7}
  7 | 2020-01-06 | 2020-01-06 |   0 |       17 | {678,534,679,9,776,7}
777 | 2020-01-11 | 2020-01-12 |   1 |        2 | {777}

我想知道总岛(grp 0)以天为单位的时间以及它包含的期间ID。

沙盒:https://rextester.com/SHVL41709

sql postgresql gaps-and-islands
1个回答
0
投票

这是您其他问题的一个有趣的变体。问题是lag()仅查看前一行以检查是否重叠。相反,您想查看所有前面的行。

幸运的是,您可以为此使用累积式max()

SELECT p.id, p.starts_on, p.ends_on, grp,
      (Max(ends_on) OVER (PARTITION BY grp) - Min(starts_on) OVER (PARTITION BY grp) 
      ) + 1 AS duration, Array_agg(p.id) OVER (PARTITION BY grp) 
FROM (SELECT p.*,
            Count(*) FILTER (WHERE prev_eo < starts_on - INTERVAL '1 day') OVER
                (PARTITION BY 1 
                  ORDER BY starts_on
                ) AS grp 
      FROM (SELECT p.*,
                  MAX(ends_on) OVER (ORDER BY starts_on ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS prev_eo 
            FROM (SELECT p.id, p.starts_on, p.ends_on 
                  FROM periods p
                  WHERE starts_on > '2019-12-15' AND
                        starts_on < '2020-01-15'
                 ) p 
          ) p 
  ) p;

我不确定PARTITION BY 1应该做什么,但是我没有包括它。

[Here是右旋。

期待您的下一个问题。这是一个挑战:如果开始时间相等,则累积最大值将不稳定。在这种情况下,您要么要删除重复项,要么要使累积最大值的排序保持稳定。

© www.soinside.com 2019 - 2024. All rights reserved.