我有以下代码。最后我想总结一下我的原始数据。应汇总重叠周期并输出最小/最大日期。
下面的代码给出了每个ID的最长PATH。到目前为止,一切都很好。不幸的是,我不知道如何使用 ID 来根据组指定最长的整体 PATH。
原表:
身份证 | 达腾 | BEGINN_ZEITSTEMPEL | ENDE_ZEITSTEMPEL | 团体 |
---|---|---|---|---|
1 | 数据1A | 2023-05-01 00:00:00 | 2023-06-01 00:00:00 | 1 |
2 | 数据2A | 2023-04-01 00:00:00 | 2023-05-15 00:00:00 | 1 |
3 | 数据3A | 2023-03-01 00:00:00 | 2023-05-30 00:00:00 | 1 |
4 | 数据4A | 2023-02-01 00:00:00 | 2023-05-29 00:00:00 | 1 |
5 | 数据5A | 2023-05-14 00:00:00 | 2023-05-15 00:00:00 | 2 |
6 | 数据1B | 2023-05-29 00:00:00 | 2023-05-30 00:00:00 | 2 |
7 | 数据2B | 2023-05-01 00:00:00 | 2023-08-01 00:00:00 | 3 |
8 | 数据3B | 2023-05-01 00:00:00 | 2023-09-01 00:00:00 | 3 |
9 | 数据4B | 2023-05-01 00:00:00 | 2023-06-01 00:00:00 | 3 |
10 | 数据5B | 2021-05-01 00:00:00 | 2022-06-01 00:00:00 | 3 |
决赛桌:
START_ID | 达腾 | BEGINN_ZEITSTEMPEL | ENDE_ZEITSTEMPEL | 团体 | 路径 |
---|---|---|---|---|---|
1 | 数据1A | 2023-02-01 00:00:00 | 2023-06-01 00:00:00 | 1 | 2 -> 4 -> 3 -> 1 |
5 | 数据5A | 2023-05-14 00:00:00 | 2023-05-15 00:00:00 | 2 | 5 |
6 | 数据1B | 2023-05-29 00:00:00 | 2023-05-30 00:00:00 | 2 | 6 |
7 | 数据3B | 2023-05-01 00:00:00 | 2023-09-01 00:00:00 | 3 | 9 -> 7 -> 8 |
10 | 数据5B | 2021-05-01 00:00:00 | 2022-06-01 00:00:00 | 3 | 10 |
WITH CTE as (
Select 1 as ID, 'Datensatz1A' as Daten, TO_DATE('2023-05-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 1 as Gruppe FROM DUAL
UNION Select 2 as ID, 'Datensatz2A' as Daten, TO_DATE('2023-04-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-05-15 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 1 as Gruppe FROM DUAL
UNION Select 3 as ID, 'Datensatz3A' as Daten, TO_DATE('2023-03-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-05-30 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 1 as Gruppe FROM DUAL
UNION Select 4 as ID, 'Datensatz4A' as Daten, TO_DATE('2023-02-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-05-29 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 1 as Gruppe FROM DUAL
UNION Select 5 as ID, 'Datensatz5A' as Daten, TO_DATE('2023-05-14 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-05-15 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 2 as Gruppe FROM DUAL
UNION Select 6 as ID, 'Datensatz1B' as Daten, TO_DATE('2023-05-29 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-05-30 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 2 as Gruppe FROM DUAL
UNION Select 7 as ID, 'Datensatz2B' as Daten, TO_DATE('2023-05-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-08-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 3 as Gruppe FROM DUAL
UNION Select 8 as ID, 'Datensatz3B' as Daten, TO_DATE('2023-05-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-09-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 3 as Gruppe FROM DUAL
UNION Select 9 as ID, 'Datensatz4B' as Daten, TO_DATE('2023-05-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2023-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 3 as Gruppe FROM DUAL
UNION Select 10 as ID, 'Datensatz5B' as Daten, TO_DATE('2021-05-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Beginn_Zeitstempel, TO_DATE('2022-06-01 00:00:00', 'YYYY-MM-DD HH24:MI:SS') as Ende_Zeitstempel, 3 as Gruppe FROM DUAL
),
RecursiveCTE (Start_ID, End_ID, Daten, Beginn_Zeitstempel, Ende_Zeitstempel, Gruppe, Path) AS (
SELECT
ID,
ID,
Daten,
Beginn_Zeitstempel,
Ende_Zeitstempel,
Gruppe,
CAST(ID AS VARCHAR2(100)) AS PATH
FROM CTE
WHERE NOT EXISTS (
SELECT 1
FROM CTE c
WHERE c.ID = CTE.ID
AND c.Gruppe = CTE.Gruppe
AND c.Beginn_Zeitstempel < CTE.Beginn_Zeitstempel
)
UNION ALL
SELECT
r.Start_ID,
c.ID,
c.Daten,
LEAST(r.Beginn_Zeitstempel, c.Beginn_Zeitstempel),
GREATEST(r.Ende_Zeitstempel, c.Ende_Zeitstempel),
c.Gruppe,
r.Path || ' -> ' || c.ID AS Pfad
--r.Path || c.ID AS PATH
FROM RecursiveCTE r
JOIN CTE c ON r.End_ID <> c.ID
AND r.Gruppe = c.Gruppe
AND r.Ende_Zeitstempel >= c.Beginn_Zeitstempel
AND r.Ende_Zeitstempel <= c.Ende_Zeitstempel
AND INSTR(r.Path, c.ID) = 0
),
-- SELECT * from RecursiveCTE
MaxPathCTE AS (
SELECT
Start_ID,
Gruppe,
MAX(PATH) KEEP (DENSE_RANK LAST ORDER BY LENGTH(PATH)) AS MAX_PATH
FROM RecursiveCTE
GROUP BY Start_ID, Gruppe
)
SELECT
r.Start_ID,
r.Daten,
r.Beginn_Zeitstempel,
r.Ende_Zeitstempel,
r.Gruppe,
r.Path
FROM RecursiveCTE r
INNER JOIN MaxPathCTE m ON r.Start_ID = m.Start_ID AND r.Gruppe = m.Gruppe AND r.Path = m.MAX_PATH
ORDER BY r.Start_ID;
从 Oracle 12 开始,您可以使用
MATCH_RECOGNIZE
进行逐行模式匹配:
SELECT *
FROM table_name
MATCH_RECOGNIZE (
PARTITION BY Gruppe
ORDER BY Ende_Zeitstempel DESC, Beginn_Zeitstempel DESC
MEASURES
FIRST(id) AS start_id,
LAST(ID) AS last_id,
COUNT(*) AS num_rows,
MIN(Beginn_Zeitstempel) AS Beginn_Zeitstempel,
FIRST(Ende_Zeitstempel) AS Ende_Zeitstempel
PATTERN (overlapping* non_overlapping)
DEFINE
overlapping AS MIN(Beginn_Zeitstempel) <= NEXT(Ende_Zeitstempel)
)
对于样本数据:
CREATE TABLE table_name (id, daten, Beginn_Zeitstempel, Ende_Zeitstempel, Gruppe) AS
Select 1, 'Datensatz1A', DATE '2023-05-01', DATE '2023-06-01', 1 FROM DUAL UNION ALL
Select 2, 'Datensatz2A', DATE '2023-04-01', DATE '2023-05-15', 1 FROM DUAL UNION ALL
Select 3, 'Datensatz3A', DATE '2023-03-01', DATE '2023-05-30', 1 FROM DUAL UNION ALL
Select 4, 'Datensatz4A', DATE '2023-02-01', DATE '2023-05-29', 1 FROM DUAL UNION ALL
Select 5, 'Datensatz5A', DATE '2023-05-14', DATE '2023-05-15', 2 FROM DUAL UNION ALL
Select 6, 'Datensatz1B', DATE '2023-05-29', DATE '2023-05-30', 2 FROM DUAL UNION ALL
Select 7, 'Datensatz2B', DATE '2023-05-01', DATE '2023-08-01', 3 FROM DUAL UNION ALL
Select 8, 'Datensatz3B', DATE '2023-05-01', DATE '2023-09-01', 3 FROM DUAL UNION ALL
Select 9, 'Datensatz4B', DATE '2023-05-01', DATE '2023-06-01', 3 FROM DUAL UNION ALL
Select 10, 'Datensatz5B', DATE '2021-05-01', DATE '2022-06-01', 3 FROM DUAL
输出:
团体 | START_ID | LAST_ID | NUM_ROWS | BEGINN_ZEITSTEMPEL | ENDE_ZEITSTEMPEL |
---|---|---|---|---|---|
1 | 1 | 2 | 4 | 2023-02-01 00:00:00 | 2023-06-01 00:00:00 |
2 | 6 | 6 | 1 | 2023-05-29 00:00:00 | 2023-05-30 00:00:00 |
2 | 5 | 5 | 1 | 2023-05-14 00:00:00 | 2023-05-15 00:00:00 |
3 | 8 | 9 | 3 | 2023-05-01 00:00:00 | 2023-09-01 00:00:00 |
3 | 10 | 10 | 1 | 2021-05-01 00:00:00 | 2022-06-01 00:00:00 |