我是 PostgreSQL 中的
WITH RECURSIVE
新手。我有一个相当标准的递归查询,它遵循邻接列表。例如,如果我有:
1 -> 2
2 -> 3
3 -> 4
3 -> 5
5 -> 6
它产生:
1
1,2
1,2,3
1,2,3,4
1,2,3,5
1,2,3,5,6
我想要的是:
1,2,3,4
1,2,3,5,6
但我不知道如何在 Postgres 中执行此操作。这似乎是“选择最长的路径”或“选择不包含在另一条路径中的路径”。我也许可以看到如何通过连接本身来做到这一点,但这似乎效率很低。
示例查询是:
WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
SELECT g.id, g.link, g.data, 1, ARRAY[g.id], false
FROM graph g
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1, path || g.id, g.id = ANY(path)
FROM graph g, search_graph sg
WHERE g.id = sg.link AND NOT cycle
)
SELECT * FROM search_graph;
您已经通过
cycle
找到了触手可及的解决方案,只需在末尾添加谓词即可。
但是将你的中断条件调整一级,目前你附加的一个节点太多了:
WITH RECURSIVE search AS (
SELECT id, link, data, ARRAY[g.id] AS path, (link = id) AS cycle
FROM graph g
WHERE NOT EXISTS (SELECT FROM graph WHERE link = g.id)
UNION ALL
SELECT g.id, g.link, g.data, s.path || g.id, g.link = ANY(s.path)
FROM search s
JOIN graph g ON g.id = s.link
WHERE NOT s.cycle
)
SELECT *
FROM search
WHERE cycle;
-- WHERE cycle IS NOT FALSE; -- alternative if link can be NULL
还包括一个启动条件,如@wildplasser提到的。
cycle
的初始条件是(link = id)
以捕获捷径循环。如果您有 CHECK
约束不允许在表中这样做,则没有必要。
具体实施取决于缺失的细节。
这是假设所有图都以循环或
link IS NULL
终止,并且同一个表中存在从 link
到 id
的 FK 约束。
确切的实施取决于缺失的细节。如果 link
实际上不是链接(没有引用完整性),则需要适应...
只需将额外的子句添加到最终查询中,例如:
WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
SELECT g.id, g.link, g.data, 1, ARRAY[g.id], false
FROM graph g
-- BTW: you should add a START-CONDITION here, like:
-- WHERE g.id = 1
-- or even (to find ALL linked lists):
-- WHERE NOT EXISTS ( SELECT 13
-- FROM graph nx
-- WHERE nx.link = g.id
-- )
UNION ALL
SELECT g.id, g.link, g.data, sg.depth + 1, path || g.id, g.id = ANY(path)
FROM graph g, search_graph sg
WHERE g.id = sg.link AND NOT cycle
)
SELECT * FROM search_graph sg
WHERE NOT EXISTS ( -- <<-- extra condition
SELECT 42 FROM graph nx
WHERE nx.id = sg.link
);
请注意:
not exists(...)
-子句尝试连接与递归并集的第二个leg相同的记录。
所以:它们是相互排斥的。 WITH recursive graph (child, parent) AS (
SELECT 2, 1
UNION
SELECT 3, 2
UNION
SELECT 4, 2
UNION
SELECT 6, 5
UNION
SELECT 7, 6
UNION
SELECT 6, 7
),
paths (start, node, depth, path, has_cycle, terminated) AS (
SELECT
ARRAY[g1.parent],
false,
false
FROM graph g1
WHERE true
AND NOT EXISTS (SELECT 1 FROM graph g2 WHERE g1.parent = g2.child)
UNION ALL
SELECT
p.path || g.child,
g.child = ANY(p.path),
g.parent is null AS terminated
FROM paths p
LEFT OUTER JOIN graph g ON g.parent = p.node
WHERE NOT has_cycle
)
SELECT * from path WHERE terminated
;
因此,技巧是通过使用
terminated
来使用
LEFT OUTER JOIN
列,然后仅选择终止的路径。