我需要为表示流网络的有向图确定Strahler number或Strahler流顺序。我可以得到信息forwards and backwards using WITH RECURSIVE
queries,但似乎我需要做一些不同的事情来确定Strahler数。
例如,这是一个19段流网络,有10个支流和一个插座。每个段的上游部分由节点ID表示。
表格结构中的相同数据,其中段由to_node
连接,对于水池出口是零。
CREATE TABLE streams (
node integer PRIMARY KEY,
to_node integer REFERENCES streams(node),
expected_order integer
);
INSERT INTO streams(node, to_node, expected_order) VALUES
(1, NULL, 4),
(2, 1, 4),
(3, 2, 3),
(4, 2, 3),
(5, 4, 3),
(6, 3, 2),
(7, 3, 2),
(8, 5, 2),
(9, 5, 2),
(10, 6, 1),
(11, 6, 1),
(12, 7, 1),
(13, 7, 1),
(14, 8, 1),
(15, 8, 1),
(16, 9, 1),
(17, 9, 1),
(18, 4, 1),
(19, 1, 1);
Strahler数的预期结果(expected_order
)在这里可视化:
有三条规则(来自GRASS 7.0 Manual):
从我在挖掘中找到的解决这个问题的是这个计算can be done with SQL(除了我认为他们的“SQL脚本”是为MS SQL Server编写的)。但是,我还没有找到可以用PostgreSQL 9.1完成的任务。
我最好的尝试之一是计算每个节点的上游节点数量,这些节点正确识别支流(第一顺序),而不是其他节点:
WITH RECURSIVE search_graph AS (
SELECT node AS start_node, node
FROM streams
-- Connect downstream towards outlet(s)
UNION ALL
SELECT sg.start_node, n.node
FROM streams n
JOIN search_graph sg ON n.to_node = sg.node
)
SELECT start_node, count(sg.node) as upstream_nodes, expected_order
FROM search_graph sg
JOIN streams s ON sg.start_node = s.node
GROUP BY start_node, expected_order
ORDER BY upstream_nodes DESC, start_node;
start_node | upstream_nodes | expected_order
------------+----------------+----------------
1 | 19 | 4
2 | 17 | 4
4 | 9 | 3
3 | 7 | 3
5 | 7 | 3
6 | 3 | 2
7 | 3 | 2
8 | 3 | 2
9 | 3 | 2
10 | 1 | 1
11 | 1 | 1
12 | 1 | 1
13 | 1 | 1
14 | 1 | 1
15 | 1 | 1
16 | 1 | 1
17 | 1 | 1
18 | 1 | 1
19 | 1 | 1
(19 rows)
一个想法是使用nth_value(value any, nth integer)
window function与appropriately set window frame range。但是,我不确定如何设置它,或者是否可以设置它来识别Strahler数字。另一个[不太令人兴奋]的想法是为每个Strahler编号手动运行迭代,我希望我的真实世界数据可以在5到8个订单(迭代)之间。这可以用DO
statement完成。但任何更好的想法都会受到欢迎。
我对CTE有限制。递归CTE不能对自身进行LEFT JOIN。刚刚结束了它的功能。
现场测试:https://www.db-fiddle.com/f/8z58LCVhD62YvkeJjriW8d/0
create or replace function strahler(_parent int) returns table(node int, sn int)
as
$$
select
s.node,
case
-- If the node is a leaf (has no children), its Strahler number is one.
when count(st.*) = 0 then
1
when count(st.*) >= 2 then
case
-- If the node has one child with Strahler number i,
-- and all other children have Strahler numbers less than i,
-- then the Strahler number of the node is i again.
when min(st.sn) < max(st.sn) then
max(st.sn)
-- If the node has two or more children with Strahler number i,
-- and no children with greater number,
-- then the Strahler number of the node is i + 1.
when min(st.sn) = max(st.sn) then
max(st.sn) + 1
end
end as sn
from streams s
left join lateral strahler(s.node) st on true
where _parent = 0 or s.to_node = _parent
group by s.node
$$ language 'sql';
测试:
select st.*, s.expected_order
from strahler(0) st
join streams s on st.node = s.node
order by st.node;
输出:
| node | sn | expected_order |
| ---- | --- | -------------- |
| 1 | 4 | 4 |
| 2 | 4 | 4 |
| 3 | 3 | 3 |
| 4 | 3 | 3 |
| 5 | 3 | 3 |
| 6 | 2 | 2 |
| 7 | 2 | 2 |
| 8 | 2 | 2 |
| 9 | 2 | 2 |
| 10 | 1 | 1 |
| 11 | 1 | 1 |
| 12 | 1 | 1 |
| 13 | 1 | 1 |
| 14 | 1 | 1 |
| 15 | 1 | 1 |
| 16 | 1 | 1 |
| 17 | 1 | 1 |
| 18 | 1 | 1 |
| 19 | 1 | 1 |
这是最初的计划
现场测试:https://www.db-fiddle.com/f/8z58LCVhD62YvkeJjriW8d/1
with recursive search_graph as (
select node as start_node, node
from streams
union all
select sg.start_node, n.node
from streams n
join search_graph sg on n.to_node = sg.node
)
, get_kids as
(
select
s.node as kid,
count(sg.*) - 1 as kid_kids,
s.expected_order
from streams s
join search_graph sg on s.node = sg.start_node
group by s.node, s.expected_order
order by kid_kids
)
, order_safe as
(
select
row_number() over(s) eo,
gk.kid,
gk.kid_kids,
gk_kid.to_node as parent,
gk_p.kid_kids as siblings
from get_kids gk
left join streams gk_kid on gk.kid = gk_kid.node
left join get_kids gk_p on gk_kid.to_node = gk_p.kid
window s as (order by gk_p.kid_kids /* siblings */, gk_kid.to_node /* parent */)
)
select * from order_safe;
输出:
| eo | kid | kid_kids | parent | siblings |
| --- | --- | -------- | ------ | -------- |
| 1 | 11 | 0 | 6 | 2 |
| 2 | 10 | 0 | 6 | 2 |
| 3 | 12 | 0 | 7 | 2 |
| 4 | 13 | 0 | 7 | 2 |
| 5 | 15 | 0 | 8 | 2 |
| 6 | 14 | 0 | 8 | 2 |
| 7 | 17 | 0 | 9 | 2 |
| 8 | 16 | 0 | 9 | 2 |
| 9 | 6 | 2 | 3 | 6 |
| 10 | 7 | 2 | 3 | 6 |
| 11 | 9 | 2 | 5 | 6 |
| 12 | 8 | 2 | 5 | 6 |
| 13 | 5 | 6 | 4 | 8 |
| 14 | 18 | 0 | 4 | 8 |
| 15 | 3 | 6 | 2 | 16 |
| 16 | 4 | 8 | 2 | 16 |
| 17 | 19 | 0 | 1 | 18 |
| 18 | 2 | 16 | 1 | 18 |
| 19 | 1 | 18 | | |
最初的计划是以安全的顺序评估每个节点(将由eo字段促进),从具有较少兄弟节点的节点开始,直到具有许多兄弟节点的节点。然后在将要评估的每个节点上,还将检查其直接子节点(递归CTE将对其自身执行LEFT JOIN),然后执行必要的Strahler的三个条件。但是,CTE有一个限制,递归CTE不能对自己进行LEFT JOIN。