我正在尝试查找与字符串相关的所有元组,而不仅仅是与之匹配的元组。这是我做的:
from itertools import chain
data = [('A','B'),('B','C'),('B','D'),('B','F'),('F','W'),('W','H'),('G','Z')]
init = 'A'
filtered_init = [item for item in data if item[0] == init or item[1] == init]
elements = list(dict.fromkeys([ i for i in chain(*filtered_init)]))
elements.remove(init)
dat = []
for i in elements:
sync = [item for item in data if item[0] == i or item[1] == i]
dat.append(sync)
print(dat)
结果是:
[('A', 'B'), ('B', 'C'), ('B', 'D'), ('B', 'F')]
但是,它仅包含A-B相关级别。我想找到的是与init
字符串相关的所有元组,如下图所示:
换句话说,[('A','B'),('B','C'),('B','D'),('B','F'),('F','W'),('W','H')]
查找所有可到达init
的边。我如何获得它们?
您的问题是在由connected component定义的无向图中找到init
的edge list data structure。
此数据结构不适用于此问题,因此第一步是将其转换为adjacency list。从那里,我们可以应用任何标准的graph traversal算法,例如depth first search。完成后,我们可以将结果转换回想要输出的边缘列表格式。
from collections import defaultdict
def find_connected_component(edge_list, start):
# convert to adjacency list
edges = defaultdict(list)
for a, b in edge_list:
edges[a].append(b)
edges[b].append(a)
# depth-first search
stack = [start]
seen = set()
while stack:
node = stack.pop()
if node not in seen:
seen.add(node)
stack.extend(edges[node])
# convert back to edge list
return [ edge for edge in edge_list if edge[0] in seen ]
用法:
>>> find_connected_component(data, init)
[('A', 'B'), ('B', 'C'), ('B', 'D'), ('B', 'F'), ('F', 'W'), ('W', 'H')]
为了提高效率,您可以使用DSU。此解决方案有效O(N)
from functools import reduce
import random
parent = dict()
init = 'A'
data = [('A','B'),('B','C'),('B','D'),('B','F'),('F','W'),('W','H'),('G','Z')]
def make_set(v):
parent[v] = v
def find_set(v):
if v == parent[v]:
return v
parent[v] = find_set(parent[v])
return parent[v]
def union_sets(a, b):
a, b = map(find_set, [a, b])
if a != b:
if random.randint(0, 1):
a, b = b, a
parent[b] = a;
elements = set(reduce(lambda x, y: x+y, data))
for v in elements:
parent[v] = v
for u, v in data:
union_sets(u, v)
init_set = find_set(init)
edges_in_answer = [e for e in data if find_set(e[0]) == init_set]
print(edges_in_answer)
输出:[(('A','B'),('B','C'),('B','D'),('B','F'),('F ','W'),('W','H')]
非常幼稚的解决方案,对于复杂的树可能不是有效的。
data = [('A', 'B'), ('B', 'C'), ('B', 'D'), ('B', 'F'),
('F', 'W'), ('W', 'H'), ('G', 'Z')]
init = ['A']
result = []
while init:
initNEW = init.copy()
init = []
new = 0
for edge in data:
for vertex in initNEW:
if edge[0] == vertex:
result.append(edge)
init.append(edge[1])
new += 1
for i in range(len(result) - new, len(result)):
data.remove(result[i])
print(result)
# [('A', 'B'), ('B', 'C'), ('B', 'D'), ('B', 'F'), ('F', 'W'), ('W', 'H')]