基于传递关系组合元组

问题描述 投票:0回答:3

我有一个元组列表:

[(10,22), (10,20), (10,69), (34,18), (18,17), (89,990), (86,80), (174,175), (543,542)]

我想得到这样的结果:

[(10,22,20,69), (34,18,17), (89,990), (86, 80), (174,175), (543,542)]

我想将至少有一个共同元素的任何元组组合在一起。我该怎么做?

python tuples
3个回答
3
投票

写的[在问题的初始版本中,后来被编辑]听起来你想要传递闭包,但实际上你还需要一个对称闭包,这样你就可以合并

(10,22)
(10,20)
(10,69)
.

def merge_sets(sets):
    for i in range(len(sets) - 1, -1, -1):
        for j in range(i - 1, -1, -1):
            if sets[i] & sets[j]: # check for intersection (non-empty -> True)
                sets[j] |= sets[i] # merge i-th set into j-th set
                sets.pop(i) # remove i-th set
                break # terminate inner loop and continue with next i
    return sets

tuples = [(10,22), (10,20), (10,69), (34,18), (18,17), (89,990), (86,80), (174,175), (543,542)]
sets = [set(tuple) for tuple in tuples]
merged_sets = merge_sets(sets)
merged_tuples = [tuple(s) for s in merged_sets]
# merged_tuples: [(20, 69, 22, 10), (17, 34, 18), (89, 990), (80, 86), (174, 175), (542, 543)]

这段代码很简单。它首先将元组转换为集合,以便我们可以方便地检查共享元素(集合交集)。最后,我们将集合转换回元组。

函数

merge_sets
将所有集合相互比较。每当两个集合相交时,我们合并它们并继续。

三点值得进一步解释:

  • 我们可以安全地将第 i 个集合合并到第 j 个集合中,因为我们已经将前者与中间的所有集合进行了比较。 (如果我们反过来做,我们将不得不比较我们已经部分比较过的集合:中间可能有一些集合与第 j 个集合相交,但不与第 i 个集合相交。举一个最小的例子,尝试输入
    [(1, 2), (3, 4), (2, 3)]
    (从左到右遍历)或
    [(2, 3), (3, 4), (1, 2)]
    (从右到左遍历,如在我的代码中)。)因为我们合并集合的方向与循环进行的方向相同(此处:从右到左;选择在算法上是任意的,但请参阅下一个要点),我们可以简单地继续循环,而不必在每次合并后都在较早的时间点重新启动它们。
  • 两个循环都倒计时,因为从右侧删除 Python
    list
    元素比从左侧更快。 (为了从左边快速移除,我们可以使用
    collections.deque
    。)
  • 为什么这个问题最好用传统的循环结构来解决(而不是更高级的 Python 结构,例如列表推导或
    map
    )的原因是我们正在处理可变对象(我们将修改列表以及包含项目)。即使使用传统的循环,我们也需要小心地以正确的方式修改项目(通过在循环进程的方向上合并它们;参见第一个要点)。

感谢用户 Crazy Chucky 提醒我非空集的计算结果为

True
.


0
投票

compatible 元组链接在一起,即至少共享一个元素的元组。每条链将是原始数据的连接组件。

def cc_from_tuples(pairs:list[tuple]) -> list[tuple]:
    """Get Connected Components"""
    ccs = []
    # new instance of the data
    s_pairs = set(pairs) # or pairs.copy()
    # iterative classification of the pairs
    while s_pairs:
        # initialize connected component
        cc = set(s_pairs.pop())
        # temporary set
        s_pairs_tmp = set()
        # classify tuples as either part of the cc or as next candidate cc
        for t in s_pairs:
            if cc.intersection(t):
                cc.update(t)
            else:
                s_pairs_tmp.add(t)
        # add completed connected component
        ccs.append(tuple(cc)) # casting
        # iteration step
        s_pairs = s_pairs_tmp

    return css


ts = [(10,22), (10,20), (10,69), (34,18), (18,17), (89,990), (86,80), (174,175), (543,542)]
cc_from_tuples(ts)
#[(17, 18, 34), (10, 20, 69, 22), (542, 543), (89, 990), (80, 86), (174, 175)]

ts = [(1, 0), (2, 3), (1, 4)]
cc_from_tuples(ts)
#[(0, 1, 4), (2, 3)]

ts = [(1, 0), (2, 3), (1, 2)]
cc_from_tuples(ts)
#[(0, 1, 2, 3)]

如果结果需要按元组长度递减排序,修改函数的返回语句如下

return sorted(ccs, key=len, reverse=True)


0
投票

我想知道您是否可以通过采用 Lover of Structure 的答案 背后的想法并移动 O(n^2) 部分以对整数进行运算来加快速度。这是一个尝试。在这个版本中,我不需要假设输入是一个集合列表:

from collections import defaultdict

tuples = [(10, 22), (10, 20), (10, 69), (34, 18), (18, 17),
          (89, 990), (86, 80), (174, 175), (543, 542)]
values_to_index = defaultdict(int)

# This part is a single pass over the elements
for i, t in enumerate(tuples):
    for v in t:
        values_to_index[v] |= (1 << i)

# Merging the sets is done with integer operations
# Instead of popping elements, it seems simpler to construct
# a new list, since that will be shorter than the input,
# reducing the number of searches
merged_sets = []
for s in values_to_index.values():
    # To avoid unnecessary checks for disjointness, keep track of the
    # elements that merge pre-existing sets and clean up as you go
    merges = []
    for i in range(len(merged_sets)):
        if merged_sets[i] & s:
            merged_sets[i] |= s
            merges.append(i)
    if len(merges) == 0:
        merged_sets.append(s)
    elif len(merges) > 1:
        k = merges[0]
        for i in merges[:0:-1]:
            merged_sets[k] |= merged_sets.pop(i)

# Now we have a list of integers telling us which set each tuple goes to
output = [set() for _ in range(len(merged_sets))]
for output_index, bits in enumerate(merged_sets):
    while bits:
        b = bits & (~bits + 1)
        output[output_index].update(tuples[b.bit_length() - 1])
        bits ^= b
result = [tuple(s) for s in output]

位迭代器由 https://stackoverflow.com/a/8898977/2988730.

提供
© www.soinside.com 2019 - 2024. All rights reserved.