找到两个数组列表之间所有不同的交点和差异的Python高效方法

问题描述 投票:0回答:1

这是我的previous question的概括。我需要找到两个不同数组列表之间的所有交集和差异。我有一个deficient工作版本,不是很pythonic也不高效。它的主要问题是基于排序的列表,因此我丢失了列表中的原始顺序,这对于保留非常重要。]

def decomposition(row1, row2):
    '''Function that decomposes two sets in three sets formed by the intersection and the two non intersected parts
    Example
    -------
    >>> t1 = np.array([1,2,3])
    >>> t2 = np.array([3,4,5,6])  
    >>> decomposition(t1, t2)
    [[1, 2], [3], [4, 5, 6]]
    '''
    d1 = list(np.setdiff1d(row1, row2))
    d2 = list(np.intersect1d(row1, row2))
    d3 = list(np.setdiff1d(row2, row1))
    ds = (d1, d2, d3)
    return [d for d in ds if d]

def unpythonic(x, y):
    brute = [decomposition(a, b) for a in x for b in y]
    brute_flat = [item for sublist in brute for item in sublist]
    brute_flat.sort()
    flatty = list(k for k,_ in it.groupby(brute_flat))
    almost_final = []
    for c in range(len(flatty)):
        for d in range(c+1, len(flatty)):
            if not set(flatty[c]).isdisjoint(set(flatty[d])):
                almost_final.append(decomposition(flatty[c], flatty[d]))
    almost_final_flat = [item for sublist in almost_final for item in sublist]
    almost_final_flat.sort()
    final = list(k for k,_ in it.groupby(almost_final_flat))
    if not final:
        final = flatty
    return final

将函数不适用于此示例

u = [np.array([0, 6, 7, 10]), np.array([1, 2, 5, 9])]
v = [np.array([7, 10]), np.array([0, 3, 4, 5])]

我们获得

unpythonic(u, v)
[[0], [1, 2, 9], [3, 4], [5], [6], [7, 10]]

但是所需的结果应该类似于数组列表的原始顺序,比方说,先考虑u然后再考虑v,>]

[[0], [6], [7, 10], [1, 2, 9], [5], [3, 4]]

有任何建议吗?预先感谢!

EDIT

:我刚刚注意到unpythonic函数甚至不正确!我现在将其应用于新示例v = [np.array([7,10]),np.array([8]),np.array([0,3,4,5])],但没有工作得很好:(

这是我先前问题的概括。我需要找到两个不同数组列表之间的所有交集和差异。我的工作版本不足,不是很...

python numpy set intersection difference
1个回答
0
投票

没有必要分开unpythonic的参数,结果与将xy联合在一起的结果相同。因此,您有一套套,并尝试找到单独的块。要找到它们,您可以遍历所有元素并记住在哪个集合遇到此值。然后,如果2个元素具有完全相同的集合集(例如,它们在第二个和第四个集合处相遇),我们将此元素作为联合组返回。

from collections import defaultdict


def pythonic(s):
    """
    >>> pythonic([[0, 6, 7, 10], [1, 2, 5, 9], [7, 10], [0, 3, 4, 5]])
    [[0], [6], [7, 10], [1, 2, 9], [5], [3, 4]]
    >>> pythonic([[7, 10], [8], [0, 3, 4, 5], [0, 6, 7, 10], [1, 2, 5, 9]])
    [[7, 10], [8], [0], [3, 4], [5], [6], [1, 2, 9]]
    >>> pythonic([[0, 1, 4, 5], [1, 2, 3, 4], [3, 4, 5, 6]])
    [[0], [1], [4], [5], [2], [3], [6]]
    """
    all_elements = defaultdict(list)
    for i, ss in enumerate(s):
        for elem in ss:
            all_elements[elem].append(i)
    reversed = defaultdict(list)
    for k, v in all_elements.items():
        reversed[frozenset(v)].append(k) # or tuple can be used but "frozenset" feels "safer"
    return list(reversed.values())
© www.soinside.com 2019 - 2024. All rights reserved.