我想计算两个数组的累积交集。例如
> cumulative_intersection([1,3,4,5,2,6], [4,1,5,2,3,7])
[[], [1], [1,4], [1,4,5], [1,4,5,3,2], [1,4,5,3,2]]
我可以使用循环强行实现这一点。例如,在Python中,
a1 = [1,3,4,5,2,6]
a2 = [4,1,5,2,3,7]
n = len(a1)
all_intersects = list()
for i in range(n):
intersect = np.intersect1d(a1[:i], a2[:i])
all_intersects.append(intersect)
是否有更有效的算法来计算?我的直觉是可以对数减少计算量,因为每个连续的列表都是前一个列表的超集。
如果将交集算法替换为不对结果进行排序(从而花费线性时间),则蛮力算法的渐近复杂度变为O(n 2),这已经是问题的下界渐近线复杂。
为了演示此下限,为便于讨论,我们假设您有一个O(1)方法来确定需要向第i个交集添加哪些数字才能获得第(i + 1)个交集。因此,如果C(i)是复制第i个交点的时间,则您的总运行时间为
现在,复制数组是O(n)操作,在两个数组相同的最坏情况下,第i个交点的长度为i + 1。因此,最糟糕的情况是:
话虽这么说,您可以通过修改O(n)贪婪算法以生成两个列表的交集,以一个恒定因子击败天真的算法的运行时间。
def intersection(a: list,b: list): ''' Computes the interection of lists a and b. Assumes that a and b have the same length''' a_leftover = set() b_leftover = set() stored = set() # We only want unique values n = len(a) result = [] for i in range(n): a_elm = a[i] b_elm = b[i] if a_elm not in stored and a_elm == b_elm: result.append(a_elm) stored.add(a_elm) else: if a_elm not in stored: if a_elm in b_leftover: # a_elm was previously encountered in b result.append(a_elm) stored.add(a_elm) b_leftover.remove(a_elm) else: a_leftover.add(a_elm) if b_elm not in stored: if b_elm in a_leftover: # b_elf was previously encountered in a result.append(b_elm) stored.add(b_elm) a_leftover.remove(b_elm) else: b_leftover.add(b_elm) return result
您需要做的就是修改算法以存储并返回中间结果。
def cumulative_intersection(a: list,b: list):
''' Computes the cumulative interection of lists a and b. Assumes that a and b have the same
length'''
a_leftover = set()
b_leftover = set()
stored = set() # We only want unique values
n = len(a)
result = []
results = []
for i in range(n):
a_elm = a[i]
b_elm = b[i]
if a_elm not in stored and a_elm == b_elm:
result.append(a_elm)
stored.add(a_elm)
else:
if a_elm not in stored:
if a_elm in b_leftover:
# a_elm was previously encountered in b
result.append(a_elm)
stored.add(a_elm)
b_leftover.remove(a_elm)
else:
a_leftover.add(a_elm)
if b_elm not in stored:
if b_elm in a_leftover:
# b_elf was previously encountered in a
result.append(b_elm)
stored.add(b_elm)
a_leftover.remove(b_elm)
else:
b_leftover.add(b_elm)
results.append(result[:])
return results
也许是这样: