如何使用Python在数组中查找重复模式

Question

我想在数组中找到重复的序列：

[1102, 200, 250, 648, 200, 22, 223,
 5, 648, 98, 102, 22, 223,
 5, 648, 98, 102, 22, 223, 5, 648, 98, 102, 22, 223,
 5, 648, 98, 102, 22, 223, 5, 648, 98, 102, 22]

如果您注意到上面的数组中有一些重复的数字，例如 5、648、98、102、22、223 不断重复。但要注意的是，我事先没有这个，我所拥有的只是通过一些计算生成的数组，我的任务是找到其中重复的一小部分。

还没有，我尝试了 Numpy，但这需要一个预先给定的小序列来用于在主数组中查找该序列。

Answer 1

这可以使用弗洛伊德的循环查找算法来完成。也被称为“龟兔赛跑”算法。

Answer 2

您可以对数据计算自相关：

以 pandas 的

autocorrelation_plot

为例：

from pandas.plotting import autocorrelation_plot

autocorrelation_plot(lst)

Lag=6 处的峰值表明 6 个项目的滞后存在一定的正相关性。

另一种方法可能是迭代子列表并计算出现次数。

例如使用

itertools.batched

和

collections.Counter

:

from collections import Counter
from itertools import chain
from itertools import batched # see recipe for python <3.12

c = Counter(chain.from_iterable(batched(l, n=n) for n in range(2, len(l)//2+1)))

c.most_common(1)

输出：

[((223, 5, 648, 98, 102, 22), 7)]

对于Python <3.12, use the

batched

食谱：

from itertools import islice

def batched(iterable, n):
    # batched('ABCDEFG', 3) --> ABC DEF G
    if n < 1:
        raise ValueError('n must be at least one')
    it = iter(iterable)
    while batch := tuple(islice(it, n)):
        yield batch

如何使用Python在数组中查找重复模式

问题描述投票：0回答：2

2个回答

最新问题

如何使用Python在数组中查找重复模式

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2