Python 单个循环花费的时间怎么可能比多个循环花费的时间还要多？

Question

我有 2 个代码可以完成类似的任务：

class Solution1:
    def f(self, s: str) -> bool:
        def calculate(s, i, digit, length, maxi):
            count = 1
            while i < length:
                if s[i] == digit: count += 1
                else: break
                i += 1
            
            return max(maxi, count), i

        
        maxi1 = 0
        maxi0 = 0
        length = len(s)
        i = 0
        while i<length:
            if s[i] == '0': maxi0, i = calculate(s,i,'0', length, maxi0)
            else: maxi1, i = calculate(s,i,'1', length, maxi1)
        
        return maxi1 > maxi0


class Solution3:
    def f(self, s: str) -> bool:
        s1 = s.split('0')
        s0 = s.split('1')
        r1 = max([len(i) for i in s1])
        r0 = max([len(i) for i in s0])
        return r1>r0

使用

%timeit

,

运行以下代码

x = '1100011111100111010111'*10

sol1 = Solution1().f
sol3 = Solution3().f

% timeit -r 30 -n 3000 sol1(x,)
% timeit -r 30 -n 3000 sol3(x,)

它给了我，

3000 loops, best of 30: 77.5 µs per loop
3000 loops, best of 30: 29.6 µs per loop

奇怪的是，第二个代码几乎花了一半的时间。

split()

迭代循环一次并已使用两次，然后

len

为每个子部分占用

O(N)

时间，并为每个子字符串调用，这意味着总共

O(N)

时间，除此之外，

max

两者的工作原理完全相同。有人能解释一下这是怎么发生的吗？是

cpython

在幕后为内置函数做一些事情吗？

令我惊讶的是：

class Solution4:
    def f(self, s: str) -> bool:
        return len(max(s.split('0'))) > len(max(s.split('1')))

需要 7x 更少的时间

max

用于

strings

+

split() twice

。

Answer 1

kindall 的评论中已经提供了一个代码如何可能比另一个代码慢的问题的答案。为了完整起见，这里再次：

像 max() 这样的内置函数是用 C 编写的，比等效的 Python 代码快得多。那么您的解决方案示例并不严格执行与其他两个相同的操作（它检查最大的字符串，而不是具有最大长度的字符串）。在这种情况下，它们在功能上是等效的（因为由五个零组成的字符串比由三个零组成的字符串“更大”），但比较两个字符串（如您所知）比两个函数调用及其结果的比较更快。 – 亲切

有两种方法可以更快地完成工作：

优化所使用方法的代码
选择完全不同的方法

因此，我们不必想知道使用其他函数/方法/模块执行相同的操作可以显着缩短执行时间，而是让我们惊讶的是选择另一种方法/方法来解决相同的问题也可以做到这一点。下面是另一种方法的代码以及有关解决方案背后的核心思想的解释：

在寻找解决方案时，往往有一些最明显的事情人们不会考虑。这里对同一任务的讨论计算Python中重复序列的最长出现次数令人惊讶的是没有提到下面提供的方法。

称为“解决方案 5”的方法采用相反的方法，即在字符串的所有子字符串的生成列表中搜索最大子字符串长度。

它会迭代增加子字符串长度来搜索子字符串的第一次出现。使用高度优化的子字符串搜索并仅查找其第一次出现，与解决方案 4 相比，在字符串较长的情况下花费的时间更少，而对于非常长的字符串则花费的时间极少（对于非常短的字符串 f4() 似乎更快） .

因此，让我们将最快的解决方案 f4() 与解决同一问题的另一种方法 f5() 进行比较。

class Solution4:
    def f4(self, s: str) -> bool:
        return len(max(s.split('0'))) > len(max(s.split('1')))

def f4(s: str) -> bool: # cut out time for class overhead
    return len(max(s.split('0'))) > len(max(s.split('1')))

def f5(s):
    foundMax0=False; foundMax1=False
    for l in range(1,len(s)+1):
        if not foundMax0: 
            # print(f'{l=}, {s.find("0"*l)=}') 
            if s.find("0"*l) is -1: 
                max0=l-1; foundMax0=True
        if not foundMax1: 
            # print(f'{l=}, {s.find("1"*l)=}') 
            if s.find("1"*l) is -1: 
                max1=l-1; foundMax1=True
        if foundMax0 and foundMax1:
            break
    return max1 > max0

from time import perf_counter as T
s = '1100011111100111010111'*10 # *1000
sT=T(); sol4 =Solution4().f4(s); eT=T(); print(f'{eT-sT:11.8f}')
sT=T(); sol4f=            f4(s); eT=T(); print(f'{eT-sT:11.8f}')
sT=T(); sol5 =            f5(s); eT=T(); print(f'{eT-sT:11.8f}')
print( f5(s) )
# Times vary, but the last solution is more than ten times faster than the up to now fastest one on a large string and still 1.5 times faster on a small one:  
# s*10   : 0.00002643
# s*10   : 0.00001687
# s*10   : 0.00000914
# s*1000 : 0.00116592
# s*1000 : 0.00104705
# s*1000 : 0.00007862

上面的代码还有很大的优化空间。

Kelly Bundy

在评论中提出了两种优化方法，

f(6)

和f(7)：

def f5(s):
    foundMax0=False; foundMax1=False
    for l in range(1,len(s)+1):
        if not foundMax0:
            if s.find("0"*l) is -1: 
                max0=l-1; foundMax0=True
        if not foundMax1:
            if s.find("1"*l) is -1: 
                max1=l-1; foundMax1=True
        if foundMax0 and foundMax1:
            break
    return max1 > max0
    
def f6(s):
    s0, s1 = '0', '1'
    while s1 in s:
        if s0 not in s:
            return True
        s0 += '0'
        s1 += '1'
    return False

def f7(s):
    n = 1
    while '1'*n in s:
        if '0'*n not in s:
            return True
        n += 1
    return False

import random
print(all(f7(s)==f6(s)==f7(s) for _ in range(1000) for s in [''.join(random.choices('01', k=1000))]))

from time import perf_counter as T
#s = '1100011111100111010111'*10 # *1000
s=''.join(random.choices('01', k=1000000))
sT=T(); sol5 =            f5(s); eT=T(); print(f't5={eT-sT:11.8f}')
sT=T(); sol6 =            f6(s); eT=T(); print(f't6={eT-sT:11.8f}')
sT=T(); sol7 =            f7(s); eT=T(); print(f't7={eT-sT:11.8f}')

使用

f6(s)

和

f7(s)

的代码加速是通过不获取 0 和 1 游程长度的最大值来实现的，当只找到一个最大值时停止搜索。

无论如何，有关任何进一步速度优化的一般消息如下：

想了解循环的比较时间，考虑更改这些循环所处理的数据也会对最终运行时间产生重大影响。

Python 单个循环花费的时间怎么可能比多个循环花费的时间还要多？

问题描述投票：0回答：1

1个回答

最新问题

Python 单个循环花费的时间怎么可能比多个循环花费的时间还要多？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1