为什么两个列表理解中的“if 条件”比具有 2 个两个条件的一个循环工作得更快?

问题描述 投票:0回答:3
import time
from random import random
from typing import List


def test(arr: List[int] | None = None) -> None:
    if not arr:
        raise TypeError("Variable arr must exist!")
    
    opp = arr.pop()

    def check_time(func, msg):
        t0 = time.perf_counter()
        func()
        print(f"{msg}\ntime - {time.perf_counter() - t0}")

    def first_method():
        more_arr = [e for e in arr if e > opp]
        less_arr = [e for e in arr if e < opp]
        return less_arr, more_arr

    def second_method():
        more_arr, less_arr = [], []
        for e in arr:
            if e > opp:
                more_arr.append(e)
            elif e < opp:
                less_arr.append(e)
        return less_arr, more_arr

    check_time(first_method, "first_method")
    check_time(second_method, "second_method")
    """
    [RESULT]
    first_method
    time - 0.1035286999976961
    second_method
    time - 0.12783881399809616
    """


def main() -> None:
    test([int(random() * 1000) for _ in range(1_000_000)])


if __name__ == '__main__':
    main()

结果:

first_method:时间 - 0.10790603799978271

second_method:时间 - 0.1264369229975273

------------------------------------------------ ----------

我不知道为什么 first_method() 比 second_method() 快?

从优化的角度来看,list_comprehension 中的“if”条件如何工作?

python if-statement list-comprehension
3个回答
1
投票

添加这第三个方法可以看到

append
方法的检索需要很多时间:

    def third_method():
        more_arr, less_arr = [], []
        m = more_arr.append
        l = less_arr.append
        
        for e in arr:
            if e > opp:
                m(e)
            elif e < opp:
                l(e)
        return less_arr, more_arr

这种方法通常比第一种方法稍微快一点。


0
投票

您可以使用 cProfile 模块查看有关功能的信息或它们在内部需要多长时间

运行一次

if __name__ == '__main__':
    import cProfile
    cProfile.run('main()')
first_method
time - 0.06351780006662011
second_method
time - 0.1859593999106437
         1999012 function calls in 0.493 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.493    0.493 <string>:1(<module>)
        2    0.005    0.003    0.250    0.125 test.py:13(check_time)
        1    0.000    0.000    0.061    0.061 test.py:18(first_method)
        1    0.036    0.036    0.036    0.036 test.py:19(<listcomp>)
        1    0.025    0.025    0.025    0.025 test.py:20(<listcomp>)
        1    0.120    0.120    0.183    0.183 test.py:23(second_method)
        1    0.006    0.006    0.493    0.493 test.py:44(main)
        1    0.178    0.178    0.237    0.237 test.py:45(<listcomp>)
        1    0.000    0.000    0.250    0.250 test.py:7(test)
        1    0.000    0.000    0.493    0.493 {built-in method builtins.exec}
        2    0.000    0.000    0.000    0.000 {built-in method builtins.print}
        4    0.000    0.000    0.000    0.000 {built-in method time.perf_counter}
   998993    0.064    0.000    0.064    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'pop' of 'list' objects}
  1000000    0.059    0.000    0.059    0.000 {method 'random' of '_random.Random' objects}
ncalls
for the number of calls.

tottime
for the total time spent in the given function (and excluding time made in calls to sub-functions)

percall
is the quotient of tottime divided by ncalls

cumtime
is the cumulative time spent in this and all subfunctions (from invocation till exit). This figure is accurate even for recursive functions.

percall
is the quotient of cumtime divided by primitive calls

filename:lineno(function)
provides the respective data of each function

998993 0.064 0.000 0.064 0.000 {“列表”对象的“追加”方法}

追加比列表推导需要更多时间


0
投票

性能会因您的硬件平台和 Python 版本而异。

我修改了原始代码以使用 randint() 而不是使用 random() 进行不必要的计算。这对时间没有影响。

在我的机器上first_method()second_method()慢。

值得注意的是,在 first_method() 中,arr 列表被枚举两次(每次理解一次),而在 second_method() 中,它只被枚举一次

from random import randint
from time import perf_counter

def test(arr: list) -> None:
    opp = arr.pop()

    def check_time(func, msg):
        t0 = perf_counter()
        func()
        print(f"{msg} duration = {perf_counter() - t0:.4f}s")  

    def first_method():
        more_arr = [e for e in arr if e > opp]
        less_arr = [e for e in arr if e < opp]
        return less_arr, more_arr

    def second_method():
        more_arr, less_arr = [], []
        for e in arr:
            if e > opp:
                more_arr.append(e)
            elif e < opp:
                less_arr.append(e)
        return less_arr, more_arr

    check_time(first_method, 'First')
    check_time(second_method, 'Second')


def main() -> None:
    test([randint(0, 1_000) for _ in range(1_000_000)])

if __name__ == '__main__':
    main()

输出:

First duration = 0.0472s
Second duration = 0.0367s

平台:

Python 3.11.2
macOS 13.2.1
CPU 3GHz 10-Core Intel Xeon W
RAM 32GB
© www.soinside.com 2019 - 2024. All rights reserved.