为什么任何()和所有()在处理布尔值时效率低下?

问题描述 投票:2回答:2

我在玩timeitand, or, any(), all()时意识到了一些事情,我想我可以在这里分享一下。以下是衡量效果的脚本:

def recursion(n):
    """A slow way to return a True or a False boolean."""
    return True if n == 0 else recursion(n-1)       

def my_function():
    """The function where you perform all(), any(), or, and."""
    a = False and recursion(10)

if __name__ == "__main__":
    import timeit
    setup = "from __main__ import my_function"
    print(timeit.timeit("my_function()", setup=setup))

以下是一些时间安排:

a = False and recursion(10)
0.08799480279344607

a = True or recursion(10)
0.08964192798430304

正如预期的那样,True or recursion(10)False and recursion(10)的计算速度非常快,因为只有第一项很重要且操作立即返回。

a = recursion(10) or True # recursion() is False
1.4154556830951606 

a = recursion(10) and False # recursion() is True
1.364157978046478

在行中使用or Trueand False不会加速计算,因为它们被评估为秒,并且必须首先执行整个递归。虽然令人讨厌,但这是合乎逻辑的,它遵循操作优先级规则。

更令人惊讶的是,无论情况如何,all()any()总是表现最差:

a = all(i for i in (recursion(10), False))) # recursion() is False
1.8326778537880273

a = all(i for i in (False, recursion(10))) # recursion() is False
1.814645767348111

我原本预计第二次评估会比第一次评估快得多。

a = any(i for i in (recursion(10), True))) # recursion() is True
1.7959248761901563

a = any(i for i in (True, recursion(10))) # recursion() is True
1.7930442127481

这里的期望值相同。

因此,如果性能在您的应用中很重要,那么any()all()似乎远不是分别编写大型or和大型and的便捷方式。这是为什么?

编辑:基于评论,似乎元组生成缓慢。我认为没有理由为什么Python本身不能使用它:

def all_faster(*args):
    Result = True
    for arg in args:
        if not Result:
            return False
        Result = Result and arg
    return True

def any_faster(*args):
    Result = False
    for arg in args:
        if Result:
            return True
        Result = Result or arg
    return False

它比内置功能更快,似乎有短路机制。

a = faster_any(False, False, False, False, True)
0.39678611016915966

a = faster_any(True, False, False, False, False)
0.29465180389252055

a = faster_any(recursion(10), False) # recursion() is True
1.5922580174283212

a = faster_any(False, recursion(10)) # recursion() is True
1.5799157924820975

a = faster_all(False, recursion(10)) # recursion() is True
1.6116566893888375

a = faster_all(recursion(10), False) # recursion() is True
1.6004807187900951

Edit2:好吧,一个接一个地传递参数会更快,但生成器会慢一些。

python-3.x micro-optimization
2个回答
1
投票

anyall短路可以。

问题是,在这两种情况下,你必须在将tuple传递给any之前构建它,所以顺序没有区别:所用的时间仍然相同。让我们用变量分解它:

t = (True, recursion(10))   # recursion is called
a = any(i for i in t)       # this is very fast, only boolean testing

当你到达第二行时,已经花费了时间。

这与andor的短路有所不同。

anyall有趣的情况是,当您在测试时评估数据时:

any(recusion(x) for x in (10,20,30))

如果你想避免评估,可以将一个lambdas元组(内联函数)传递给any并调用函数:

现在:

a = any(i() for i in (lambda:recursion(10), lambda:True))) 

和:

a = any(i() for i in (lambda:True,lambda:recursion(10)))) 

有一个非常不同的执行时间(后者是瞬时的)


2
投票

实际上,any()相当于一条orall()相当于and链,包括短路。问题在于您执行基准测试的方式。

考虑以下:

def slow_boolean_gen(n, value=False):
    for x in range(n - 1):
        yield value
    yield not value

generator = slow_boolean_gen(10)

print([x for x in generator])
# [False, False, False, False, False, False, False, False, False, True]

以及以下微观基准:

%timeit generator = slow_boolean_gen(10, True); next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator)
# 492 ns ± 35.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator) or next(generator)
# 1.18 µs ± 12.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, True); next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator)
# 1.19 µs ± 11.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator) and next(generator)
# 473 ns ± 6.27 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit generator = slow_boolean_gen(10, True); any(x for x in generator)
# 745 ns ± 15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); any(x for x in generator)
# 1.29 µs ± 12.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, True); all(x for x in generator)
# 1.3 µs ± 22.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); all(x for x in generator)
# 721 ns ± 8.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit generator = slow_boolean_gen(10, True); any([x for x in generator])
# 1.03 µs ± 28.8 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); any([x for x in generator])
# 1.09 µs ± 27.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, True); all([x for x in generator])
# 1.05 µs ± 11.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); all([x for x in generator])
# 1.02 µs ± 11.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

您可以清楚地看到短路工作正常,但如果您首先构建list,则需要一个恒定的时间来抵消您从短路中获得的任何性能增益。

EDIT:

手动实施不会给我们带来任何性能提升:

def all_(values):
    result = True
    for value in values:
        result = result and value
        if not result:
            break
    return result

def any_(values):
    result = False
    for value in values:
        result = result or value
        if result:
            break
    return result

%timeit generator = slow_boolean_gen(10, True); any_(x for x in generator)
# 765 ns ± 6.76 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); any_(x for x in generator)
# 1.48 µs ± 8.97 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, True); all_(x for x in generator)
# 1.47 µs ± 5.71 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
%timeit generator = slow_boolean_gen(10, False); all_(x for x in generator)
# 765 ns ± 8.76 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
© www.soinside.com 2019 - 2024. All rights reserved.