另一个问题要求计算列表理解中列表的累积最大值,例如:
input: [3, 4, 2, 8, 9, 3, 3, 4, 20, 1]
output: [3, 4, 4, 8, 9, 9, 9, 9, 20, 20]
我按照要求提出了有一个不错的列表理解,但它使用输入作为列表给出:
[m
for m in list_input[:1]
for x in list_input
for m in [max(m, x)]]
我的问题/挑战:如果输入是 iterator 或任何其他可迭代对象怎么办?应该仍然只是一个列表理解并且只需要线性时间。当然不使用
accumulate(iterable, max)
。我找到了一个,我喜欢它,但也想看看其他人的解决方案:-)
测试代码,留下
cumulative_maximum
来实现(在线尝试!):
from itertools import accumulate
def cumulative_maximum(iterable):
return [your list comprehension here]
def test(iterable):
iterable = tuple(iterable)
expect = list(accumulate(iterable, max))
output = cumulative_maximum(iterable)
print(output == expect, output)
output = cumulative_maximum(iter(iterable))
print(output == expect, output)
test([3, 4, 2, 8, 9, 3, 3, 4, 20, 1])
test([])
test([3])
test([3, 1])
如果我们可以使用 walrus 运算符,则可以:
return [m := x if idx == 0 else max(m, x) for idx, x in enumerate(iterable)]
enumerate
、if
和else
严格用于初始化m
。
这是一个 Python 3.8+ 解决方案。
这是我的和基准:
def cumulative_maximum(iterable):
return [m
for it in [iter(iterable)]
for m in it
for xs in [[m], it]
for x in xs
for m in [max(m, x)]]
首先我确保我有一个 iterator,称之为
it
。然后我用第一个元素初始化最大变量m
(如果没有,我们就停在那里)。然后迭代器 it
只有剩余的元素,所以我通过先遍历 m
然后再遍历 xs = [m]
将 xs = it
“链接”回它前面。对于每个我都会检查 x
值并更新/输出 m
。
iterable = iter([random.random() for _ in range(1000)])
基准:
Python 3.10.4:
35.9 us 35.9 us 35.9 us Kelly2
42.3 us 42.4 us 42.5 us enke
43.3 us 43.4 us 43.5 us loop2
55.8 us 55.8 us 56.5 us loop
63.6 us 63.6 us 63.7 us Kraigolas2
74.8 us 74.9 us 75.2 us accumulate_lambda
106.1 us 106.4 us 106.5 us accumulate_max
120.6 us 120.7 us 120.7 us Kelly
161.8 us 162.4 us 162.4 us Kraigolas
Python 3.9.2:
39.5 us 39.6 us 39.6 us Kelly2
45.9 us 46.1 us 46.2 us enke
47.4 us 47.6 us 47.6 us loop2
55.5 us 55.6 us 55.6 us loop
65.6 us 65.7 us 65.7 us Kraigolas2
74.4 us 74.6 us 74.6 us accumulate_lambda
93.0 us 93.1 us 93.2 us accumulate_max
108.7 us 108.9 us 109.0 us Kelly
143.8 us 143.8 us 143.8 us Kraigolas
注:
Kelly2
版本用更快的max(m, x)
替换了x if m < x else m
。tee
的建议比我的慢一点,因为它通过 tee
迭代器进行迭代,而我的则直接在给定可迭代对象(的迭代器)上进行迭代。此外,它还有一个缺点,即在另一个结构中额外缓冲内存中除第一个元素之外的所有元素,因为第一个 tee
迭代器在第二个迭代器运行时停留在开头。accumulate
和loop
解决方案不使用列表理解,但出于好奇/作为参考,我还是将它们包含在内。完整的基准代码:
def cumulative_maximum_Kelly(iterable):
return [m
for it in [iter(iterable)]
for m in it
for xs in [[m], it]
for x in xs
for m in [max(m, x)]]
def cumulative_maximum_Kelly2(iterable):
return [m
for it in [iter(iterable)]
for m in it
for xs in [[m], it]
for x in xs
for m in [x if m < x else m]]
def cumulative_maximum_Kraigolas(iterable):
return [m := x if idx == 0 else max(m, x) for idx, x in enumerate(iterable)]
def cumulative_maximum_Kraigolas2(iterable):
return [m := x if not idx else x if m < x else m for idx, x in enumerate(iterable)]
def cumulative_maximum_enke(iterable):
return [m
for first, second in [tee(iterable)]
for m in [next(first, [])]
for x in second
for m in [m if m > x else x]]
def cumulative_maximum_accumulate_max(iterable):
return list(accumulate(iterable, max))
def cumulative_maximum_accumulate_lambda(iterable):
return list(accumulate(iterable, lambda m, x: x if m < x else m))
def cumulative_maximum_loop(iterable):
result = []
it = iter(iterable)
for m in it:
result.append(m)
for x in it:
if m < x:
m = x
result.append(m)
return result
def cumulative_maximum_loop2(iterable):
result = []
append = result.append
it = iter(iterable)
for m in it:
append(m)
for x in it:
if m < x:
m = x
append(m)
return result
funcs = [
cumulative_maximum_Kelly,
cumulative_maximum_Kelly2,
cumulative_maximum_enke,
cumulative_maximum_Kraigolas,
cumulative_maximum_Kraigolas2,
cumulative_maximum_accumulate_max,
cumulative_maximum_accumulate_lambda,
cumulative_maximum_loop,
cumulative_maximum_loop2,
]
from timeit import repeat
from random import random
from itertools import tee, accumulate
import sys
tss = [[] for _ in funcs]
for _ in range(10):
lst = [random() for _ in range(1000)]
expect = funcs[0](iter(lst))
for func, ts in zip(funcs, tss):
assert func(iter(lst)) == expect
t = min(repeat(lambda: func(iter(lst)), number=100)) / 100
ts.append(t)
for func, ts in sorted(zip(funcs, tss), key=lambda func_ts: sorted(func_ts[1])):
print(*('%5.1f us ' % (t * 1e6) for t in sorted(ts)[:3]),
func.__name__.removeprefix('cumulative_maximum_'))
print()
print(sys.version)