为什么“set.union()”函数比集合并运算符“|”慢得多在这段代码中?

问题描述 投票:0回答:2

我一直在重新审视我的解决方案https://adventofcode.com/2023/day/16。这是我的旧代码:

from functools import cache
from helper import directions_2d
# [(0, 1), (-1, 0), (0, -1), (1, 0)]


def behavior(drt, tile):
    (row, col) = drt
    match tile:
        case "/":
            return [(-col, -row)]
        case "\\":
            return [(col, row)]
        case "|":
            if row == 0:
                return [(-1, 0), (1, 0)]
        case "-":
            if col == 0:
                return [(0, -1), (0, 1)]
    return [drt]


def run(input_data: str):
    part1 = 0
    part2 = 0
    light_map = [list(s) for s in input_data.splitlines()]
    lengths = (len(light_map), len(light_map[0]))
    starts = []
    for drt in directions_2d():
        dim = 0 if drt[0] == 0 else 1
        for i in range(lengths[dim]):
            start = [0, 0]
            start[dim] = i
            start[1 - dim] = -1 if drt[1 - dim] == 1 else lengths[1 - dim]
            starts.append([(tuple(start), drt)])

    @cache
    def run_line(row, col, drt):
        energized = set()
        (new_row, new_col) = (row, col)
        new_drt_list = [drt]
        while new_drt_list == [drt]:
            (new_row, new_col) = (new_row + drt[0], new_col + drt[1])
            if any(not 0 <= (new_row, new_col)[i] < lengths[i] for i in (0, 1)):
                new_drt_list = []
            else:
                new_drt_list = behavior(drt, light_map[new_row][new_col])
                energized.add((new_row, new_col))
        return new_row, new_col, new_drt_list, energized

    beams: list[tuple[int, int]]
    for beams in starts:
        current_i = 0
        energized = set()
        while current_i < len(beams):
            (row, col), drt = beams[current_i]
            new_row, new_col, new_drt_list, new_energized = run_line(row, col, drt)
            for new_drt in new_drt_list:
                if ((new_row, new_col), new_drt) not in beams:
                    beams.append(((new_row, new_col), new_drt))
            # this line
            energized = energized.union(new_energized)
            current_i += 1

        if beams[0] == ((0, -1), (0, 1)):
            part1 = len(energized)
        part2 = max(part2, len(energized))

    return part1, part2

如果我用 union 运算符替换 union 函数,如下所示:

energized |= new_energized
,运行时间从 28-30 秒下降到 2-3 秒。

我搜索了文档、指南以及我能找到的所有内容;他们都告诉我 union 函数和运算符具有相似的性能,没有相差 10 倍。

为了确定,我运行了代码(两个版本)大约 10 次。

该存储库已公开于 https://github.com/ultinvincible/AdventOfCode_Py

python python-3.x performance set
2个回答
0
投票

所以我发现了问题:

energized = energized.union(new_energized)

实际上等于

energized |= new_energized

但相当于

energized = energized | new_energized

区别在于

a=a.union(b)
a=a|b
创建一个新的
set
对象并将其分配给旧引用,而
a|=b
修改集合,这就是代码速度如此之快的原因。

我真傻。


0
投票

这与使用

mylist.append(x)
添加到列表末尾与使用
mylist = mylist + [x]
之间的区别相同。前者是摊销常数时间,因为它会添加到末尾(除非它调整大小,这就是为什么它是amoritized常数时间),后者是线性时间,因为您总是创建一个全新的列表对象。因此,在一个循环中,前者将是线性的,后者将是二次的。使用Shlemiel the Painter算法

的经典性能陷阱
© www.soinside.com 2019 - 2024. All rights reserved.