为什么更简单的循环速度更慢?

问题描述 投票:0回答:1

n = 10**8
调用,简单循环对我来说始终比复杂循环慢得多,我不明白为什么:

def simple(n):
    while n:
        n -= 1

def complex(n):
    while True:
        if not n:
            break
        n -= 1

有时几秒钟内:

simple 4.340795516967773
complex 3.6490490436553955
simple 4.374553918838501
complex 3.639145851135254
simple 4.336690425872803
complex 3.624480724334717
Python: 3.11.4 (main, Sep  9 2023, 15:09:21) [GCC 13.2.1 20230801]

这是字节码的循环部分,如

dis.dis(simple)
所示:

  6     >>    6 LOAD_FAST                0 (n)
              8 LOAD_CONST               1 (1)
             10 BINARY_OP               23 (-=)
             14 STORE_FAST               0 (n)

  5          16 LOAD_FAST                0 (n)
             18 POP_JUMP_BACKWARD_IF_TRUE     7 (to 6)

对于

complex

 10     >>    4 LOAD_FAST                0 (n)
              6 POP_JUMP_FORWARD_IF_TRUE     2 (to 12)

 11           8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

 12     >>   12 LOAD_FAST                0 (n)
             14 LOAD_CONST               2 (1)
             16 BINARY_OP               23 (-=)
             20 STORE_FAST               0 (n)

  9          22 JUMP_BACKWARD           10 (to 4)

所以看起来复杂的每次迭代都会做更多的工作(两次跳转而不是一次)。那为什么会更快呢?

似乎是Python 3.11的现象,请参阅评论。

基准脚本(在线尝试!):

from time import time
import sys

def simple(n):
    while n:
        n -= 1

def complex(n):
    while True:
        if not n:
            break
        n -= 1

for f in [simple, complex] * 3:
    t = time()
    f(10**8)
    print(f.__name__, time() - t)

print('Python:', sys.version)
python performance cpython python-internals python-3.11
1个回答
0
投票

我检查了字节码(python 3.11.6)的源代码,发现在反编译的字节码中,似乎只有

JUMP_BACKWARD
会执行预热函数,当执行足够多的次数时,会触发python 3.11中的specialization

PyObject* _Py_HOT_FUNCTION
_PyEval_EvalFrameDefault(PyThreadState *tstate, _PyInterpreterFrame *frame, int throwflag)
{
    /* ... */
        TARGET(JUMP_BACKWARD) {
            _PyCode_Warmup(frame->f_code);
            JUMP_TO_INSTRUCTION(JUMP_BACKWARD_QUICK);
        }
    /* ... */
}
static inline void
_PyCode_Warmup(PyCodeObject *code)
{
    if (code->co_warmup != 0) {
        code->co_warmup++;
        if (code->co_warmup == 0) {
            _PyCode_Quicken(code);
        }
    }
}

专业化似乎可以加快使用多个字节码的速度,从而显着提高速度:

void
_PyCode_Quicken(PyCodeObject *code)
{
    /* ... */
            switch (opcode) {
                case EXTENDED_ARG:  /* ... */
                case JUMP_BACKWARD: /* ... */
                case RESUME:        /* ... */
                case LOAD_FAST:     /* ... */
                case STORE_FAST:    /* ... */
                case LOAD_CONST:    /* ... */
            }
    /* ... */
}
© www.soinside.com 2019 - 2024. All rights reserved.