为什么Python的子类化会减慢速度呢？

Question

我正在研究扩展dict的简单类，但我意识到pickle的键查找和使用非常很慢。

我认为这是我班上的问题，所以我做了一些琐碎的基准测试：

(venv) marco@buzz:~/sources/python-frozendict/test$ python --version
Python 3.9.0a0
(venv) marco@buzz:~/sources/python-frozendict/test$ sudo pyperf system tune --affinity 3
[sudo] password for marco: 
Tune the system configuration to run benchmarks

Actions
=======

CPU Frequency: Minimum frequency of CPU 3 set to the maximum frequency

System state
============

CPU: use 1 logical CPUs: 3
Perf event: Maximum sample rate: 1 per second
ASLR: Full randomization
Linux scheduler: No CPU is isolated
CPU Frequency: 0-3=min=max=2600 MHz
CPU scaling governor (intel_pstate): performance
Turbo Boost (intel_pstate): Turbo Boost disabled
IRQ affinity: irqbalance service: inactive
IRQ affinity: Default IRQ affinity: CPU 0-2
IRQ affinity: IRQ affinity: IRQ 0,2=CPU 0-3; IRQ 1,3-17,51,67,120-131=CPU 0-2
Power supply: the power cable is plugged

Advices
=======

Linux scheduler: Use isolcpus=<cpu list> kernel parameter to isolate CPUs
Linux scheduler: Use rcu_nocbs=<cpu list> kernel parameter (with isolcpus) to not schedule RCU on isolated CPUs
(venv) marco@buzz:~/sources/python-frozendict/test$ python -m pyperf timeit --rigorous --affinity 3 -s '                    
x = {0:0, 1:1, 2:2, 3:3, 4:4}
' 'x[4]'
.........................................
Mean +- std dev: 35.2 ns +- 1.8 ns
(venv) marco@buzz:~/sources/python-frozendict/test$ python -m pyperf timeit --rigorous --affinity 3 -s '
class A(dict):
    pass             

x = A({0:0, 1:1, 2:2, 3:3, 4:4})
' 'x[4]'
.........................................
Mean +- std dev: 60.1 ns +- 2.5 ns
(venv) marco@buzz:~/sources/python-frozendict/test$ python -m pyperf timeit --rigorous --affinity 3 -s '
x = {0:0, 1:1, 2:2, 3:3, 4:4}
' '5 in x'
.........................................
Mean +- std dev: 31.9 ns +- 1.4 ns
(venv) marco@buzz:~/sources/python-frozendict/test$ python -m pyperf timeit --rigorous --affinity 3 -s '
class A(dict):
    pass

x = A({0:0, 1:1, 2:2, 3:3, 4:4})
' '5 in x'
.........................................
Mean +- std dev: 64.7 ns +- 5.4 ns
(venv) marco@buzz:~/sources/python-frozendict/test$ python
Python 3.9.0a0 (heads/master-dirty:d8ca2354ed, Oct 30 2019, 20:25:01) 
[GCC 9.2.1 20190909] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from timeit import timeit
>>> class A(dict):
...     def __reduce__(self):                 
...         return (A, (dict(self), ))
... 
>>> timeit("dumps(x)", """
... from pickle import dumps
... x = {0:0, 1:1, 2:2, 3:3, 4:4}
... """, number=10000000)
6.70694484282285
>>> timeit("dumps(x)", """
... from pickle import dumps
... x = A({0:0, 1:1, 2:2, 3:3, 4:4})
... """, number=10000000, globals={"A": A})
31.277778962627053
>>> timeit("loads(x)", """
... from pickle import dumps, loads
... x = dumps({0:0, 1:1, 2:2, 3:3, 4:4})
... """, number=10000000)
5.767975459806621
>>> timeit("loads(x)", """
... from pickle import dumps, loads
... x = dumps(A({0:0, 1:1, 2:2, 3:3, 4:4}))
... """, number=10000000, globals={"A": A})
22.611666693352163

结果确实令人惊讶。当键查找慢2倍时，pickle慢5x。

这怎么可能？ get()，__eq__()和__init__()等其他方法以及在keys()，values()和items()上的迭代与dict一样快。

EDIT

：我看了Python 3.9的源代码，在Objects/dictobject.c中似乎__getitem__()方法是由dict_subscript()实现的。并且dict_subscript()仅在缺少键时才减慢子类的速度，因为子类可以实现__missing__()并尝试查看其是否存在。但是基准是使用现有密钥。

但是我注意到了：__getitem__()是用标志METH_COEXIST定义的。还有__contains__()（慢2倍的另一种方法）具有相同的标志。从official documentation：

该方法将代替现有定义加载。不带METH_COEXIST，默认为跳过重复的定义。自开槽包装器在方法表之前被加载，存在一个例如，sq_contains插槽将生成一个包装方法，名为contains
（）并阻止加载具有相同名称的相应PyCFunction。定义了标志后，PyCFunction将为代替包装对象加载，并将与插槽共存。这很有用，因为对PyCFunction的调用比包装对象调用。
因此，如果我理解正确，理论上METH_COEXIST应该可以加快速度，但效果似乎相反。为什么？

EDIT 2

：我发现了更多。

__getitem__()和__contains()__被标记为METH_COEXIST，因为它们在PyDict_Type中声明了[[two

次。它们都一次出现在插槽tp_methods中，在其中它们显式声明为__getitem__()和__contains()__。但是official documentation表示tp_methods是子类继承的[[not。

因此dict的子类不调用__getitem__()，而是调用子插槽mp_subscript。实际上，mp_subscript包含在插槽tp_as_mapping中，允许子类继承其子插槽。

问题是__getitem__()和mp_subscript都使用
same
函数dict_subscript。可能仅仅是它的继承方式减慢了它的速度吗？

我当时正在研究一个扩展dict的简单类，我意识到键查找和使用pickle的速度非常慢。我认为这是我班上的问题，所以我做了一些琐碎的基准测试：（venv）...

为什么Python的子类化会减慢速度呢？

问题描述投票：8回答：1

1个回答

最新问题

为什么Python的子类化会减慢速度呢？

问题描述 投票：8回答：1

1个回答

最新问题

问题描述投票：8回答：1