为什么Python为字符串分配内存的策略不同？

Question

字符串类型使用Unicode表示。 Unicode 字符串每个字符最多可以占用 4 个字节，具体取决于编码。

我知道Python使用三种内部表示形式来表示Unicode字符串：

1 字节
2字节
4字节

但我仍然对字符串的内存分配感到困惑。

import sys

>>> print(sys.getsizeof("hello world hello world"))
>>> 72

>>> print(sys.getsizeof(["hello world hello world"]))
>>> 64

>>> print(sys.getsizeof(("hello world hello world",)))
>>> 48

为什么会出现这种情况？当我将相同的字符串放入列表和元组时，大小减小了。但为什么呢？

Answer 1

getsizeof

call 不是递归的，当您在容器上调用它时，不会给出所有包含对象的大小。在您的示例中，获取真实 str 对象大小的唯一调用是

print(sys.getsizeof("hello world hello world"))

，其他调用仅向您提供 1 项列表的大小和 1 项元组的大小。

为了获得组合对象的完整大小，您必须使用递归函数的收据，该函数将产生对象的大小，加上其所有属性和包含的对象（如果有）的大小。

一些事情：


from sys import getsizeof

def getfullsize(obj, seen=None):
    if seen is None:
        seen = set()
    if id(obj) in seen:
        return 0
    seen.add(id(obj))
    size = getsizeof(obj)
    if not isinstance (obj, (str, bytes)) and hasattr(type(obj), "__len__"):
        for item in obj:
            if hasattr(type(obj), "values"):
                size += getfullsize(obj[item], seen)
           
            size += getfullsize(item, seen)
    if hasattr(obj, "__dict__"):
        size += getfullsize(obj.__dict__, seen)
    if hasattr(obj, "__slots__"):
        for attr in obj.__slots__:
            if (item:=getattr(obj, attr, None)) is not None:
                size+= getfullsize(item, seen)
    return size

为什么Python为字符串分配内存的策略不同？

问题描述投票：0回答：1

1个回答

最新问题

为什么Python为字符串分配内存的策略不同？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1