使用Google Cloud Datastore Python库时，应该如何调查内存泄漏？

Question

我有一个使用Google的Datastore的Web应用程序，在足够多的请求后一直在运行内存。

我已经把这个问题缩小到Datastore查询。下面提供一个最小的PoC，一个加长版其中包括内存测量是在Github上。

from google.cloud import datastore
from google.oauth2 import service_account

def test_datastore(entity_type: str) -> list:
    creds = service_account.Credentials.from_service_account_file("/path/to/creds")
    client = datastore.Client(credentials=creds, project="my-project")
    query = client.query(kind=entity_type, namespace="my-namespace")
    query.keys_only()
    for result in query.fetch(1):
        print(f"[+] Got a result: {result}")

for n in range(0,100):
    test_datastore("my-entity-type")

剖析过程RSS显示每次迭代大约增长1 MiB。即使没有结果返回，也会发生这种情况。以下是我的Github gist的输出。

[+] Iteration 0, memory usage 38.9 MiB bytes
[+] Iteration 1, memory usage 45.9 MiB bytes
[+] Iteration 2, memory usage 46.8 MiB bytes
[+] Iteration 3, memory usage 47.6 MiB bytes
..
[+] Iteration 98, memory usage 136.3 MiB bytes
[+] Iteration 99, memory usage 137.1 MiB bytes

但与此同时，Python的硕士显示了一个平面图（运行像 mprof run python datastore_test.py):

这个问题

是我调用Datastore的方式出了问题，还是可能是某个库的潜在问题？

环境是Windows 10上的Python 3.7.4 (也在Docker的Debian 3.8上测试过)，并使用 google-cloud-datastore==1.11.0 和 grpcio==1.28.1.

编辑1

澄清一下，这不是典型的Python分配器行为，它从操作系统请求内存，但并没有立即从内部竞技池中释放内存。下面是我的受影响的应用程序运行的Kubernetes中的一张图。

这表明：

内存线性增长，直到大约2GiB，在那里，应用程序有效地崩溃，因为它是在内存中（技术上，Kubernetes驱逐了荚，但这是不相关的在这里）。
同样的Web应用在运行，但没有与GCP Storage或Datastore进行交互。
只添加了GCP Storage的交互（随着时间的推移，有非常轻微的增长，可能是正常的）。
只添加了GCP Datastore的交互（内存增长更大，一小时内约512MiB）。Datastore查询与本帖中的PoC代码完全相同。

编辑2

为了绝对确定Python的内存使用情况，我用 gc. 在退出之前，程序会报告。

gc: done, 15966 unreachable, 0 uncollectable, 0.0156s elapsed

我还手动强制收集垃圾，使用 gc.collect() 在循环的每次迭代中，这没有任何区别。

由于没有不可收集的对象，看来内存泄漏不太可能来自于使用Python内部内存管理分配的对象。因此，更有可能是外部C库泄露了内存。

潜在的相关

有一个敞篷车问题我不能确定是否与此有关，但与我的问题有很多相似之处。

Answer 1

我已经把内存泄漏的范围缩小到了创建了 "数据存储"。datastore.Client 对象。

对于下面的概念验证代码，内存使用量并没有增加。

from google.cloud import datastore
from google.oauth2 import service_account

def test_datastore(client, entity_type: str) -> list:
    query = client.query(kind=entity_type, namespace="my-namespace")
    query.keys_only()
    for result in query.fetch(1):
        print(f"[+] Got a result: {result}")

creds = service_account.Credentials.from_service_account_file("/path/to/creds")
client = datastore.Client(credentials=creds, project="my-project")

for n in range(0,100):
    test_datastore(client, "my-entity-type")

这对于一个小脚本来说是有意义的 client 对象可以被创建一次并在请求之间安全地共享。

在许多其他的应用程序中，安全地传递客户端对象比较困难（或者说不可能）。我希望当客户端超出范围时，库能释放内存，否则在任何长期运行的程序中都可能出现这个问题。

编辑1

我把这个范围缩小到了grpc。环境变量 GOOGLE_CLOUD_DISABLE_GRPC 可以设置（为任何值）来禁用grpc。

一旦设置了这一点，我在Kubernetes中的应用看起来就像。

对valgrind的进一步调查显示，这很可能与grpc中OpenSSL的使用有关，我在下面的文章中写道本票在错误跟踪器上。

使用Google Cloud Datastore Python库时，应该如何调查内存泄漏？

问题描述投票：1回答：1

1个回答

最新问题

使用Google Cloud Datastore Python库时，应该如何调查内存泄漏？

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1