Dask distributed无法使用numpy.arrays和sparse.matrices反序列化

问题描述 投票:1回答:1

我在图表上的不同任务上多次收到以下错误(执行之间的更改)。可能当某些任务返回numpy.arrays和scipy.sparse矩阵时。

distributed.protocol.pickle - INFO - Failed to deserialize b'\x80\x04'
Traceback (most recent call last):
  File "/home/user/venv/lib/python3.5/site-packages/distributed/protocol/pickle.py", line 59, in loads
    return pickle.loads(x)
EOFError: Ran out of input
distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "/home/user/venv/lib/python3.5/site-packages/distributed/protocol/core.py", line 119, in loads
    value = _deserialize(head, fs)
  File "/home/user/venv/lib/python3.5/site-packages/distributed/protocol/serialize.py", line 158, in deserialize
    return f(header, frames)
  File "/home/user/venv/lib/python3.5/site-packages/distributed/protocol/serialize.py", line 20, in <lambda>
    deserializers = {None: lambda header, frames: pickle.loads(b''.join(frames))}
  File "/home/user/venv/lib/python3.5/site-packages/distributed/protocol/pickle.py", line 59, in loads
    return pickle.loads(x)
EOFError: Ran out of input
distributed.comm.utils - ERROR - truncated data stream (485 bytes): [b'', b"\x92\x83\xa6report\xc2\xa4keys\x91\xd9P('_avro_body-read-block-bag-from-delayed-67c7a9690149de9743ed970f873fa1d6', 283)\xa2op\xabdelete-data\x86\xa8priority\x93\x00\x01\xcc\xce\xa6nbytes\x81\xd9:('bag-from-delayed-67c7a9690149de9743ed970f873fa1d6', 283)\xce\x00 \x86p\xa8duration\xcb@\x18\x16m\x88xX\x00\xa7who_has\x81\xd9:('bag-from-delayed-67c7a9690149de9743ed970f873fa1d6', 283)\x91\xb5tcp://127.0.0.1:38623\xa2op\xaccompute-task\xa3key\xd9K('pluck-map-process_features_sparse-d94d304dc59efb780c39bfb0ca4df37f', 283)", b'\x83\xabbytestrings\x90\xa7headers\x81\x92\x01\xa4task\x83\xabcompression\x91\xc0\xa5count\x01\xa7lengths\x91\x02\xa4keys\x91\x92\x01\xa4task', b'\x80\x04']
distributed.worker - INFO - Connection to scheduler broken. Reregistering
distributed.worker - INFO - -------------------------------------------------
distributed.worker - INFO -         Registered to:       tcp://127.0.0.1:8786
distributed.worker - INFO - -------------------------------------------------

这始终是EOFError: Ran out of input错误,具有不同大小的缓冲区(有时小到几个字节),整个集群在一台机器上运行。

理想情况下,我想要解决实际问题,但是也可以理解调查问题和理解可能出错的方法。现在我有点卡住而不知道如何解决手头的问题。

运行client.get_versions(check=True)完成没有错误,这在更新所有包(即numpy,scipy,dask,dask-distributed,cloudpickle)后仍然存在

dask dask-distributed
1个回答
1
投票

cloudpickle项目(dask使用)最近修补了修复可能导致此错误的问题。

本评论中解释了一些细节:https://github.com/ray-project/ray/issues/2685#issuecomment-423182347

...更多细节可以在cloudpickle github repo中的相关问题/ PR中找到。

FWIW,我今天遇到了这个错误(包括b'\x80\x04'部分),并且将cloudpickle更新为0.8.0似乎已经修复了它。

© www.soinside.com 2019 - 2024. All rights reserved.