pkl 文件太大无法加载

问题描述 投票:0回答:0

最近在努力学习python和深度学习。我的老师给我发了一个 pkl 文件,其中包含我需要的数据。 pkl 文件的大小为 9.6GB。我的内存只有16g。 当我尝试用

pickle.load(open('data.pkl', 'rb'))
加载整个文件时,我的电脑崩溃了:(

然后,我尝试使用缓冲区加载pkl文件,我的电脑又死机了:( 下面是缓冲区的代码:

import pickle
import gc
block_size = 512 * 1024 * 1024 # 512Mb
data = b''
count_num = 0
with open('../data.pkl', 'rb') as f:
    while True:
        buffer = f.read(block_size)
        if not buffer:
            break;
        count_num += 1
        data += buffer
        print("read" + str(count_num*512) + "Mb")
        gc.collect()
print("finish")
```python
After that, I try to Split large files into small files, but I can't load the split small files because of `UnpicklingError: pickle data was truncated` and `UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.`
below is the code of splitting:
```python
import pickle
import gc
block_size = 10 * 1024 * 1024
count_num = 0
with open('../data.pkl', 'rb') as f:
    while True:
        buffer = f.read(block_size)
        if not buffer:
            break;
        count_num += 1
        print("read" + str(count_num) + "0Mb")
        fw = open("data/wiki-data-statement-"+str(count_num)+".pkl", "wb")
        pickle.dump(buffer, fw)
        print("split"+ str(count_num) + "block")
        gc.collect()
print("finish")

我需要一些关于如何解决这个问题的建议?关于可以执行此任务的其他工具的任何建议,将不胜感激。谢谢

python json pickle jsonpickle
© www.soinside.com 2019 - 2024. All rights reserved.