我有一个大约 28GB 图像的数据集和一个 COCO 文件格式,包括 3 个 json 文件(train、test 和 val)与这些图像协调 现在我想将数据集分成每个部分 3.5GB,下面是我找到的一个函数:
def split_and_save_data(data, num_items, output_file_prefix):
data_part = []
current_size = 0
part_idx = 1
for item in data['images']:
image_path = item['file_name']
current_size += os.path.getsize(image_path)
data_part.append(item)
if current_size >= approx_size_per_part or len(data_part) == num_items:
output_file = os.path.join(output_dir, f'{output_file_prefix}_part{part_idx}.json')
with open(output_file, 'w') as file:
json.dump(data_part, file)
data_part = []
current_size = 0
part_idx += 1
但是当我调用该函数时:
split_and_save_data(test_data, num_train_items, 'val')
它返回了这个:
FileNotFoundError: [Errno 2] No such file or directory: '132a855ee8b23533d8ae69af0049c038171a06ddfcac892c3c6d7e6b4091c642.png'
有人可以帮助我吗?