我有一个压缩文件夹,其中包含一个子文件夹,并且该子文件夹中包含约60000多个图像。我想知道是否有一种方法可以读取子文件夹中的所有图像而无需将其提取(因为图像文件夹的大小约为100GB)。
我曾考虑在python中使用zipfile包。但是,由于我不知道如何遍历整个子文件夹,因此无法在模块中使用open函数。如果您能向我提供有关如何执行此操作的任何意见,那就太好了
with zipfile.ZipFile("/home/diliptmonson/abc.zip","r") as zip_ref:
train_images=zip_ref.open('train/86760c00-21bc-11ea-a13a-137349068a90.jpg')```
from zipfile import ZipFile
import numpy as np
import cv2
import os
# https://thispointer.com/python-how-to-get-the-list-of-all-files-in-a-zip-archive/
with ZipFile("abc.zip", "r") as zip_ref:
# Get list of files names in zip
list_of_files = zip_ref.namelist()
# Iterate over the list of file names in given list & print them
for elem in list_of_files:
#print(elem)
ext = os.path.splitext(elem)[-1] # Get extension of elem
if ext == ".jpg":
# Read data in case extension is ".jpg"
in_bytes = zip_ref.read(elem)
# Decode bytes to image.
img = cv2.imdecode(np.fromstring(in_bytes, np.uint8), cv2.IMREAD_COLOR)
# Show image for testing
cv2.imshow('img', img)
cv2.waitKey(1000)
cv2.destroyAllWindows()
使用for循环:
# namelist lists all files
for file in zip_ref.namelist():
opened_file = zip_ref.open(file)
# do stuff with your file