我正在学习有关喀拉拉邦的图像分类。我已经下载了甜甜圈和华夫饼的示例数据集,但是它们的大小不同。为了标准化它们的大小,我从它们的目录中加载图像,调整它们的大小并将它们存储在numpy数组中:
test_data_dir = 'v_data/train/donuts_and_waffles/'
validation_data_dir = 'v_data/test/donuts_and_waffles/'
loaded_test_donuts = list()
for filename in listdir(test_data_dir + 'donuts/'):
image1 = Image.open(test_data_dir + 'donuts/' + filename)
img_resized = image1.resize((224,224))
img_data = asarray(img_resized)
loaded_test_donuts.append(img_data)
loaded_test_waffles = list()
for filename in listdir(test_data_dir + 'waffles/'):
image1 = Image.open(test_data_dir + 'waffles/' + filename)
img_resized = image1.resize((224,224))
img_data = asarray(img_resized)
loaded_test_waffles.append(img_data)
loaded_validation_donuts = list()
for filename in listdir(validation_data_dir + 'donuts/'):
image1 = Image.open(validation_data_dir + 'donuts/' + filename)
img_resized = image1.resize((224,224))
img_data = asarray(img_resized)
loaded_validation_donuts.append(img_data)
loaded_validation_waffles = list()
for filename in listdir(validation_data_dir + 'waffles/'):
image1 = Image.open(validation_data_dir + 'waffles/' + filename)
img_resized = image1.resize((224,224))
img_data = asarray(img_resized)
loaded_validation_waffles.append(img_data)
test_data = list()
validation_data = list()
test_data.append(np.array(loaded_test_donuts))
test_data.append(np.array(loaded_test_waffles))
validation_data.append(np.array(loaded_validation_donuts))
validation_data.append(np.array(loaded_validation_waffles))
test_data = np.array(test_data)
validation_data = np.array(validation_data)
然后,我想为我的数据创建一个ImageDataGenerator:
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow(
#how can I pass here test_data to make it work (along with which parameters)
)
validation_generator = test_datagen.flow(
#how can I pass here validation_data to make it work (along with which parameters)
)
如何实现?我已经这样尝试过:
train_generator = train_datagen.flow(
test_data, #does not work
batch_size=batch_size)
validation_generator = test_datagen.flow(
validation_data, #does not work
batch_size=batch_size)
但随后出现此错误:
Traceback (most recent call last):
...
ValueError: ('Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (2, 770, 224, 224, 3))
我建议您创建一个文件夹,其中有n个代表类的文件夹,例如“ dog”,“ cat”,并先执行预处理步骤,然后按如下方式保存生成的图像:
from PIL import Image
import glob
from keras.preprocessing import image
W=500
H=825
for folder in glob.glob("*"): #goes through every folder
ims = glob.glob(folder+ "\\*.png") #reads image names from folder assuming images are png
for im in ims:
img = Image.open(im)
print(im)
if (img.size != (W, H)):
imgr = process(img, W, H) # where "process" is reszing in your case
imgr.save(im)
然后将您的数据溢出到训练和验证文件夹中,然后执行:
traingen = image.ImageDataGenerator(rescale=1./255)
validationgen = image.ImageDataGenerator(rescale=1./255)
train = traingen.flow_from_directory("train",target_size=(H,W), batch_size=s,shuffle=True)
val = validationgen.flow_from_directory("validation",target_size=(500, 825), batch_size=32, shuffle=False)
很难说什么[[不起作用而没有错误消息,但是我认为问题是您将列表传递给了ImageDataGenerators。您可以通过将列表转换为numpy-arrays来轻松解决此问题:
test_data = list()
validation_data = list()
test_data.append(np.array(loaded_test_donuts))
test_data.append(np.array(loaded_test_waffles))
validation_data.append(np.array(loaded_validation_donuts))
validation_data.append(np.array(loaded_validation_waffles))
test_data = np.array(test_data)
validation_data = np.array(validation_data)
编辑:一种更好的方法,堆叠而不是附加到列表并进行转换
test_data = np.vstack((np.array(loaded_test_donuts),np.array(loaded_test_waffles))) validation_data = np.vstack((np.array(loaded_validation_donuts),np.array(loaded_validation_waffles)))