tf.data 生成的训练结果没有改善

问题描述 投票:0回答:1

我正在尝试使用 tf.data 来扩充我拥有的数据集。数据集在我的计算机本地排列如下:

datasets/fruits/{class_name}/*jpg

{class_name}包括7种不同的水果,包括:草莓、芒果、西兰花、葡萄、苹果、柠檬和橙子。

这是我为数据增强步骤编写的代码。正如您所看到的,我什至没有实际增加数据,我只是使用 tf.data.Dataset.from_tensor_slices 加载图像并重新缩放像素:

import tensorflow as tf
import random
from tensorflow.data import AUTOTUNE
from tensforflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers.experimental import preprocessing
from imutils.import paths
from sklearn.preprocessing import LabelEncoder

INIT_LR = 1e-2 # learning rate
BS = 32 # batch size
EPOCHS = 50 # number of epochs

# load images with tensorflow

def load_images(imagePath, label):
    image = tf.io.read_file(imagePath)
    image = tf.image.decode_jpeg(image, channels=3)
    image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    image = tf.image.resize(image, (64, 64))
    return (image, label)

# augment helper function
def augment(image, label, aug):
    image = aug(image)

    return (image, label)

# get all image paths and save them as strings with format **/{class_name}/*jpg
allImages = list(paths.list_images("datasets/fruits"))
random.shuffle(allImages) # shuffle the images

# perform 0.75/0.25 train/test split
i = int(len(allImages) * 0.25)
trainPaths = allImages[i:]

# get labels by getting {class_name} from **/{class_name}/*jpg
trainLabels = [p.split(os.path.sep)[-2] for p in trainPaths]
testPaths = allImages[:i]
testLabels = [p.split(os.path.sep)[-2] for p in testPaths]

# use LabelEncoder to one-hot encode the class names
labelEncoder = LabelEncoder()
labelEncoder = labelEncoder.fit(trainLabels)
trainLabels = labelEncoder.transform(trainLabels)
trainLabels = to_categorical(trainLabels)
testLabels = labelEncoder.transform(testLabels)
testLabels = to_categorical(testLabels)

# load the train and test data into a tf.data.Dataset
trainDS = tf.data.Dataset.from_tensor_slices((trainPaths, trainLabels))
trainDS = (
    trainDS
    .shuffle(32, seed=42)
    .map(load_images, num_parallel_calls=AUTOTUNE)
    .batch(BS)
    .cache()
)

# rescale the pixels from [0, 1]

trainAug = tf.keras.Sequential(
    [
        preprocessing.Rescaling(scale=1.0/255),
    ]
)

trainDS = (
    trainDS
    .map(lambda x, y: augment(x, y, trainAug), num_parallel_calls=AUTOTUNE)
    .prefetch(AUTOTUNE)
)

testDS = tf.data.Dataset.from_tensor_slices(( testPaths, testLabels ))
testDS = (
    testDS
    .shuffle(32)
    .map(load_images, num_parallel_calls=AUTOTUNE)
    .batch(BS)
    .cache()
)

testAug = tf.keras.Sequential(
    [
        preprocessing.Rescaling(scale=1.0/255),
    ]
)

testDS = (
    testDS
    .map(lambda x, y: augment(x, y, testAug), num_parallel_calls=AUTOTUNE)
    .prefetch(AUTOTUNE)
)

# I don't think there is any issues with this part, but here I am setting up the optimizer and model for training
sgd = SGD(learning_rate=INIT_LR, momentum=0.9, weight_decay=INIT_LR/EPOCHS)
model = MiniVGGNet.build(64, 64, 3, num_classes=num_classes)
model.compile(loss="categorical_crossentropy", optimizer=sgd, metrics=["accuracy"])
training_history = model.fit(
    x=trainDS,
    validation_data=testDS,
    epochs=EPOCHS
)

这是我得到的训练结果。您可以看到 val_loss 和 val_accuracy 没有显示出改善的迹象。 training results using tf.data to load images

我怀疑问题在于图像数据的加载和处理方式,而不是我正在使用的模型。我能够使用相同的模型生成合理的结果,对加载的图像进行训练,而不是使用张量流方法,而是使用简单的 python 方法。这些是我得到的结果 training results using standard python methods to load data

无论我如何加载图像数据,我都希望获得类似的训练结果。如果有人能阐明我的问题,我将不胜感激!我已经被这个问题困扰好几天了。

尝试使用 tf.data 加载数据集并进行训练,但得到了奇怪的结果

tensorflow-datasets
1个回答
0
投票

结果我将像素重新缩放了两次。

image = tf.image.convert_image_dtype(image, dtype=tf.float32)

自动重新缩放一次,然后,在预处理步骤中,我再次重新缩放

trainAug = tf.keras.Sequential(
    [
        preprocessing.Rescaling(scale=1.0/255),
    ]
)

删除

preprocessing.Rescaling(scale=1.0/255),
后,训练开始产生合理的结果。 文档提到了这种自动重新缩放行为(https://www.tensorflow.org/api_docs/python/tf/image/convert_image_dtype

© www.soinside.com 2019 - 2024. All rights reserved.