我正在使用 Tensorflow 和 Keras 使用 Resnet_50 进行迁移学习。我遇到的问题是,我的模型似乎在准确性方面表现良好,但我的 val_loss 非常高,并且当我尝试进行预测时,准确性非常低。
以下是代码的相关部分:
# Create an ImageDataGenerator with data augmentation for training
data_generator = keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split=0.5,
rotation_range=30, # Random rotations
width_shift_range=0.2, # Horizontal shifts
height_shift_range=0.2, # Vertical shifts
shear_range=0.2, # Shear transformations
zoom_range=0.2, # Zoom
horizontal_flip=True, # Horizontal flips
)
# Load and preprocess training data
train_data_flow = data_generator.flow_from_directory(
dataset_path,
target_size=(224, 224), # Resize images to 224x224
batch_size=32,
class_mode='categorical',
subset='training' # Use training subset
)
# Load and preprocess validation data
val_data_flow = data_generator.flow_from_directory(
dataset_path,
target_size=(224, 224), # Resize images to 224x224
batch_size=32,
class_mode='categorical',
subset='validation' # Use validation subset
)
加载模型
# Load the model from TensorFlow Hub
model_url = "https://tfhub.dev/tensorflow/resnet_50/feature_vector/1"
hub_layer = hub.KerasLayer(model_url, input_shape=(224, 224, 3) , trainable=False)
# Create a Sequential model with dropout and batch normalization
model = keras.Sequential([
hub_layer,
layers.Dropout(0.2), # Lower dropout rate
layers.Dense(256, activation='relu'),
layers.BatchNormalization(), # Batch normalization
layers.Dropout(0.5), # Dropout
layers.Dense(9, activation='softmax')
])
# Build the Sequential model
model.build((None, 224, 224, 3))
# Summary of the model
model.summary()
然后我编译:
# Compile the model
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']
)
最后:
# Fit the model with early stopping
history = model.fit(
train_data_flow,
validation_data=val_data_flow,
epochs=15, # Number of epochs,
)
以下是纪元:
纪元1/15
81/81 [==============================] - 489s 6s/step - loss: 0.3773 - accuracy: 0.8832 - val_loss: 0.7994 - val_accuracy: 0.7476
Epoch 2/15
81/81 [==============================] - 489s 6s/step - loss: 0.3316 - accuracy: 0.8980 - val_loss: 0.8229 - val_accuracy: 0.7378
Epoch 3/15
81/81 [==============================] - 488s 6s/step - loss: 0.3468 - accuracy: 0.8879 - val_loss: 0.8221 - val_accuracy: 0.7362
Epoch 4/15
81/81 [==============================] - 489s 6s/step - loss: 0.3148 - accuracy: 0.9011 - val_loss: 0.8380 - val_accuracy: 0.7362
Epoch 5/15
81/81 [==============================] - 488s 6s/step - loss: 0.3250 - accuracy: 0.8972 - val_loss: 0.7680 - val_accuracy: 0.7409
Epoch 6/15
81/81 [==============================] - 491s 6s/step - loss: 0.3100 - accuracy: 0.9026 - val_loss: 0.7220 - val_accuracy: 0.7616
Epoch 7/15
81/81 [==============================] - 491s 6s/step - loss: 0.2844 - accuracy: 0.9120 - val_loss: 0.7259 - val_accuracy: 0.7651
Epoch 8/15
81/81 [==============================] - 490s 6s/step - loss: 0.2811 - accuracy: 0.9007 - val_loss: 0.7722 - val_accuracy: 0.7511
Epoch 9/15
81/81 [==============================] - 490s 6s/step - loss: 0.2689 - accuracy: 0.9167 - val_loss: 0.7943 - val_accuracy: 0.7433
Epoch 10/15
81/81 [==============================] - ETA: 0s - loss: 0.2569 - accuracy: 0.9182
预测:
Model Accuracy on Test Set: 0.2021069059695669
在不了解有关训练、验证和测试数据集的更多信息的情况下,很难将单个问题指出为代码问题。
不过,我想说的是,到 epoch 8 时,您的模型似乎出现了 过度拟合 。这可以通过查看
val_loss
和 val_accuracy
参数看出。您应该预期在整个训练循环中验证损失会减少,而验证准确性会增加。一旦您发现其中任何一个朝相反方向移动,您可能就出现了过度拟合。这也可以解释你在测试集上表现不佳的一些(如果不是全部)。
对此的一种可能的解决方案是,一旦出现过度拟合,就停止拟合模型,例如尝试将 epoch 参数设置为低于 15,也许:
epochs=8
。
另一种解决方案可能是使用指定的优化器设置不同的学习率,请参阅此答案以获取有关此主题的更全面的解释。