我使用 CNN 进行鸟类分类,如下所示。然而,它在训练集上表现很好(50%),在验证集上表现非常糟糕(<1%).
数据如下:
train_images.shape
(4829, 100, 100, 3)
train_labels.shape
(4829, 200)
test_images.shape
(1204, 100, 100, 3)
test_labels.shape
(1204, 200)
对于模型,我使用了 MobileNet 的迁移学习并对其进行了微调。
import tensorflow as tf
from tensorflow.keras import layers, Model
from tensorflow.keras.applications import efficientnet_v2
from tensorflow.keras.optimizers import Adam
# Load the pre-trained EfficientNetV2B0 model
base_model = tf.keras.applications.MobileNetV2(input_shape=(100,100,3),
include_top=False,
weights='imagenet')
base_model.trainable = True
# Fine-tune from this layer onwards
fine_tune_at = 100
# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
layer.trainable = False
为了防止过度拟合,我使用了数据增强、L2正则化、dropout、提前停止和学习率衰减,但它们似乎没有效果。
inputs = tf.keras.Input(shape=(100, 100, 3))
x = augment(inputs, training=True)
x = base_model(x, training=True)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(580, activation='relu', kernel_regularizer=regularizers.l2(0.01), bias_regularizer=regularizers.l2(0.01))(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = tf.keras.layers.Dense(200, activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)
model.compile(loss = "categorical_crossentropy",
optimizer = tf.keras.optimizers.Adam(),
metrics = ['accuracy'])
from tensorflow.keras.callbacks import Callback, EarlyStopping,ModelCheckpoint, ReduceLROnPlateau
# Setup EarlyStopping callback to stop training if model's val_loss doesn't improve for 3 epochs
early_stopping = EarlyStopping(monitor = "val_loss", # watch the val loss metric
patience = 5,
restore_best_weights = True) # if val loss decreases for 3 epochs in a row, stop training
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=3, min_lr=1e-6)
history = model.fit(train_images, train_labels, epochs=50,
validation_split=0.2,
callbacks=[
early_stopping,
reduce_lr
])
有人可以建议我解决这个问题吗?
改进训练数据, 降低模型复杂度, 减少特征数量