CNN 过度拟合且在验证集上表现不佳

问题描述 投票:0回答:1

我使用 CNN 进行鸟类分类,如下所示。然而,它在训练集上表现很好(50%),在验证集上表现非常糟糕(<1%).

数据如下:

train_images.shape

(4829, 100, 100, 3)

train_labels.shape

(4829, 200)

test_images.shape

(1204, 100, 100, 3)

test_labels.shape

(1204, 200)

对于模型,我使用了 MobileNet 的迁移学习并对其进行了微调。

import tensorflow as tf
from tensorflow.keras import layers, Model
from tensorflow.keras.applications import efficientnet_v2
from tensorflow.keras.optimizers import Adam

# Load the pre-trained EfficientNetV2B0 model
base_model = tf.keras.applications.MobileNetV2(input_shape=(100,100,3),
                                               include_top=False,
                                               weights='imagenet')

base_model.trainable = True

# Fine-tune from this layer onwards
fine_tune_at = 100

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
  layer.trainable = False

为了防止过度拟合,我使用了数据增强、L2正则化、dropout、提前停止和学习率衰减,但它们似乎没有效果。

inputs = tf.keras.Input(shape=(100, 100, 3))
x = augment(inputs, training=True)
x = base_model(x, training=True)

x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(580, activation='relu', kernel_regularizer=regularizers.l2(0.01), bias_regularizer=regularizers.l2(0.01))(x)
x = tf.keras.layers.Dropout(0.2)(x)

outputs = tf.keras.layers.Dense(200, activation='softmax')(x)
model = tf.keras.Model(inputs, outputs)

model.compile(loss = "categorical_crossentropy",
               optimizer = tf.keras.optimizers.Adam(),
               metrics = ['accuracy'])

from tensorflow.keras.callbacks import Callback, EarlyStopping,ModelCheckpoint, ReduceLROnPlateau

# Setup EarlyStopping callback to stop training if model's val_loss doesn't improve for 3 epochs
early_stopping = EarlyStopping(monitor = "val_loss", # watch the val loss metric
                               patience = 5,
                               restore_best_weights = True) # if val loss decreases for 3 epochs in a row, stop training

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=3, min_lr=1e-6)

history = model.fit(train_images, train_labels, epochs=50,
                    validation_split=0.2,
                    callbacks=[
                      early_stopping,
                      reduce_lr
                    ])

表演如下: enter image description here enter image description here

有人可以建议我解决这个问题吗?

tensorflow deep-learning computer-vision conv-neural-network overfitting-underfitting
1个回答
0
投票

改进训练数据, 降低模型复杂度, 减少特征数量

© www.soinside.com 2019 - 2024. All rights reserved.