通过减少 val_loss 和增加 val_accuracy 来改进模型?

问题描述 投票:0回答:1

我目前正在对视频进行二元分类,并努力了解如何减少模型的 val_loss 并提高模型的 val_accuracy。我很确定我当前的预测是关于过度拟合的问题,但我不确定我可以改变什么来随着时间的推移改进模型。

我正在使用 MobileNet + LSTM 的组合。这是图层:

def create_model():
 
    model = Sequential()
    mobilenet = MobileNetV2( include_top=False, weights="imagenet")
    mobilenet.trainable=True
    for layer in mobilenet.layers[:-20]:
      layer.trainable=False

    model.add(Input(shape = (16, 64, 64, 3)))
    
    model.add(TimeDistributed(mobilenet))
    
    model.add(Dropout(0.25))
                                    
    model.add(TimeDistributed(Flatten()))
    
    lstm_fw = LSTM(units=32)
    lstm_bw = LSTM(units=32, go_backwards = True)  

    model.add(Bidirectional(lstm_fw, backward_layer = lstm_bw))
   
    model.add(Dropout(0.25))

    model.add(Dense(256,activation='relu'))
    model.add(Dropout(0.25))

    model.add(Dense(128,activation='relu'))
    model.add(Dropout(0.25))

    model.add(Dense(64,activation='relu'))
    model.add(Dropout(0.25))

    model.add(Dense(32,activation='relu'))
    model.add(Dropout(0.25))
    
    
    model.add(Dense(len(CLASSES_LIST), activation = 'softmax'))

    model.summary()
    
    return model

这是模型的回调和编译

# Create Early Stopping Callback to monitor the accuracy
early_stopping_callback = EarlyStopping(monitor = 'val_accuracy', patience = 10, restore_best_weights = True)

# Create ReduceLROnPlateau Callback to reduce overfitting by decreasing learning
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='val_loss',
                                                  factor=0.5,
                                                  patience=5,
                                                  min_lr=0.0001,
                                                  verbose=1)
 
# Compiling the model 
model.compile(loss = 'categorical_crossentropy', optimizer = 'sgd', metrics = ["accuracy"])
 
# Fitting the model 
result= model.fit(x = features_train, y = labels_train, epochs = 50, batch_size = 8 ,
                                             shuffle = True, validation_split = 0.2, callbacks = [early_stopping_callback,reduce_lr])
Epoch 1/50
180/180 [==============================] - 47s 240ms/step - loss: 0.6998 - accuracy: 0.4903 - val_loss: 0.6903 - val_accuracy: 0.5306 - lr: 0.0100
Epoch 2/50
180/180 [==============================] - 40s 225ms/step - loss: 0.6873 - accuracy: 0.5326 - val_loss: 0.6853 - val_accuracy: 0.5583 - lr: 0.0100
Epoch 3/50
180/180 [==============================] - 42s 232ms/step - loss: 0.6750 - accuracy: 0.5688 - val_loss: 0.6653 - val_accuracy: 0.6333 - lr: 0.0100
Epoch 4/50
180/180 [==============================] - 43s 237ms/step - loss: 0.6505 - accuracy: 0.6201 - val_loss: 0.6440 - val_accuracy: 0.6167 - lr: 0.0100
Epoch 5/50
180/180 [==============================] - 43s 240ms/step - loss: 0.6145 - accuracy: 0.6597 - val_loss: 0.6203 - val_accuracy: 0.6611 - lr: 0.0100
Epoch 6/50
180/180 [==============================] - 44s 247ms/step - loss: 0.5757 - accuracy: 0.6840 - val_loss: 0.6140 - val_accuracy: 0.6639 - lr: 0.0100
Epoch 7/50
180/180 [==============================] - 43s 238ms/step - loss: 0.5453 - accuracy: 0.7493 - val_loss: 0.5832 - val_accuracy: 0.6972 - lr: 0.0100
Epoch 8/50
180/180 [==============================] - 43s 237ms/step - loss: 0.5219 - accuracy: 0.7660 - val_loss: 0.5975 - val_accuracy: 0.6722 - lr: 0.0100
Epoch 9/50
180/180 [==============================] - 43s 237ms/step - loss: 0.4905 - accuracy: 0.7778 - val_loss: 0.6300 - val_accuracy: 0.6750 - lr: 0.0100
Epoch 10/50
180/180 [==============================] - 43s 237ms/step - loss: 0.4247 - accuracy: 0.8069 - val_loss: 0.7396 - val_accuracy: 0.6806 - lr: 0.0100
Epoch 11/50
180/180 [==============================] - 43s 238ms/step - loss: 0.4199 - accuracy: 0.8153 - val_loss: 0.7280 - val_accuracy: 0.6833 - lr: 0.0100
Epoch 12/50
180/180 [==============================] - ETA: 0s - loss: 0.3750 - accuracy: 0.8354
Epoch 12: ReduceLROnPlateau reducing learning rate to 0.005999999865889549.
180/180 [==============================] - 43s 238ms/step - loss: 0.3750 - accuracy: 0.8354 - val_loss: 0.7351 - val_accuracy: 0.6944 - lr: 0.0100
Epoch 13/50
180/180 [==============================] - 43s 239ms/step - loss: 0.3464 - accuracy: 0.8625 - val_loss: 0.7303 - val_accuracy: 0.6861 - lr: 0.0060
Epoch 14/50
180/180 [==============================] - 43s 241ms/step - loss: 0.3239 - accuracy: 0.8771 - val_loss: 0.7414 - val_accuracy: 0.6972 - lr: 0.0060
Epoch 15/50
180/180 [==============================] - 43s 241ms/step - loss: 0.2701 - accuracy: 0.8993 - val_loss: 0.8252 - val_accuracy: 0.7000 - lr: 0.0060
Epoch 16/50
180/180 [==============================] - 44s 244ms/step - loss: 0.2850 - accuracy: 0.8875 - val_loss: 0.7776 - val_accuracy: 0.6750 - lr: 0.0060
Epoch 17/50
180/180 [==============================] - ETA: 0s - loss: 0.2117 - accuracy: 0.9278
Epoch 17: ReduceLROnPlateau reducing learning rate to 0.003600000031292438.
180/180 [==============================] - 43s 240ms/step - loss: 0.2117 - accuracy: 0.9278 - val_loss: 0.9213 - val_accuracy: 0.6861 - lr: 0.0060
Epoch 18/50
180/180 [==============================] - 43s 241ms/step - loss: 0.1874 - accuracy: 0.9333 - val_loss: 0.9246 - val_accuracy: 0.7111 - lr: 0.0036
Epoch 19/50
180/180 [==============================] - 44s 244ms/step - loss: 0.1764 - accuracy: 0.9417 - val_loss: 0.9533 - val_accuracy: 0.7000 - lr: 0.0036
Epoch 20/50
180/180 [==============================] - 45s 248ms/step - loss: 0.1595 - accuracy: 0.9493 - val_loss: 1.0038 - val_accuracy: 0.6889 - lr: 0.0036
Epoch 21/50
180/180 [==============================] - 44s 245ms/step - loss: 0.1404 - accuracy: 0.9590 - val_loss: 1.0586 - val_accuracy: 0.6944 - lr: 0.0036
Epoch 22/50
180/180 [==============================] - ETA: 0s - loss: 0.1331 - accuracy: 0.9576
Epoch 22: ReduceLROnPlateau reducing learning rate to 0.0021599999628961085.
180/180 [==============================] - 45s 249ms/step - loss: 0.1331 - accuracy: 0.9576 - val_loss: 1.0285 - val_accuracy: 0.7139 - lr: 0.0036
Epoch 23/50
180/180 [==============================] - 45s 247ms/step - loss: 0.1027 - accuracy: 0.9708 - val_loss: 1.1426 - val_accuracy: 0.7056 - lr: 0.0022
Epoch 24/50
180/180 [==============================] - 44s 247ms/step - loss: 0.1057 - accuracy: 0.9653 - val_loss: 1.1398 - val_accuracy: 0.7167 - lr: 0.0022
Epoch 25/50
180/180 [==============================] - 44s 246ms/step - loss: 0.0929 - accuracy: 0.9694 - val_loss: 1.2270 - val_accuracy: 0.7056 - lr: 0.0022
Epoch 26/50
180/180 [==============================] - 45s 249ms/step - loss: 0.0982 - accuracy: 0.9715 - val_loss: 1.2117 - val_accuracy: 0.7111 - lr: 0.0022
Epoch 27/50
180/180 [==============================] - ETA: 0s - loss: 0.0745 - accuracy: 0.9778
Epoch 27: ReduceLROnPlateau reducing learning rate to 0.0012959999497979878.
180/180 [==============================] - 45s 251ms/step - loss: 0.0745 - accuracy: 0.9778 - val_loss: 1.2147 - val_accuracy: 0.7083 - lr: 0.0022
Epoch 28/50
180/180 [==============================] - 45s 249ms/step - loss: 0.0742 - accuracy: 0.9750 - val_loss: 1.2625 - val_accuracy: 0.7056 - lr: 0.0013
Epoch 29/50
180/180 [==============================] - 45s 251ms/step - loss: 0.0797 - accuracy: 0.9757 - val_loss: 1.2876 - val_accuracy: 0.7194 - lr: 0.0013
Epoch 30/50
180/180 [==============================] - 45s 250ms/step - loss: 0.0655 - accuracy: 0.9799 - val_loss: 1.3786 - val_accuracy: 0.6944 - lr: 0.0013
Epoch 31/50
180/180 [==============================] - 44s 246ms/step - loss: 0.0770 - accuracy: 0.9736 - val_loss: 1.3411 - val_accuracy: 0.7056 - lr: 0.0013
Epoch 32/50
180/180 [==============================] - ETA: 0s - loss: 0.0592 - accuracy: 0.9826
Epoch 32: ReduceLROnPlateau reducing learning rate to 0.0007775999838486314.
180/180 [==============================] - 44s 247ms/step - loss: 0.0592 - accuracy: 0.9826 - val_loss: 1.3883 - val_accuracy: 0.7111 - lr: 0.0013
Epoch 33/50
180/180 [==============================] - 44s 244ms/step - loss: 0.0513 - accuracy: 0.9847 - val_loss: 1.4247 - val_accuracy: 0.7139 - lr: 7.7760e-04
Epoch 34/50
180/180 [==============================] - 45s 248ms/step - loss: 0.0580 - accuracy: 0.9819 - val_loss: 1.4371 - val_accuracy: 0.7083 - lr: 7.7760e-04
Epoch 35/50
180/180 [==============================] - 45s 251ms/step - loss: 0.0629 - accuracy: 0.9812 - val_loss: 1.4415 - val_accuracy: 0.7028 - lr: 7.7760e-04
Epoch 36/50
180/180 [==============================] - 45s 248ms/step - loss: 0.0507 - accuracy: 0.9861 - val_loss: 1.4447 - val_accuracy: 0.7111 - lr: 7.7760e-04
Epoch 37/50
180/180 [==============================] - ETA: 0s - loss: 0.0674 - accuracy: 0.9799
Epoch 37: ReduceLROnPlateau reducing learning rate to 0.0004665599903091788.
180/180 [==============================] - 45s 247ms/step - loss: 0.0674 - accuracy: 0.9799 - val_loss: 1.4476 - val_accuracy: 0.7139 - lr: 7.7760e-04
Epoch 38/50
180/180 [==============================] - 44s 245ms/step - loss: 0.0374 - accuracy: 0.9882 - val_loss: 1.4594 - val_accuracy: 0.7111 - lr: 4.6656e-04
Epoch 39/50
180/180 [==============================] - 44s 243ms/step - loss: 0.0477 - accuracy: 0.9840 - val_loss: 1.4670 - val_accuracy: 0.7000 - lr: 4.6656e-04
tensorflow machine-learning keras lstm mobilenet
1个回答
0
投票

该模型似乎对训练数据过度拟合。这可能是由于:

过度拟合的原因

  1. 模型对于给定数据可能过于复杂,导致过度拟合。
  2. 用于训练的数据集可能太小,导致模型记忆训练样例。
  3. dropout 率相对较低,添加更多 dropout 层可能有助于规范模型。

解决方案

  1. 考虑降低模型的复杂度:减少Dense层中的隐藏单元、减少Dense层等

  2. 检查数据集的大小和多样性。问问自己训练数据是否真的是合适的数据集......

  3. 提高辍学率,将正则化程度提高到 50%

经过简化的代码以避免过度拟合(丢失和复杂性)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, TimeDistributed, Dense, Dropout, Flatten, LSTM, Bidirectional
from tensorflow.keras.applications import MobileNetV2

def create_model():
    model = Sequential()
    
    # Load MobileNetV2 with ImageNet weights
    mobilenet = MobileNetV2(include_top=False, weights="imagenet")
    mobilenet.trainable = True
    
    # Freeze only the first layers
    for layer in mobilenet.layers[:-20]:
        layer.trainable = False

    model.add(Input(shape=(16, 64, 64, 3)))

    model.add(TimeDistributed(mobilenet))

    model.add(Dropout(0.5))  # Increased dropout for more regularization

    model.add(TimeDistributed(Flatten()))

    lstm_fw = LSTM(units=32)
    lstm_bw = LSTM(units=32, go_backwards=True)

    model.add(Bidirectional(lstm_fw, backward_layer=lstm_bw))

    model.add(Dropout(0.5))  # Increased dropout for more regularization

    # Reduced the number of dense layers and hidden units in dense layers to reduce complexity
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(0.5))  # Increased dropout for more regularization

    model.add(Dense(64, activation='relu'))
    model.add(Dropout(0.5))  # Increased dropout for more regularization

    model.add(Dense(32, activation='relu'))
    model.add(Dropout(0.5))  # Increased dropout for more regularization

    model.add(Dense(len(CLASSES_LIST), activation='softmax'))

    model.summary()

    return model

© www.soinside.com 2019 - 2024. All rights reserved.