正确训练这个 Unet 模型的问题

问题描述 投票:0回答:0

所以我正在尝试训练一个 UNET 模型来从图像中分割人类。我正在使用回调,其中之一是 ReduceLROnPlateau 和 Early Stopping。然而,我的模型没有训练并且在损失 0.48 时停止,即使它没有正确分割。我很乐意分享这个问题的所有相关材料。

回调:

resume_callbacks= [
ModelCheckpoint('/content/gdrive/MyDrive/unet.h5', monitor= "val_loss", verbose= 1),
# Reduces lr when metric stops improving
ReduceLROnPlateau(monitor='val_loss', patience= 3, factor= 0.1, verbose= 1),
ModelCheckpoint('/content/gdrive/MyDrive/unet_checkpoint_latest.hdf5', monitor="val_loss", mode="min", save_best_only=True, verbose=1),
CSVLogger(csv_path),
EarlyStopping(monitor='val_loss', patience=8)]

合身功能:

# Starting/resuming training
 model.fit(
    train_dataset,
    validation_data= test_dataset,
    epochs= 100,    
    steps_per_epoch= train_steps,
    validation_steps= valid_steps,
    callbacks= resume_callbacks,
    initial_epoch=50)

超参数:

input_shape= (256, 256, 3)
batch_size= 8
epochs= 100
lr= 1e-4
model_path= "/content/gdrive/MyDrive/unet.h5"
csv_path= "/content/gdrive/MyDrive/data.csv"
checkpoint_path= "/content/gdrive/MyDrive/unet_checkpoint_latest.hdf5"

输出信息:

Epoch 57: ReduceLROnPlateau reducing learning rate to 9.99999943962493e-12.

Epoch 57: val_loss did not improve from 0.48361
568/568 [==============================] - 86s 152ms/step - loss: 0.4759 - mean_io_u: 
0.3723 - recall: 0.3187 - precision: 0.5634 - val_loss: 0.4836 - val_mean_io_u: 0.3718 - 
val_recall: 0.3082 - val_precision: 0.5464 - lr: 1.0000e-10
Epoch 58/100
568/568 [==============================] - ETA: 0s - loss: 0.4759 - mean_io_u: 0.3723 - 
recall: 0.3187 - precision: 0.5638
Epoch 58: saving model to /content/gdrive/MyDrive/unet.h5

Epoch 58: val_loss did not improve from 0.48361
568/568 [==============================] - 86s 152ms/step - loss: 0.4759 - mean_io_u: 
0.3723 - recall: 0.3187 - precision: 0.5638 - val_loss: 0.4836 - val_mean_io_u: 0.3718 - 
val_recall: 0.3094 - val_precision: 0.5461 - lr: 1.0000e-11
Epoch 59/100
568/568 [==============================] - ETA: 0s - loss: 0.4760 - mean_io_u: 0.3723 - 
recall: 0.3185 - precision: 0.5635
Epoch 59: saving model to /content/gdrive/MyDrive/unet.h5

Epoch 59: val_loss did not improve from 0.48361
568/568 [==============================] - 89s 156ms/step - loss: 0.4760 - mean_io_u: 
0.3723 - recall: 0.3185 - precision: 0.5635 - val_loss: 0.4836 - val_mean_io_u: 0.3718 - 
val_recall: 0.3083 - val_precision: 0.5463 - lr: 1.0000e-11
Epoch 60/100
568/568 [==============================] - ETA: 0s - loss: 0.4760 - mean_io_u: 0.3723 - 
recall: 0.3184 - precision: 0.5634
Epoch 60: saving model to /content/gdrive/MyDrive/unet.h5

Epoch 60: ReduceLROnPlateau reducing learning rate to 9.999999092680235e-13.

Epoch 60: val_loss did not improve from 0.48361
568/568 [==============================] - 86s 152ms/step - loss: 0.4760 - mean_io_u: 
0.3723 - recall: 0.3184 - precision: 0.5634 - val_loss: 0.4836 - val_mean_io_u: 0.3718 - 
val_recall: 0.3069 - val_precision: 0.5466 - lr: 1.0000e-11
Epoch 61/100
568/568 [==============================] - ETA: 0s - loss: 0.4758 - mean_io_u: 0.3723 - 
recall: 0.3187 - precision: 0.5636
Epoch 61: saving model to /content/gdrive/MyDrive/unet.h5

Epoch 61: val_loss did not improve from 0.48361
568/568 [==============================] - 86s 151ms/step - loss: 0.4758 - mean_io_u: 
0.3723 - recall: 0.3187 - precision: 0.5636 - val_loss: 0.4836 - val_mean_io_u: 0.3718 - 
val_recall: 0.3079 - val_precision: 0.5464 - lr: 1.0000e-12
<keras.callbacks.History at 0x7f06bcffeaf0>
python tensorflow deep-learning image-segmentation unet-neural-network
© www.soinside.com 2019 - 2024. All rights reserved.