训练 DL 模型时,本地集合点正在中止,状态为:OUT_OF_RANGE:序列结束

问题描述 投票:0回答:1

我正在创建一个植物病害识别模型。我有一个包含 38 种疾病的数据集,每种疾病有大约 2000 张图像。但是在训练模型时,由于一些 OUT_OF_RANGE 错误,一些时期被跳过。有人可以帮我解决这个问题吗?

import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Input

train_dir = 'dataset/train'
valid_dir = 'dataset/valid'
batch_size = 32

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

valid_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='categorical'
)

valid_generator = valid_datagen.flow_from_directory(
    valid_dir,
    target_size=(150, 150),
    batch_size=batch_size,
    class_mode='categorical'
)

model = Sequential([
    Input(shape=(150, 150, 3)),
    Conv2D(32, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(38, activation='softmax')  # Adjust output units based on the number of disease classes
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // batch_size,
    epochs=10,
    validation_data=valid_generator,
    validation_steps=valid_generator.samples // batch_size
)

model.save('plant_disease_model.h5')

class_indices = train_generator.class_indices
disease_names = list(class_indices.keys())
print("Mapping of Class Indices to Disease Names:", class_indices)

终端:

Found 70295 images belonging to 38 classes.
Found 17572 images belonging to 38 classes.
2024-04-23 19:50:32.085744: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instru
ctions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Epoch 1/10
\.venv\Lib\site-packages\keras\src\trainers\data_adapters\py_dataset_adapter.p
y:120: UserWarning: Your `PyDataset` class should call `super().__init__(**kwargs)` in its constructor. `**kwargs` can include `workers`, `use_m
ultiprocessing`, `max_queue_size`. Do not pass these arguments to `fit()`, as they will be ignored.
  self._warn_if_super_not_called()
←[1m2196/2196←[0m ←[32m━━━━━━━━━━━━━━━━━━━━←[0m←[37m←[0m ←[1m905s←[0m 411ms/step - accuracy: 0.4608 - loss: 1.8737 - val_accuracy: 0.7432 - val_
loss: 0.8556
Epoch 2/10
←[1m   1/2196←[0m ←[37m━━━━━━━━━━━━━━━━━━━━←[0m ←[1m12:02←[0m 329ms/step - accuracy: 0.6875 - loss: 0.78202024-04-23 20:05:37.996528: W tensorfl
ow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
         [[{{node IteratorGetNext}}]]
C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\contextlib.py:155: UserWarning: Your input ran out of data; interrupting training. Ma
ke sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches. You may need to use the `.repeat()` function wh
en building your dataset.
  self.gen.throw(typ, value, traceback)
2024-04-23 20:05:38.068817: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of
sequence
         [[{{node IteratorGetNext}}]]
←[1m2196/2196←[0m ←[32m━━━━━━━━━━━━━━━━━━━━←[0m←[37m←[0m ←[1m0s←[0m 49us/step - accuracy: 0.6875 - loss: 0.7820 - val_accuracy: 0.7500 - val_los
s: 0.2462

如上所示,epoch 1 已成功完成,但 epoch 2 由于某些错误而终止。同样,epoch 3、5、7、9 成功完成,但 epoch 4、6、8、10 导致错误。

python tensorflow machine-learning keras deep-learning
1个回答
0
投票

是的,我和你面临着同样的问题,似乎每隔一个时代就会出现一次。我也在网上搜索了一下,似乎这是 Tensorflow 2.16 的某种错误 您可以参考以下链接: https://github.com/tensorflow/tensorflow/issues/62963

© www.soinside.com 2019 - 2024. All rights reserved.