我正在使用 InceptionV3 和 GRU 进行视频分类。我的数据集被标记为两类。我已经从视频中提取了帧(Sequnece_length = 30)并将其存储在包含视频索引、帧索引、高度、重量和通道的frames_list列表中。我已经转换为 4D numpy 数组:
four_dim_list = [frame for video in frames_list for frame in video]
frame_list = np.array(four_dim_list)
frame_list 的长度为 3660(来自 122 个视频样本 * 30 帧),并且 label_list 包含 3660 个标签(每帧标签)。
为了训练,我使用了预训练模型 InceptionV3 并使用了两个堆叠的 GRU 层(16 个单元和 8 个单元)。实现如下:
import tensorflow as tf
from tensorflow.keras.applications import InceptionV3
from tensorflow.keras.applications.inception_v3 import preprocess_input
from tensorflow.keras.layers import GRU, Dense, TimeDistributed, Masking, Flatten
from tensorflow.keras.layers import MaxPooling3D
from tensorflow.keras.layers import MaxPooling3D, MaxPooling2D
num_frames=30
sample=122
BATCH_SIZE=30
EPOCHS = 5
IMG_SIZE = 224
# Preprocess frames
preprocessed_frames = preprocess_input(frame_list)
# InceptionV3 model with pretrained weights
inception_model = tf.keras.applications.InceptionV3(weights='imagenet', include_top=False, input_shape=(IMG_SIZE, IMG_SIZE, 3))
# GRU model
model = tf.keras.Sequential()
# TimeDistributed layer with the InceptionV3-based feature extractor
model.add(tf.keras.layers.TimeDistributed(inception_model, input_shape=(num_frames, IMG_SIZE, IMG_SIZE, 3)))
model.add(tf.keras.layers.Masking(mask_value=0.0))
num_elements = 5 * 5 * 2048
model.add(tf.keras.layers.Reshape((num_frames, num_elements)))
model.add(tf.keras.layers.GRU(units=16, return_sequences=True))
model.add(tf.keras.layers.GRU(units=8))
model.add(tf.keras.layers.Dropout(0.4))
model.add(tf.keras.layers.Dense(units=8, activation='relu'))
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
#Model Training
model.fit(preprocessed_frames, label_list, batch_size=BATCH_SIZE, epochs=EPOCHS)
运行代码时,我收到输入层尺寸不兼容的错误。
这是模型摘要和错误:
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5
87910968/87910968 [==============================] - 0s 0us/step
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
time_distributed (TimeDist (None, 30, 5, 5, 2048) 21802784
ributed)
masking (Masking) (None, 30, 5, 5, 2048) 0
reshape (Reshape) (None, 30, 51200) 0
gru (GRU) (None, 30, 16) 2458464
gru_1 (GRU) (None, 8) 624
dropout (Dropout) (None, 8) 0
dense (Dense) (None, 8) 72
dense_1 (Dense) (None, 1) 9
=================================================================
Total params: 24261953 (92.55 MB)
Trainable params: 24227521 (92.42 MB)
Non-trainable params: 34432 (134.50 KB)
_________________________________________________________________
Epoch 1/5
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-e2802569fee1> in <cell line: 38>()
36
37 #Train the model
---> 38 model.fit(preprocessed_frames, label_list, batch_size=BATCH_SIZE, epochs=EPOCHS)
1 frames
/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py in error_handler(*args, **kwargs)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py in tf__train_function(iterator)
13 try:
14 do_return = True
---> 15 retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False
ValueError: in user code:
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1338, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1322, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1303, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1080, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.10/dist-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/input_spec.py", line 298, in assert_input_compatibility
raise ValueError(
ValueError: Input 0 of layer "sequential" is incompatible with the layer: expected shape=(None, 30, 224, 224, 3), found shape=(30, 224, 224, 3)
任何人都可以提供帮助来解决问题并理解问题!注意到我是视频处理方面的新手,所以我在解释这个问题时可能会很笨拙,请原谅我。
“运行代码时,我收到输入层尺寸不兼容的错误。”
ValueError: Input 0 of layer "sequential" is incompatible with the layer:
expected shape=(None, 30, 224, 224, 3), found shape=(30, 224, 224, 3)
这个很简单。
每次执行
model.add(tf.keras.layers. ... etc
命令时,您都会 添加图层
您的图层输入之一是未正确设置(例如:没有添加足够的参数)。
问题出在这一行:
model.add(tf.keras.layers.TimeDistributed(inception_model, input_shape=(num_frames, IMG_SIZE, IMG_SIZE, 3)))
您的设置在哪里:
# is the "found shape"
input_shape = (num_frames, IMG_SIZE, IMG_SIZE, 3)
但它应该像下面的“预期”形状一样设置:
# is the "expected shape"
expected shape = (None, num_frames, IMG_SIZE, IMG_SIZE, 3)