如何在LSTM中提取已提取的视频帧?

问题描述 投票:0回答:1

我想根据一千个视频进行一些异常检测。我已经提取了所有视频的所有帧的功能(使用VGG16)。现在,我拥有与每个视频相对应的几个文件中的所有内容。

当我从磁盘加载文件时,我得到了一个np.ndarray的形状(nb_frames,25088)。 25088组件对应于扁平化时VGGNet16的输出(VGG16输出:1x7x7x512)。

我想用K帧提供LSTM K帧。然而,自从我尝试以来已经好几天了,但我现在绝望了,无法让它发挥作用......

self.model = Sequential()
# LSTM needs 3 dimensional data (nb_samples, timesteps, input_dim)
self.model.add(CuDNNLSTM(32, return_sequences=True, batch_input_shape=(BATCH_SIZE, SIZE_WINDOW, 25088)))
self.model.add(Dropout(0.2))
self.model.add(Dense(1, activation='softmax'))
self.model.compile(loss='binary_crossentropy', optimizer="rmsprop", metrics=['accuracy'])
self.model.summary()

for (X_train, y_train) in self.batch_generator():
    self.model.fit(X_train, y_train, epochs=10)

这是我的发电机:

def batch_generator(self):
    # for all feature extracted files
    for video in self.videos:
        # videos[0] contains the path to the file
        # videos[1] contains the target (abnormal or not)
        x_train = np.load(video[0])  # load the video's features from disk

        nb_frames = x_train.shape[0]
        data = x_train.shape[1]

        # I've seen on stackoverflow I have to do that...
        x_train = x_train.reshape(nb_frames, data, 1)

        # The target is defined at video level, not frame level, then the same y is applied for all frame of
        # current video
        y_train = np.array([video[1]] * nb_frames)

        # the output shape (the output *shape* is 2 dimensional according to someone on stackoverflow)
        y_train = y_train.reshape(y_train.shape[0], 1)

        nb_windows = len(x_train) // SIZE_WINDOW

        for window_index in range(0, nb_windows):
            start = window_index * SIZE_WINDOW
            end = (window_index + 1) * SIZE_WINDOW
            yield x_train[start:end], y_train[start:end]

我收到错误:

ValueError: Error when checking input: expected cu_dnnlstm_input 
to have shape (30, 25088) but got array with shape (25088, 1)

30是我想在LSTM中处理的帧数。

此外,每当我尝试更改组件的顺序时,我得到相同的错误,但具有不同的值...

编辑:如果我应用第一个答案的解决方案,这是我的代码。但它给了我一个ValueError,不能重塑:

        for window_index in range(0, nb_windows):
            start = window_index * SIZE_WINDOW
            end = (window_index + 1) * SIZE_WINDOW

            chunk = np.array(x_train[start:end])
            chunk = chunk.reshape(int(nb_frames / SIZE_WINDOW), SIZE_WINDOW, data)

            yield chunk, y_train[start:end]

即使我在这里这样做,错误依然存在:

        [...]
        # I've seen on stackoverflow I have to do that...
        # x_train = x_train.reshape(nb_frames, data, 1)
        x_train = x_train.reshape(int(nb_frames / SIZE_WINDOW), SIZE_WINDOW, data)
        [...]
python tensorflow keras lstm vgg-net
1个回答
0
投票

改变重塑:

x_train = x_train[:len(x_train)-(x_train%SIZE_WINDOW)]
x_train = x_train.reshape(int(nb_frames/SIZE_WINDOW), SIZE_WINDOW, data)

对不起这是我的错

© www.soinside.com 2019 - 2024. All rights reserved.