输入0与层repeat_vector_40不兼容:预期ndim = 2,发现ndim = 1

问题描述 投票:0回答:2

我正在开发一个用于异常检测的 LSTM 自动编码器模型。我的 keras 模型设置如下:

from keras.models import Sequential

from keras import Model, layers
from keras.layers import Layer, Conv1D, Input, Masking, Dense, RNN, LSTM, Dropout, RepeatVector, TimeDistributed, Masking, Reshape

def create_RNN_with_attention():
    x=Input(shape=(X_train_dt.shape[1], X_train_dt.shape[2]))
    RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)
    attention_layer = attention()(RNN_layer_1)
    dropout_layer_1 = Dropout(rate=0.2)(attention_layer)
    repeat_vector_layer = RepeatVector(n=X_train_dt.shape[1])(dropout_layer_1)
    RNN_layer_2 = LSTM(units=64, return_sequences=True)(repeat_vector_layer)
    dropout_layer_1 = Dropout(rate=0.2)(RNN_layer_2)
    output = TimeDistributed(Dense(X_train_dt.shape[2], trainable=True))(dropout_layer_1)
    model=Model(x,output)
    model.compile(loss='mae', optimizer='adam')    
    return model

注意我添加的注意力层,

attention_layer
。在添加此之前,模型编译完美,但是在添加此attention_layer之后 - 模型抛出以下错误:
ValueError: Input 0 is incompatible with layer repeat_vector_40: expected ndim=2, found ndim=1

我的注意力层设置如下:

import keras.backend as K
class attention(Layer):
    def __init__(self,**kwargs):
        super(attention,self).__init__(**kwargs)
 
    def build(self,input_shape):
        self.W=self.add_weight(name='attention_weight', shape=(input_shape[-1],1), 
                               initializer='random_normal', trainable=True)
        self.b=self.add_weight(name='attention_bias', shape=(input_shape[1],1), 
                               initializer='zeros', trainable=True)        
        super(attention, self).build(input_shape)
 
    def call(self,x):
        # Alignment scores. Pass them through tanh function
        e = K.tanh(K.dot(x,self.W)+self.b)
        # Remove dimension of size 1
        e = K.squeeze(e, axis=-1)   
        # Compute the weights
        alpha = K.softmax(e)
        # Reshape to tensorFlow format
        alpha = K.expand_dims(alpha, axis=-1)
        # Compute the context vector
        context = x * alpha
        context = K.sum(context, axis=1)
        return context

注意力掩模的想法是让模型像火车一样关注更突出的特征。

为什么我会收到上述错误以及如何解决此问题?

python tensorflow keras lstm attention-model
2个回答
1
投票

我认为问题出在这一行:

RNN_layer_1 = LSTM(units=64, return_sequences=False)(x)

该层输出形状为

(batch_size, 64)
的张量。所以这意味着你输出一个向量,然后在 w.r.t 上运行注意力机制。批量维度而不是顺序维度。这也意味着您的输出具有压缩的批量尺寸,这对于任何
keras
层来说都是不可接受的。这就是为什么
Repeat
层会产生错误,因为它期望向量的形状至少为
(batch_dimension, dim)

如果你想在序列上运行注意力机制,那么你应该将上面提到的行切换为:

RNN_layer_1 = LSTM(units=64, return_sequences=True)(x)

0
投票

在注意力模型中,通常不使用“RepeatVector”层。该层有助于重复输入向量与输出时间一样多的次数。但是当使用注意力机制时,不需要重复输出向量,因为重要性适用于所有时间。

更具体地说,在您的模型中,

LSTM'' layer is first taken with 
attention''层中的
RNN_layer_1''. Then, by applying the attention mechanism (through 
return_sequences=True''和
RepeatVector'' to repeat vectors), the importances are determined for each time. Finally, with 
TimeDistributed Dense''的输出,每次都会计算输出。

因此,这里不需要

RepeatVector
层,应该将其删除。

© www.soinside.com 2019 - 2024. All rights reserved.