在keras的LSTM中使零输入不使用嵌入

问题描述 投票:1回答:1

我正在Keras训练LSTM:

iclf = Sequential()
iclf.add(Bidirectional(LSTM(units=10, return_sequences=True, recurrent_dropout=0.3), input_shape=(None,2048)))
iclf.add(TimeDistributed(Dense(1, activation='sigmoid')))

每个单元格的输入是一个2048矢量,它是已知的并且不需要学习(如果您愿意,它们是输入句子中单词的ELMo嵌入)。因此,这里没有嵌入层。

由于输入序列的长度可变,因此使用pad_sequences对其进行填充:

X = pad_sequences(sequences=X, padding='post', truncating='post', value=0.0, dtype='float32')

现在,我想告诉LSTM忽略这些填充元素。官方方法是使用mask_zero=True嵌入层。但是,这里没有嵌入层。如何通知LSTM屏蔽零元素?

keras lstm embedding
1个回答
0
投票

如@Today在注释中所建议,您可以使用Masking层。在这里,我添加了一个玩具问题。

# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM, Masking
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model

# define input sequence
sequence = array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9], 
                  [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
                  [0.3, 0.4, 0.5, 0.6]])
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')


# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))

# define model
model = Sequential()
model.add(Masking(mask_value=0, input_shape=(n_in, 1)))
model.add(LSTM(100, activation='relu', input_shape=(n_in,1) ))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
print(yhat[0,:,0])
© www.soinside.com 2019 - 2024. All rights reserved.