我正在Keras训练LSTM:
iclf = Sequential()
iclf.add(Bidirectional(LSTM(units=10, return_sequences=True, recurrent_dropout=0.3), input_shape=(None,2048)))
iclf.add(TimeDistributed(Dense(1, activation='sigmoid')))
每个单元格的输入是一个2048矢量,它是已知的并且不需要学习(如果您愿意,它们是输入句子中单词的ELMo嵌入)。因此,这里没有嵌入层。
由于输入序列的长度可变,因此使用pad_sequences
对其进行填充:
X = pad_sequences(sequences=X, padding='post', truncating='post', value=0.0, dtype='float32')
现在,我想告诉LSTM忽略这些填充元素。官方方法是使用mask_zero=True
嵌入层。但是,这里没有嵌入层。如何通知LSTM屏蔽零元素?
如@Today在注释中所建议,您可以使用Masking
层。在这里,我添加了一个玩具问题。
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM, Masking
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
from keras.utils import plot_model
# define input sequence
sequence = array([[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
[0.3, 0.4, 0.5, 0.6]])
# make sure to use dtype='float32' in padding otherwise with floating points
sequence = pad_sequences(sequence, padding='post', dtype='float32')
# reshape input into [samples, timesteps, features]
n_obs = len(sequence)
n_in = 9
sequence = sequence.reshape((n_obs, n_in, 1))
# define model
model = Sequential()
model.add(Masking(mask_value=0, input_shape=(n_in, 1)))
model.add(LSTM(100, activation='relu', input_shape=(n_in,1) ))
model.add(RepeatVector(n_in))
model.add(LSTM(100, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(1)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(sequence, sequence, epochs=300, verbose=0)
plot_model(model, show_shapes=True, to_file='reconstruct_lstm_autoencoder.png')
# demonstrate recreation
yhat = model.predict(sequence, verbose=0)
print(yhat[0,:,0])