我正在尝试使用手套嵌入和一些对话行为信息作为文本特征来设计 LSTM-CNN 模型,我在两个级别使用了填充:在句子中使所有句子长度相等,在文档级别使所有文档相同数量的句子。我如何在两个级别添加掩码,以便模型了解这些句子只是填充句子而不是实际句子?这是我的代码:
def create_model(conv_activation='sigmoid',kernel_size=4,filters=32,learning_rate=0.0001):
EMBED_SIZE = 100
EPOCHS=2
max_conv=max_conversation_length
max_words=length_long_sentence
BATCH_SIZE=128
total_features=input_size
#learn_rate=0.0001#0.0001#0.01
DA_tags_size=len(unique_tag_set)
text_size=max_words
DA_size=text_size+DA_tags_size
ac_size=DA_size+max_acoustic_len
all_input=Input(shape=(max_conv,total_features,))
text_input=tf.slice(all_input, [0,0,0], [-1, -1,text_size])
tags_input=tf.slice(all_input, [0,0,text_size], [-1, -1,DA_tags_size])
acoustic_input=tf.slice(all_input,[0,0,DA_size],[-1,-1,-1])
text_emb = Embedding(vocab_size, EMBED_SIZE, weights=[embedding_matrix],input_length=max_words, name='embedding',trainable=False,mask_zero=True)(text_input)
hidden_vectors=TimeDistributed(Bidirectional(LSTM(units=128, return_sequences=False)) ,name='utterance_lstm')(text_emb )
combined= Concatenate(axis=-1)([hidden_vectors,tags_input])
conv_0=Conv1D(filters=filters, kernel_size=kernel_size, padding='same', activation='relu')(combined)
maxpool_1=GlobalMaxPooling1D()(conv_0)
out=Dense(32, activation='relu')(maxpool_1)
output = Dense(units=1,kernel_regularizer=l2(0.01))(out)
output=Activation(conv_activation)(output)
model = Model(inputs=all_input, outputs=output)#tags_input,acoustic_input
losses=tf.keras.losses.BinaryCrossentropy()
model.compile(optimizer=opt, loss=losses, metrics=['accuracy'])
print(model.summary())#
return model
这个387是最大的发音数。我们如何在 lstm 层中传递掩码,以便它知道在 387 个话语中有多少是填充的话语并忽略它并在后续层中传递同样的话语?
根据 Keras guidelines,你可以这样做:
inputs = keras.Input(shape=(None,), dtype="int32")
x = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True)(inputs)
outputs = layers.LSTM(32)(x)
model = keras.Model(inputs, outputs)
这在内部将掩蔽信息传递给下一层。您不需要手动指定它。