Tensorflow 拼写校正模型的错误预测

问题描述 投票:0回答:1

我训练了一个用于拼写纠正的 Tensorflow 模型。我训练了 > 60 个 epoch,准确率达到约 82.2%,损失为 0.3032。当我尝试使用模型进行预测时,它没有提供任何正确的预测。该模型使用超过 10 万个句子进行训练,其中约 20% 存在拼写错误。输出是二进制数据(不是 one-hot 编码的)。即使使用一些训练数据,该模型也无法提供正确的预测。模型如下:

def create_model():

  tf.random.set_seed(42)

  # input_layer = Input(shape=(x_train.shape[1],1),name='input_layer')#input layer for use without embedding
  input_layer = Input(shape=(x_train.shape[1]),name='input_layer')#input layer for use with embedding
  sp_embedding_layer = Embedding(len(char2idx),EMBEDDING_DIM,embeddings_initializer=initializers.Constant(embed_matrix),trainable=False)(input_layer)#embeddings_initializer=initializers.Constant(embed_matrix),

  x = Bidirectional(LSTM(1024,return_sequences = False))(sp_embedding_layer)
  x = Dense(1920, activation = 'relu',name='Dense_1')(x)
  x = Dense(2048, activation = 'relu',name='Dense_2')(x)
  x = Dense(2048, activation = 'relu',name='Dense_3')(x)
  x = Dense(2048, activation = 'relu',name='Dense_4')(x)
  x = Dense(2048, activation = 'relu',name='Dense_5')(x)
  x = Dense(1024, activation = 'relu',name='Dense_6')(x)
  x = Dense(1024, activation = 'relu',name='Dense_7')(x)
  x = Dense(y_train.shape[1],name='Output',activation = 'sigmoid')(x)
  model = models.Model(inputs = input_layer, outputs = x)

  return model

优化器和回调如下:

sf = 4

spb = x_train.shape[0]/BATCH_SIZE #steps per batch
sf_epoch = spb * sf


myoptimizer = optimizers.Adam(learning_rate=0.000001)


filepath="/content/drive/MyDrive/NLP/Models/SCModels/weights.{epoch:02d}.tf"
checkpoint = EpochModelCheckpoint(filepath,frequency=sf,save_weights_only=True)
# ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True,
#                              mode='max',save_freq = sf)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,verbose=1,
                              patience=2)

callbacks_list = [checkpoint,reduce_lr]

训练代码:

history = n_model.fit(x_train,y_train,validation_data=(x_val,y_val),epochs=60,
                    callbacks=callbacks_list,batch_size=BATCH_SIZE)

预测代码如下:

test_model = create_model()
test_model.load_weights(selected_model)
test_optimizer = optimizers.Adam(learning_rate=0.0983)
test_model.compile(optimizer=test_optimizer,loss=tf.keras.losses.BinaryCrossentropy(),
              metrics=['accuracy'])

print('input:',sample_sent)
input = np.array(encode(sample_sent,MAX_SENT_LEN+10)).reshape((1,-1))
answer = test_model.predict(input)
answer_dec = np.where(answer<0.5,0,1)
print(answer_dec[0].tolist())
print(decode(recover_int_seq(answer_dec[0].tolist())))

为什么预测不准确?我将不胜感激任何帮助。谢谢。

python tensorflow nlp
1个回答
0
投票

在我看来,你并不真的需要那么多 Dense 层。考虑添加 LayerNormalization 和 Dropout 层而不是一些 Dense 层。

© www.soinside.com 2019 - 2024. All rights reserved.