如何进行单词嵌入为RNN提供输入?

问题描述 投票:0回答:1

我正在尝试使用基本RNN进行单词预测。我需要为RNN小区提供输入;我正在尝试下面的代码

X_input = tf.placeholder(tf.int32, shape = (None, sequence_length, 1))
Y_target = tf.placeholder(tf.int32, shape = (None, sequence_length, 1))

tfWe = tf.Variable(tf.random_uniform((V, embedding_dim)))
W1 = tf.Variable(np.random.randn(hidden_layer_size, label).astype(np.float32))
b = tf.Variable(np.zeros(label).astype(np.float32))
rnn = GRUCell(num_units = hidden_layer_size, activation = tf.nn.relu)

x = tf.nn.embedding_lookup(tfWe, X_input)
x = tf.unstack(x, sequence_length, 1)
output, states = tf.nn.dynamic_rnn(rnn, x, dtype = tf.float32)
output = tf.transpose(output, (1,0,2))
output = tf.reshape(output, (sequence_length*num_samples,hidden_layer_size))

我收到错误ValueError:图层gru_cell_2需要1个输入,但它收到39个输入张量。我认为这个错误是由嵌入引起的,因为它没有给出可以输入到GRUCell的尺寸张量。那么,如何向GRU Cell提供输入?

tensorflow nlp deep-learning lstm rnn
1个回答
0
投票

你初始化X_input的方式可能是错误的。额外的一个维度导致问题。如果你删除它,那么就不需要使用unstack了。以下代码可行。

X_input = tf.placeholder(tf.int32, shape = (None, sequence_length))
Y_target = tf.placeholder(tf.int32, shape = (None, sequence_length))

tfWe = tf.Variable(tf.random_uniform((V, embedding_dim)))
W1 = tf.Variable(np.random.randn(hidden_layer_size, label).astype(np.float32))
b = tf.Variable(np.zeros(label).astype(np.float32))
rnn = tf.contrib.rnn.GRUCell(num_units = hidden_layer_size, activation = tf.nn.relu)

x = tf.nn.embedding_lookup(tfWe, X_input)
output, states = tf.nn.dynamic_rnn(rnn, x, dtype = tf.float32)
##shape of output here is (None,sequence_length,hidden_layer_size)

但是如果你真的需要使用那个维度,那么你需要在unstack做一个小的修改。你正沿着axis=1将它拆散到sequence_length的张量中,这似乎也不对。这样做:

X_input = tf.placeholder(tf.int32, shape = (None, sequence_length, 1))
Y_target = tf.placeholder(tf.int32, shape = (None, sequence_length, 1))

tfWe = tf.Variable(tf.random_uniform((V, embedding_dim)))
W1 = tf.Variable(np.random.randn(hidden_layer_size, label).astype(np.float32))
b = tf.Variable(np.zeros(label).astype(np.float32))
rnn = tf.contrib.rnn.GRUCell(num_units = hidden_layer_size, activation = tf.nn.relu)

x = tf.nn.embedding_lookup(tfWe, X_input)
x = tf.unstack(x, 1, 2)
output, states = tf.nn.dynamic_rnn(rnn, x[0], dtype = tf.float32)
##shape of output here is again same (None,sequence_length,hidden_layer_size)

最后,如果你真的需要在sequence_length张量的数量中取消它,那么用unstack替换tf.map_fn()并执行以下操作:

X_input = tf.placeholder(tf.int32, shape = (None, sequence_length, 1))
Y_target = tf.placeholder(tf.int32, shape = (None, sequence_length, 1))

tfWe = tf.Variable(tf.random_uniform((V, embedding_dim)))
W1 = tf.Variable(np.random.randn(hidden_layer_size, label).astype(np.float32))
b = tf.Variable(np.zeros(label).astype(np.float32))
rnn = tf.contrib.rnn.GRUCell(num_units = hidden_layer_size, activation = tf.nn.relu)

x = tf.nn.embedding_lookup(tfWe, X_input)
x = tf.transpose(x,[1,0,2,3])
##tf.map_fn unstacks a tensor along the first dimension only so we need to make seq_len as first dimension by taking transpose

output,states = tf.map_fn(lambda x: tf.nn.dynamic_rnn(rnn,x,dtype=tf.float32),x,dtype=(tf.float32, tf.float32))
##shape of output here is (sequence_length,None,1,hidden_layer_size)

警告:注意每个解决方案中output的形状。要小心你想要什么样的形状。

编辑:

要回答有关何时使用何种类型输入的问题:

假设你有25个句子,每个句子有15个单词,你将它分成5个批次,每个5个批次。另外,假设您正在使用50维的单词嵌入(假设您正在使用word2vec),那么您的输入形状将是(batch_size=5,time_step=15, features=50)。在这种情况下,您不需要使用取消堆栈或任何类型的映射。

接下来,假设您有30个文档,每个文档有25个句子,每个句子长15个字,并且您将文档分成6个批次,每个大小为5个。再说一次,假设你正在使用50维的单词嵌入,那么你的输入形状现在有一个额外的维度。在这里batch_size=5time_step=15features=50但句子数量怎么样?现在您的输入是(batch_size=5,num_sentences=25,time_step=15, features=50),这是任何类型的RNNs的无效形状。在这种情况下,您需要沿句子维度将其取消堆叠以生成25个张量,每个张量都具有形状(5,15,50)。为了做到这一点,我使用了tf.map_fn

© www.soinside.com 2019 - 2024. All rights reserved.