在Tensorflow中，LSTMCell中状态元组（c，h）的返回'输出'和'h'之间有什么区别？

Question

我搜索了许多教程/博客/指南和官方Tensorflow文档来理解这一点。例如，请参阅以下行：

lstm = tf.nn.rnn_cell.LSTMCell(512)
output, state_tuple = lstm(current_input, last_state_tuple)

现在，如果我解压缩状态，

last_cell_memory, last_hidden_state =  state_tuple

output和last_hidden_state都具有与[batch_size，512]完全相同的维度。两者都可以互换使用吗？我的意思是，我可以这样做吗？：

last_state_tuple= last_cell_memory, output

然后在lstm中输入last_state_tuple？

Answer 1

雅克的答案是正确的，但它没有提到重要的一点：LSTM层的状态几乎总是等于输出。当LSTM细胞链很长并且并非所有输入序列具有相等的长度（因此被填充）时，差异变得重要。那是你应该区分状态和输出的时候。

请参阅my answer on a similar question中的runnable示例（它使用BasicRNNCell，但您将获得与LSTMCell相同的结果）。

Answer 2

是的，状态的第二个元素与输出相同。

来自https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/LSTMStateTuple

按顺序存储两个元素：（c，h）。其中c是隐藏状态，h是输出。

还要通过实验验证：

import tensorflow as tf
from numpy import random as rng
lstm = tf.nn.rnn_cell.LSTMCell(10)
inp = tf.placeholder(tf.float32, shape=(1, 10))
stt = tf.placeholder(tf.float32, shape=(1, 10))
hdd = tf.placeholder(tf.float32, shape=(1, 10))
out = lstm(inp, (stt, hdd))
sess = tf.InteractiveSession()
init = tf.global_variables_initializer()
sess.run(init)
a = rng.randn(1, 10)
b = rng.randn(1, 10)
c = rng.randn(1, 10)
output = sess.run(out, {inp: a, stt: b, hdd: c})
assert (output[0] == output[1][1]).all()

在Tensorflow中，LSTMCell中状态元组（c，h）的返回'输出'和'h'之间有什么区别？

问题描述投票：0回答：2

2个回答

最新问题

在Tensorflow中，LSTMCell中状态元组（c，h）的返回'输出'和'h'之间有什么区别？

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2