LSTM 用于从单个单词生成一行诗?输入操作和模型创建帮助

问题描述 投票:0回答:1
X = []
Y = []

for line in document:
  words = line.split()
  line_length = len(words)
  if line_length > 1:  # Lines with 1 word or less are excluded
      input_sequence = [word_to_index.get(words[0], 0)]  # First word as input
      output_sequence = [word_to_index.get(word, 0) for word in words[1:max_sequence_length]]  # Remaining words as output
      # Pad shorter sequences with zeros
      while len(output_sequence) < max_sequence_length - 1:
        output_sequence.append(0)
      X.append(input_sequence)
      Y.append(output_sequence)

这是正确的方法吗?我只是希望我的模型能够创建一首少于 10-12 个单词的诗作为发布的字符串列表中的每个元素,并且我正在尝试使用 NLP 来实现它。 word_to_index 只是为文档中的每个单词分配数字。 我怎样才能做到这一点?当我继续这样做时,它在损失计算中显示出一些维度错误。

对此还有什么进一步的建议吗?我应该如何继续获得诗歌生成器?
The dataset looks like this
The X and Y train are of the shapes as this

这给出了错误

 File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1377, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1360, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1349, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1127, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py", line 1185, in compute_loss
        return self.compiled_loss(
    File "/usr/local/lib/python3.10/dist-packages/keras/src/engine/compile_utils.py", line 277, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 143, in __call__
        losses = call_fn(y_true, y_pred)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 270, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 2221, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "/usr/local/lib/python3.10/dist-packages/keras/src/backend.py", line 5575, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (None, 16) and (None, 1, 16) are incompatible

我只想让我的模型接受一个单词输入并创建一首 5-7-5 音节数的单行输出诗,或者只是目前合适的单行诗,大约 10 到 15 个单词。我应该如何处理这个问题?

我尝试了该方法,但形状出现错误。我发现很难修改它,因为我认为我已经做错了什么! 如有任何帮助,我们将不胜感激!

nlp lstm word-embedding data-preprocessing
1个回答
0
投票

您能检查一下输入数据的形状吗?尝试更改输入数据的形状。该错误表明尺寸有问题。也许在训练模型时,它需要一个特定的维度,其中必须对输入进行格式化,并且更改输入数据的形状可能会有所帮助。如果错误仍然存在,也许您可以再次发送。 谢谢

© www.soinside.com 2019 - 2024. All rights reserved.