在 Windows 上不存在的 WSL2 上训练时出现错误

问题描述 投票:0回答:1

我一直在尝试使用 python 和 tensorflow 制作一个简单的语言模型,我发现为了正确使用 GPU,我需要在 WSL 上运行,这当然没问题。虽然新的问题又出现了。它抛出此错误:

TypeError: Layer.__init__() takes 1 positional argument but 2 were given
,来自此函数

class MyModel(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, rnn_units):
    super().__init__(self)
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(rnn_units,
                                   return_sequences=True,
                                   return_state=True)
    self.dense = tf.keras.layers.Dense(vocab_size)

  def call(self, inputs, states=None, return_state=False, training=False):
    x = inputs
    x = self.embedding(x, training=training)
    if states is None:
      states = self.gru.get_initial_state(x)
    x, states = self.gru(x, initial_state=states, training=training)
    x = self.dense(x, training=training)

    if return_state:
      return x, states
    else:
      return x

如果我将行

super().__init__(self)
更改为
super().__init__()
,它可以解决这个问题,但会创建另一个问题,现在它会抛出此错误:
{{function_node __wrapped__Pack_N_2_device_/job:localhost/replica:0/task:0/device:GPU:0}} Shapes of all inputs must match: values[0].shape = [256,100,256] != values[1].shape = [] [Op:Pack] name:

我想知道是否有超过 20 分钟 TF 经验的人可以帮助我。

python linux tensorflow windows-subsystem-for-linux
1个回答
0
投票

刚刚在谷歌云上检查了您的代码。看起来很有效。难道是你的输入有问题?

import tensorflow as tf

class MyModel(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, rnn_units):
    super().__init__(self)
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(rnn_units,
                                   return_sequences=True,
                                   return_state=True)
    
    self.dense = tf.keras.layers.Dense(vocab_size)

  def call(self, inputs, states=None, return_state=False, training=False):
    x = inputs


    x = self.embedding(x, training=training)

    if states is None:
      states = self.gru.get_initial_state(x)
      
    x, states = self.gru(x, initial_state=states, training=training)
    x = self.dense(x, training=training)

    if return_state:
      return x, states
    else:
      return x


batch_size = 433
vocab_size = 10000
sentence_length = 40

with tf.device(tf.DeviceSpec(device_type="GPU")):

  tt = tf.random.uniform(shape=(batch_size, sentence_length), minval=0, maxval=vocab_size, dtype=tf.int32)
  model = MyModel(vocab_size = vocab_size,  embedding_dim = 256, rnn_units = 16)     
  ff = model.call(tt)

ff
© www.soinside.com 2019 - 2024. All rights reserved.