ValueError:预期批次大小与模型输出批次大小不匹配。输出形状=(16,6248),预期输出形状=形状(1,6248)
我在模型的最后一个密集层中定义的是(None,6248)
我无法理解此输出中的数字是16。全部代码,不带数字16。
描述:我在kaggle中使用TPU V3进行培训。 但是如果我使用GPU,那么当我进行预测“ model.predict(...)”时我不会得到任何错误。
我使用Keras功能API创建模型
请让我清楚。谢谢!这是我的模特。
embedding_layer = Embedding(len(word_index) + 1,
EMBEDDING_DIM,
weights=[embedding_matrix],
input_length= max_sequence_len - 1,
trainable=False)
sequence_1_input = Input(shape=(max_sequence_len - 1,), dtype='int32')
embedded_sequences_1 = embedding_layer(sequence_1_input)
activations = Bidirectional(LSTM(num_lstm, dropout=rate_drop_lstm, recurrent_dropout=rate_drop_lstm, return_sequences=True))(embedded_sequences_1)
activations = LSTM(num_lstm, dropout=rate_drop_lstm, recurrent_dropout=rate_drop_lstm)(activations)
merged = Dense(num_dense, activation=act)(activations)
merged = Dropout(rate_drop_dense)(merged)
merged = BatchNormalization()(merged)
preds = Dense(vocab_size, activation='softmax')(merged)
model = Model(inputs=sequence_1_input, outputs=preds)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
这是我的数据:
train = predictors[0:10240]
test = label[0:10240]
dataset = tf.data.Dataset.from_tensor_slices((train, test))
dataset = dataset.shuffle(200).batch(1024).repeat()
dataset = dataset.prefetch(AUTO)
#Here is the shape of dataset:
<PrefetchDataset shapes: ((None, 87), (None, 6248)), types: (tf.int32, tf.float32)>
适合的型号:
model.fit(dataset, epochs=100, steps_per_epoch=10240//1024, verbose=1)
摘要
只想提一下我遇到类似的错误。
它使我相信tensorflow中肯定有一个错误。(也将“ 16”作为第一个输出形状)
我已经在预测函数上给“ batch_size”赋予了“ 1”,只是为了保证效果。
可悲的是完全没有影响。
也许它从tpu收到16批?
这可能导致self.results在此位置出乎意料:https://github.com/tensorflow/tensorflow/blob/e5bf8de410005de06a7ff5393fafdf832ef1d4ad/tensorflow/python/keras/engine/training_utils.py#L346
但是我目前不知道如何在不修补tensorflow本身的情况下对其进行修复。
编辑:
那太难调试了...目前我认为该错误在此附近:
因为它以某种方式基于“ num_replicas_in_sync”处理输出形状。
但是如果没有调试就无法真正解决。
一些带有pdb的Debuginfo:
(Pdb) b /opt/conda/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:262
Breakpoint 7 at /opt/conda/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py:262
(Pdb) c
> /opt/conda/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py(262)_aggregate_predict_results()
-> nested_outs = batch_outs[i * num_replicas:i * num_replicas + num_replicas]
(Pdb) print(len(batch_outs))
8
(Pdb) print(batch_outs[0].shape)
(2, 7, 355, 235, 1)
“ 2”一开始对我来说是错误的,不知道为什么会这样。(后来更改为16,导致“ batch_outs”有多个重复项)]
再次编辑(28.04.2020):
该问题似乎较早发生:
(Pdb) b /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:128
Breakpoint 3 at /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py:128
(Pdb) c
> /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py(128)run_one_epoch()
-> batch_outs = execution_function(iterator)
(Pdb) n
> /opt/conda/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py(159)run_one_epoch()
-> if mode != ModeKeys.PREDICT:
(Pdb) print(len(batch_outs))
16
(Pdb) print(batch_outs[14].shape)
(2, 7, 355, 235, 1)
(Pdb) np.array_equal(batch_outs[14],batch_outs[15])
True
再次编辑(29.04.2020):
作为替代方法,似乎在以下方面起作用:
from tensorflow.python.tpu import device_assignment as device_assignment_lib
tf.keras.backend.set_floatx('float32')
tpu = tf.distribute.cluster_resolver.TPUClusterResolver() # TPU detection
tf.config.experimental_connect_to_cluster(tpu)
topology = tf.tpu.experimental.initialize_tpu_system(tpu)
device_assignment = device_assignment_lib.DeviceAssignment(
topology, core_assignment=device_assignment_lib.
SINGLE_CORE_ASSIGNMENT)
strategy = tf.distribute.experimental.TPUStrategy(tpu,device_assignment)
当然,这不是最佳解决方案,因为那样一来,您只使用一个内核。至少它证实了我的猜测,即tensorflow不能正确地过滤掉所有内核的结果。