在tensorflow的`BERT`中使用`keras.Model.fit`时维度不匹配

Question

我按照Fine-tuning BERT的说明使用我自己的数据集构建模型（它有点大，大于20G），然后采取步骤重新cdoe我的数据并从

tf_record

文件加载它们。我创建的

training_dataset

与说明中的签名相同

training_dataset.element_spec

({'input_word_ids': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None), 
'input_mask': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None), 
'input_type_ids': TensorSpec(shape=(32, 1024), dtype=tf.int32, name=None)}, 
TensorSpec(shape=(32,), dtype=tf.int32, name=None))

其中

batch_size

为 32，

max_seq_length

为 1024。正如说明所示，

The resulting tf.data.Datasets return (features, labels) pairs, as expected by keras.Model.fit

似乎一切都按预期工作，（虽然指令没有显示如何使用

training_dataset

）但是，以下代码

bert_classifier.fit(
    x = training_dataset, 
    validation_data=test_dataset, # has the same signature just as training_dataset
    batch_size=32,
    epochs=epochs,
    verbose=1,
)

遇到一个对我来说似乎很奇怪的错误，

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/captain/project/dataload/train.py", line 81, in <module>
    verbose=1,
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1100, in fit
    tmp_logs = self.train_function(iterator)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 871, in _call
    self._initialize(args, kwds, add_initializers_to=initializers)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 726, in _initialize
    *args, **kwds))
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 2969, in _get_concrete_function_internal_garbage_collected
    graph_function, _ = self._maybe_define_function(args, kwargs)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3361, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/function.py", line 3206, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 990, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py", line 634, in wrapped_fn
    out = weak_wrapped_fn().__wrapped__(*args, **kwds)
  File "/home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py", line 977, in wrapper
    raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:

    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py:805 train_function  *
        return step_function(self, iterator)
    /home/captain/.local/lib/python3.7/site-packages/official/nlp/keras_nlp/layers/position_embedding.py:88 call  *
        return tf.broadcast_to(position_embeddings, input_shape)
    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/ops/gen_array_ops.py:845 broadcast_to  **
        "BroadcastTo", input=input, shape=shape, name=name)
    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:750 _apply_op_helper
        attrs=attr_protos, op_def=op_def)
    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/func_graph.py:592 _create_op_internal
        compute_device)
    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:3536 _create_op_internal
        op_def=op_def)
    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:2016 __init__
        control_input_ops, op_def)
    /home/captain/.local/lib/python3.7/site-packages/tensorflow/python/framework/ops.py:1856 _create_c_op
        raise ValueError(str(e))

    ValueError: Dimensions must be equal, but are 512 and 1024 for '{{node bert_classifier/bert_encoder_1/position_embedding/BroadcastTo}} = 
BroadcastTo[T=DT_FLOAT, Tidx=DT_INT32](bert_classifier/bert_encoder_1/position_embedding/strided_slice_1, bert_classifier/bert_encoder_1/position_embedding/Shape)' 
with input shapes: [512,768], [3] and with input tensors computed as partial shapes: input[1] = [32,1024,768].

与 512 无关，我的代码中也没有使用 512。那么我的代码哪里出了问题以及如何修复？

Answer 1

他们基于从

bert_classifier

 加载的

bert_config_file

 创建了

bert_config.json

bert_classifier, bert_encoder = bert.bert_models.classifier_model(bert_config, num_labels=2)

bert_config.json

{
'attention_probs_dropout_prob': 0.1,
 'hidden_act': 'gelu',
 'hidden_dropout_prob': 0.1,
 'hidden_size': 768,
 'initializer_range': 0.02,
 'intermediate_size': 3072,
 'max_position_embeddings': 512,
 'num_attention_heads': 12,
 'num_hidden_layers': 12,
 'type_vocab_size': 2,
 'vocab_size': 30522
}

根据此配置，

hidden_size

为 768，

max_position_embeddings

为 512，因此用于输入 BERT 模型的输入数据必须与描述的形状相同。它解释了您遇到形状不匹配问题的原因。

因此，要使其工作，您必须将用于创建张量输入的所有行从

更改为

。

在tensorflow的`BERT`中使用`keras.Model.fit`时维度不匹配

问题描述投票：0回答：1

1个回答

最新问题

在tensorflow的`BERT`中使用`keras.Model.fit`时维度不匹配

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1