我的 IndoBERT 模型出现 ValueError：该模型没有从输入中返回损失，仅返回以下键：last_hidden_state

Question

我试图用我的数据集微调我的 IndoBERT 模型。当我使用 BERT base-uncased 时，它起作用了。但当我要用 IndoBERT 来做时，它又回来了

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[367], line 1
----> 1 trainer.train()

File ~\AppData\Local\anaconda3\Lib\site-packages\transformers\trainer.py:1539, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
   1537         hf_hub_utils.enable_progress_bars()
   1538 else:
-> 1539     return inner_training_loop(
   1540         args=args,
   1541         resume_from_checkpoint=resume_from_checkpoint,
   1542         trial=trial,
   1543         ignore_keys_for_eval=ignore_keys_for_eval,
   1544     )

File ~\AppData\Local\anaconda3\Lib\site-packages\transformers\trainer.py:1869, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
   1866     self.control = self.callback_handler.on_step_begin(args, self.state, self.control)
   1868 with self.accelerator.accumulate(model):
-> 1869     tr_loss_step = self.training_step(model, inputs)
   1871 if (
   1872     args.logging_nan_inf_filter
   1873     and not is_torch_tpu_available()
   1874     and (torch.isnan(tr_loss_step) or torch.isinf(tr_loss_step))
   1875 ):
   1876     # if loss is nan or inf simply add the average of previous logged losses
   1877     tr_loss += tr_loss / (1 + self.state.global_step - self._globalstep_last_logged)

File ~\AppData\Local\anaconda3\Lib\site-packages\transformers\trainer.py:2772, in Trainer.training_step(self, model, inputs)
   2769     return loss_mb.reduce_mean().detach().to(self.args.device)
   2771 with self.compute_loss_context_manager():
-> 2772     loss = self.compute_loss(model, inputs)
   2774 if self.args.n_gpu > 1:
   2775     loss = loss.mean()  # mean() to average on multi-gpu parallel training

File ~\AppData\Local\anaconda3\Lib\site-packages\transformers\trainer.py:2813, in Trainer.compute_loss(self, model, inputs, return_outputs)
   2811 else:
   2812     if isinstance(outputs, dict) and "loss" not in outputs:
-> 2813         raise ValueError(
   2814             "The model did not return a loss from the inputs, only the following keys: "
   2815             f"{','.join(outputs.keys())}. For reference, the inputs it received are {','.join(inputs.keys())}."
   2816         )
   2817     # We don't use .loss here since the model may return tuples instead of ModelOutput.
   2818     loss = outputs["loss"] if isinstance(outputs, dict) else outputs[0]

ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,pooler_output. For reference, the inputs it received are input_ids,token_type_ids,attention_mask.

我有这样的感觉，因为分词器的工作方式不同，但我不知道。我是这个领域的初学者。任何建议都会非常有帮助。

我的代码

tokenizer = AutoTokenizer.from_pretrained("indolem/indobert-base-uncased")

def tokenize_and_align_labels(examples, label_all_tokens=True):
    """
    Function to tokenize and align labels with respect to the tokens. This function is specifically designed for
    Named Entity Recognition (NER) tasks where alignment of the labels is necessary after tokenization.

    Parameters:
    examples (dict): A dictionary containing the tokens and the corresponding NER tags.
                     - "tokens": list of words in a sentence.
                     - "ner_tags": list of corresponding entity tags for each word.

    label_all_tokens (bool): A flag to indicate whether all tokens should have labels.
                             If False, only the first token of a word will have a label,
                             the other tokens (subwords) corresponding to the same word will be assigned -100.

    Returns:
    tokenized_inputs (dict): A dictionary containing the tokenized inputs and the corresponding labels aligned with the tokens.
    """
    tokenized_inputs = tokenizer(examples["text"], truncation=True, is_split_into_words=True)
    labels = []
    for i, label in enumerate(examples["labels"]):
        word_ids = tokenized_inputs.word_ids(batch_index=i)
        # word_ids() => Return a list mapping the tokens
        # to their actual word in the initial sentence.
        # It Returns a list indicating the word corresponding to each token.
        previous_word_idx = None
        label_ids = []
        # Special tokens like `<s>` and `<\s>` are originally mapped to None
        # We need to set the label to -100 so they are automatically ignored in the loss function.
        for word_idx in word_ids:
            if word_idx is None:
                # set –100 as the label for these special tokens
                label_ids.append(-100)
            # For the other tokens in a word, we set the label to either the current label or -100, depending on
            # the label_all_tokens flag.
            elif word_idx != previous_word_idx:
                # if current word_idx is != prev then its the most regular case
                # and add the corresponding token
                label_ids.append(label[word_idx])
            else:
                # to take care of sub-words which have the same word_idx
                # set -100 as well for them, but only if label_all_tokens == False
                label_ids.append(label[word_idx] if label_all_tokens else -100)
                # mask the subword representations after the first subword

            previous_word_idx = word_idx
        labels.append(label_ids)
    tokenized_inputs["labels"] = labels
    return tokenized_inputs

model = AutoModel.from_pretrained("indolem/indobert-base-uncased",num_labels=7)

from transformers import TrainingArguments, Trainer
args = TrainingArguments(
"test-ner",
evaluation_strategy = "epoch",
learning_rate=2e-2,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
num_train_epochs=1,
weight_decay=0.1,
)

def compute_metrics(eval_preds):
    """
    Function to compute the evaluation metrics for Named Entity Recognition (NER) tasks.
    The function computes precision, recall, F1 score and accuracy.

    Parameters:
    eval_preds (tuple): A tuple containing the predicted logits and the true labels.

    Returns:
    A dictionary containing the precision, recall, F1 score and accuracy.
    """
    pred_logits, labels = eval_preds

    pred_logits = np.argmax(pred_logits, axis=2)
    # the logits and the probabilities are in the same order,
    # so we don’t need to apply the softmax

    # We remove all the values where the label is -100
    predictions = [
        [label_list[eval_preds] for (eval_preds, l) in zip(prediction, label) if l != -100]
        for prediction, label in zip(pred_logits, labels)
    ]

    true_labels = [
      [label_list[l] for (eval_preds, l) in zip(prediction, label) if l != -100]
       for prediction, label in zip(pred_logits, labels)
   ]
    results = metric.compute(predictions=predictions, references=true_labels)
    return {
   "precision": results["overall_precision"],
   "recall": results["overall_recall"],
   "f1": results["overall_f1"],
  "accuracy": results["overall_accuracy"],
  }

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["valid"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
)

trainer.train()

我的数据集

DatasetDict({
    train: Dataset({
        features: ['text', 'labels'],
        num_rows: 91
    })
    test: Dataset({
        features: ['text', 'labels'],
        num_rows: 12
    })
    valid: Dataset({
        features: ['text', 'labels'],
        num_rows: 11
    })
})

完整代码

Answer 1

看起来您想要执行令牌分类（NER），但您正在加载的模型只是一个不会返回损失的基本模型，因为基本模型没有特定于任务的头来提供损失（代码）。

您应该使用 AutoModelForTokenClassification 加载权重，而不是 Automodel，以检索任务的相应损失。

from transformers import AutoModelForTokenClassification

model = AutoModelForTokenClassification.from_pretrained("indolem/indobert-base-uncased",num_labels=7)

我的 IndoBERT 模型出现 ValueError：该模型没有从输入中返回损失，仅返回以下键：last_hidden_state

问题描述投票：0回答：1

1个回答

最新问题

我的 IndoBERT 模型出现 ValueError：该模型没有从输入中返回损失，仅返回以下键：last_hidden_state

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1