HuggingFace 评估微调零样本模型

问题描述 投票:0回答:0

我正在微调 HuggingFace

facebook/bart-large-mnli
模型以满足我的需要,我使用以下参数:

training_args = TrainingArguments(
    output_dir=model_directory,      # output directory
    num_train_epochs=30,              # total number of training epochs
    per_device_train_batch_size=1,  # batch size per device during training - 16 - Don't go over 1, it's out of memory
    per_device_eval_batch_size=2,   # batch size for evaluation - 64 - Don't go over 2, it's out of memory
    warmup_steps=500,                 # number of warmup steps for learning rate scheduler - 500
    weight_decay=0.01,               # strength of weight decay
)

model = BartForSequenceClassification.from_pretrained("facebook/bart-large-mnli")

trainer = Trainer(
    model=model,                          # the instantiated 🤗 Transformers model to be trained
    args=training_args,                   # training arguments, defined above
    compute_metrics=compute_metrics,      # a function to compute the metrics
    train_dataset=train_dataset,          # training dataset
    eval_dataset=test_dataset             # evaluation dataset
)

# Train the trainer
trainer.train()

我用的

compute_metrics
是:

import numpy as np
from datasets import Dataset, load_metric
from transformers import EvalPrediction

def compute_metrics(p: EvalPrediction):
  metric_acc = load_metric("accuracy")
  preds = p.predictions[0] if isinstance(p.predictions, tuple) else p.predictions
  preds = np.argmax(preds, axis=1)
  result = {}
  result["accuracy"] = metric_acc.compute(predictions=preds, references=p.label_ids)["accuracy"]
  return result

但是无论我使用多少训练或测试数据,或者多少个 epoch,当我使用

trainer.evaluate()
时,我得到的精度为 0.5.

我的问题是:

  1. 我该如何改进它?
  2. 如何实施其他评估指标?例如 F1 分数。
python deep-learning huggingface-transformers text-classification evaluation
© www.soinside.com 2019 - 2024. All rights reserved.