使用 HuggingFace 训练器,如何显示训练损失与评估数据集的比较?

问题描述 投票:0回答:1

我正在跑步:

#original training script

trainer = transformers.Trainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=test_dataset, #turn on the eval dataset for comparisons
    args=transformers.TrainingArguments(
        num_train_epochs=2,
        per_device_train_batch_size=1,
        gradient_accumulation_steps=1,
        warmup_ratio=0.05,
        max_steps=20,
        learning_rate=2e-4,
        fp16=True,
        logging_steps=1,
        output_dir="outputs",
        optim="paged_adamw_8bit",
        lr_scheduler_type='cosine',
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!

我不是 100% 清楚,但我认为显示的损失是与训练数据集与评估数据集相比......

如何显示损失与评估(理想情况下还有训练集)?

我本以为添加 eval_dataset 就足够了......

language-model huggingface-trainer
1个回答
0
投票

您可以将compute_metrics函数添加到您的Trainer模块中

def compute_metrics(p):
    print(type(p))
    pred, labels = p
    pred = np.argmax(pred, axis=1)
    accuracy = accuracy_score(y_true=labels, y_pred=pred)
    recall = recall_score(y_true=labels, y_pred=pred)
    precision = precision_score(y_true=labels, y_pred=pred)
    f1 = f1_score(y_true=labels, y_pred=pred)
    return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    compute_metrics=compute_metrics
)
© www.soinside.com 2019 - 2024. All rights reserved.