我正在跑步:
#original training script
trainer = transformers.Trainer(
model=model,
train_dataset=train_dataset,
eval_dataset=test_dataset, #turn on the eval dataset for comparisons
args=transformers.TrainingArguments(
num_train_epochs=2,
per_device_train_batch_size=1,
gradient_accumulation_steps=1,
warmup_ratio=0.05,
max_steps=20,
learning_rate=2e-4,
fp16=True,
logging_steps=1,
output_dir="outputs",
optim="paged_adamw_8bit",
lr_scheduler_type='cosine',
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)
model.config.use_cache = False # silence the warnings. Please re-enable for inference!
我不是 100% 清楚,但我认为显示的损失是与训练数据集与评估数据集相比......
如何显示损失与评估(理想情况下还有训练集)?
我本以为添加 eval_dataset 就足够了......
您可以将compute_metrics函数添加到您的Trainer模块中
def compute_metrics(p):
print(type(p))
pred, labels = p
pred = np.argmax(pred, axis=1)
accuracy = accuracy_score(y_true=labels, y_pred=pred)
recall = recall_score(y_true=labels, y_pred=pred)
precision = precision_score(y_true=labels, y_pred=pred)
f1 = f1_score(y_true=labels, y_pred=pred)
return {"accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1}
trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
compute_metrics=compute_metrics
)