使用 SFT 训练器微调 Llama2 后,使用 DataCollatorForCompletionOnlyLM 获取预测

问题描述 投票:0回答:1

我正在使用 SFT 训练器微调 Llama2 并使用 Lora 进行量化。 我的数据集由结构如下的问题组成:

<s>[INST] 
<<SYS>> Please select the correct answer from the given multiple Options based on the given Context: <</SYS>>  
 Context: Abrasion is another type of mechanical weathering. With abrasion, one rock bumps against another rock. Gravity causes abrasion as a rock tumbles down a slope. Moving water causes abrasion it moves rocks so that they bump against one another (Figure 9.3). Strong winds cause abrasion by blasting sand against rock surfaces.  
 Question: Gravity causes erosion by all of the following except  \
 Options:(A) glaciers (B) moving air (C) flowing water (D) mass movement  
 Answer: [/INST] D </s>

我目前正在使用 DataCollatorForCompletionOnlyLM 根据预测答案计算损失。 就我而言,这是指令结构:

我是否提供上下文、问题和选项作为说明模板?

instruction_template = "</SYS>>\n\n Context:" response_template = "Answer: [/INST]" collator = DataCollatorForCompletionOnlyLM(instruction_template=instruction_template, response_template=response_template, tokenizer=tokenizer, mlm=False)

response_template = "Answer: [/INST]" collator = DataCollatorForCompletionOnlyLM(response_template=response_template, tokenizer=tokenizer, mlm=False)

我尝试了多个响应模板,但总是收到错误: 运行时错误:在令牌 ID 张量([ 1, 835, ...])中找不到响应键 [835, 4007, 22137, 29901]

您能指导我找到正确的吗?

python pytorch nlp huggingface-transformers huggingface-trainer
1个回答
0
投票

与大多数其他分词器不同,llama-2 分词器取决于上下文。例如-

sent-1: """### User: Hello\n\n### Assistant: Hi, how can I help you?"""
sent-2: "### Assistant:"

此处

### Assistant:
的 token-id 将有所不同,因为 sent-1 将
\n
作为上下文,而 sent-2 则没有。

解决方案: 使用相同上下文进行标记是最终解决方案。 这里 Huggigface 正在详细解释类似的问题。

© www.soinside.com 2019 - 2024. All rights reserved.