我正在使用 SFT 训练器微调 Llama2 并使用 Lora 进行量化。 我的数据集由结构如下的问题组成:
<s>[INST]
<<SYS>> Please select the correct answer from the given multiple Options based on the given Context: <</SYS>>
Context: Abrasion is another type of mechanical weathering. With abrasion, one rock bumps against another rock. Gravity causes abrasion as a rock tumbles down a slope. Moving water causes abrasion it moves rocks so that they bump against one another (Figure 9.3). Strong winds cause abrasion by blasting sand against rock surfaces.
Question: Gravity causes erosion by all of the following except \
Options:(A) glaciers (B) moving air (C) flowing water (D) mass movement
Answer: [/INST] D </s>
我目前正在使用 DataCollatorForCompletionOnlyLM 根据预测答案计算损失。 就我而言,这是指令结构:
我是否提供上下文、问题和选项作为说明模板?
instruction_template = "</SYS>>\n\n Context:" response_template = "Answer: [/INST]" collator = DataCollatorForCompletionOnlyLM(instruction_template=instruction_template, response_template=response_template, tokenizer=tokenizer, mlm=False)
或
response_template = "Answer: [/INST]" collator = DataCollatorForCompletionOnlyLM(response_template=response_template, tokenizer=tokenizer, mlm=False)
我尝试了多个响应模板,但总是收到错误: 运行时错误:在令牌 ID 张量([ 1, 835, ...])中找不到响应键 [835, 4007, 22137, 29901]
您能指导我找到正确的吗?
与大多数其他分词器不同,llama-2 分词器取决于上下文。例如-
sent-1: """### User: Hello\n\n### Assistant: Hi, how can I help you?"""
sent-2: "### Assistant:"
此处
### Assistant:
的 token-id 将有所不同,因为 sent-1 将 \n
作为上下文,而 sent-2 则没有。
解决方案: 使用相同上下文进行标记是最终解决方案。 这里 Huggigface 正在详细解释类似的问题。