为什么 GPT-J 的行为随着文本中的微小偏差而改变

Question

我正在使用 GPT-J 而不进行微调来从文本中执行实体提取。为了简单起见，我将我的实验限制在有限数量的实体（人、组织、地点、日期）上。

我尝试了各种提示，提示中提供了 3-4 个示例。

例如

Text: Helena Smith founded Core.ai 2 years ago. She is now the CEO and CTO of the company and is building a team of highly skilled developers in machine learning and natural language processing.
Person: Helena Smith
Position: CEO and CTO
Email: Not specified
Location: Not specified
Phone: Not specified

### 

Text: Julian([email protected]) works as a barista at the starbucks located in Boston.
Person: Julian
Position: Barista
Email: [email protected]
Location: Boston
Phone: Not specified

### 

Text: Maxime is a data scientist at Auto Dataset, and he's been working there since 2016 at the San Francisco office. He can be reached at [email protected].


Output:
Person: Maxime
Organization: AutoDataset
Location: San Francisco
Year: 2018

###

在上面的示例中，输出被正确识别，但如果我将最后一个示例中的年份更改为其他内容，则输出 Year 为 None。

我的理解告诉我，输出的这种变化不知何故是因为模型在训练期间看到的所有文本，有时可以正确识别上下文，有时则不能。但我不确定仅更改年份是否与上下文有关。有人可以更好地解释可能发生的事情吗？

我尝试更改 top_p 和温度参数以查看是否有明显的解释，但我没有看到任何解释。

以下是我使用的代码

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("gpt-j-6B")

model = AutoModelForCausalLM.from_pretrained("gpt-j-6B",low_cpu_mem_usage=True)

inputs = tokenizer(input_text, return_tensors="pt")
input_ids = inputs["input_ids"]
output = model.generate(
                input_ids,
                attention_mask=inputs["attention_mask"],
                max_new_tokens=100,
                temperature=0,
                repetition_penalty=1.2,
                top_p=0.9,
                eos_token_id=tokenizer.convert_tokens_to_ids("###")
                )
print(tokenizer.decode(output[0]).replace(input_text,''))

为什么 GPT-J 的行为随着文本中的微小偏差而改变

问题描述投票：0回答：0

最新问题

为什么 GPT-J 的行为随着文本中的微小偏差而改变

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0