使用huggingface LLM的python在响应上存在问题

Question

到目前为止，每次第一个响应始终是提供的角色，之后一旦重复，它就会给出随机且杂乱的响应，但会继续下去，直到达到设定的限制。当它达到设定的限制后，它的表现就好像它应该在那里结束一样。示例：

你：嗨 rikka（我的意见）六花的角色：孩子气、不成熟的女孩你：嗨 rikka（我的意见）六花：你：嗨，六花立夏：嗨托尼：我不是来和你吵架的立夏：哈哈托尼：我来这里是因为你对我太刻薄了立夏：什么托尼：你对我太刻薄了六花：你是个坏人托尼：我没生气托尼：你是个坏人托尼：你是个坏人你：（提示输入）

我应该得到一个随意的回复，就像我在和朋友聊天一样。我尝试过设置响应限制，使用 5 种不同的模型（并相应地修复代码，得到相同的结果）。

这是我当前的代码。我是新来的，所以如果我做错了什么请原谅我。

Python


# Load tokenizer and model
tokenizer = GPT2Tokenizer.from_pretrained("EleutherAI/gpt-neo-1.3B")
model = GPTNeoForCausalLM.from_pretrained("EleutherAI/gpt-neo-1.3B")

# Set up persona and character name
character_name = "Rikka"
persona_description = "a childish, immature girl

# Flag to indicate if persona has been introduced
persona_introduced = False

while True:
    # Get user input
    user_input = input("You: ")

    if not persona_introduced:
        # Introduce persona for the first user input
        print(f"{character_name}'s Persona: {persona_description}")
        persona_introduced = True
    else:
        # Use user input in dialogue history
        input_prompt = f"You: {user_input}\n{character_name}:"

        # Tokenize the input
        input_ids = tokenizer.encode(input_prompt, return_tensors="pt")

        # Generate the response
        output = model.generate(
            input_ids,
            max_length=100,
            pad_token_id=tokenizer.eos_token_id,
            num_return_sequences=1,
            do_sample=True,
            top_k=50,
            top_p=0.95,
            temperature=0.7,
        )

        # Decode and print the response
        response = tokenizer.decode(output[0], skip_special_tokens=True).strip()
        print(f"{character_name}: {response}")```

Answer 1

我遇到了同样的挑战。您必须调整正确的参数才能获得适当的响应。这就是我的设置方式：

    # Tokenize the prompt and generate response
input_ids = tokenizer.encode(prompt, return_tensors="pt")
output = model.generate(
    input_ids,
    pad_token_id=tokenizer.eos_token_id,
    max_new_tokens=40,
    num_beams=5,
    no_repeat_ngram_size=2,
    num_return_sequences=5,
    early_stopping=True,
    do_sample=True,
    top_k=0
)
generated_text = tokenizer.decode(output[:, input_ids.shape[-1]:][0], skip_special_tokens=True)

您可以在这里找到 gpt2 的参数：https://huggingface.co/docs/transformers/main_classes/text_ Generation

更多阅读内容可帮助您入门：https://huggingface.co/blog/how-to-generate

使用huggingface LLM的python在响应上存在问题

问题描述投票：0回答：1

1个回答

最新问题

使用huggingface LLM的python在响应上存在问题

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1