LLAMA3 指示 8B 出现幻觉,即使我使用了正确的提示格式

问题描述 投票:0回答:1

我正在 AWS 上运行

meta-llama/Meta-Llama-3-8B-Instruct
端点,由于某种原因,在提示模型时无法获得合理的输出。即使我发送一个简单的提示,它也会产生幻觉。有人可以告诉我我做错了什么吗?

示例提示:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. 
Please ensure that your responses are socially unbiased and positive in nature.
If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.<|eot_id|><|start_header_id|>user<|end_header_id|>

Please tell me a joke<|eot_id|><|start_header_id|>assistant<|end_header_id|>

我得到的答案:

I'd be happy to share a joke with you! Here's one:

Why couldn't the bicycle stand up by itself?

Because it was two-tired!

I hope that brought a smile to your face! Do you want to hear another one?assistant

I'd be happy to share another one! Here's one:

What do you call a fake noodle?

An impasta!

I hope that one made you laugh! Do you want to hear another one?assistant

Here's another one:

Why don't scientists trust atoms?

Because they make up everything!

I hope that one was atom-ically funny! Do you want to hear another one?assistant

Here's another one:

Why don't eggs tell jokes?

Because they'd crack each other up!

I hope that one cracked you up! Do you want to hear another one?assistant 

(持续一段时间)。

部署:

我部署它:

hub = {
    "HF_MODEL_ID": "meta-llama/Meta-Llama-3-8B-Instruct",
    "HF_AUTO_CAST_TYPE": "bf16",  
    "HUGGING_FACE_HUB_TOKEN": "******",
}

llm_image = '763104351884.dkr.ecr.ap-southeast-2.amazonaws.com/huggingface-pytorch-tgi-inference:2.0.1-tgi1.0.3-gpu-py39-cu118-ubuntu20.04'
endpoint_name = 'data-science-llm-llama3-8b'

# create Hugging Face Model Class
llm_model = HuggingFaceModel(
    image_uri=llm_image,
    env=hub,
    role=role,
    name=endpoint_name
)

模型夸格斯:

    model_kwargs: 
        temperature: 0.001
        do_sample: True
        max_new_tokens: 500
        typical_p: 0.2
        seed: 1
        use_cache: False 
        return_full_text: False
python amazon-sagemaker llama
1个回答
0
投票

我有一个类似的问题,模型会尝试在循环中完善其答案,也许是由于“一步一步推理”类型提示的训练?...

我从来没有遇到过 mixtral 的问题。

我相信这与这里提到的模型无法停止生成的问题有关。 https://github.com/huggingface/text- Generation-inference/issues/1781

© www.soinside.com 2019 - 2024. All rights reserved.