量化 4 位和 8 位 - “quantization_config”中出现错误

问题描述 投票:0回答:1

我正在使用 model = 'filipealmeida/Mistral-7B-Instruct-v0.1-sharded' 并将其量化为 4 位 具有以下功能。

def load_quantized_model(model_name: str):
    """
    :param model_name: Name or path of the model to be loaded.
    :return: Loaded quantized model.
    """
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        load_in_4bit=True,
        torch_dtype=torch.bfloat16,
        quantization_config=bnb_config
    )

    return model

当我加载文件时,我收到以下错误消息:

ValueError                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 model = load_quantized_model(model_name)

Cell In[10], line 13
      2 """
      3 :param model_name: Name or path of the model to be loaded.
      4 :return: Loaded quantized model.
      5 """
      6 bnb_config = BitsAndBytesConfig(
      7     load_in_4bit=True,
      8     bnb_4bit_use_double_quant=True,
      9     bnb_4bit_quant_type="nf4",
     10     bnb_4bit_compute_dtype=torch.bfloat16
     11 )
---> 13 model = AutoModelForCausalLM.from_pretrained(
     14     model_name,
     15     load_in_4bit=True,
     16     torch_dtype=torch.bfloat16,
     17     quantization_config=bnb_config
     18 )
     20 return model

File ~/miniconda3/envs/peft/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:563, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
...
   2981         )
   2983     # preparing BitsAndBytesConfig from kwargs
   2984     config_dict = {k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters}

ValueError: You can't pass `load_in_4bit`or `load_in_8bit` as a kwarg when passing `quantization_config` argument at the same time.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

我检查了 _BaseAutoModelClass.from_pretrained 类,但找不到设置“8_bit”的位置。我需要做什么才能正确加载 4 位模型?

我尝试更改bnb_config以使其适应8_bit,但无法解决问题。

gpu local large-language-model quantization 8-bit
1个回答
0
投票

即使我遇到了同样的错误,我已经使用自动训练对数据集进行了微调,现在我想使用 Mistra AI 模型加载它的适配器,但我仍然无法执行相同的操作。我无法进行推理,并且遇到了同样的错误 ValueError:同时传递

load_in_4bit
参数时,不能将
load_in_8bit
quantization_config
作为 kwarg 传递。

如果您找到任何解决方案,请告诉我。

© www.soinside.com 2019 - 2024. All rights reserved.