量化 4 位和 8 位 - “quantization_config”中出现错误

Question

我正在使用 model = 'filipealmeida/Mistral-7B-Instruct-v0.1-sharded' 并将其量化为 4 位具有以下功能。

def load_quantized_model(model_name: str):
    """
    :param model_name: Name or path of the model to be loaded.
    :return: Loaded quantized model.
    """
    bnb_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_use_double_quant=True,
        bnb_4bit_quant_type="nf4",
        bnb_4bit_compute_dtype=torch.bfloat16
    )

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        load_in_4bit=True,
        torch_dtype=torch.bfloat16,
        quantization_config=bnb_config
    )

    return model

当我加载文件时，我收到以下错误消息：

ValueError                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 model = load_quantized_model(model_name)

Cell In[10], line 13
      2 """
      3 :param model_name: Name or path of the model to be loaded.
      4 :return: Loaded quantized model.
      5 """
      6 bnb_config = BitsAndBytesConfig(
      7     load_in_4bit=True,
      8     bnb_4bit_use_double_quant=True,
      9     bnb_4bit_quant_type="nf4",
     10     bnb_4bit_compute_dtype=torch.bfloat16
     11 )
---> 13 model = AutoModelForCausalLM.from_pretrained(
     14     model_name,
     15     load_in_4bit=True,
     16     torch_dtype=torch.bfloat16,
     17     quantization_config=bnb_config
     18 )
     20 return model

File ~/miniconda3/envs/peft/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py:563, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
...
   2981         )
   2983     # preparing BitsAndBytesConfig from kwargs
   2984     config_dict = {k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters}

ValueError: You can't pass `load_in_4bit`or `load_in_8bit` as a kwarg when passing `quantization_config` argument at the same time.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

我检查了 _BaseAutoModelClass.from_pretrained 类，但找不到设置“8_bit”的位置。我需要做什么才能正确加载 4 位模型？

我尝试更改bnb_config以使其适应8_bit，但无法解决问题。

Answer 1

即使我遇到了同样的错误，我已经使用自动训练对数据集进行了微调，现在我想使用 Mistra AI 模型加载它的适配器，但我仍然无法执行相同的操作。我无法进行推理，并且遇到了同样的错误 ValueError：同时传递

load_in_4bit

参数时，不能将

load_in_8bit

或

quantization_config

作为 kwarg 传递。

如果您找到任何解决方案，请告诉我。

量化 4 位和 8 位 - “quantization_config”中出现错误

问题描述投票：0回答：1

1个回答

最新问题

量化 4 位和 8 位 - “quantization_config”中出现错误

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1