SageMaker 抱怨 /opt/ml/model 似乎没有名为 config.json 的文件

问题描述 投票:0回答:1

我使用 Huggingface 模型和自定义管道将我的模型部署到 SageMaker 上,我的 model.tar.gz 结构如下所示:

├── added_tokens.json
├── code
│   ├── inference.py
│   ├── pipeline.py
│   └── requirements.txt
├── config.json
├── generation_config.json
├── model-00001-of-00002.safetensors
├── model-00002-of-00002.safetensors
├── model.safetensors.index.json
├── special_tokens_map.json
├── tokenizer_config.json
├── tokenizer.json
└── tokenizer.model

我通过

部署了我的模型
from sagemaker.huggingface.model import HuggingFaceModel

hub = {
   'HF_TASK':'text-generation'
}
huggingface_model = HuggingFaceModel(
   env=hub, 
   model_data="s3://my_model_bucket/model.tar.gz",
   role=role,
   transformers_version="4.28",
   pytorch_version="2.0",
   py_version='py310',
)

# deploy the endpoint endpoint
predictor = huggingface_model.deploy(
    initial_instance_count=1,
    instance_type="ml.g5.xlarge"
    )

但是,当我尝试调用模型时,这是我的回应

{
  "code": 400,
  "type": "InternalServerException",
  "message": "/opt/ml/model does not appear to have a file named config.json. Checkout \u0027https://huggingface.co//opt/ml/model/None\u0027 for available files."
}

另一个错误是

W-model-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - OSError: /opt/ml/model does not appear to have a file named config.json. Checkout 'https://huggingface.co//opt/ml/model/None' for available files.

但是 config.json 显然在我的模型目录中。这是我的 inference.py 代码

import torch
from typing import Dict
from transformers import AutoTokenizer, AutoModelForCausalLM
from pipeline import MyCustomPipeline

pipeline = None

def model_fn(model_dir):
    print("Loading model from: " + model_dir)
    tokenizer = AutoTokenizer.from_pretrained(
        model_dir,
        local_files_only=True,
    )
    model = AutoModelForCausalLM.from_pretrained(
        model_dir,
        local_files_only=True,
        device_map="auto",
        torch_dtype=torch.bfloat16,
        trust_remote_code=True,
    )
    pipeline = MyCustomPipeline(model, tokenizer)
    return model, tokenizer

def transform_fn(model, input_data, content_type, accept):
    return pipeline(input_data)

我在这里做错了什么?我应该遵循所有必要的步骤将拥抱模型部署到 SageMaker 上。

python huggingface-transformers amazon-sagemaker huggingface
1个回答
0
投票

我遇到了同样的错误。显然,我提到的 S3 路径是错误的。

请通过解压缩一次文件来仔细检查您是否引用了正确的模型工件。

#检查 tar 文件内容,看看它是否是您显示的格式。 !tar -xvf "model.tar.gz"

© www.soinside.com 2019 - 2024. All rights reserved.