我正在尝试使用自定义 docker 文件在 sagemaker 端点中部署模型:
ARG REGION=us-east-1
FROM 763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:2.0.1-gpu-py310-cu118-ubuntu20.04-sagemaker
RUN pip install poetry
RUN poetry config virtualenvs.create false
WORKDIR /opt/
RUN poetry new code --name models
WORKDIR /opt/code/
RUN poetry add json-lines sagemaker-inference
ADD tuta models/tuta
ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/code
ENV SAGEMAKER_PROGRAM models/tuta/sm_inference.py
models/tuta 包含多个模型文件,例如图层、指标...以及 sm_inference.Py 文件:
from models.tuta.inference import TUTAForCTC
import json
import os
JSON_CONTENT_TYPE = 'application/json'
def model_fn(model_dir):
print("loading the model!")
model = TUTAForCTC(model_bin=os.path.join(model_dir, "tuta-ctc.bin"), model_config_path=os.path.join(model_dir, "config.json"))
print("model loaded!")
return model
def predict_fn(data, model):
print("predicting...")
return {"response": data}
# return model.predict(data['hier_table'], data['flat_table'], data['table_range'])
def input_fn(serialized_input_data, content_type=JSON_CONTENT_TYPE):
print("reading input...")
return json.loads(serialized_input_data)
def output_fn(prediction, content_type):
return prediction
端点已部署并具有 InService 状态,ping 时有 200 响应。但是一旦我运行发送请求,我就会收到错误,并且 ping 响应为 500。