我正在尝试启动 sagemaker 管道,但遇到了容器无法检测到正在启动的 py 脚本的问题。
基本设置:
pipeline_test.py
的脚本,仅包含 `print("Hello World")。pipeline_test.py
(代码见下文)。启动脚本:
import json
import os
from sagemaker import image_uris
from sagemaker.processing import (
Processor,
ScriptProcessor,
ProcessingInput,
ProcessingOutput,
)
from sagemaker.session import Session
session = Session()
os.environ["IMAGE_URI"] = <value>
os.environ["ROLE"] = <value>
os.environ["BUCKET_NAME"] = <value>
# upload code to s3
code_input_path = (
f"s3://{os.environ['BUCKET_NAME']}/pipeline_test/pipeline_test.py"
)
# output data in s3
data_output_path = f"s3://{os.environ['BUCKET_NAME']}/pipeline_test"
session.upload_data(
bucket=os.environ["BUCKET_NAME"],
key_prefix=f"pipeline_test",
path="/home/sagemaker-user/data_science/pipeline_test.py",
)
# sagemaker container paths
container_base_path = "/opt/ml/processing"
def test_base_processor():
# handle amazon sagemaker processing tasks
processor = Processor(
role=os.environ["ROLE"],
image_uri=os.environ["IMAGE_URI"],
instance_count=1,
instance_type="ml.t3.medium",
entrypoint=[
"python",
f"{container_base_path}/pipeline_test.py", # I also tried /input/pipeline_test.py
"--processor=base-processor",
],
)
processor.run(
job_name=f'processor-test-5',
outputs=[
ProcessingOutput(
source=f"{container_base_path}/output/result",
destination=f"{data_output_path}/result",
output_name="test_result",
),
],
)
test_base_processor()
不幸的是,管道失败,当我检查 cloudwatch 日志时,我看到以下错误:
python:无法打开文件“/opt/ml/processing/pipeline_test.py”:[Errno 2]没有这样的文件或目录
这是 dockerfile:
# Use Python 3.10.14 base image
FROM --platform=linux/amd64 python:3.10.14 as build
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1 \
PYTHONUNBUFFERED=1 \
PYTHONIOENCODING=UTF-8 \
LANG=C.UTF-8 \
LC_ALL=C.UTF-8
# Install system dependencies
RUN apt-get update && \
apt-get install -y \
gcc \
libpq-dev \
libffi-dev
# Copy requirements.txt
COPY requirements.txt .
# Install Python dependencies
RUN pip install --no-cache-dir -r requirements.txt
我检查过的事情:
pipeline_test.py
已成功推送到s3container_base_path
的多种变体,添加 /processing/
目录、删除它等等。