botocore.exceptions.ClientError:调用CreateModel操作时发生错误(ValidationException):无法访问模型数据

问题描述 投票:0回答:2

我想将 MLflow 映像部署到包含机器学习模型的 AWS Sagemaker 终端节点。我执行了以下代码,我在这篇博文中找到了它。

import mlflow.sagemaker as mfs

run_id = run_id # the model you want to deploy - this run_id was saved when we trained our model
region = "us-east-1" # region of your account
aws_id = "XXXXXXXXXXX" # from the aws-cli output
arn = "arn:aws:iam::XXXXXXXXXXX:role/your-role"
app_name = "iris-rf-1"
model_uri = "mlruns/%s/%s/artifacts/random-forest-model" % (experiment_id,run_id) # edit this path based on your working directory
image_url = aws_id + ".dkr.ecr." + region + ".amazonaws.com/mlflow-pyfunc:1.2.0" # change to your mlflow version

mfs.deploy(app_name=app_name, 
           model_uri=model_uri, 
           region_name=region, 
           mode="create",
           execution_role_arn=arn,
           image_url=image_url)

但是我收到以下错误。我检查了附加到 IAM 角色的所有策略和权限。它们都符合错误消息所抱怨的内容。

botocore.exceptions.ClientError:调用 CreateModel 操作时发生错误 (ValidationException):无法访问模型数据 https://s3.amazonaws.com/mlflow-sagemaker-us-east-1-xxx/mlflow- xgb-demo-model-eqktjeoit5mxhmjn-abpanw/model.tar.gz。请确保角色“arn:aws:iam::xxx:role/mlflow-sagemaker-dev”存在,并且其信任关系策略允许对服务主体“sagemaker.amazonaws.com”执行操作“sts:AssumeRole”。还要确保该角色具有“s3:GetObject”权限并且该对象位于 us-east-1。

如何解决这个问题?

amazon-web-services amazon-sagemaker mlflow
2个回答
1
投票

我找到了根本原因。我必须转到 IAM 角色的“信任关系”部分,然后将“sagemaker.amazonaws.com”添加到服务主体。


0
投票

只需访问cloudFormation服务即可查看堆栈名称。错误将会得到解决。 当您为堆栈命名与项目名称不同时,就会出现此错误。默认情况下,README.md 文件引用项目名称为 stack 的命令。

© www.soinside.com 2019 - 2024. All rights reserved.