我正在尝试使用Python语言通过AWS CDK自动部署SageMaker多模型端点,(我想通过直接以json / yaml格式编写CloudFormation模板也是如此),但是,当尝试部署它时, SageMaker模型创建时发生错误。
这里是使用cdk synth
命令制作的CloudFormation模板的一部分:
Resources:
smmodelexecutionrole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Statement:
- Action: sts:AssumeRole
Effect: Allow
Principal:
Service: sagemaker.amazonaws.com
Version: "2012-10-17"
Policies:
- PolicyDocument:
Statement:
- Action: s3:GetObject
Effect: Allow
Resource:
Fn::Join:
- ""
- - "arn:"
- Ref: AWS::Partition
- :s3:::<bucket_name>/deploy_multi_model_artifact/*
Version: "2012-10-17"
PolicyName: policy_s3
- PolicyDocument:
Statement:
- Action: ecr:*
Effect: Allow
Resource:
Fn::Join:
- ""
- - "arn:"
- Ref: AWS::Partition
- ":ecr:"
- Ref: AWS::Region
- ":"
- Ref: AWS::AccountId
- :repository/<my_ecr_repository>
Version: "2012-10-17"
PolicyName: policy_ecr
Metadata:
aws:cdk:path: <omitted>
smmodel:
Type: AWS::SageMaker::Model
Properties:
ExecutionRoleArn:
Fn::GetAtt:
- smmodelexecutionrole
- Arn
Containers:
- Image: xxxxxxxxxxxx.dkr.ecr.<my_aws_region>.amazonaws.com/<my_ecr_repository>/multi-model:latest
Mode: MultiModel
ModelDataUrl: s3://<bucket_name>/deploy_multi_model_artifact/
ModelName: MyModel
Metadata:
aws:cdk:path: <omitted>
在终端上运行cdk deploy
时,发生以下错误:
3/6 | 7:56:58 PM | CREATE_FAILED | AWS::SageMaker::Model | sm_model (smmodel)
Could not access model data at s3://<bucket_name>/deploy_multi_model_artifact/.
Please ensure that the role "arn:aws:iam::xxxxxxxxxxxx:role/<my_role>" exists
and that its trust relationship policy allows the action "sts:AssumeRole" for the service principal "sagemaker.amazonaws.com".
Also ensure that the role has "s3:GetObject" permissions and that the object is located in <my_aws_region>.
(Service: AmazonSageMaker; Status Code: 400; Error Code: ValidationException; Request ID: xxxxx)
我有:
为了测试是否是IAM角色问题,我尝试用MultiModel
替换SingleModel
,并用s3://<bucket_name>/deploy_multi_model_artifact/
替换s3://<bucket_name>/deploy_multi_model_artifact/one_of_my_artifacts.tar.gz
,这样我就可以成功创建模型。然后,我猜测这与IAM无关,这与错误消息告诉我的是相反的(但我可能会犯错!)。
所以我想知道问题出在哪里。因为我已经使用boto3毫无问题地部署了此多模型端点,这更加令人困惑。
任何帮助将不胜感激!