我尝试使用 SKlearn 通过 AWS Sagemaker 部署模型,并收到此错误:
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)
<ipython-input-145-29a1d3175b01> in <module>
----> 1 deployment = model.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/estimator.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, use_compiled_model, wait, model_name, kms_key, data_capture_config, tags, serverless_inference_config, async_inference_config, **kwargs)
1254 kms_key=kms_key,
1255 data_capture_config=data_capture_config,
-> 1256 serverless_inference_config=serverless_inference_config,
1257 async_inference_config=async_inference_config,
1258 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, serializer, deserializer, accelerator_type, endpoint_name, tags, kms_key, wait, data_capture_config, async_inference_config, serverless_inference_config, **kwargs)
1001 self._base_name = "-".join((self._base_name, compiled_model_suffix))
1002
-> 1003 self._create_sagemaker_model(
1004 instance_type, accelerator_type, tags, serverless_inference_config
1005 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/model.py in _create_sagemaker_model(self, instance_type, accelerator_type, tags, serverless_inference_config)
548 container_def,
549 vpc_config=self.vpc_config,
--> 550 enable_network_isolation=enable_network_isolation,
551 tags=tags,
552 )
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/session.py in create_model(self, name, role, container_defs, vpc_config, enable_network_isolation, primary_container, tags)
2670
2671 try:
-> 2672 self.sagemaker_client.create_model(**create_model_request)
2673 except ClientError as e:
2674 error_code = e.response["Error"]["Code"]
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
413 "%s() only accepts keyword arguments." % py_operation_name)
414 # The "self" in this scope is referring to the BaseClient.
--> 415 return self._make_api_call(operation_name, kwargs)
416
417 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
743 error_code = parsed_response.get("Error", {}).get("Code")
744 error_class = self.exceptions.from_code(error_code)
--> 745 raise error_class(parsed_response, operation_name)
746 else:
747 return parsed_response
ClientError: An error occurred (ValidationException) when calling the CreateModel operation: Could not find model data at s3://sagemaker-us-east-2-978433479050/sagemaker-scikit-learn-2022-04-28-22-33-14-817/output/model.tar.gz.
我正在运行的代码是:
from sagemaker import Session, get_execution_role
from sagemaker.sklearn.estimator import SKLearn
sagemaker_session = Session()
role = get_execution_role()
train_input = sagemaker_session.upload_data("TSLA.csv")
model = SKLearn(entry_point='lr.py',
train_instance_type='ml.m4.xlarge',
role=role, framework_version='0.231',
sagemaker_session=sagemaker_session)
model.fit({'train': train_input})
deployment = model.deploy(initial_instance_count=1, instance_type="ml.m4.xlarge")
train_input 是:s3://sagemaker-us-east-2-978433479050/data/TSLA.csv
训练工作已完成,但由于某种原因模型未部署。
日志表明您训练的模型工件没有被正确捕获。请跑
model.data #estimator that you are training
这将显示您的模型工件/数据是否已实际创建(model.tar.gz)。
以下是训练/部署 sklearn 模型的示例:https://github.com/RamVegiraju/SageMaker-Deployment/tree/master/RealTime/Script-Mode/Sklearn/Regression