尝试通过 Sagemaker 将本地文件上传到 S3 时出现 FileNotFoundError

问题描述 投票:0回答:1

完全披露:我对 AWS 世界还比较陌生。正如标题所述,我正在尝试通过 Sagemaker 工作室中的 JupyterLab 将文件夹从本地计算机上传到亚马逊 S3 卷。我可以通过单击 JupyterLab 中的 upload icon 来手动执行此操作,但我希望能够使用以下语法来执行此操作:

import sagemaker
from sagemaker.tuner import (
    IntegerParameter,
    CategoricalParameter,
    ContinuousParameter,
    HyperparameterTuner,
)

sagemaker_session = sagemaker.Session()
region = sagemaker_session.boto_region_name
bucket = sagemaker_session.default_bucket()
prefix = "sagemaker/my-first-proj"
role = sagemaker.get_execution_role()

local_dir = "/Users/tomi/DevProjects/WeThePeople/datasets"
inputs = sagemaker_session.upload_data(path=local_dir, bucket=bucket, key_prefix=prefix)

当我运行上面的代码块时,这是我得到的错误:

FileNotFoundError                         Traceback (most recent call last)
Cell In[2], line 2
      1 local_dir = "/Users/tomi/DevProjects/WeThePeople/datasets"
----> 2 inputs = sagemaker_session.upload_data(path=local_dir, bucket=bucket, key_prefix=prefix)
      3 print("input spec (in this case, just an S3 path): {}".format(inputs))

File /opt/conda/lib/python3.10/site-packages/sagemaker/session.py:400, in Session.upload_data(self, path, bucket, key_prefix, extra_args)
    397     s3 = self.s3_resource
    399 for local_path, s3_key in files:
--> 400     s3.Object(bucket, s3_key).upload_file(local_path, ExtraArgs=extra_args)
    402 s3_uri = "s3://{}/{}".format(bucket, key_prefix)
    403 # If a specific file was used as input (instead of a directory), we return the full S3 key
    404 # of the uploaded object. This prevents unintentionally using other files under the same
    405 # prefix during training.

File /opt/conda/lib/python3.10/site-packages/boto3/s3/inject.py:318, in object_upload_file(self, Filename, ExtraArgs, Callback, Config)
    287 def object_upload_file(
    288     self, Filename, ExtraArgs=None, Callback=None, Config=None
    289 ):
    290     """Upload a file to an S3 object.
    291 
    292     Usage::
   (...)
    316         transfer.
    317     """
--> 318     return self.meta.client.upload_file(
    319         Filename=Filename,
    320         Bucket=self.bucket_name,
    321         Key=self.key,
    322         ExtraArgs=ExtraArgs,
    323         Callback=Callback,
    324         Config=Config,
    325     )

File /opt/conda/lib/python3.10/site-packages/boto3/s3/inject.py:143, in upload_file(self, Filename, Bucket, Key, ExtraArgs, Callback, Config)
    108 """Upload a file to an S3 object.
    109 
    110 Usage::
   (...)
    140     transfer.
    141 """
    142 with S3Transfer(self, Config) as transfer:
--> 143     return transfer.upload_file(
    144         filename=Filename,
    145         bucket=Bucket,
    146         key=Key,
    147         extra_args=ExtraArgs,
    148         callback=Callback,
    149     )

File /opt/conda/lib/python3.10/site-packages/boto3/s3/transfer.py:292, in S3Transfer.upload_file(self, filename, bucket, key, callback, extra_args)
    288 future = self._manager.upload(
    289     filename, bucket, key, extra_args, subscribers
    290 )
    291 try:
--> 292     future.result()
    293 # If a client error was raised, add the backwards compatibility layer
    294 # that raises a S3UploadFailedError. These specific errors were only
    295 # ever thrown for upload_parts but now can be thrown for any related
    296 # client error.
    297 except ClientError as e:

File /opt/conda/lib/python3.10/site-packages/s3transfer/futures.py:103, in TransferFuture.result(self)
     98 def result(self):
     99     try:
    100         # Usually the result() method blocks until the transfer is done,
    101         # however if a KeyboardInterrupt is raised we want want to exit
    102         # out of this and propagate the exception.
--> 103         return self._coordinator.result()
    104     except KeyboardInterrupt as e:
    105         self.cancel()

File /opt/conda/lib/python3.10/site-packages/s3transfer/futures.py:266, in TransferCoordinator.result(self)
    263 # Once done waiting, raise an exception if present or return the
    264 # final result.
    265 if self._exception:
--> 266     raise self._exception
    267 return self._result

File /opt/conda/lib/python3.10/site-packages/s3transfer/tasks.py:269, in SubmissionTask._main(self, transfer_future, **kwargs)
    265     self._transfer_coordinator.set_status_to_running()
    267     # Call the submit method to start submitting tasks to execute the
    268     # transfer.
--> 269     self._submit(transfer_future=transfer_future, **kwargs)
    270 except BaseException as e:
    271     # If there was an exception raised during the submission of task
    272     # there is a chance that the final task that signals if a transfer
   (...)
    281 
    282     # Set the exception, that caused the process to fail.
    283     self._log_and_set_exception(e)

File /opt/conda/lib/python3.10/site-packages/s3transfer/upload.py:591, in UploadSubmissionTask._submit(self, client, config, osutil, request_executor, transfer_future, bandwidth_limiter)
    589 # Determine the size if it was not provided
    590 if transfer_future.meta.size is None:
--> 591     upload_input_manager.provide_transfer_size(transfer_future)
    593 # Do a multipart upload if needed, otherwise do a regular put object.
    594 if not upload_input_manager.requires_multipart_upload(
    595     transfer_future, config
    596 ):

File /opt/conda/lib/python3.10/site-packages/s3transfer/upload.py:244, in UploadFilenameInputManager.provide_transfer_size(self, transfer_future)
    242 def provide_transfer_size(self, transfer_future):
    243     transfer_future.meta.provide_transfer_size(
--> 244         self._osutil.get_file_size(transfer_future.meta.call_args.fileobj)
    245     )

File /opt/conda/lib/python3.10/site-packages/s3transfer/utils.py:247, in OSUtils.get_file_size(self, filename)
    246 def get_file_size(self, filename):
--> 247     return os.path.getsize(filename)

File /opt/conda/lib/python3.10/genericpath.py:50, in getsize(filename)
     48 def getsize(filename):
     49     """Return the size of a file, reported by os.stat()."""
---> 50     return os.stat(filename).st_size

FileNotFoundError: [Errno 2] No such file or directory: '/Users/tomi/DevProjects/WeThePeople/datasets'

但是我确信这条路径存在于我的机器上。如果我进入终端,我显然能够访问该目录,如下所示:

>>> (WeThePeople) tomi@MacBook-Pro-4 datasets % pwd
/Users/tomi/DevProjects/WeThePeople/datasets

我认为这可能是 AWS 上的 IAM 权限问题,但我使用 sagemaker 的用户配置文件同时具有

AmazonS3FullAccess
AmazonSageMakerFullAccess
。不确定这是否是问题所在,但我只是想提一下。

问题是,什么可能导致此问题以及如何解决?会不会是其他的权限设置?还有什么我没检查过吗?

python amazon-web-services amazon-s3 amazon-sagemaker
1个回答
0
投票

您的工作室笔记本与您的本地计算机不同。

查看您分享的命令:

>>> (WeThePeople) tomi@MacBook-Pro-4 datasets % pwd
/Users/tomi/DevProjects/WeThePeople/datasets

这似乎是您的 Mac。 Studio 无法访问您的本地 Mac 文件。如果需要,您可以使用您指定的上传按钮将文件上传到 Studio,然后您就可以参考 Studio 中的路径将它们从 SageMaker Studio 上传到 S3。

© www.soinside.com 2019 - 2024. All rights reserved.