我正在编写一个 Python 脚本,使用预签名 URL 将大文件(5GB+)上传到 s3 存储桶。我有这个代码的 JavaScript 版本,所以我相信逻辑和端点都是有效的。
对于文件的每个部分,我都会获得一个预先签名的分段上传 URL,然后尝试向该 URL 发出 PUT 请求:
offset = 0
part_number = 0
with open(file_path, 'rb') as f:
while offset < file_size_bytes:
# Get a presigned URL for this chunk
get_multipart_upload_url_params = {
"partNumber": part_number,
"uploadId": upload_id,
"Key": file_key,
}
get_multipart_upload_url_response = requests.get(GET_MULTIPART_UPLOAD_URL_ENDPOINT, params=get_multipart_upload_url_params)
if 'uploadURL' not in get_multipart_upload_url_response.json():
print("Error: Upload Part URL not found in response")
sys.exit(1)
chunk_upload_url = get_multipart_upload_url_response.json()['uploadURL']
# Upload the chunk
remaining_bytes = file_size_bytes - offset
chunk_size = min(MAX_CHUNK_SIZE, remaining_bytes)
chunk = f.read(chunk_size)
if not chunk:
break
response = requests.put(chunk_upload_url, data=chunk)
...
当 requests.put 执行时,我看到一个如下所示的错误:
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bucket-name.s3.amazonaws.com', port=443): Max retries exceeded with url: [PRESIGNED URL REDACTED] (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2426)')))
更令人困惑的是,当我实现单部分上传功能时,它使用相同的接口可以正常工作:
# Get presigned upload URL
upload_response = requests.get(SINGLE_PART_UPLOAD_API_ENDPOINT, params={
'filename': filename,
}).json()
if 'uploadURL' not in upload_response or 'Key' not in upload_response:
print("Error: Upload URL or file key not found in response")
sys.exit(1)
upload_url = upload_response['uploadURL']
file_key = upload_response['Key']
# Upload the file using requests
print(f"Uploading: {file_path}")
with open(file_path, 'rb') as f:
response = requests.put(upload_url, data=f, headers={"Content-Type": "application/octet-stream"})
...
我尝试过的一些事情:
问题是partNumber的索引为1,而我将初始零件编号值设置为0。我将保留这篇文章,希望它能在将来帮助其他人。
参考:https://docs.aws.amazon.com/AmazonS3/latest/API/API_UploadPart.html