我有一个 Python 脚本,用于从 URL 下载图像并将其上传到 AWS S3。当我在本地计算机上运行该脚本时,它可以完美运行。但是,当我在 AWS EC2 实例上部署并运行相同的脚本时,遇到了
ReadTimeout
错误。
我收到的错误如下:
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='www.net-a-porter.com', port=443): Read timed out. (read timeout=100)
下面是我的代码的相关部分:
import requests
import tempfile
import os
def upload_image_to_s3_from_url(self, image_url, filename, download_timeout=120):
"""
Downloads an image from the given URL to a temporary file and uploads it to AWS S3,
then returns the S3 file URL.
"""
try:
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.96 Safari/537.36",
'Accept': 'image/avif,image/webp,image/apng,image/*,*/*;q=0.8'
}
# Request the image
response = requests.get(image_url, timeout=download_timeout, stream=True, headers=headers)
response.raise_for_status()
# Determine the content type
content_type = response.headers.get('Content-Type', 'image/jpeg') # Default to image/jpeg
# Create a temporary file
with tempfile.NamedTemporaryFile(delete=False) as tmp_file:
# Write the response content to the temporary file
for chunk in response.iter_content(chunk_size=8192):
tmp_file.write(chunk)
# Now that we have the image locally, upload it to S3 with the correct content type
file_url = self.upload_image_to_s3(tmp_file.name, filename, content_type)
# Optionally, delete the temporary file here if you set delete=False
os.unlink(tmp_file.name)
return file_url
except requests.RequestException as e:
raise Exception(f"Failed to download or upload image. Error: {e}")
# Example URL causing issues
image_url = "https://www.net-a-porter.com/variants/images/1647597326276381/in/w1365_a3-4_q60.jpg"
尝试从
www.net-a-porter.com
下载图像时会出现此问题。超时设置为 120 秒,我认为这已经足够了。
到目前为止我尝试过的:
User-Agent
任何有关如何解决此问题的见解或建议将不胜感激。
测试表明,当添加一组特定标头时,Web 服务器会做出响应。不确定这种行为是有意还是无意。更改了用户代理并添加了额外的标头,如下所示以查看它是否获得响应:
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0",
'Accept': 'image/avif,image/webp,image/apng,image/*,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive'
}
你能尝试一下吗?