请求 - 无法从链接下载 xls 文件、陷入循环或超时?

问题描述 投票:0回答:1

我正在尝试从以下链接下载文件:

url = "https://nsearchives.nseindia.com/content/fo/qtyfreeze.xls"

我尝试过设置

timeout
stream
headers
allow-redirects
。当我不设置超时时,响应永远不会完成。当我设置
timeout
时,我总是收到超时错误。我可以在 Firefox 和 Chrome 中毫无问题地下载该文件。

我正在使用的代码:

import requests as req

headers = {
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
        "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/vnd.ms-excel",
        "Accept-Encoding": "gzip, deflate"}

url = "https://nsearchives.nseindia.com/content/fo/qtyfreeze.xls"

# and variations of the following
resp = req.get(url, timeout=30, headers=headers)

# tried urllib
#urllib.request.urlretrieve(url,"/home/data/freeze_quantity.xls")

python python-requests
1个回答
0
投票

事情发生了变化(无法指出错误,所以,按照我的方式尝试):

  1. 超时值:从 30 秒增加到 60 秒。
  2. 流式传输:添加了stream=True来处理大文件的流式传输。
  3. 分块写入文件:利用iter_content分块写入响应内容。
  4. 错误处理:使用response.raise_for_status()和try- except块增强了错误处理。

试试这个:

import requests

url = "https://nsearchives.nseindia.com/content/fo/qtyfreeze.xls"

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/vnd.ms-excel',
    'Accept-Encoding': 'gzip, deflate'
}

try:
    with requests.get(url, headers=headers, stream=True, timeout=60) as response:
        response.raise_for_status()
        with open('freeze_quantity.xls', 'wb') as file:
            for chunk in response.iter_content(chunk_size=8192):
                file.write(chunk)
    print("File downloaded successfully.")
except requests.exceptions.RequestException as e:
    print(f"Error downloading file: {e}")
© www.soinside.com 2019 - 2024. All rights reserved.