我正在尝试从以下链接下载文件:
url = "https://nsearchives.nseindia.com/content/fo/qtyfreeze.xls"
我尝试过设置
timeout
、stream
、headers
、allow-redirects
。当我不设置超时时,响应永远不会完成。当我设置 timeout
时,我总是收到超时错误。我可以在 Firefox 和 Chrome 中毫无问题地下载该文件。
我正在使用的代码:
import requests as req
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/vnd.ms-excel",
"Accept-Encoding": "gzip, deflate"}
url = "https://nsearchives.nseindia.com/content/fo/qtyfreeze.xls"
# and variations of the following
resp = req.get(url, timeout=30, headers=headers)
# tried urllib
#urllib.request.urlretrieve(url,"/home/data/freeze_quantity.xls")
事情发生了变化(无法指出错误,所以,按照我的方式尝试):
试试这个:
import requests
url = "https://nsearchives.nseindia.com/content/fo/qtyfreeze.xls"
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/vnd.ms-excel',
'Accept-Encoding': 'gzip, deflate'
}
try:
with requests.get(url, headers=headers, stream=True, timeout=60) as response:
response.raise_for_status()
with open('freeze_quantity.xls', 'wb') as file:
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
print("File downloaded successfully.")
except requests.exceptions.RequestException as e:
print(f"Error downloading file: {e}")