带有附加标头的 Python HTTP 请求在云服务器上生成 403 错误,在我的机器上运行良好

问题描述 投票:0回答:2

总结我发现并需要帮助的问题,

  • 我创建了一个调用 get 请求的 python 程序 https://bx.in.th/api/pairing/
  • 程序在我的机器(Mac OSX)上运行良好
  • 一旦在 Digital Ocean Ubuntu droplet 上运行,它就会抛出 HTTP 403 禁止错误。
  • 我做了一天的研究,大部分答案都是修改标题 我尝试了所有这些,但没有成功。

我浏览过的一些链接/参考。

这是指向问题的简化源代码:

import urllib.request
import json

url = 'https://bx.in.th/api/pairing/'

headers = {
    'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.64 Safari/537.11',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.3',
    'Accept-Encoding': 'none',
    'Accept-Language': 'en-US,en;q=0.5',
    'Connection': 'keep-alive'
}

request = urllib.request.Request(url, headers=headers)

response = urllib.request.urlopen(request)

print(response.read())
print()
print(response.getheaders())

正确的输出应该是:

b'{"1":{"pairing_id":1,"primary_currency":"THB","secondary_currency":"BTC"},"21":{"pairing_id":21,"primary_currency":"THB","secondary_currency":"ETH"},"22":{"pairing_id":22,"primary_currency":"THB","secondary_currency":"DAS"},"23":{"pairing_id":23,"primary_currency":"THB","secondary_currency":"REP"},"20":{"pairing_id":20,"primary_currency":"BTC","secondary_currency":"ETH"},"4":{"pairing_id":4,"primary_currency":"BTC","secondary_currency":"DOG"},"6":{"pairing_id":6,"primary_currency":"BTC","secondary_currency":"FTC"},"24":{"pairing_id":24,"primary_currency":"THB","secondary_currency":"GNO"},"13":{"pairing_id":13,"primary_currency":"BTC","secondary_currency":"HYP"},"2":{"pairing_id":2,"primary_currency":"BTC","secondary_currency":"LTC"},"3":{"pairing_id":3,"primary_currency":"BTC","secondary_currency":"NMC"},"26":{"pairing_id":26,"primary_currency":"THB","secondary_currency":"OMG"},"14":{"pairing_id":14,"primary_currency":"BTC","secondary_currency":"PND"},"5":{"pairing_id":5,"primary_currency":"BTC","secondary_currency":"PPC"},"19":{"pairing_id":19,"primary_currency":"BTC","secondary_currency":"QRK"},"15":{"pairing_id":15,"primary_currency":"BTC","secondary_currency":"XCN"},"7":{"pairing_id":7,"primary_currency":"BTC","secondary_currency":"XPM"},"17":{"pairing_id":17,"primary_currency":"BTC","secondary_currency":"XPY"},"25":{"pairing_id":25,"primary_currency":"THB","secondary_currency":"XRP"},"8":{"pairing_id":8,"primary_currency":"BTC","secondary_currency":"ZEC"}}'

[('Date', 'Sun, 13 Aug 2017 09:27:02 GMT'), ('Content-Type', 'text/javascript'), ('Content-Length', '1485'), ('Connection', 'close'), ('Set-Cookie', '__cfduid=d51c37ea835bae4a0c892e91f34f7bc131502616422; expires=Mon, 13-Aug-18 09:27:02 GMT; path=/; domain=.bx.in.th; HttpOnly'), ('Cache-Control', 'max-age=86400'), ('Expires', 'Mon, 14 Aug 2017 09:27:02 GMT'), ('Strict-Transport-Security', 'max-age=0'), ('X-Content-Type-Options', 'nosniff'), ('Server', 'cloudflare-nginx'), ('CF-RAY', '38daa2e36e0a836b-BKK')]

在 droplet 上运行源代码得到的错误:

raceback (most recent call last):
  File "api-call.py", line 17, in <module>
    response = urllib.request.urlopen(request)
  File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 472, in open
    response = meth(req, response)
  File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.5/urllib/request.py", line 510, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

谢谢!

python server httprequest http-status-code-403
2个回答
1
投票

你必须使用像 Luminati 这样的强代理。 我也收到 403 错误状态,但它适用于 luminati 代理。


0
投票

在 Digital Ocean 上有类似的问题

解决方案是注册代理并使用它。注意:luminiti 现在更名为 brightdata.com

下面的例子。

我建议使用 Python 的请求模块,然后像这样设置您的调用:

import requests

proxies = {'http': 'http://brd-customer-hl_234567a0-zone-isp:[email protected]:22225',
           'https': 'http://brd-customer-hl_234567a0-zone-isp:[email protected]:22225'}
url = 'https://bx.in.th/api/pairing/'
headers = {'User-Agent': 'Mozilla/5.0 etc'}
r = requests.get(url, headers=headers, proxies=proxies, timeout=10)

r.status_code # should be 200, not 403

使用

r.text
r.json()
从请求对象中读取api数据。

实际上,您只需要此示例的 https 代理,但最好将它们都包括在内。

© www.soinside.com 2019 - 2024. All rights reserved.