由于谷歌没有找到任何有关错误“http.client.HTTPException:获得超过100个标头”的信息,我创建了这个问题。
>>> import http.client as h
>>> conn = h.HTTPConnection("www.coursefinders.com")
>>> conn.request("HEAD","/")
>>> conn.getresponse();
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.4/http/client.py", line 1148, in getresponse
response.begin()
File "/usr/lib/python3.4/http/client.py", line 376, in begin
self.headers = self.msg = parse_headers(self.fp)
File "/usr/lib/python3.4/http/client.py", line 267, in parse_headers
raise HTTPException("got more than %d headers" % _MAXHEADERS)
http.client.HTTPException: got more than 100 headers
这个异常是什么意思,我应该如何正确处理这种类型的错误?网站在浏览器中运行正常。
这是一个不涉及更改库的 py 文件的解决方案:
import httplib # or http.client if you're on Python 3
httplib._MAXHEADERS = 1000
只需将其放在代码顶部即可
将 C:\Python27\Lib\httplib.py 中的“_MAXHEADERS”值更改为 1000 或 10000
我本来建议使用
requests
,但它是使用http.client实现的,并且由于同样的原因而失败。为了验证问题是在库还是服务器中,我尝试了 telnet 会话,结果类似于:
Trying 91.250.81.121...
Connected to www.coursefinders.com.
Escape character is '^]'.
HEAD / HTTP\1.1
HTTP/1.1 200 OK
Date: Mon, 14 Apr 2014 08:35:54 GMT
Server: Apache/2.2.16 (Debian)
X-Powered-By: PHP/5.3.3-7+squeeze19
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Set-Cookie: PHPSESSID=2bnr4dpa4e90r2lmbv01smu1b6; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: c_id=496cc5d32486ac8d944e971ad6ec9eb3649ab23cs%3A3%3A%22235%22%3B; expires=Tue, 15-Apr-2014 08:35:54 GMT; path=/
Set-Cookie: login=-1; path=/
Set-Cookie: wc=1; expires=Thu, 09-Apr-2015 08:35:54 GMT
Set-Cookie: login=-1; path=/
Set-Cookie: login=-1; path=/
[... Many Set-Cookie commands omitted ...]
Set-Cookie: login=-1; path=/
Cache-Control: max-age=1, private, must-revalidate
Vary: Accept-Encoding
Connection: close
Content-Type: text/html; charset=utf-8
Connection closed by foreign host.
所以看起来他们的服务器配置错误并且正在喷出大量多余的 Set-Cookie 标头。
似乎没有任何方法可以配置
httplib
来接受大量标头。我尝试过寻找未使用 httplib
实现的替代 HTTP 库,但没有任何运气。
一个 OSX 我将其添加到我的代码中
import httplib as http_client
然后调试脚本以查找从何处加载库。就我而言,是
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py
然后我根据 Felix 的帖子编辑限制
sudo vim /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py