我已使用 cookies.txt Chrome 插件从 Chrome 导出 cookie。我可以使用curl 轻松重放请求并导出cookie。像
curl -b mycookie.txt -c mycookie.txt $url
这样的代码可以完美运行。
但是,当我尝试使用请求从 mycookie.txt 加载会话时,它无法工作。经过长时间的调试,我发现请求不会发送会话cookie(cookie文件中的过期值为0),即使我已经使用以下代码加载了过期的cookie:
cj = cookielib.MozillaCookieJar('mycookie.txt')
cj.load(ignore_discard=True, ignore_expires=True)
print cj #it shows the session cookie already loaded
s = requests.Session()
s.cookies = cj
r = s.get(url, headers=myheaders, timeout=10, allow_redirects=False)
print r.request.headers #this clearly shows the request didn't send the session cookie
我怎样才能让它发挥作用?
这发生在我身上。让我们仔细看看
requests/models.py
:
def prepare(
...
self.prepare_cookies(cookies) # [1]
def prepare_cookies(self, cookies):
...
if isinstance(cookies, cookielib.CookieJar):
self._cookies = cookies
else:
...
cookie_header = get_cookie_header(self._cookies, self) # [2]
...
转到
requests/cookies.py
:
def get_cookie_header(jar, request):
...
jar.add_cookie_header(r) # [3]
return r.get_new_headers().get("Cookie")
转到
http/cookiejar.py
:
def add_cookie_header(self, request):
...
cookies = self._cookies_for_request(request) # [4]
...
def _cookies_for_request(self, request):
"""Return a list of cookies to be returned to server."""
cookies = []
for domain in self._cookies.keys():
cookies.extend(self._cookies_for_domain(domain, request)) # [5]
return cookies
def _cookies_for_domain(self, domain, request):
cookies = []
if not self._policy.domain_return_ok(domain, request):
return []
_debug("Checking %s for cookies to return", domain)
cookies_by_path = self._cookies[domain]
for path in cookies_by_path.keys():
if not self._policy.path_return_ok(path, request):
continue
cookies_by_name = cookies_by_path[path]
for cookie in cookies_by_name.values():
if not self._policy.return_ok(cookie, request): # [6] It calls `return_ok()` per cookie line
_debug(" not returning cookie")
continue
cookies.append(cookie)
return cookies
def return_ok(self, cookie, request):
...
for n in "version", "verifiability", "secure", "expires", "port", "domain":
fn_name = "return_ok_"+n
fn = getattr(self, fn_name)
if not fn(cookie, request): # [7] It will calls `return_ok_domain()`
return False
return True
def return_ok_domain(self, cookie, request):
req_host, erhn = eff_request_host(request) # [8] It calls eff_request_host() to append .local
domain = cookie.domain
if domain and not domain.startswith("."):
dotdomain = "." + domain
else:
dotdomain = domain
...
# [10] `.localhost.local` will not match `.localhost`:
if cookie.version == 0 and not ("."+erhn).endswith(dotdomain):
_debug(" request-host %s does not match Netscape cookie domain "
"%s", req_host, domain)
return False
return True
IPV4_RE = re.compile(r"\.\d+$", re.ASCII)
def eff_request_host(request):
"""Return a tuple (request-host, effective request-host name).
As defined by RFC 2965, except both are lowercased.
"""
erhn = req_host = request_host(request)
if req_host.find(".") == -1 and not IPV4_RE.search(req_host):
erhn = req_host + ".local" # [9] Problem here
return req_host, erhn
这意味着不带点的
localhost
将连接到localhost.local
。例如,如果我的 cookie 文件是使用 localhost
域编辑的,则它不会发送 cookie,因为域不匹配:
localhost FALSE / FALSE 1712121230 token xxxxx
cookie 将在附加
.local
后发送(请注意,http
的本地主机仅选择第四列等于 FALSE
的行):
localhost.local FALSE / FALSE 1712121230 token xxxxx
或者,我可以使用相同的 IP,例如
s.get(http://127.0.0.1:<port>/...)
和 127.0.0.1 FALSE / FALSE 1712121230 token xxxxx