我正在尝试在 www.roblox.com 上抓取一个需要登录的页面。我已经使用 .ROBLOSECURITY cookie 完成了此操作,但是,该 cookie 每隔几天就会更改一次。我想使用登录表单和 Python 来登录。表格和我到目前为止所拥有的内容如下。我不想使用任何附加库,例如 mechanize 或 requests。
形式:
<form action="/newlogin" id="loginForm" method="post" novalidate="novalidate" _lpchecked="1">
<div id="loginarea" class="divider-bottom" data-is-captcha-on="False">
<div id="leftArea">
<div id="loginPanel">
<table id="logintable">
<tbody>
<tr id="username">
<td><label class="form-label" for="Username">Username:</label></td>
<td><input class="text-box text-box-medium valid" data-val="true" data-val-required="The Username field is required." id="Username" name="Username" type="text" value="" autocomplete="off" aria-required="true" aria-invalid="false" style="cursor: auto; background-image: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGP6zwAAAgcBApocMXEAAAAASUVORK5CYII=);"></td>
</tr>
<tr id="password">
<td><label class="form-label" for="Password">Password:</label></td>
<td><input class="text-box text-box-medium" data-val="true" data-val-required="The Password field is required." id="Password" name="Password" type="password" autocomplete="off" style="cursor: auto; background-image: url(data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR4nGP6zwAAAgcBApocMXEAAAAASUVORK5CYII=);"></td>
</tr>
</tbody>
</table>
<div>
</div>
<div>
<div id="forgotPasswordPanel">
<a class="text-link" href="/Login/ResetPasswordRequest.aspx" target="_blank">Forgot your password?</a>
</div>
<div id="signInButtonPanel" data-use-apiproxy-signin="False" data-sign-on-api-path="https://api.roblox.com/login/v1">
<a roblox-js-onclick="" class="btn-medium btn-neutral">Sign In</a>
<a roblox-js-oncancel="" class="btn-medium btn-negative">Cancel</a>
</div>
<div class="clearFloats">
</div>
</div>
<span id="fb-root">
<div id="SplashPageConnect" class="fbSplashPageConnect">
<a class="facebook-login" href="/Facebook/SignIn?returnTo=/home" ref="form-facebook">
<span class="left"></span>
<span class="middle">Login with Facebook<span>Login with Facebook</span></span>
<span class="right"></span>
</a>
</div>
</span>
</div>
</div>
<div id="rightArea" class="divider-left">
<div id="signUpPanel" class="FrontPageLoginBox">
<p class="text">Not a member?</p>
<h2>Sign Up to Build & Make Friends</h2>
<a roblox-js-onsignup="" class="btn-medium btn-primary">Sign Up</a>
</div>
</div>
</div>
<input id="ReturnUrl" name="ReturnUrl" type="hidden" value="">
</form>
到目前为止我所拥有的:
import cookielib
import urllib
import urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
authentication_url = 'http://www.roblox.com/newlogin'
payload = {
'ReturnUrl' : 'http://www.roblox.com/home',
'Username' : 'usernamehere',
'Password' : 'passwordhere'
}
data = urllib.urlencode(payload)
req = urllib2.Request(authentication_url, data)
resp = urllib2.urlopen(req)
contents = resp.read()
print contents
我的代码有什么问题;我只有在打印内容时才看到登录页面
PS:登录页面是HTTPS
OP 的解决方案。
我自己完成了脚本,代码如下:
import cookielib
import urllib
import urllib2
cj = cookielib.CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
urllib2.install_opener(opener)
authentication_url = 'https://www.roblox.com/newlogin'
payload = {
'username' : 'YourUsernameHere',
'password' : 'YourPasswordHere',
'' : 'Log In',
}
data = urllib.urlencode(payload)
req = urllib2.Request(authentication_url, data)
resp = urllib2.urlopen(req)
PageYouWantToOpen = urllib2.urlopen("http://www.roblox.com/develop").read()
几周前,我仅使用 urllib.request 进行了一些网络抓取/自动选项卡打开,从而制作了这个课程。这可能会帮助您,或者也许会让您走上正确的道路。
import urllib.request
class Log_in:
def __init__(self, loginURL, username, password):
self.loginURL = loginURL
self.username = username
self.password = password
def log_in_to_site(self):
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm = None,
uri=self.loginURL,
user=self.username,
passwd=self.password)
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)