Python3爬网无法登录到页面吗?时间戳?

问题描述 投票:0回答:1

所以我刚开始学习使用python3进行网络抓取,并且我想登录该网站:https://dienynas.tamo.lt/Prisijungimas/Login

所需的表单数据为:用户名:用户名,密码:IsMobileUser:否,ReturnUrl:”,RequireCaptcha:否,时间戳:2020-03-31 14:11:21,SToken:17a48bd154307fe36dcadc6359681609f4799034ad5cade3e1b31864f25fe12f

这是我的代码:

from bs4 import BeautifulSoup
import requests
from lxml import html
from datetime import datetime

data = {'UserName': 'username',
           'Password': 'password',
           'IsMobileUser': 'false',
           'ReturnUrl': '',
           'RequireCaptcha': 'false'
           }

login_url = 'https://dienynas.tamo.lt/Prisijungimas/Login'
url = 'https://dienynas.tamo.lt/Pranesimai'

with requests.Session() as s:
    r = s.get(login_url)
    soup = BeautifulSoup(r.content, "lxml")
    AUTH_TOKEN = soup.select_one("input[name=SToken]")["value"]
    now = datetime.now()
    data['Timestamp'] = f'{now.year}-{now.month}-{now.day} {now.hour}:{now.minute}:{now.second}'
    data["SToken"] = AUTH_TOKEN
    r = s.post(login_url, data=data)
    r = s.get(url)
    print(r.text)

而且我无法登录该页面,我认为时间戳记做错了吗?请帮助:)

编辑:所以今天我做了一些更改,因为我发现我需要的大多数数据都在隐藏的输入中,所以:

data = {'UserName': 'username',
        'Password': 'password',
        }

AUTH_TOKEN = soup.find("input",{'name':"SToken"}).get("value")
    Timestamp = soup.find("input",{'name':"Timestamp"}).get("value")
    IsMobileUser = soup.find("input",{'name':"IsMobileUser"}).get("value")
    RequireCaptcha = soup.find("input", {'name': "RequireCaptcha"}).get("value")
    ReturnUrl = soup.find("input", {'name': "ReturnUrl"}).get("value")

并将其添加到数据字典中,我也尝试创建标题:

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36'}
r = s.post(login_url, data=data, headers=headers)

是的,对我没有任何帮助。也许也许有一种方法可以找出为什么我无法登录?

python-3.x web-scraping beautifulsoup request timestamp
1个回答
0
投票

我同意你的意见。看来您没有发送正确的时间戳。该网站具有输入内容,因此您可以像令牌一样将其抓取并发送,也可以在网站使用的相同时区生成相同的时间戳记

from bs4 import BeautifulSoup
import requests
from lxml import html
from datetime import datetime
from pytz import timezone


data = {'UserName': 'username',
           'Password': 'password',
           'IsMobileUser': 'false',
           'ReturnUrl': '',
           'RequireCaptcha': 'false'
           }

login_url = 'https://dienynas.tamo.lt/Prisijungimas/Login'
url = 'https://dienynas.tamo.lt/Pranesimai'

with requests.Session() as s:
    r = s.get(login_url)
    soup = BeautifulSoup(r.content, "lxml")
    AUTH_TOKEN = soup.find("input",{'name':"SToken"}).get("value")
    Timestamp  = soup.find("input",{'name':"Timestamp"}).get("value") #2020-03-31 15:36:37
    now = datetime.now(timezone('Etc/GMT-3'))
    data['Timestamp'] = now.strftime('%Y-%m-%d %H:%M:%S') #2020-03-31 15:36:36
    print('Timestamp from website',Timestamp)
    print('Timestamp from python',data['Timestamp'])
    data["SToken"] = AUTH_TOKEN
    r = s.post(login_url, data=data)
    r = s.get(url)
    print(r.text)
© www.soinside.com 2019 - 2024. All rights reserved.