通过请求绕过不可见的验证码

问题描述 投票:0回答:2

我正在尝试使用请求对网站进行扫描,以便机器人更加高效,当我在模拟流程时通过网络监视器查看 GET 和 POST 请求时,我注意到有一个不可见的 reCaptcha阻止我使用请求库继续该过程。

我尝试模拟我在网络监视器中看到的确切 GET 和 POST 请求,包括使用会话对象执行此操作并保存我获得的 cookie,但 reCaptcha 是(我认为)阻止我实际保持交互工作的原因。我尝试寻找第三方 API,但据我了解,它仅在您实际打开网络驱动程序或您在有效负载中获得要提供的密钥时才有效,在这种情况下不需要,因为密钥为空。我还看到了一些使用硒的解决方案,但我试图避免使用它,因为它需要更多时间,而且我需要尽可能高效。

这是我尝试开始第一步的代码:

import requests
from pypasser import reCaptchaV3
    
headers = {
    'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
    'Accept-Encoding' : 'gzip, deflate, br',
    'Accept-Language':'en-US,en;q=0.5',
    'Connection':'keep-alive',
    'Host':'agendamentosonline.mne.gov.pt',
    'Sec-Fetch-Dest':'document',
    'Sec-Fetch-Mode':'navigate',
    'Sec-Fetch-Site':'none',
    'Sec-Fetch-User':'?1',
    'Upgrade-Insecure-Requests':'1',
    'User-Agent':'what my broswer provides me'
}
    
session = requests.Session()
resp = session.get('https://agendamentosonline.mne.gov.pt/AgendamentosOnline/app/scheduleAppointmentForm.jsf',headers=headers)
    cookies = requests.utils.cookiejar_from_dict(requests.utils.dict_from_cookiejar(session.cookies))
    recaptcha_response_1 = reCaptchaV3('https://www.recaptcha.net/recaptcha/api2/anchor?ar=1&k=6LcKSTIfAAAAAEAxSrS8rAFiDux_eX2DBPnCXGkR&co=aHR0cHM6Ly9hZ2VuZGFtZW50b3NvbmxpbmUubW5lLmdvdi5wdDo0NDM.&hl=en&v=pn3ro1xnhf4yB8qmnrhh9iD2&size=invisible&cb=1ozwitjudoci')
    recaptcha_response_2 = reCaptchaV3('https://www.recaptcha.net/recaptcha/api2/anchor?ar=1&k=6Ld7jDcgAAAAAJTt2PDioLNT69IwwEPvlhqa94K7&co=aHR0cHM6Ly9hZ2VuZGFtZW50b3NvbmxpbmUubW5lLmdvdi5wdDo0NDM.&hl=pt-PT&v=pn3ro1xnhf4yB8qmnrhh9iD2&size=normal&cb=ans33lqyh70u')
    print(recaptcha_response_1) #
    print(recaptcha_response_2) #

payload_1 = {
        'javax.faces.partial.ajax': 'true',
        'javax.faces.source': 'scheduleForm:tabViewId:dataNascimento',
        'javax.faces.partial.execute': 'scheduleForm:tabViewId:dataNascimento',
        'javax.faces.behavior.event': 'blur',
        'javax.faces.partial.event': 'blur',
        'scheduleForm': 'scheduleForm',
        'javax.faces.ViewState': '-7004941669221712425:6117221410156700201',
        'scheduleForm:tabViewId:ccnum': '****',
        'scheduleForm:tabViewId:dataNascimento_input': '***',
        'scheduleForm:tabViewId_activeIndex': '0',
        'g-recaptcha-response': '',
        'scheduleForm:respV2': ''
}

resp = session.post('https://agendamentosonline.mne.gov.pt/AgendamentosOnline/app/scheduleAppointmentForm.jsf',headers=headers,data=payload_1,cookies =cookies)
      
print(resp.text)

它确实给了我一个响应,只是与它工作时我应该得到的响应不同

python web-scraping python-requests bots recaptcha
2个回答
0
投票

你好吗?我读了你的问题,发现你正在使用 Python 脚本。您尝试过使用

capmonster-python
模块吗?

capmonster-python

请检查此模块,如果验证码干扰您的脚本,请解决验证码并运行脚本。

感谢您的阅读。


-2
投票

事实上完全有可能做到这一点:

import requests

url = "https://www.recaptcha.net/recaptcha/api2/anchor?ar=1&k=6Lc4FowfAAAAAPiXf6lyEPAe_84kesFIGdJUdRTe&co=aHR0cHM6Ly94dXNoYW54aWFuZy5jb206NDQz&hl=en&v=pn3ro1xnhf4yB8qmnrhh9iD2&size=invisible&cb=v0hsolko5hl1"
response = requests.get(url) ##get the recaptcha-token value
token = str(response.text).partition(str('id="recaptcha-token" value="'))[-1].partition(str('">'))[0] ##parse it left ot right
post = requests.post("https://www.google.com/recaptcha/api2/reload?k=6Lc4FowfAAAAAPiXf6lyEPAe_84kesFIGdJUdRTe",data=f"v=UFwvoDBMjc8LiYc1DKXiAomK&reason=q&c={token}&k=6Lc4FowfAAAAAPiXf6lyEPAe_84kesFIGdJUdRTe&co=aHR0cHM6Ly9hY2NvdW50cy5zcG90aWZ5LmNvbTo0NDM.&hl=en&size=invisible&chr=%5B61%2C36%2C84%5D&vh=7349404152&bg=!d3GgcVjIAAX6VNIG-kc72sZkL7AELV23BHEg3iiH1gcAAADHVwAAAAttAQecCimhpOYJBHsHw4TnDQnJAUU1KJWxkMVvr9kGAhPbfpEnRsIzZxDoK8WNA4Xk_jX6YLNl5cj97gy8xe0qj2UogYjr5xxWaD7OHCEWXqDqFHo9zQkvm1Jr-3PhDQbPfdz_WeOLnRGfdAlF7f6kTVJj8r_mdAx3g-11hZ4fXQpAMZ0qWUVIHOx4N86v_InW_G-9vhB6bzY_Xg1rQvsjsor6h9-BUi6cUZMmvYAn78v7JLBPZSdpWYD285rwy35stcDw5cYF8ruxzI_IqsNA6NAZWA4k1n-PuM5pxQDzLsrkD5oXB839hlcFKldmlsFx074KtmlcmvUVrD2O4Q8hNqjNTjRDSRqfZzcChfvqRKsx1DyhuXnz5dYAAR1ASd47CXlwOBdU6gCke1gRtLFtfSBMDvrhg7jK3uVK3jM-0q66IZwyUZHosVS0tI6DRRdXK6owLFJZi3lLnBzbASXdOaUGQHrzDFjQAbU76NE-neuga0bExaNraPqN0wYUBz1D0IPJ5kYLPQNArW2Z9a-to_yP1Oo7IDJlty6h9jTS7D32mQinK7JskejX_kPrchsfCCmNTQVmzSVnky-3WK4okaHXDKe9EHTTD2q5yQMRKNhHiHhifVwZ7fbBuaAP36m2qOxSrjrz6Em_HkCqAQb5GArJCVq7w04FuDxW7AGg086NIlp5QkST-AkJjCxn9BnV_5_37x_K8-vkMgIgND-pOet9HMf-4yrI7QrI8odYI9mmdEHwlSbKyWBkfnWFhTTL096dlgvDsNIiPoYIqdSjRAc1hkb2ToGqHTKD9VsfpcH79bBg_reEC7EK2ifubkSRhmz_LGhlN5wRTr_DuhjO0_pH8-TGKDCLlQJQ-jWC97z979drLh97I6wuXroq3xwfymQ9iDs2glksEExM78hbfVQsdRhLiUtDFkYxjinFEb325zUQJR6xT-yXNcLfLwDLWwfL6nuLS14IXKmENSW-6OIkXXJyDhgUKW6B1Reyll5b1s9A7OpsAtQH6H5rQzm6422zF12dO9JODF4UErQv43JpQu_wYG-VRUwGWcHvcrG9vp1c8mXjNfxoE2Ok0tTNXQXLr1DacKk4mG2YZF7X8xkPjDqW1XH6w8kde64MoCbMlc2u4yv9x-44P2XXMCDppFLsRLekxW00kAXP__rWpvNoEtt4PbI_Y_d9-lSLd6WDQ5mZObuIdo6BAS095B443-CNTb_4IAp9-4puXY_WU1cbkvt0hsV9iFjkcbsjAw1xmwBZoVg1ukewp4kPWL-oVVlGJYuQm_7AvAjZ6nRIRv_f7KebJQr-bY6wD3asqUzEZ8DHOLUJeScIFtDTFzAg9SxkP1dYde7y9umqn3a_3OyFR6iulqy-c0LoULRNh1DXG4KXKaabC4f7cixdSWPazY58wiic3ysahAsbaFGv_LzwFCy7uP0M7zKiwadGSOH_gaROuLTHbRnbEvPAgaa3zyP9mFPNhy_AsgKOAy5iA4l9qaBiVXwrWpXVyuQgsliVcmpeLSrMg9fpbb7LGcLv9dz5LxUetPIDUndRuJnW6xCyNakiVQMy6vF9l9qkEoHRjua7sPWZnJC29zjHdTgEVy5SOKinYGOBs1GlCFSSyIjBWixBWXH83hCjdd3TDJjQsNtDsRMr8mVlyiEqKkIttz1-2mV2ZhA6FmJ3Ldm-tnlQN5iaIM40HKbbrHDuDKWhdWXsouO2BfLJxDDvu--e251eTYOuFIHRQzCUg-y4LddffPqdFpemjCsJC1xHTx2DQXZTdQTv9n0FK9GTwRizJxnYqn5lDoXZv4MtG4tSZFjx8U9KnNBApFcDXXfFeymVWT5miPlimr9zSBRGEdzCAv-NctpVXrSwd3Yzpsj_eGFT91owgeAzjOnPFWMod2XCZbEywubJI-0QsHFxwiGbsnXtV2fXQOzdpdQKVcynj7gQhIJHQogB0M4achR6TmT-7dRvNkffa17qyRqpoIXhbpbvC9cgQG4VQaYjlhpeiWNHM7uTW1-5cdkZOcVqxsU5c1fzMpv77BuRYm--EJpFZSfsihySlvcGWVnz1qS-deD4gGUa8un8j3v0-YAu4llS6vC9OpCb-khnh7SgHk-a19cLD9m6mXVu9EJlV2gbdMcKouobkIljeKBT6ivhkemTe_peKfgDjgFSJfJ7Hxey2LR7nG1YW2FVv6kAOPRfoHNUf2OEvUHcvZ00jc7nJZTfrt-8nmutFD9C59MQ5HWvtIK5XobaAxyunaZon6iZxiFFRDo2o-xl6TwCuHYvmVWl6mAr5kn5QDlclIKc6hrIq8osYCcukWMZhu7L9wsyVMy1WC2GhXdWlTZnaJjqLtGBsxaTCbzND4nZ0zGEsGMX180J-y1PQ3EY3nP0e4ToqO8rXPi6lZ4GmGTpm0XypZ0jkf1xnU1FacQhmpVmIKru8kbjjChfywMM2exkn3E7CINxQS77i81vn3c8fWcdvKQ9lVProo60Yzea7RpjOdnfk9T4CcjV-J941093qAttWyknhB661xBQCzOXFB0euSb8Jn-J_5tSgX4NE1AyNXQEA5wk6km6tT3UUyK3yTEn8oynK_FZz_p4W4BGy_sCUm0IG43ioT-17L2CoQAzk2ZE5g4eh7jkASVHBeXREbMWtB4YdO-gPwxIrWVVOiN57jSDi5yM08wgBqKeAVYLHXFFuUG7konyayI7tTwxYjN0j7T9nGR2Jh1wmA-q99D4tbsM2AvWIWn9j3g83JBF4nqzS4lt72WUpL3kAdbOz2xwRKaWLFaEsaM9jQeg2ijJpTNqRlKxtXneWqjkca5JZCZEmGCbplWJAEARNOEVHWd00dc2dCt8KEHiBiAP86L6loq_QvD-kbLd1bd9S8FqCMVFRcOwOOBvUEBm1D-mJiA2KWBJ9T87kcAQmLRQxrTuGHMojr9cBtKz-2afsMXRPoCPmRc-dDwiYOXUgdERgEH6lifStYMTZcjS66GGA-0UccFdY2yAl7TG6b3lDa-lbTwSJHESj_UrH3neTMf2U8Z6rFWsTIHa5XfQ8nFacgUyokFLtzxGH57QQqRUc0bfqEouu_o8S0galOM1p2uaZqrrdvAbq31i-xMU5CqW0_WVG3REfA6SY0CJLXOs8mzwGFgZJEpr374MMRL6JEUu7qd_jib4P9-O8pvKFk7tfPTccXWq12b1gj7SsA6sdeffMMG1gpD-kYGud8ghD6x9sevkZ-IRveRZQmUCqXvT6rl-YOfyBTDsv2vpqD1kXxGSNV206XBFw6bFQB583TBhFWfm3p6nc1s3p-KY4oIMR1l6Z5Ccfh7CWv7EYNkbjwfsrk1PXoI38vy4cT8ttz49TQ5WSPSBgeZuAKUlX0Hml2C2xtis_a3YABvB4UsJK65Rg7hQCWLAlX8HYLeVYiUiqh31LE5JUPayYiC0nxQADw7A6-6hFtJqQz84LrxMw7a-Q59R0VwCqWGfCebmRh_BVlg", headers={
    "content-type": "application/x-www-form-urlencoded"
}) ##post the token to get the rresp token
print(post.text) #parse the token with json or what ever

##do you request

recaptcha v3 很糟糕,这是一种无需支付美元即可绕过它的方法。此示例基于 https://xushanxiang.com/demo/recaptcha/ 一个演示站点,您可以通过替换 recaptcha 密钥 (k=) 来为您的站点实现它。我希望这有助于此绕过是通用的并且适用于任何网站。然而它不适用于企业级验证码,那么您需要使用像 2captcha.com 这样的解算器。

干杯!

© www.soinside.com 2019 - 2024. All rights reserved.