如何使用 Selenium 下载 PDF 文件?

问题描述 投票:0回答:0

我正在尝试从以下网站下载 PDF:http://esaj.tjsp.jus.br/cjsg/getArquivo.do?conversationId=&cdAcordao=16548741

所以,我做的第一件事是创建一个 time.sleep 来手动解决 reCAPTCHA (我不确定是否可以自动执行此操作)。之后,我希望下载开始。然而,什么也没有发生。

下面是我的代码:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import time

if __name__ == "__main__":

    url = "http://esaj.tjsp.jus.br/cjsg/getArquivo.do?conversationId=&cdAcordao=16548741"
    service = Service(executable_path='./chromedriver.exe')
    options = webdriver.ChromeOptions()
    
    options.add_experimental_option('prefs', {
         'download.default_directory': 'C:\\Users\....',
         'download.prompt_for_download': False,
         "download.directory_upgrade": True,
         'plugins.always_open_pdf_externally': True})
    
    driver = webdriver.Chrome(service=service, options=options)
    try:
 
        driver.get(url)

        print('Acesso')

        time.sleep(180)
        
        print('Sucesso!')
        
        driver.implicitly_wait(10)
        
        print('Download concluido...')
        
        driver.quit()

    except Exception as e:
        print(f"Ocorreu um erro: {e}")
        driver.quit()

python selenium-webdriver web-scraping web-crawler
© www.soinside.com 2019 - 2024. All rights reserved.