学习使用Selenium和Python进行抓取

问题描述 投票:0回答:3

我正在学习使用硒,但无法连接到该站点'http://www.festo.com/cat/it_it/products_VUVG_S?CurrentPartNo=8043720'

它不会加载网站的内容

我想学习如何连接到该网站以请求图像和数据

我的代码很简单,因为我正在学习,我在寻找建立连接的方法,但没有成功

from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

ff_profile = FirefoxProfile()
ff_profile.set_preference("general.useragent.override", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.85 Safari/537.36")
driver = webdriver.Firefox(firefox_profile = ff_profile)
driver.get('http://www.festo.com/cat/it_it/products_VUVG_S?CurrentPartNo=8043720')
time.sleep(5)
campo_busca = driver.find_elements_by_id('of132')
print(campo_busca)
python selenium iframe web-scraping webdriverwait
3个回答
0
投票

尝试一下有关更多信息here

 FIREFOX_DRIVER_PATH = "your_geckodriver_path"
 firefox_options = FirefoxOptions()
 firefox_options.headless = True

 # set options as per requirement for firefox
 firefox_options.add_argument("--no-sandbox")
 firefox_options.add_argument("--disable-setuid-sandbox")
 firefox_options.add_argument('--disable-dev-shm-usage')
 firefox_options.add_argument("--window-size=1920,1080")
 driver = webdriver.Firefox(firefox_options=firefox_options, executable_path=FIREFOX_DRIVER_PATH)
 driver.get('http://www.festo.com/cat/it_it/products_VUVG_SCurrentPartNo=8043720')
 time.sleep(5)

 campo_busca = driver.find_elements_by_id('of132')

 print(campo_busca)

0
投票

从此link下载驱动程序并将其放置在文件夹中,然后复制完整的路径并粘贴到下面]

 FIREFOX_DRIVER_PATH = "driver_path"

 firefox_options = FirefoxOptions()

 #only if you dont want to see the gui else make is false or comment
 firefox_options.headless = True

 driver = webdriver.Firefox(firefox_options=firefox_options, executable_path=FIREFOX_DRIVER_PATH)
 driver.get('http://www.festo.com/cat/it_it/products_VUVG_SCurrentPartNo=8043720')
 time.sleep(3)

 campo_busca = driver.find_elements_by_id('of132')

 print(campo_busca)

0
投票

由于所需元素在<iframe>内,因此要调用提取所需元素的src属性,您必须:

© www.soinside.com 2019 - 2024. All rights reserved.