Python3-Selenium无法找到提供的xpath

问题描述 投票:0回答:4

我正在使用Python 3和Selenium来从网站上获取一些图像链接,如下所示:

import sys
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.proxy import Proxy, ProxyType

chrome_options = Options()  
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364')

link_xpath = '/html/body/main/div/div[2]/div[2]/div/div/div[2]/div/div[2]/div[1]/div/div/div[2]/div/img'

link_path = driver.find_element_by_xpath(link_xpath).text
print(link_path)

driver.quit()

解析该URL时,您可以在页面中间看到有问题的图像。在Google Chrome浏览器中右键单击并检查元素时,可以在Chrome开发工具中右键单击元素本身,并获取该图像的xpath。

为了我,一切看起来都很正常,但是在运行上述代码时,出现以下错误:

Traceback (most recent call last):
  File "G:\folder\folder\testfilepy", line 16, in <module>
    link_path = driver.find_element_by_xpath(link_xpath).text
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
    'value': value})['value']
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "G:\Python36\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/main/div/div[2]/div[2]/div/div/div[2]/div/div[2]/div[1]/div/div/div[2]/div/img"}
  (Session info: headless chrome=83.0.4103.61)

谁能告诉我为什么Selenium无法找到提供的xpath?

python selenium xpath css-selectors webdriverwait
4个回答
1
投票
要提取图像的src属性,您需要为WebDriverWait引入visibility_of_element_located(),并且可以使用以下Locator Strategies中的任何一个:

  • 使用CSS_SELECTOR

    options = webdriver.ChromeOptions() options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) options.add_argument('--headless') options.add_argument('--window-size=1920,1080') driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.o-layout__item div.c-bezel.programme-content__image>img"))).get_attribute("src"))

  • 使用XPATH

    options = webdriver.ChromeOptions() options.add_experimental_option("excludeSwitches", ["enable-automation"]) options.add_experimental_option('useAutomationExtension', False) options.add_argument('--headless') options.add_argument('--window-size=1920,1080') driver = webdriver.Chrome(options=options, executable_path=r'C:\WebDrivers\chromedriver.exe') driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364') print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='o-layout__item']//div[@class='c-bezel programme-content__image']/img"))).get_attribute("src"))

  • 控制台输出:

    https://images.metadata.sky.com/pd-image/251eeec2-acb3-4733-891b-60f10f2cc28c/16-9/640

  • 注:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC

  • 参考

    您可以在NoSuchElementException中找到一些详细的讨论:


    1
    投票
    您具有正确的xpath,但不要使用绝对路径,因此非常容易更改。试试这个相对的xpath//div[@class="c-bezel programme-content__image"]//img

    为了达到您的意思,请使用.get_attribute("src")而不是.text

    driver.get('https://www.sky.com/tv-guide/20200605/4101-1/107/Efe2-364') element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, '//div[@class="c-bezel programme-content__image"]//img'))) print(element.get_attribute("src")) driver.quit()

    或更好的方法,使用css选择器。这应该更快:

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.c-bezel.programme-content__image > img')))


    0
    投票
    如果您在无头模式下工作,通常最好增加窗口大小。将此行添加到您的选项中:

    chrome_options.add_argument('window-size=1920x1080')


    0
    投票
    您的xpath似乎是正确的。您无法定位,因为您忘记了处理cookie。自己尝试。将驱动程序搁置几秒钟,然后单击同意所有cookie。然后,您将看到您的元素。有多种处理cookie的方法。我能够通过使用自己的xpath来找到xpath,它更干净。我从最近的父母那里拜访了那个元素。

    希望获得帮助。

    © www.soinside.com 2019 - 2024. All rights reserved.