每次我尝试抓取页面的 HTML 时,我都会遇到同样的错误:
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
我知道 selenium 改变了按类名选择对象的方式,所以我使用以下代码:
from selenium import webdriver
driver = webdriver.Chrome("C:\\Users\\PC\\Downloads\\chromedriver_win32\\chromedriver.exe")
url = "https://www.oddsportal.com/football/england/premier-league/"
driver.get(url)
element1 = driver.find_elements(By.CLASS_NAME, "flex items-center gap-1 my-1 align-center w-[100%]")
print(element1)
driver.quit()
我尝试使用“班级名称”而不是
By.CLASS_NAME
你可以使用下面的 Css Selector ,那些空格是多个类
By.cssSelector(".flex.items-center.gap-1.my-1.align-center.w-[100%]")
这个错误信息...
selenium.common.exceptions.InvalidSelectorException: Message: invalid selector: An invalid or illegal selector was specified
...表示您使用的 selector 不是有效的选择器。
要理想地提取所需的文本,您需要为 visibility_of_element_located() 引入 WebDriverWait,并且您可以使用以下任一定位器策略:
使用CSS_SELECTOR:
driver.get("https://www.oddsportal.com/football/england/premier-league/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div.flex.items-center.gap-1.my-1.align-center > div > div"))).text)
使用XPATH:
driver.get("https://www.oddsportal.com/football/england/premier-league/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@class, 'flex items-center gap-1 my-1 align-center')]/div/div"))).text)
控制台输出:
Nottingham
1
:
2
Newcastle
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
您可以在How to retrieve the text of a WebElement using Selenium - Python
中找到相关讨论