我试图从 URL 中抓取数据,但是当我使用方法
find_elemnts()
时,我得到了这个错误。我正在尝试获取一些数据。这是我的代码:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
import time
#opening the browser
PATH = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(PATH)
driver.get("https://www.bestbuy.com/")
#sending a request through the search bar
search = driver.find_element(By.NAME, 'st')
search.send_keys("dell xps")
search.send_keys(Keys.RETURN)
driver.maximize_window() # For maximizing window
driver.implicitly_wait(20) # gives an implicit wait for 20 seconds
#clicking a desired link by the class name
# driver.find_element(By.CLASS_NAME,'sku-title').click()
# paths:
# sku-title #class name for href links
# //*[@id="main-results"]/ol/li[1] #link class
# //*[@id="shop-sku-list-item-23022673"]/div/div/div[1]/div[3]/div[1]/div[1]/span[2] class name: sju-value #model
# //*[@id="pricing-price-23825371"]/div/div/div/div/div[1]/div/div[1]/div class name:priceView-hero-price priceView-customer-price #price
links = driver.find_elements(By.CLASS_NAME, 'sku-title')
for link in links:
# href = link.find_element(By.XPATH, './/*[@id="main-results"]/ol/li[1]').text
model = link.find_element(By.CLASS_NAME, 'sku-value').text
# price = link.find_element(By.XPATH, './/*[@id="pricing-price-23825371"]/div/div/div/div/div[1]/div/div[1]/div').text
print(model)
time.sleep(200)
driver.quit()
期待文本中的数据(价格、型号和链接)
要抓取 Model 数字,您需要为 visibility_of_all_elements_located() 引入 WebDriverWait 并且您可以使用以下任一Locator Strategies:
代码块:
driver.get("https://www.bestbuy.com/")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "a.us-link > img"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input.search-input"))).send_keys("dell xps")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[title='submit search']"))).click()
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//span[@class='attribute-title' and text()='Model:']//following::span[1]")))])
driver.quit()
控制台输出:
['XPS9320-7523BLK-PUS', 'XPS9520-9195SLV-PUS', 'XPS9320-7585SLV-PUS', 'XPS9520-7171SLV-PUS', 'XPS9520-7294WHT-PUS', 'BBY-K2PKKFX', 'BBY-46J4FFX', 'XPS9720-7218PLT-PUS', 'BBY-W8PYKFX']
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC