我试图点击下面网站(chinalaborwatch)的ListNews div中的每个链接。
我做了一些研究,下面的内容应该是有效的,但相反,它只点击了一个链接,然后就停止了。
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome(executable_path=r"C:\webdrivers\chromedriver.exe")
driver.get("http://www.chinalaborwatch.org/news")
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH,'/html/body/form/div[5]/div/div[2]'))).click()
我缺少了什么?
谢谢!
你可以先拿到网址列表.然后访问他们,然后刮取你想要的数据。
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("http://www.chinalaborwatch.org/news")
element_list = driver.find_elements_by_css_selector('#form1 > div:nth-child(5) > div > div.ListNews > div')
url_list = [element.find_element_by_tag_name('a').get_attribute('href') for element in element_list] # get all the url
for i in url_list:
driver.get(i) # switch the url
# then it is your work,scrape the text you want.