我正在使用 Selenium 从 imdb 中提取用户评论,但是当我尝试从每条评论中提取评分时,我收到此错误:
selenium.common.exceptions.NoSuchElementException:消息:没有这样的元素:无法定位元素:{“method”:“css选择器”,“selector”:“. rating-other-user- rating span”}
元素的html路径为:
<div>
<div class="inline-rating">
<div class="ipl-ratings-bar">
<span class="rating-other-user-rating">
<svg class="ipl-icon ipl-star-icon " xmlns="http://www.w3.org/2000/svg" fill="#000000" height="24" viewBox="0 0 24 24" width="24">
<path d="M0 0h24v24H0z" fill="none"></path>
<path d="M12 17.27L18.18 21l-1.64-7.03L22 9.24l-7.19-.61L12 2 9.19 8.63 2 9.24l5.46 4.73L5.82 21z"></path>
<path d="M0 0h24v24H0z" fill="none"></path>
</svg>
<span>4</span>
<span class="point-scale">/10</span>
</span>
</div>
</div>
<a href="/review/rw3316476/?ref_=tt_urv" class="title"> The Most Over-Rated Show I Have Ever Seen</a>
</div>
我尝试使用的代码是
service = Service(executable_path="chromedriver.exe")
driver = webdriver.Chrome(service=service)
url = 'https://m.imdb.com/title/tt0903747/reviews?ref_=tt_urv'
driver.get(url)
review_elements = driver.find_elements(By.CLASS_NAME, 'review-container')
#I tried searching the rating (in this case 4) with all the following codes, but I receive the error
rating = review_element.find_element(By.CLASS_NAME, 'inline-rating')
rating = review_element.find_element(By.CLASS_NAME, 'ipl-ratings-bar')
rating = review_element.find_element(By.CSS_SELECTOR, '.rating-other-user-rating span')
rating = review_element.find_element(By.CLASS_NAME,'rating-other-user-rating')
这很奇怪,因为我毫无问题地选择了标题、文本、日期和用户名。
问题是页面中间的一条评论没有评级。因此,当您循环遍历所有
review-container
时,当其中一个没有评分时,您会得到 NoSuchElementException
。
我修复了代码并添加了
try-except
来处理缺失的评级。
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
url = 'https://m.imdb.com/title/tt0903747/reviews?ref_=tt_urv'
driver.get(url)
driver.maximize_window()
wait = WebDriverWait(driver, 10)
review_elements = wait.until(EC.visibility_of_all_elements_located((By.CLASS_NAME, 'review-container')))
for review_element in review_elements:
try:
rating = review_element.find_element(By.CSS_SELECTOR, 'span.rating-other-user-rating')
print(rating.text)
except NoSuchElementException:
# no rating, ignore
pass
输出
10/10
10/10
10/10
10/10
10/10
10/10
10/10
10/10
10/10
10/10
7/10
5/10
10/10
10/10
10/10
10/10
10/10
10/10
10/10
10/10
6/10
5/10
10/10
10/10