Selenium Python - 我正在寻找的元素无法找到,即使它存在于雅虎财经中

问题描述 投票:0回答:1

我正在参与一个学校项目,我需要从雅虎财经网站获取某些股票的分析师价格目标估计(这是强制性的)。

当我尝试通过 beautiful soup 使用它时,我无法抓取它(我相信 JS 正在调整页面加载),所以我转向 selenium 来获取此类数据。但是,当我尝试通过 XPATH 获取元素时,它会返回错误,就好像它不存在一样。我正在使用 EC,以防它需要加载,但它不起作用。我尝试将等待时间修改为 2 分钟,所以这不是问题

代码如下:

from selenium import webdriver 
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument('--no-sandbox') 
chrome_options.add_argument("--headless")
chrome_options.add_argument(f'user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36')
chrome_options.add_argument("window-size=1920,1080")

driver = webdriver.Chrome(options=chrome_options)
driver.get("https://finance.yahoo.com/quote/BBAJIOO.MX?.tsrc=fin-srch")
driver.delete_all_cookies()

WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//*[@id="Col2-11-QuoteModule-Proxy"]/div/section/div')))

有人知道为什么会发生这种情况吗?我怎样才能获得这样的评级?

下图是所需的评分

这是 HTML 代码的示例:

<div aria-label="Low  60 Current  64.59 Average  69.25 High  76.8" class="Px(10px)">
    <div class="Pos(r) Pb(30px) H(1em)">
        <div class="Start(75%) W(25%) Pos(a) H(1em) Bdbc($seperatorColor) Bdbw(2px) Bdbs(d) T(30px)"></div>
        <div class="Pos(a) H(1em) Bdbc($seperatorColor) Bdbw(2px) Bdbs(s) T(30px) W(100%)"></div>
        <div class="Pos(a) D(ib) T(35px)" data-test="analyst-cur-tg" style="left: 27.3214%;">
            <div class="W(7px) H(7px) Bgc($linkActiveColor) Bdrs(50%) Z(1) B(-5px) Translate3d($half3dTranslate) Pos(r)"></div>
            <div class="Bgc($linkActiveColor) Start(0) T(5px) W(1px) H(17px) Z(0) Pos(r)"></div>
            <div class="Miw(100px) T(6px) C($linkActiveColor) Pos(r) Fz(s) Fw(500) D(ib) Ta(c) Translate3d($half3dTranslate)"><span>Current</span>&nbsp;<span>64.59</span></div>
        </div>
        <div class="Pos(a) D(ib) T(-1px)" data-test="analyst-avg-tg" style="left: 55.0595%;">
            <div class="Pos(r) T(5px) Miw(100px) Fz(s) Fw(500) D(ib) C($primaryColor)Ta(c) Translate3d($half3dTranslate)"><span>Average</span>&nbsp;<span>69.25</span></div>
            <div class="Pos(r) Bgc($tertiaryColor) W(1px) H(17px) Z(0) T(6px) Start(-1px)"></div>
            <div class="W(8px) H(8px) Bgc(t) Bd Bdc($seperatorColor) Bdrs(50%) Z(1) B(-6px) Pos(r) Translate3d($half3dTranslate)"></div>
        </div><span class="W(6px) H(6px) Bgc($tertiaryColor) Bdrs(50%) Z(0) B(-5px) Start(0) Pos(a) Translate3d($half153dTranslate)"></span><span class="W(6px) H(6px) Bgc($tertiaryColor) Bdrs(50%) Z(0) B(-5px) Pos(a) Translate3d($zero153dTranslate) Start(100%)"></span></div>
    <div class="Ov(a) Fz(xs) Mt(10px) C($tertiaryColor)">
        <div class="Pos(r) Fl(start) Fz(xs) C($tertiaryColor) "><span>Low</span>&nbsp;<span>60.00</span></div>
        <div class="Pos(r) Fl(end) Fz(xs) C($tertiaryColor) "><span>High</span>&nbsp;<span>76.80</span></div>
    </div>
</div>
python selenium-webdriver yahoo
1个回答
0
投票

我猜您使用的 xpath 是从开发人员模式复制的,但在这种情况下它是空的。

driver.delete_all_cookies() # modify below this line


l = driver.find_element(By.ID, 'app')
f = open("s.txt", "w", encoding='utf-8') 
str_content = l.get_attribute("innerHTML")
f.write(str_content)
f.close() # save the log file

l = driver.find_element(By.ID, 'Col2-11-QuoteModule-Proxy')
print(l.get_attribute("innerHTML")) # empty since <span></span>

打开日志文件,注意👇

如果 (窗口.性能) {窗口.性能.标记 && window.performance.mark('Col2-11-QuoteModule');window.performance.measure && window.performance.measure('Col2-11-QuoteModuleDone','PageStart','Col2-11-QuoteModule');}

所以xpath是空的,空有两个意思,不存在,或者还不存在👇

driver.execute_script("window.scrollTo(0, document.body.scrollHeight)")

l = driver.find_element(By.XPATH, xp)
print(l.get_attribute("innerHTML"))

WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, '//*[@id="Col2-11-QuoteModule-Proxy"]/div/section/div')))

xp='//*[@id="Col2-11-QuoteModule-Proxy"]/div/section/div/div[1]/div[3]/div[3]/span[2]'
l = driver.find_element(By.XPATH, xp)
print(l.text) # the current value

print("you can continue project from here")

driver.quit()

为了安全起见,通常在完成后退出驱动程序。

© www.soinside.com 2019 - 2024. All rights reserved.