Python 硒。抓取网页

问题描述 投票:0回答:3

我想使用 'https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3' 从 'Stock Style - Weight' 内的框中获取数据硒

此数据位于 iframe 中。我可以切换到 iframe 并单击按钮=“重量”,但我无法获得九个数字

下面是我的代码

driver = webdriver.Chrome(chromedriver)
driver.get("https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3")

iframe = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, "//iframe[@id='portfolio']")))
driver.switch_to.frame(iframe)

element1=driver.find_element_by_xpath('/html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[2]/div[2]')
element2=element1.find_element_by_css_selector("input[type='radio'][value='Weight']").click()

我尝试了几种选择

driver.find_element_by_xpath('*//div/div[2]/div/div[2]/div/svg/g/g[3]/g[2]/g[1]/text')
driver.find_element_by_css_selector("mbc-chart-group> g.style-box-text-layer > g:nth-child(1)")

但是我遇到了同样的错误

NoSuchElementException: no such element: Unable to locate element
python selenium web-scraping
3个回答
1
投票

元素位于

svg
text
标签中。要访问相同的内容,您需要使用:

//*[local-name()='svg'] or //*[name()='svg']

参考链接

这些数字的 Xpath 是:

//div[@class='sal-stock-style__weight']//*[name()='svg' and @role='chart']//*[name()='g' and @class='style-box-text-layer']//*[name()='text']

尝试如下并确认:

numbers = driver.find_elements_by_xpath("//div[@class='sal-stock-style__weight']//*[name()='svg' and @role='chart']//*[name()='g' and @class='style-box-text-layer']//*[name()='text']")
for num in numbers:
    print(num.text)
15
6
4
22
14
2
19
13
2

1
投票

您需要添加这两行来单击接受cookie按钮和投资者类型按钮

WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='onetrust-accept-btn-handler']"))).click()
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='btn_professional']"))).click()

完整代码

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
driver.get("https://www.morningstar.co.uk/uk/funds/snapshot/snapshot.aspx?id=F00000NF9P&tab=3")

WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='onetrust-accept-btn-handler']"))).click()
WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//*[@id='btn_professional']"))).click()

iframe = WebDriverWait(driver, 10).until(
    EC.presence_of_element_located((By.XPATH, "//iframe[@id='portfolio']")))
driver.switch_to.frame(iframe)

element1=driver.find_element_by_xpath('/html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[2]/div[2]')
element2=element1.find_element_by_css_selector("input[type='radio'][value='Weight']").click()

0
投票

我尝试运行上面的代码,但出现以下错误,我还添加了

driver.implicitly_wait(20)
函数,因为我认为这与等待时间有关,但没有用

NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div/sal-components-pillar-cards-process/div/div[2]/div/div[2]/div[2]"}
  (Session info: chrome=122.0.6261.129)
Stacktrace:
    GetHandleVerifier [0x00EA8D03+51395]
    (No symbol) [0x00E15F61]
    (No symbol) [0x00CCE13A]
    (No symbol) [0x00D062BB]
    (No symbol) [0x00D063EB]
    (No symbol) [0x00D3C162]
    (No symbol) [0x00D23ED4]
    (No symbol) [0x00D3A570]
    (No symbol) [0x00D23C26]
    (No symbol) [0x00CFC629]
    (No symbol) [0x00CFD40D]
    GetHandleVerifier [0x012268D3+3712147]
    GetHandleVerifier [0x01265CBA+3971194]
    GetHandleVerifier [0x01260FA8+3951464]
    GetHandleVerifier [0x00F59D09+776393]
    (No symbol) [0x00E21734]
    (No symbol) [0x00E1C618]
    (No symbol) [0x00E1C7C9]
    (No symbol) [0x00E0DDF0]
    BaseThreadInitThunk [0x75267BA9+25]
    RtlInitializeExceptionChain [0x7765BD2B+107]
    RtlClearBits [0x7765BCAF+191]
© www.soinside.com 2019 - 2024. All rights reserved.