我想使用 selenium 加载 html 页面。
这是我的Python代码:
driver = webdriver.Chrome()
driver.maximize_window()
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'html.parser')
driver.quit()
text_file = open(r'<my_path>', 'w')
text_file.write(soup.prettify())
text_file.close()
我保存的.txt 文件中感兴趣的代码部分是空的:
<div class="sh-share-analysis">
<h2>
Analysten-Meinung
</h2>
<div class="content sh-hidden" data="1">
<div class="bar sh-hidden">
</div>
<div class="value sh-hidden">
</div>
<div class="tab sh-hidden">
</div>
<div class="sh-link sh-hidden">
<a href="#">
<span>
Zur Detailanalyse
</span>
</a>
</div>
当我在浏览器中加载页面时,它在我的开发工具中看起来是这样的:
<div class="sh-share-analysis">
<h2>Analysten-Meinung</h2>
<div class="content" data="1">
<div class="bar">
<div class="action-30" style="width: 49.5%;"></div><div class="action-20" style="width: 49.5%;"></div></div>
<div class="value">
<div class="action-30"><div class="bullet"></div>Positiv (1)</div><div class="action-20"><div class="bullet"></div>Neutral (1)</div></div>
<div class="tab">
<div><span>Ø Kursziel (geschätzt)</span><span>160,91 EUR</span></div><div><span>Differenz zum Kurs</span><span>+12,82 %</span></div></div>
<div class="sh-link sh-hidden">
<a href="https://wertpapiere.ing.de/Investieren/Aktie/Analyse/US0079031078" wtevent=""><span>Zur Detailanalyse</span></a>
</div>
您可以尝试使用 XPATH 获取所有文本值(
page_source
无法加载动态源):
WebDriverWait(driver, 15).until(
EC.presence_of_element_located((By.XPATH, '//div/div')),
f'Exception get element').text
15 秒计时器将帮助您等待动态元素加载