从 Python 中刷新的 javascript 页面中抓取数据

Question

我尝试从网站上抓取一些数据https://bloks.io/live 第一个问题是我无法访问该表中的刷新数据。我的想法是检查第一列是否发生变化。如果确实如此，我必须检查哪个帐户等等...... 所以我需要不断地阅读第一栏，但它不起作用。

我尝试使用CSS选择器，但没有成功。

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from webdriver_manager.chrome import ChromeDriverManager
from bs4 import BeautifulSoup


link = 'https://bloks.io/live'

driver = webdriver.Chrome(service=Service((ChromeDriverManager().install())))
driver.get(link)

page =driver.page_source

tSoup = BeautifulSoup(driver.find_element(By.CSS_SELECTOR, '#info>tbody>tr:nth-child(2)').get_attribute('outerHTML'), 'html.parser')

这给我带来了 no such element 错误消息。有人可以帮助我吗？

Answer 1

为了更好地理解发生了什么，我建议你制作屏幕并保存 html。如果你愿意这样做，你会发现你需要一些时间等待（或睡觉 - 这不是最好的解决方案）

# save screen
driver.save_screenshot("webpage.png")

# save html
html_code = driver.page_source

with open("webpage.html", "w", encoding="utf-8") as file:
    file.write(html_code)

关于睡眠/等待https://selenium-python.readthedocs.io/waits.html

从 Python 中刷新的 javascript 页面中抓取数据

问题描述投票：0回答：1

1个回答

最新问题

从 Python 中刷新的 javascript 页面中抓取数据

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1