从动态 javascript 网页抓取数据的问题

问题描述 投票:0回答:0

下面是我尝试抓取一个小数据集的尝试——我遇到了一个问题,没有数据被拉出,当预览在 docker 中显示时,我的表不会显示,只有列名。我尝试了以下变体但没有成功!我正在学习 Python,cin

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import pandas as pd
import time

# Create a new Chrome browser instance
s = Service('/usr/local/bin/chromedriver')
driver = webdriver.Chrome(service=s)

# Go to the webpage
driver.get('https://www.deaths-in-custody.project.uq.edu.au/record')

# Wait for the page to load
time.sleep(5)  # Adjust this delay as needed

# Find the search button element
search_button = driver.find_element('css selector', '.Search-button')

# Click the search button
search_button.click()

# Wait for the table to load
time.sleep(5)  # Adjust this delay as needed

# Find the table element
table = driver.find_element('css selector', 'table')

# Get the HTML content of the table
table_html = table.get_attribute('outerHTML')

# Use pandas to read the HTML table into a DataFrame
df = pd.read_html(table_html)[0]

# Print the DataFrame
print(df)

# Close the driver
driver.quit()
javascript python html selenium-webdriver jquery-selectors
© www.soinside.com 2019 - 2024. All rights reserved.