如何抓取鼠标悬停在元素上时出现的文本?

问题描述 投票:0回答:1

在网站上https://www.cpubenchmark.net/cpu.php?cpu=Intel+Core+i9-11900K+%40+3.50GHz&id=3904 我尝试在“定价历史记录”部分中抓取所有工具提示信息、CPU 的价格和日期

from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
options = Options()
options.add_argument("start-maximized")
webdriver_service = Service()
driver = webdriver.Chrome(options=options, service=webdriver_service)

driver.get('https://www.cpubenchmark.net/cpu.php?cpu=Intel+Core+i9-11900K+%40+3.50GHz&id=3904')
element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//*[@id='placeholder']/div/canvas[2]")))

for el in element:       
    ActionChains(driver).move_to_element(el).perform()   
    mouseover = WebDriverWait(driver, 30).until(EC.visibility_of_element_located((By.SELECTOR, ".placeholder > div > div.canvasjs-chart-tooltip > div > span")))      
    print(mouseover.text)

但结果显示:“WebElement”对象不可迭代。 有什么我必须修改的吗?或者还有其他好方法来抓取“定价历史记录”部分中所有价格和日期的鼠标悬停信息吗?谢谢您的帮助!!!

python selenium-webdriver tooltip screen-scraping mousehover
1个回答
0
投票

要将图表中的时间/价格放入 pandas 数据框中,您可以使用下一个示例:

import re

import pandas as pd
import requests

url = (
    "https://www.cpubenchmark.net/cpu.php?cpu=Intel+Core+i9-11900K+%40+3.50GHz&id=3904"
)

html_text = requests.get(url).text

df = pd.DataFrame(
    re.findall(r"dataArray\.push\({x: (\d+), y: ([\d.]+)}", html_text),
    columns=["time", "price"],
)

df["time"] = pd.to_datetime(df["time"].astype(int) // 1000, unit="s")
print(df.tail())

打印:

                   time   price
236 2023-05-28 06:00:00  317.86
237 2023-05-29 06:00:00  319.43
238 2023-05-30 06:00:00  429.99
239 2023-05-31 06:00:00  314.64
240 2023-06-01 06:00:00   318.9
© www.soinside.com 2019 - 2024. All rights reserved.