从档案中抓取天气数据

问题描述 投票:0回答:1

我正在尝试从以下网络链接中抓取 2020-2023 年的天气数据:https://www.windguru.cz/archive.php?id_spot=4910&id_model=3&date_from=2020-08-01&date_to=2023-08- 01。特别感兴趣的数据是风向,此处用箭头表示。但是,我想要用于生成这些箭头的基本角度。该网站需要电子邮件登录才能查看这些数据,因此我尝试在 Python 中使用 selenium 来实现此目的:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.chrome.options import Options

options = webdriver.ChromeOptions()
options.add_argument("--headless")
driver = webdriver.Chrome(options = options)
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome()
driver.get('https://www.windguru.cz/archive.php?id_spot=4910&id_model=3&date_from=2020-08-01&date_to=2023-08-01')
WebDriverWait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "#forecasts-page")))

我需要在这段代码中添加什么来提取所需的信息?我对编程缺乏经验,所以很多内容对我来说都是新的。如果有人可以完成代码或提供替代方案来提取感兴趣的角度数据,那就太好了。

python web-scraping weather
1个回答
0
投票

这只是 Windguru Station JSON API 的基本 Python 包装器。我没有帐户,因此您需要自己测试此代码。

import requests

class WindguruAPIWrapper:
    def __init__(self, uid, password):
        self.base_url = "https://www.windguru.cz/int/wgsapi.php"
        self.uid = uid
        self.password = password
        self.default_params = {
            "uid": self.uid,
            "password": self.password,
            "format": "json"
        }

    def get_data_current(self):
        params = dict(self.default_params, q="station_data_current")
        response = requests.get(self.base_url, params=params)
        return response.json()

    def get_data(self, from_time=None, to_time=None, avg_minutes=10, vars_list=None):
        params = dict(self.default_params, q="station_data", avg_minutes=avg_minutes)
        if from_time:
            params["from"] = from_time
        if to_time:
            params["to"] = to_time
        if vars_list:
            params["vars"] = ",".join(vars_list)
        
        response = requests.get(self.base_url, params=params)
        return response.json()

    def get_data_csv(self, from_time=None, to_time=None, avg_minutes=10, vars_list=None):
        params = dict(self.default_params, q="station_data_csv", avg_minutes=avg_minutes)
        if from_time:
            params["from"] = from_time
        if to_time:
            params["to"] = to_time
        if vars_list:
            params["vars"] = ",".join(vars_list)
        
        response = requests.get(self.base_url, params=params)
        return response.text

    def get_data_last(self, hours=1, avg_minutes=10, back_hours=0, vars_list=None):
        params = dict(self.default_params, q="station_data_last", hours=hours, back_hours=back_hours)
        if vars_list:
            params["vars"] = ",".join(vars_list)
        
        response = requests.get(self.base_url, params=params)
        return response.json()

    def get_data_last_csv(self, hours=1, avg_minutes=10, back_hours=0, vars_list=None):
        params = dict(self.default_params, q="station_data_last_csv", hours=hours, back_hours=back_hours)
        if vars_list:
            params["vars"] = ",".join(vars_list)
        
        response = requests.get(self.base_url, params=params)
        return response.text

# Replace with your actual UID and password
uid = "YOUR_UID"
password = "YOUR_PASSWORD"

api = WindguruAPIWrapper(uid, password)
current_data = api.get_data_current()
print(current_data)

# You can use similar calls for other methods provided by the API wrapper

如果您尚未在终端中运行

requests
,请记住安装
pip install requests
库。

© www.soinside.com 2019 - 2024. All rights reserved.