我怎样才能从https://news.blizzard.com/en-us/diablo4获取json数据或者如果可能的话

问题描述 投票:0回答:1

可以从这个url获取json文件吗? https://news.blizzard.com/en-us/diablo4

我尝试进入 fetch 但没有找到太多我也尝试看看是否能找到任何我可以使用的东西,但我需要 json 文件我不确定是否有 url 的 json 文件但我会这么想

python python-3.x web-scraping
1个回答
0
投票

这是一个如何使用他们的 REST Json API 获取所有文章的示例:

import requests
from bs4 import BeautifulSoup

json_url = "https://news.blizzard.com/en-us/blog/list?pageNum={page_num}&pageSize=30&community=diablo4"

page_num = 1
while True:
    data = requests.get(json_url.format(page_num=page_num)).json()

    soup = BeautifulSoup(data["html"], "html.parser")
    for article in soup.select("article"):
        print(article.a.text)
        # parse other data
        # ...

    if data["pageNum"] * data["pageSize"] > data["totalCount"]:
        break

    page_num += 1

打印:

...

Diablo IV Quarterly Update—December 2020
Diablo IV Quarterly Update—September 2020
Diablo IV Quarterly Update—June 2020
Diablo IV Quarterly Update—February 2020
System Design in Diablo IV (Part II)
System Design in Diablo IV (Part I)
A Letter from our Game Director – BlizzCon 2019
Diablo IV Feature Overview
Diablo IV Unveiled
© www.soinside.com 2019 - 2024. All rights reserved.