我如何使用beautifulsoup从Wikipedia表中提取一条数据

问题描述 投票:2回答:2

因此,我正在尝试获取哥伦比亚目前为一个网站确认的冠状病毒病例数。我只需要显示案例数,而我正在使用bs4。但是,我了解有关编程的基本知识,但我不了解python。这就是我所拥有的

import bs4

import requests

response = requests.get("https://es.wikipedia.org/wiki/Pandemia_de_enfermedad_por_coronavirus_de_2020_en_Colombia")

if response is not None:
    html = bs4.BeautifulSoup(response.text, 'html.parser')

    title = html.select(".infobox")[0].text
    paragraphs = html.select("tr")
    #for para in paragraphs:
        #print (para.text)

    mylist = soup.find_all('td')
    print(mylist.text)
python-3.x api beautifulsoup wikipedia
2个回答
0
投票

我正在尝试获取目前有多少例冠状病毒在哥伦比亚确认

api's有许多可用的实时数据,您无需在Wikipedia上获取此信息。这是一个python示例:

import requests

j = requests.get("https://pomber.github.io/covid19/timeseries.json").json()
# j['Colombia'] # full `timeseries` that you can import in pandas

# to get latest available date, use [-1]:
confirmed = j['Colombia'][-1]['confirmed']
deaths = j['Colombia'][-1]['deaths']
recovered = j['Colombia'][-1]['recovered']
# {'date': '2020-4-24', 'confirmed': 4881, 'deaths': 225, 'recovered': 1003}

作为旁注,我称此为[[virusconvid19


0
投票
这里是使用API​​而不是废弃Wikipedia的另一个示例,在这种情况下为免费的covid19 API

import requests class Covid19ApiHelper: URL_API = 'https://api.covid19api.com/summary' def __init__(self): self._global_info = None self._countries = None def refresh(self): """Request data from the API and saves it""" response = requests.get(self.URL_API) data = response.json() self._global_info = data['Global'] self._countries = {item['CountryCode']: item for item in data['Countries']} def get_global_info(self): return self._global_info def get_country_info(self, countryCode): """Returns the information by country using the standard two digit country code""" return self._countries[countryCode] if __name__=='__main__': covid_helper = Covid19ApiHelper() covid_helper.refresh() print(covid_helper.get_global_info()) print(covid_helper.get_country_info('CO'))

全局输出:

{'NewConfirmed': 86850, 'TotalConfirmed': 2894581, 'NewDeaths': 5839, 'TotalDeaths': 202795, 'NewRecovered': 27616, 'TotalRecovered': 815948}

输出哥伦比亚:

{'Country': 'Colombia', 'CountryCode': 'CO', 'Slug': 'colombia', 'NewConfirmed': 261, 'TotalConfirmed': 5142, 'NewDeaths': 8, 'TotalDeaths': 233, 'NewRecovered': 64, 'TotalRecovered': 1067, 'Date': '2020-04-26T09:16:56Z'}

数据源:https://covid19api.com/#details
© www.soinside.com 2019 - 2024. All rights reserved.