多个表头 表 以及如何从 作为表行中刮取数据 ] >> [

问题描述 投票:1回答:1
我正在尝试从网站上抓取数据,但是表格中有两组数据,首先,其中2-3行数据位于广告中,其余部分位于正文中。我每次尝试都遇到诸如TypeError,AttributeError之类的错误时,只能一次轻松地从其中一个提取数据。顺便说一句我正在使用python这是代码

import requests from bs4 import BeautifulSoup import pandas as pd url="https://www.worldometers.info/world-population/" r=requests.get(url) print(r) html=r.text soup=BeautifulSoup(html,'html.parser') print(soup.title.text) print() print() live_data=soup.find_all('div',id='maincounter-wrap') print(live_data) for i in live_data: print(i.text) table_body=soup.find('thead') table_rows=table_body.find_all('tr') table_body_2=soup.find('tbody') table_rows_2=soup.find_all('tr') year_july1=[] population=[] yearly_change_in_perchantage=[] yearly_change=[] median_age=[] fertillity_rate=[] density=[]#density (p\km**) urban_population_in_perchantage=[] urban_population=[] for tr in table_rows: td=tr.find_all('td') year_july1.append(td[0].text) population.append(td[1].text) yearly_change_in_perchantage.append(td[2].text) yearly_change.append(td[3].text) median_age.append(td[4].text) fertillity_rate.append(td[5].text) density.append(td[6].text) urban_population_in_perchantage.append(td[7].text) urban_population.append(td[8].text) for tr in table_rows_2: td=tr.find_all('td') year_july1.append(td[0].text) population.append(td[1].text) yearly_change_in_perchantage.append(td[2].text) yearly_change.append(td[3].text) median_age.append(td[4].text) fertillity_rate.append(td[5].text) density.append(td[6].text) urban_population_in_perchantage.append(td[7].text) urban_population.append(td[8].text) headers=['year_july1','population','yearly_change_in_perchantage','yearly_change','median_age','fertillity_rate','density','urban_population_in_perchantage','urban_population'] data_2= pd.DataFrame(list(zip(year_july1,population,yearly_change_in_perchantage,yearly_change,median_age,fertillity_rate,density,urban_population_in_perchantage,urban_population)),columns=headers) print(data_2) data_2.to_csv("C:\\Users\\data_2.csv")

我正在尝试从网站上抓取数据,但是表格中有两组数据,首先,其中2-3行数据位于广告中,其余部分位于正文中。当我同时尝试这两种方法时,我可以轻松地一次仅从一个提取数据...
python web-scraping
1个回答
0
投票
您可以尝试以下代码,它会生成所需的数据。如果您需要任何澄清,请告诉我:-
© www.soinside.com 2019 - 2024. All rights reserved.