作业3的问题,Introduction to Data_Science_in_Python课程

问题描述 投票:-3回答:1

我对作业3(https://github.com/AparaV/intro-to-data-science-with-python/tree/master/assignment-03)有问题。我希望结果没有“ Nan”。但是有“ Nan”的价值。这是我第一次学习编程语言。如果有人能告诉我我的python代码出了什么问题,那将很好。这是wanted resultresult using codes below]的图片>

import pandas as pd
import numpy as np
import re
def answer_one():
    def energy():
        energy = pd.read_excel('Energy Indicators.xls',sheet_name='Energy')
        energy = energy.iloc[16:243]
        energy.drop(['Unnamed: 0','Unnamed: 1'], axis ='columns',inplace=True)
        energy.columns = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']

        energy = energy.replace('...',np.nan)
        energy['Energy Supply'] = energy['Energy Supply']*1000000

        energy = energy.replace('Republic of Korea','South Korea')
        energy = energy.replace('United States of America','United States')
        energy = energy.replace('United Kingdom of Great Britain and Northern Ireland','United Kingdom')
        energy = energy.replace('China, Hong Kong Special Administrative Region','Hong Kong')

        energy['Country'] =  energy['Country'].str.replace(" \(.*\)","")

        energy = energy.reset_index()
        energy = energy[['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']]
        return energy

    def GDP():
        GDP= pd.read_csv('world_bank.csv')
        s=(GDP.iloc[3].values)[:4].astype(str).tolist() + (GDP.iloc[3].values)[4:].astype(int).astype(str).tolist()
        GDP = GDP[4:]
        GDP.columns = s
        GDP = GDP[['Country Name','2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']]
        GDP.columns = ['Country','2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015']

        GDP = GDP.replace('Korea, Rep.','South Korea')
        GDP = GDP.replace('Iran, Islamic Rep','Iran')
        GDP = GDP.replace('Hong Kong SAR, China','Hong Kong')
        return GDP

    def ScimEn():
        ScimEn = pd.read_excel('scimagojr-3.xlsx')
        return ScimEn

    e = energy() 
    g = GDP()
    s = ScimEn()

    df=pd.merge(e,g,how = 'outer',left_on='Country',right_on='Country')
    df=pd.merge(s,df,how='outer',left_on='Country',right_on='Country')
    df.sort_values(by=['Rank'], inplace = True)
    df.set_index('Country',inplace=True)
    res = df.head(15)
    return res
answer_one()

我对作业3有一些问题(https://github.com/AparaV/intro-to-data-science-with-python/tree/master/assignment-03)。我希望结果没有“ Nan”。但是有“ Nan”的价值。这个...

python
1个回答
0
投票

我先尝试使用正则表达式替换,然后再替换这样的国家/地区名称。

© www.soinside.com 2019 - 2024. All rights reserved.