通过Python中的公共日期加入时间序列(数据框和序列/列表问题)

问题描述 投票:0回答:1

Noob在这里。请继续宽恕我正在学习的格式。我正在尝试创建一个由三列组成的时间序列(我认为是一个数据帧)。一个是日期列,下一个是库存列,最后一个是价格列。

我提取了两个单独的系列(日期和库存;日期和价格),我希望将这两个系列融合在一起,以便可以看到三列而不是两组中的两个。这是我的代码。

导入json将numpy导入为np将熊猫作为pd导入从urllib.error导入URLError,HTTPError从urllib.request导入urlopen

EIAgov类(对象):def init(自身,令牌,系列):'''目的:通过请求以下内容来初始化EIAgov类:-EIA代币-要下载的系列的ID码

    Parameters:
    - token: string
    - series: string or list of strings
    '''
    self.token = token
    self.series = series

'''
def __repr__(self):
    return str(self.series)
'''

def Raw(self, ser):
    # Construct url
    url = 'http://api.eia.gov/series/?api_key=' + self.token + '&series_id=' + ser.upper()

    try:
        # URL request, URL opener, read content
        response = urlopen(url);
        raw_byte = response.read()
        raw_string = str(raw_byte, 'utf-8-sig')
        jso = json.loads(raw_string)
        return jso

    except HTTPError as e:
        print('HTTP error type.')
        print('Error code: ', e.code)

    except URLError as e:
        print('URL type error.')
        print('Reason: ', e.reason)

def GetData(self):
    # Deal with the date series                       
    date_ = self.Raw(self.series[0])        
    date_series = date_['series'][0]['data']
    endi = len(date_series) # or len(date_['series'][0]['data'])
    date = []
    for i in range (endi):
        date.append(date_series[i][0])

    # Create dataframe
    df = pd.DataFrame(data=date)
    df.columns = ['Date']

    # Deal with data
    lenj = len(self.series)
    for j in range (lenj):
        data_ = self.Raw(self.series[j])
        data_series = data_['series'][0]['data']
        data = []
        endk = len(date_series)         
        for k in range (endk):
            data.append(data_series[k][1])
        df[self.series[j]] = data

    return df

如果name =='main':tok ='mytoken'

# Natural Gas - Weekly Storage
#     
ngstor = ['NG.NW2_EPG0_SWO_R48_BCF.W']  # w/ several series at a time ['ELEC.REV.AL-ALL.M', 'ELEC.REV.AK-ALL.M', 'ELEC.REV.CA-ALL.M']
stordata = EIAgov(tok, ngstor)
print(stordata.GetData())

# Natural Gas - Weekly Prices
#     
ngpx = ['NG.RNGC1.W']  # w/ several series at a time ['ELEC.REV.AL-ALL.M', 'ELEC.REV.AK-ALL.M', 'ELEC.REV.CA-ALL.M']
pxdata = EIAgov(tok, ngpx)
print(pxdata.GetData())

请注意,'mytoken'需要替换为eia.gov API密钥。我可以获取它以成功创建两个列表的输出...但是为了合并列表,我尝试在末尾添加此列表:

joined_frame = pd.concat([ngstor,ngpx],axis = 1,sort = False)

print(joined_frame.GetData())

但是我收到一个错误(“ TypeError:无法连接类型为”的对象;只有Seri​​es和DataFrame objs有效”),因为显然我不知道列表和系列之间的区别。

如何按日期合并这些列表?非常感谢您的帮助。 (也可以随时提出意见,以防止我在本篇文章中正确地格式化代码很糟糕。)

python pandas list join series
1个回答
0
投票

[如果要在其余代码中将它们作为DataFrame进行操作,则可以按以下方式将ngstorngpx转换为DataFrame:

# I create two lists that look like yours
ngstor = [[1,2], ["2020-04-03", "2020-05-07"]]
ngpx = [[3,4] , ["2020-04-03", "2020-05-07"]]
# I transform them to DataFrames
ngstor = pd.DataFrame({"value1": ngstor[0],
                       "date_col": ngstor[1]})
ngpx = pd.DataFrame({"value2": ngpx[0],
                       "date_col": ngpx[1]})

然后您可以使用pandas.mergepandas.concat

# merge option
joined_framed = pd.merge(ngstor, ngpx, on="date_col",
                          how="outer")
# concat option
ngstor = ngstor.set_index("date_col")
ngpx = ngpx.set_index("date_col")
joined_framed = pd.concat([ngstor, ngpx], axis=1,
                          join="outer").reset_index()

结果将是:

    date_col  value1  value2
0  2020-04-03       1       3
1  2020-05-07       2       4

© www.soinside.com 2019 - 2024. All rights reserved.