我有两个数据框:
1).
2017 Hours
name Month
a January 199.25
February 203.25
March 220.75
April 203.50
May 242.50
June 261.25
July 278.50
August 227.75
September 160.75
October 213.50
November 230.75
December 159.75
2018 Hours
name Month
a January 199.25
February 203.25
March 220.75
April 203.50
May 242.50
June 261.25
July 278.50
August 227.75
September 160.75
October 213.50
November 230.75
December 159.75
我想将两个数据帧合并为一个用于绘图。我的目标是绘制一个简单的折线图,其中y轴为小时,x轴为月,2017年有一条线,2018年为另一条线
我想要一个看起来像这样的df:
Hours
name Month Year
a January 2017 199.25
February 2017 203.25
March 2017 220.75
April 2017 203.50
May 2017 242.50
June 2017 261.25
July 2017 278.50
August 2017 227.75
September 2017 160.75
October 2017 213.50
November 2017 230.75
December 2017 159.7o
January 2018 199.25
February 2018 203.25
March 2018 220.75
April 2018 203.50
May 2018 242.50
June 2018 261.25
July 2018 278.50
August 2018 227.75
September 2018 160.75
October 2018 213.50
November 2018 230.75
December 2018 159.75
任何帮助将不胜感激!!
我认为首先需要在DataFrame
s中设置相同的列名,然后使用concat
和参数keys
用于distingush DataFrame
s,并使用reset_index
用于MultiIndex
的列:
df1.columns = ['Hour']
df2.columns = ['Hour']
df = pd.concat([df1, df2], keys=(2017, 2018)).reset_index().rename(columns={'level_0':'Year'})
print (df)
Year name Month Hour
0 2017 a January 199.25
1 2017 a February 203.25
2 2017 a March 220.75
3 2017 a April 203.50
4 2017 a May 242.50
5 2017 a June 261.25
6 2017 a July 278.50
7 2017 a August 227.75
8 2017 a September 160.75
9 2017 a October 213.50
10 2017 a November 230.75
11 2017 a December 159.75
12 2018 a January 199.25
13 2018 a February 203.25
14 2018 a March 220.75
15 2018 a April 203.50
16 2018 a May 242.50
17 2018 a June 261.25
18 2018 a July 278.50
19 2018 a August 227.75
20 2018 a September 160.75
21 2018 a October 213.50
22 2018 a November 230.75
23 2018 a December 159.75
但是为了绘图应该更好:
df = (pd.concat([df1['2017 Hours'], df2['2018 Hours']], keys=(2017, 2018), axis=1)
.reset_index(level=0, drop=True))
print (df)
2017 2018
Month
January 199.25 199.25
February 203.25 203.25
March 220.75 220.75
April 203.50 203.50
May 242.50 242.50
June 261.25 261.25
July 278.50 278.50
August 227.75 227.75
September 160.75 160.75
October 213.50 213.50
November 230.75 230.75
December 159.75 159.75
传递给多个索引
df1.columns=df1.columns.str.split(' ',expand=True)
df1.swaplevel(0,1,axis=1).stack()
Out[946]:
Hours
name Month
a January 2017 199.25
February 2017 203.25
March 2017 220.75
df2.columns=df2.columns.str.split(' ',expand=True)
然后
使用concat
pd.concat([df1.swaplevel(0,1,axis=1).stack(),df2.swaplevel(0,1,axis=1).stack()])