我有一个 panas 数据框,如下:
df
Prod ProdDesc tot avg qtr val_qtr
A Cyl 110 8.7 202301 12
A Cyl 110 8.7 202302 56.9
A Cyl 110 8.7 202303 9
A Cyl 110 8.7 202304 0
所以我想要的是堆叠/转置数据帧。我用了熊猫融化,
df_tra = df.melt(id_vars=['Prod', 'ProdDesc'], var_name='Attrib', value_name='Value')
df_tra.drop_duplicates()
所以我的输出是:
df_tra
Prod ProdDesc Attrib Value
A Cyl tot 110
A Cyl avg 8.7
A Cyl quarter 202301
A Cyl quarter 202302
A Cyl quarter 202303
A Cyl quarter 202304
A Cyl val_qtr 12
A Cyl val_qtr 56.9
A Cyl val_qtr 9
A Cyl val_qtr 0
但是我想要/渴望的输出是不同的。我想要的是以下内容:
df_actual_wanted
Prod ProdDesc Attrib Value
A Cyl tot 110
A Cyl avg 8.7
A Cyl 202301 12
A Cyl 202302 56.9
A Cyl 202303 9
A Cyl 202304 0
我怎样才能实现这一目标?
DataFrame.drop_duplicates
和 DataFrame.melt
选择多列,并通过 rename
与
concat
的 snoter 子集连接,最后如果需要按两列排序:
df1 = (df[['Prod','ProdDesc','tot','avg']]
.drop_duplicates()
.melt(id_vars=['Prod', 'ProdDesc'], var_name='Attrib', value_name='Value'))
df2 = (df[['Prod','ProdDesc','qtr','val_qtr']]
.rename(columns={'qtr':'Attrib','val_qtr':'Value'}))
out = pd.concat([df1, df2]).sort_values(['Prod','ProdDesc'], ignore_index=True)
print (out)
Prod ProdDesc Attrib Value
0 A Cyl tot 110.0
1 A Cyl avg 8.7
2 A Cyl 202301 12.0
3 A Cyl 202302 56.9
4 A Cyl 202303 9.0
5 A Cyl 202304 0.0
如果默认索引和排序需要与原始更改解决方案相同:
print (df)
Prod ProdDesc tot avg qtr val_qtr
0 A Cyl 110 8.70 202301 12.0
1 A Cyl 110 8.70 202302 56.9
2 A Cyl 110 8.70 202303 9.0
3 A Cyl 110 8.70 202304 0.0
4 B Cyl 133 8.76 202301 12.0
5 B Cyl 133 8.76 202302 56.9
6 B Cyl 133 8.76 202303 9.0
7 B Cyl 133 8.76 202304 0.0
8 A Cyl1 117 8.37 202301 12.0
9 A Cyl1 117 8.37 202302 56.9
10 A Cyl1 117 8.37 202303 9.0
11 A Cyl1 117 8.37 202304 0.0
df1 = (df[['Prod','ProdDesc','tot','avg']]
.drop_duplicates()
.melt(id_vars=['Prod', 'ProdDesc'],
var_name='Attrib',
value_name='Value',
ignore_index=False))
df2 = (df[['Prod','ProdDesc','qtr','val_qtr']]
.rename(columns={'qtr':'Attrib','val_qtr':'Value'}))
out = pd.concat([df1, df2]).sort_index(kind='stable', ignore_index=True)
print (out)
Prod ProdDesc Attrib Value
0 A Cyl tot 110.00
1 A Cyl avg 8.70
2 A Cyl 202301 12.00
3 A Cyl 202302 56.90
4 A Cyl 202303 9.00
5 A Cyl 202304 0.00
6 B Cyl tot 133.00
7 B Cyl avg 8.76
8 B Cyl 202301 12.00
9 B Cyl 202302 56.90
10 B Cyl 202303 9.00
11 B Cyl 202304 0.00
12 A Cyl1 tot 117.00
13 A Cyl1 avg 8.37
14 A Cyl1 202301 12.00
15 A Cyl1 202302 56.90
16 A Cyl1 202303 9.00
17 A Cyl1 202304 0.00