我有一个如下所示的数据框。可以使用冒号分隔的字符串将类别嵌套到无限级别。我希望按降序排序。但是以分层形式显示,如图所示。
我的数据框
[category] [amount]
Household 1000
Household : Utilities 500
Living : Food 100
Transport 5000
Household : Rent 500
Household : Utilities : Water 300
Transport : Car 2000
Transport : Train 2500
Living 250
Household : Utilities : Electric 200
Living : Other 150
我如何对其进行排序
Transport 5000
Transport : Car 4900
Transport : Train 100
Household 1000
Household : Utilities 600
Household : Utilities : Water 400
Household : Utilities : Electric 200
Household : Rent 400
Living 250
Living : Other 150
Living : Food 100
请注意如何按金额排序。但是仍然仅限于层次结构。 (注意:每个冒号分隔的子级别的数量总和等于根级别的数量)
我玩过这样的事情。但它不完全是我所追求的。它伤了我太多的头。有人知道如何用熊猫很好地做到这一点吗?
dfs = dfs.sort_values(['amount', 'category'], ascending=[True, True])
使用:
splitted = df['[category]'].str.split(' : ')
df['cat'] = splitted.apply(tuple)
df['am'] = splitted.str[0].map(df.set_index('[category]')['[amount]'])
df = df.sort_values(['am','cat'], ascending=[False, True])
df = df.drop(['am','cat'], axis=1)
print (df)
[category] [amount]
3 Transport 5000
6 Transport : Car 2000
7 Transport : Train 2500
0 Household 1000
4 Household : Rent 500
1 Household : Utilities 500
9 Household : Utilities : Electric 200
5 Household : Utilities : Water 300
8 Living 250
2 Living : Food 100
10 Living : Other 150