我有一个看起来像这样的数据框:
Location Account Y2019:MTD:January:Expense Y2019:MTD:January:Income Y2019:MTD:February:Expense
Madrid ABC 4354 56456 235423
Madrid XYX 769867 32556456 6785423
Rome ABC 434654 5214 235423
Rome XYX 632556456 46724423 46588
我想重塑此df,使其变成下面的形状
Location Account Year_Month Expense Income
Madrid ABC Jan 2019 4354 56456
Madrid ABC Feb 2019 235423
Madrid XYX Jan 2019 769867 32556456
Madrid XYX Feb 2019 6785423
Rome ABC Jan 2019 434654 5214
Rome ABC Feb 2019 235423
Rome XYX Jan 2019 632556456 46724423
Rome XYX Feb 2019 46588
您能帮忙从列名中提取年份月份字符串并按照我们想要的方式融化它吗?>
我有一个数据框,看起来像:位置帐户Y2019:MTD:1月:费用Y2019:MTD:1月:收入Y2019:MTD:2月:费用Madrid ABC 4354 ...]]
我认为先重命名列然后使用pd.wide_to_long
:
df.columns = [f"{i.split(':')[3]}_{i.split(':')[2][:3]} {i.split(':')[0][1:]}" if len(i.split(":"))>1 else i for i in df.columns] #lazy way to rename, could be better
print (df.columns)
#Index(['Location', 'Account', 'Expense_Jan 2019', 'Income_Jan 2019', 'Expense_Feb 2019'], dtype='object')
print (pd.wide_to_long(df,stubnames=["Expense","Income"],i=["Location","Account"],j="Year_Month", sep="_",suffix=".*").reset_index())
#
Location Account Year_Month Expense Income
0 Madrid ABC Feb 2019 235423 NaN
1 Madrid ABC Jan 2019 4354 56456.0
2 Madrid XYX Feb 2019 6785423 NaN
3 Madrid XYX Jan 2019 769867 32556456.0
4 Rome ABC Feb 2019 235423 NaN
5 Rome ABC Jan 2019 434654 5214.0
6 Rome XYX Feb 2019 46588 NaN
7 Rome XYX Jan 2019 632556456 46724423.0