data = {'groupId':[1,1,2], 'email':['[email protected]', '[email protected]', '[email protected]'],
'type':['office','personal','personal'],'name':['santy','santy','will']}
df = pd.DataFrame(data)
我有一个这样的数据框架
groupId email type name
1 [email protected] office santy
1 [email protected] personal santy
2 [email protected] personal will
我想根据特定组中的行数,将行转化为动态列。
groupId email1 type1 email2 type2 name
1 [email protected] office [email protected] personal santy
2 [email protected] personal na na will
我知道我可以使用set_index和unstack,但是我很困惑如何给列名,并在特定的组中创建那么多列。
有什么有效的方法可以做到这一点?
你可以这样做。
new_df = (df.assign(col=df.groupby('groupId').cumcount()+1)
.set_index(['groupId','col'])
.unstack('col')
.sort_index(level=(1,0), axis=1)
)
new_df.columns = [f'{x}{y}' for x,y in new_df.columns]
输出:
email1 type1 email2 type2
groupId
1 [email protected] office [email protected] personal
2 [email protected] personal NaN NaN