如何对具有替代ID的数据进行某些更改?

问题描述 投票:0回答:1

我正在努力调整和重塑一些数据。我有如下数据。

昵称:尼克·加文昵称:尼克工作:老师昵称:尼克职责:teaching_math昵称:Bob Marcus昵称:Bob职位:音乐家昵称:鲍勃职责:弹奏钢琴

我想将其更改为:

尼克老师的教学_数学加文老师的教学_数学鲍勃·音乐家(Bob Musician)演奏_钢琴马库斯音乐家演奏_钢琴

任何帮助将不胜感激!

pandas dataframe pivot python-3.5 reshape
1个回答
0
投票
#get the names, remove the nickname appendage
df[0] = df[0].str.split(':').str[-1]

#create temp column to get nicknames into another column
df['temp'] = np.where(~df[1].str.contains('[:]'),df[0],np.nan)

#extract words after the ':'
df[1] = df[1].str.lstrip('job:').str.lstrip('duties:').str.strip()

#fillna to the side so each name has job and duties beneath
df = df.ffill(axis=1)

#group by col 0
#combine words 
#stack
#split into separate columns
#and drop index 0
final = (df
         .groupby(0)
         .agg(lambda x: x.str.cat(sep=','))
         .stack()
         .str.split(',', expand = True)
         .reset_index(drop=[0]))

最终

    0          1           2
0   Marcus  Musician    plays_piano
1   Bob     Musician    plays_piano
2   Gavin   Teacher     teaching_math
3   Nick    Teacher     teaching_math
© www.soinside.com 2019 - 2024. All rights reserved.