如何创建包含其他列信息的“消息”列?数据框已经以各种方式排序。
df head members
0 Abba As Ally
1 Abba As Apo
2 Abba As Abba
3 Bella Bi Bella
4 Bella Bi Boo
5 Bella Bi Brian
6 Abba As Arra
7 Abba As Alya
8 Abba As Abba
预期产出
df head message
0 Abba As Hi Abba, we invite you, Ally and Apo. Please use "Abba As" when arriving.
1 Bella Bi Hi Bella, we invite you, Boo and Brian. Please use "Bella Bi" when arriving.
2 Abba As Hi Abba, we invite you, Arra and Alya. Please use "Abba As" when arriving.
我尝试创建名字列:
df["head_first_name"] = df.head.str.split(" ").str[0]
df.loc[df.head_first_name.isin(df.members)
groupby.apply
与自定义函数一起使用:
def message(g):
head = g.name[0].split()[0]
others = ' and '.join([m for m in g['members'] if m != head])
return f'Hi {head}, we invite you, {others}. Please use "{g.name[0]}" when arriving.'
group = df['head'].ne(df['head'].shift()).cumsum()
out = (df.groupby(['head', group], sort=False)
.apply(message)
.droplevel(1)
.reset_index(name='message')
)
输出:
head message
0 Abba As Hi Abba, we invite you, Ally and Apo. Please use "Abba As" when arriving.
1 Bella Bi Hi Bella, we invite you, Boo and Brian. Please use "Bella Bi" when arriving.
2 Abba As Hi Abba, we invite you, Arra and Alya. Please use "Abba As" when arriving.