将下面的值连接到新列中

问题描述 投票:0回答:1

考虑这个 df:

enter image description here

    data = {'ID': [1071.0, 1072.0, nan, 1074.0, 1076.0, nan, nan, nan, 1077.0], 
'Name Type': ['Primary Name', 'Primary Name', 'Also Known As', 'Primary Name', 'Primary Name', 'Low Quality AKA', 'Low Quality AKA', 'Low Quality AKA', 'Primary Name'], 
'Surname': ['Brown', 'Red', 'R', 'Green', 'Purple', 'Pipi', 'Poopa', 'Peep', 'Orange']}

还有更多列在具有主要名称的行中包含信息,但在 akas 中为空。我需要连接每个主要名称 - 姓氏下的值(如果它们是低质量 AKA 或也称为)并实现此数据框:

enter image description here

pandas concatenation transformation
1个回答
0
投票

将此数据框

df
作为输入:

       ID        Name Type Surname
0  1071.0     Primary Name   Brown
1  1072.0     Primary Name     Red
2     NaN    Also Known As       R
3  1074.0     Primary Name   Green
4  1076.0     Primary Name  Purple
5     NaN  Low Quality AKA    Pipi
6     NaN  Low Quality AKA   Poopa
7     NaN  Low Quality AKA    Peep
8  1077.0     Primary Name  Orange

您可以使用这种方法:

df_aka_filter = df["ID"].isna()

df["ID"] = df["ID"].ffill()

df_aka = df[df_aka_filter]
df_aka = (
    df_aka.groupby("ID", as_index=False)
    .agg(lambda x: ";".join(x))
    .drop(columns="Name Type")
    .rename(columns={"Surname": "AKAs"})
)

df = pd.merge(df[~df_aka_filter], df_aka, on="ID", how="left")
       ID     Name Type Surname             AKAs
0  1071.0  Primary Name   Brown              NaN
1  1072.0  Primary Name     Red                R
2  1074.0  Primary Name   Green              NaN
3  1076.0  Primary Name  Purple  Pipi;Poopa;Peep
4  1077.0  Primary Name  Orange              NaN
© www.soinside.com 2019 - 2024. All rights reserved.