Entity name_b number_b name_s number_s
0 1 Zyla {1, 2, 3} Zeela {1, 2, 3}
1 1 Zyla 620 {1} Zylo {1}
2 1 Xyla 620 {2} Xylaa {2}
3 1 GCP Zyla {4, 7} Ann Zyl {103}
4 1 Zy {103}
I would like to get output as:
Entity number name_b name_s
0 1 1 Zyla Zeela
1 1 1 Zyla Zylo
2 1 2 Zyla Zeela
3 1 2 Zyla Xylaa
4 1 3 Zyla Zeela
5 1 1 Zyla 620 Zeela
6 1 1 Zyla 620 Zylo
7 1 2 Xyla 620 Zeela
8 1 2 Xyla 620 Xylaa
9 1 4 GCP Zyla -
10 1 7 GCP Zyla -
11 1 103 Zy Ann Zyl
我想根据 number_b 和 number_s 比较 name_b 组和 name_s 组,并找到可能的匹配项。
我假设您在
number_b
/number_s
中拥有的数据是集合:
df1 = (
df[["Entity", "name_b", "number_b"]]
.explode("number_b")
.dropna()
.rename(columns={"number_b": "number"})
)
df2 = (
df[["Entity", "name_s", "number_s"]]
.explode("number_s")
.dropna()
.rename(columns={"number_s": "number"})
)
out = pd.merge(
df1,
df2[["number", "name_s"]],
on="number",
how="outer",
)
print(out)
打印:
Entity name_b number name_s
0 1 Zyla 1 Zeela
1 1 Zyla 1 Zylo
2 1 Zyla 620 1 Zeela
3 1 Zyla 620 1 Zylo
4 1 Zyla 2 Zeela
5 1 Zyla 2 Xylaa
6 1 Xyla 620 2 Zeela
7 1 Xyla 620 2 Xylaa
8 1 Zyla 3 Zeela
9 1 GCP Zyla 4 NaN
10 1 GCP Zyla 7 NaN
11 1 Zy 103 Ann Zyl