根据 2 列左合并 2 个数据帧,但只有第二个数据帧中不匹配的列为 NaN

问题描述 投票:0回答:1

输入:

df1 = pd.DataFrame({"A": [1, 1, 2, 2, 3], "B": [1, 2, 1, 3, 1]})
df2 = pd.DataFrame({"C": [1, 1, 2, 2], "D": [1, 2, 1, 4]})

预期输出:

A B C D
1 1 1 1
1 2 1 2
2 1 2 1
2 3 2
3 1

我尝试了以下方法:

joined = pd.merge(df1, df2, left_on=['A', 'B'], right_on=['C', 'D'], how='left')

我得到的输出是:

   A  B    C    D
0  1  1  1.0  1.0
1  1  2  1.0  2.0
2  2  1  2.0  1.0
3  2  3  NaN  NaN
4  3  1  NaN  NaN
python pandas dataframe merge left-join
1个回答
0
投票

如果你正在进行这样的合并,你就无法挑选。看来您真正想要的是单独合并每一列。为了区分重复项,在这种情况下您可以使用索引,但我假设这只是侥幸,所以我将进行枚举

cols_merged = []
for col1, col2 in ('A', 'C'), ('B', 'D'):
    col_merged = pd.merge(
        df1[col1],
        df2[col2],
        left_on=[col1, df1.groupby(col1).cumcount()],
        right_on=[col2, df2.groupby(col2).cumcount()],
        how='left',
        )[col2]
    cols_merged.append(col_merged)
joined = pd.concat([df1, *cols_merged], axis=1)
joined
   A  B    C    D
0  1  1  1.0  1.0
1  1  2  1.0  2.0
2  2  1  2.0  1.0
3  2  3  2.0  NaN
4  3  1  NaN  NaN
© www.soinside.com 2019 - 2024. All rights reserved.