import pandas as pd
df_a = pd.DataFrame({'Number':[1,2,3,4,5,6,7,8],
'Column_A': ['C','E','G','L','E','N','P','R'],
'Column_B': ['D','F','H','M','Z','O','Q','S']})
df_b = pd.DataFrame({'Number':[1,2,3,4,5,6],
'Column_C': ['A','E','L','H','C','Q'],
'Column_D': ['B','F','M','G','F','P']})
mask = (((df_a['Column_A'].isin(df_b['Column_C'])) & (df_a['Column_B'].isin(df_b['Column_D']))) | ((df_a['Column_A'].isin(df_b['Column_D'])) & (df_a['Column_B'].isin(df_b['Column_C']))))
df_a[mask]
df_a
Number Column_A Column_B
0 1 C D
1 2 E F
2 3 G H
3 4 L M
4 5 E Z
5 6 N O
6 7 P Q
7 8 R S
df_b
Number Column_C Column_D
0 1 A B
1 2 E F
2 3 L M
3 4 H G
4 5 C F
5 6 Q P
df_a[掩码]
Number Column_A Column_B
1 2 E F
2 3 G H
3 4 L M
6 7 P Q
假设有更多的列要“屏蔽”,条件将变得很长。 有没有更好的合并/连接或其他解决方案?
您可以将列聚合为
set
并使用相同的逻辑:
mask = (df_a[['Column_A', 'Column_B']].agg(set, axis=1)
.isin(df_b[['Column_C', 'Column_D']].agg(set, axis=1))
)
out = df_a[mask]
输出:
Number Column_A Column_B
1 2 E F
2 3 G H
3 4 L M
6 7 P Q