假设我有两个数据帧:
# df1
+-----------------------+
| Name_1 |Age| Location |
+-----------------------+
| A | 18 | UK |
| B | 19 | US |
+-----------------------+
# df2
+-------------------------+
| Name_2 | Age | Location |
+-------------------------+
| A | 18 | US |
| B | 19 | US |
+-------------------------+
如何比较所有元素并使用布尔值获取数据框,以指示相应的值是否匹配?
期望的输出是:
# desired
+-----------------------+
| Name | Age | Location|
+-----------------------+
| A | True | False |
| B | True | True |
+-----------------------+
如果两个DataFrame中相同数量的行和相同列名由name
在DataFrame.set_index
中创建索引,然后比较:
df11 = df1.set_index('name')
df22 = df2.set_index('name')
df = (df11 == df22).reset_index()
编辑:如果index
只有不同的列:
df11 = df1.set_index('Name_1')
df22 = df2.set_index('Name_2')
df = (df11 == df22).reset_index()
print (df)
Name_1 Age Location
0 A True False
1 B True True
如果可能的话,不同的另一列,但列的长度仍然相同,并且索引的长度也必须在两者中设置相同的列名称 - 例如df22 columns
的df11 columns
:
print (df1)
Name_1 Age1 Location1
0 A 18 UK
1 B 19 US
print (df2)
Name_2 Age2 Location2
0 A 18 US
1 B 19 US
df11 = df1.set_index('Name_1')
df22 = df2.set_index('Name_2')
df22.columns = df11.columns
df = (df11 == df22).reset_index()
print (df)
Name_1 Age1 Location1
0 A True False
1 B True True