试图加入两个熊猫数据框,但得到“ ValueError:您正在尝试合并对象和int64列。”?

问题描述 投票:3回答:1
我有两个熊猫数据帧:seren1bbox。我想在名为filepath的列上执行它们的内部联接。

seren1[["filepath", "label"]].join(bbox[["filepath", "label"]], on="filepath", how="inner", lsuffix='_caller', rsuffix='_other')

给出错误:

ValueError Traceback (most recent call last) <ipython-input-74-c001a7adc7cd> in <module> ----> 1 seren1[["filepath", "label"]].join(bbox[["filepath", "label"]], on="filepath", how="inner", lsuffix='_caller', rsuffix='_other') /projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/frame.py in join(self, other, on, how, lsuffix, rsuffix, sort) 6822 # For SparseDataFrame's benefit 6823 return self._join_compat(other, on=on, how=how, lsuffix=lsuffix, -> 6824 rsuffix=rsuffix, sort=sort) 6825 6826 def _join_compat(self, other, on=None, how='left', lsuffix='', rsuffix='', /projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/frame.py in _join_compat(self, other, on, how, lsuffix, rsuffix, sort) 6837 return merge(self, other, left_on=on, how=how, 6838 left_index=on is None, right_index=True, -> 6839 suffixes=(lsuffix, rsuffix), sort=sort) 6840 else: 6841 if on is not None: /projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/reshape/merge.py in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate) 45 right_index=right_index, sort=sort, suffixes=suffixes, 46 copy=copy, indicator=indicator, ---> 47 validate=validate) 48 return op.get_result() 49 /projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/reshape/merge.py in __init__(self, left, right, how, on, left_on, right_on, axis, left_index, right_index, sort, suffixes, copy, indicator, validate) 531 # validate the merge keys dtypes. We may need to coerce 532 # to avoid incompat dtypes --> 533 self._maybe_coerce_merge_keys() 534 535 # If argument passed to validate, /projects/community/py-data-science-stack/5.1.0/kp807/envs/fastai/lib/python3.7/site-packages/pandas/core/reshape/merge.py in _maybe_coerce_merge_keys(self) 978 (inferred_right in string_types and 979 inferred_left not in string_types)): --> 980 raise ValueError(msg) 981 982 # datetimelikes must match exactly ValueError: You are trying to merge on object and int64 columns. If you wish to proceed you should use pd.concat

但是如果我将它们转换为系列联接:

import numpy as np pd.Series(np.intersect1d(seren1["filepath"].values,bbox["filepath"].values))

效果很好:

0 S1/B04/B04_R1/S1_B04_R1_PICT0006 1 S1/B04/B04_R1/S1_B04_R1_PICT0007 2 S1/B04/B04_R1/S1_B04_R1_PICT0008 3 S1/B04/B04_R1/S1_B04_R1_PICT0013 4 S1/B04/B04_R1/S1_B04_R1_PICT0039 5 S1/B04/B04_R1/S1_B04_R1_PICT0040 6 S1/B04/B04_R1/S1_B04_R1_PICT0041 7 S1/B05/B05_R1/S1_B05_R1_PICT0056 ......

类型检查:

seren1.dtypes filepath object timestamp object label object dtype: object bbox.dtypes filepath object label object X int64 Y int64 W int64 H int64 dtype: object all (seren1.filepath.apply(lambda x: isinstance(x, str)) ) True all (bbox.filepath.apply(lambda x: isinstance(x, str)) ) True

出了什么问题?

我有两个熊猫数据帧:seren1和bbox。我想在名为filepath的列上执行它们的内部联接。 seren1 [[“” filepath“,” label“]]。join(bbox [[”“ filepath”,“ label”]],on =“ filepath”,how = ...

python pandas
1个回答
0
投票
我能够按如下所示解决此错误:

假设您正在尝试将df2加入df1。为了使连接功能正常工作,两个数据框中的列名称“ Column”必须相同,并且要连接的数据框中的“ Column”列也必须具有set_index。要使df2在列“ Column”处加入df1,请使用]

© www.soinside.com 2019 - 2024. All rights reserved.