在两个excel文件之间的pandas中使用Merge函数而不是vlookup时出错(Key ERROR)

问题描述 投票:0回答:1

我正在尝试合并到Pandas中的excel文件。

import pandas as pd
import numpy as np
upload_raw = pd.read_excel(r'C:\Users\Desktop\Upload Raw Data.xlsx',
                     sheet_name = 'Upload',
                     header = 0,
                     index_col = 0,
                     )
mapping = pd.read_excel(r'C:\Users\Desktop\Mapping.xlsx',
                     sheet_name = 'Mapping',
                     header = 0,
                     index_col = 0,
                     )
display(upload_raw)
display(mapping)
upload_lookup=upload_raw.merge(mapping,on ='BRANCH',how = 'outer' )
display(upload_lookup)

我继续得到KeyError: 'BRANCH'。我检查了列的值都是文本。 Mapping文件有3列,而上传大约有4列。

上传原始数据

BRANCH  DEPT    CREAT_TS    RAF_IND
AA  &CR     2018-06-22-06.48.49.601000   
03  CUE 2018-06-22-11.43.29.859000   
90  T0L 2018-06-22-11.54.52.633000   

映射数据:

BRANCH  UNIT    MASTER
03  MAS CoE
04  NAS ET
05  ET  ET

在错误消息中,这些非常突出。

 # validate the merge keys dtypes. We may need to coerce

# work-around for merge_asof(right_index=True)
# duplicate columns & possible reduce dimensionality

我该如何避免这个问题。

我甚至尝试过left_on = 'True', right_on = 'True'

left_key = 'lkey', right_key = 'rkey'。我收到错误'找不到rkey

此致,任。

python-3.x pandas
1个回答
1
投票

主要区别似乎是我没有将'BRANCH'设置为索引。

此外,映射'BRANCH'作为int64导入,因为该示例仅包含数字,而upload_raw'BRANCH'作为对象导入。

upload_raw = pd.read_excel('data/2018-09-03_data_mapping.xlsx',
                           sheet_name = 'Upload',
                           header = 0)
mapping = pd.read_excel(r'data/2018-09-03_data_mapping.xlsx',
                        sheet_name = 'Mapping',
                        header = 0)
print(upload_raw)

output:
    BRANCH  DEPT    CREAT_TS    RAF_IND
0   AA  &CR 2018-06-22-06.48.49.601000  NaN
1   3   CUE 2018-06-22-11.43.29.859000  NaN
2   90  T0L 2018-06-22-11.54.52.633000  NaN

mapping['BRANCH'] = mapping['BRANCH'].astype('object')

print(mapping)

output:
    BRANCH  UNIT    MASTER
0   3   MAS CoE
1   4   NAS ET
2   5   ET  ET

upload_lookup=pd.merge(left=upload_raw, right=mapping, on='BRANCH')

print(upload_lookup)

output:
    BRANCH  DEPT    CREAT_TS    RAF_IND UNIT    MASTER
0   3   CUE 2018-06-22-11.43.29.859000  NaN MAS CoE
© www.soinside.com 2019 - 2024. All rights reserved.