在python中合并多个财务报表,仅通过定位

问题描述 投票:0回答:1

我在列表上的一个类别内有多个 df 形式的财务报表,我想合并每个类别的所有财务报表,同时保留信息,并且不重复同一年的值来说明问题,我在 Excel 上做了一个例子关于我想要实现的目标:在此处输入图像描述

老实说我不知道该怎么办

python pandas dataframe logic finance
1个回答
0
投票

不确定一般方法,但我尝试通过参考下面的屏幕截图来实现一些代码,

import pandas as pd

data1 = {
  "Keys": ["Apple", "Grapes", "Banana"],
  "2023": [4, 3, 1],
  "2022" : [2, 5, 2],
  "2021" : [8, 7, 3]
}

data2 = {
  "Keys": ["Apple", "Orange", "Grapes","Mandarine"],
  "2022": [2, 3, 5, 7],
  "2021" : [8, 2, 7, 3],
  "2020" : [5, 2, 4, 8]
}

data3 = {
  "Keys": ["Apple", "Orange", "Grapes","Mandarine"],
  "2021": [8, 2, 7, 3],
  "2020" : [5, 2, 4, 8],
  "2019" : [3, 6, 4, 9]
}

data4 = {
  "Keys": [None,None],
  "": [None, None]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df3 = pd.DataFrame(data3)
df4 = pd.DataFrame(data4)

def try_merge(ls_dfs,key_col):
  final_df=ls_dfs[0]
  for i in ls_dfs[1:]:
    final_df=pd.merge(final_df,i,on=key_col, how='outer', suffixes=('', '_dup'))
    for j in final_df.columns:
      if '_dup' in j:
        mycol=j.replace('_dup','')
        final_df[mycol]=final_df[mycol].combine_first(final_df[j])
        final_df.drop(columns=[j],inplace=True)
  return final_df

refined_data=try_merge([df1,df2,df3,df4],"Keys")
refined_data=refined_data.dropna(axis=0,how="all").dropna(axis=1,how="all")
refined_data

这给了我几乎与最终数据框中预期的结果

我使用的参考链接:-

https://www.w3schools.com/python/pandas/ref_df_merge.asp

https://pandas.pydata.org/docs/reference/api/pandas.merge.html

© www.soinside.com 2019 - 2024. All rights reserved.