pandas 循环遍历数据帧列表进行比较

问题描述 投票:0回答:1

计算要求:dfs中的i_f、r_f1、r_f2、r_f3,i_f与阈值0.95比较(大于等于的ID),前两个r_f与0.3比较(小于等于的ID),第三个r_f与0.46比较(获取小于或等于的ID),所有比较结果一起确定第一个df中的合格ID,在下一个df中,做同样的事情,i_f与相同的阈值0.95进行比较,但是对于前两个r_fs,先前 DF 中的合格 ID 与 0.3333 比较,其他与 0.3 比较,第三个 r_f,先前 DF 中的合格 ID 与 0.49 比较,其他与 0.46 比较,再次一起确定此 df 中的合格 ID,依此类推df 在列表中。

下面是 df 列表示例和代码,但我收到错误“没有为对象类型系列命名列的轴”。 df1 中所有 ID 的预期输出均为 1,0,1,df2 中所有 ID 的预期输出均为 1,1,1,0,0。

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'ID':[1,2,3],
                    'i_f':[0.967385562,0.869575345,1],
                    'r_f1':[0.18878187,0.327355797,0.100753051],
                    'r_f2':[0.047237449,0.056038276,0.189434048],
                    'r_f3':[0.095283998,0.2554309,0.368240321]})
df2 = pd.DataFrame({'ID':[1,2,3,4,5],
                    'i_f':[0.985,1,0.993297332,1,1],
                    'r_f1':[0.300009355,0.281788473,0.146077926,0.167329833,0.245227094],
                    'r_f2':[0.152293038,0.06668,0.196683885,0.321269411,0.02493159],
                    'r_f3':[0.111617815,0.042016,0.465285158,0.085330897,0.548370325]})
df_lst = [df1, df2]

threshold_if = 0.95
threshold_rf12_new = 0.3
threshold_rf12_current = 0.3333
threshold_rf3_new = 0.46
threshold_rf3_current = 0.49

threshold12 = {}
threshold3 = {}
for df in df_lst:
    m0 = df['i_f'].ge(threshold_if)
    m1 = df['r_f1'].le(df['ID'].map(threshold12).fillna(threshold_rf12_new).values, axis=0)
    m2 = df['r_f2'].le(df['ID'].map(threshold12).fillna(threshold_rf12_new).values, axis=0)
    m3 = df['r_f3'].le(df['ID'].map(threshold3).fillna(threshold_rf3_new).values, axis=0)
    res = (m0 & m1 & m2 & m3.all(axis='columns')).astype(int)
    df['f_f'] = res
    threshold12 = dict(zip(df['ID'], np.where(res, threshold_rf12_current, threshold_rf12_new)))
    threshold3 = dict(zip(df['security_id'], np.where(res, threshold_rf3_current, threshold_rf3_new)))
    print(df)`
pandas loops
1个回答
0
投票

替换此行:

res = (m0 & m1 & m2 & m3.all(axis='columns')).astype(int)

res = m0 & m1 & m2 & m3

这会产生原始问题中列出的预期输出。

   ID    f_f
0   1   True
1   2  False
2   3   True
   ID    f_f
0   1   True
1   2   True
2   3   True
3   4  False
4   5  False

如果您希望它们是整数,请在这一行中将它们转换为 int:

df['f_f'] = res
df['f_f'] = df['f_f'].astype(int)
© www.soinside.com 2019 - 2024. All rights reserved.