pandas 循环遍历数据帧列表进行比较

Question

计算要求：dfs中的i_f、r_f1、r_f2、r_f3，i_f与阈值0.95比较（大于等于的ID），前两个r_f与0.3比较（小于等于的ID），第三个r_f与0.46比较（获取小于或等于的ID），所有比较结果一起确定第一个df中的合格ID，在下一个df中，做同样的事情，i_f与相同的阈值0.95进行比较，但是对于前两个r_fs，先前 DF 中的合格 ID 与 0.3333 比较，其他与 0.3 比较，第三个 r_f，先前 DF 中的合格 ID 与 0.49 比较，其他与 0.46 比较，再次一起确定此 df 中的合格 ID，依此类推df 在列表中。

下面是 df 列表示例和代码，但我收到错误“没有为对象类型系列命名列的轴”。 df1 中所有 ID 的预期输出均为 1,0,1，df2 中所有 ID 的预期输出均为 1,1,1,0,0。

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'ID':[1,2,3],
                    'i_f':[0.967385562,0.869575345,1],
                    'r_f1':[0.18878187,0.327355797,0.100753051],
                    'r_f2':[0.047237449,0.056038276,0.189434048],
                    'r_f3':[0.095283998,0.2554309,0.368240321]})
df2 = pd.DataFrame({'ID':[1,2,3,4,5],
                    'i_f':[0.985,1,0.993297332,1,1],
                    'r_f1':[0.300009355,0.281788473,0.146077926,0.167329833,0.245227094],
                    'r_f2':[0.152293038,0.06668,0.196683885,0.321269411,0.02493159],
                    'r_f3':[0.111617815,0.042016,0.465285158,0.085330897,0.548370325]})
df_lst = [df1, df2]

threshold_if = 0.95
threshold_rf12_new = 0.3
threshold_rf12_current = 0.3333
threshold_rf3_new = 0.46
threshold_rf3_current = 0.49

threshold12 = {}
threshold3 = {}
for df in df_lst:
    m0 = df['i_f'].ge(threshold_if)
    m1 = df['r_f1'].le(df['ID'].map(threshold12).fillna(threshold_rf12_new).values, axis=0)
    m2 = df['r_f2'].le(df['ID'].map(threshold12).fillna(threshold_rf12_new).values, axis=0)
    m3 = df['r_f3'].le(df['ID'].map(threshold3).fillna(threshold_rf3_new).values, axis=0)
    res = (m0 & m1 & m2 & m3.all(axis='columns')).astype(int)
    df['f_f'] = res
    threshold12 = dict(zip(df['ID'], np.where(res, threshold_rf12_current, threshold_rf12_new)))
    threshold3 = dict(zip(df['security_id'], np.where(res, threshold_rf3_current, threshold_rf3_new)))
    print(df)`

Answer 1

替换此行：

res = (m0 & m1 & m2 & m3.all(axis='columns')).astype(int)

与

res = m0 & m1 & m2 & m3

这会产生原始问题中列出的预期输出。

   ID    f_f
0   1   True
1   2  False
2   3   True
   ID    f_f
0   1   True
1   2   True
2   3   True
3   4  False
4   5  False

如果您希望它们是整数，请在这一行中将它们转换为 int：

df['f_f'] = res
df['f_f'] = df['f_f'].astype(int)

pandas 循环遍历数据帧列表进行比较

问题描述投票：0回答：1

1个回答

最新问题

pandas 循环遍历数据帧列表进行比较

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1