计算要求:dfs中的i_f、r_f1、r_f2、r_f3,i_f与阈值0.95比较(大于等于的ID),前两个r_f与0.3比较(小于等于的ID),第三个r_f与0.46比较(获取小于或等于的ID),所有比较结果一起确定第一个df中的合格ID,在下一个df中,做同样的事情,i_f与相同的阈值0.95进行比较,但是对于前两个r_fs,先前 DF 中的合格 ID 与 0.3333 比较,其他与 0.3 比较,第三个 r_f,先前 DF 中的合格 ID 与 0.49 比较,其他与 0.46 比较,再次一起确定此 df 中的合格 ID,依此类推df 在列表中。
下面是 df 列表示例和代码,但我收到错误“没有为对象类型系列命名列的轴”。 df1 中所有 ID 的预期输出均为 1,0,1,df2 中所有 ID 的预期输出均为 1,1,1,0,0。
import pandas as pd
import numpy as np
df1 = pd.DataFrame({'ID':[1,2,3],
'i_f':[0.967385562,0.869575345,1],
'r_f1':[0.18878187,0.327355797,0.100753051],
'r_f2':[0.047237449,0.056038276,0.189434048],
'r_f3':[0.095283998,0.2554309,0.368240321]})
df2 = pd.DataFrame({'ID':[1,2,3,4,5],
'i_f':[0.985,1,0.993297332,1,1],
'r_f1':[0.300009355,0.281788473,0.146077926,0.167329833,0.245227094],
'r_f2':[0.152293038,0.06668,0.196683885,0.321269411,0.02493159],
'r_f3':[0.111617815,0.042016,0.465285158,0.085330897,0.548370325]})
df_lst = [df1, df2]
threshold_if = 0.95
threshold_rf12_new = 0.3
threshold_rf12_current = 0.3333
threshold_rf3_new = 0.46
threshold_rf3_current = 0.49
threshold12 = {}
threshold3 = {}
for df in df_lst:
m0 = df['i_f'].ge(threshold_if)
m1 = df['r_f1'].le(df['ID'].map(threshold12).fillna(threshold_rf12_new).values, axis=0)
m2 = df['r_f2'].le(df['ID'].map(threshold12).fillna(threshold_rf12_new).values, axis=0)
m3 = df['r_f3'].le(df['ID'].map(threshold3).fillna(threshold_rf3_new).values, axis=0)
res = (m0 & m1 & m2 & m3.all(axis='columns')).astype(int)
df['f_f'] = res
threshold12 = dict(zip(df['ID'], np.where(res, threshold_rf12_current, threshold_rf12_new)))
threshold3 = dict(zip(df['security_id'], np.where(res, threshold_rf3_current, threshold_rf3_new)))
print(df)`
替换此行:
res = (m0 & m1 & m2 & m3.all(axis='columns')).astype(int)
与
res = m0 & m1 & m2 & m3
这会产生原始问题中列出的预期输出。
ID f_f
0 1 True
1 2 False
2 3 True
ID f_f
0 1 True
1 2 True
2 3 True
3 4 False
4 5 False
如果您希望它们是整数,请在这一行中将它们转换为 int:
df['f_f'] = res
df['f_f'] = df['f_f'].astype(int)