这是集合分析中的一个典型和/或问题,我被困了很长时间。
我想对这些ID的数量求和,其中:
type_of_entry既是“收入和支出”又是“收入和劳动”
收入类型为'CAF'
预期的ID以粗体显示
例如... id 1对于收入和费用均存在。同样,收入和人工均存在ID 2和ID 3。
结果->数量= 55(5 + 40 + 10)
我已经尝试了以下集合分析,但是没有用:
我将不胜感激。
问候
Sagnik
您接受答案是Python解决方案吗?
import pandas as pd
from collections import defaultdict
df = pd.DataFrame([
['Expense', 1, 10, '-'],
['Labor', 2, 20, '-'],
['Labor', 3, 50, '-'],
['Revenue', 1, 5, 'CAF'],
['Revenue', 2, 30, 'NORM'],
['Revenue', 2, 40, 'CAF'],
['Revenue', 3, 10, 'CAF'],
['Revenue', 4, 20, 'NORM'],
['Revenue', 5, 30, 'CAF']
], columns=['type_of_entry', 'id', 'amount', 'revenue_type'])
series_caf = df[df['revenue_type'].eq('CAF')]
filter_id_list = series_caf['id'].to_list() # 1, 2, 3, 5
result_amount = 0
dict_ok = defaultdict(list)
for cur_id in filter_id_list:
is_revenue = len(df[(df.id == cur_id) & (df.type_of_entry == 'Revenue')]) > 0
is_expense = len(df[(df.id == cur_id) & (df.type_of_entry == 'Expense')]) > 0
is_labor = len(df[(df.id == cur_id) & (df.type_of_entry == 'Labor')]) > 0
is_ok = (is_revenue and is_expense) or (is_revenue and is_labor)
if is_ok:
cur_amount = series_caf[series_caf.id == cur_id].amount.values[0]
result_amount += cur_amount
dict_ok['id'].append(cur_id)
dict_ok['amount'].append(cur_amount)
dict_ok['ok_reason (REL)'].append(is_revenue*100+is_expense*10+is_labor)
df_result_info = pd.DataFrame.from_dict(dict_ok)
print(df_result_info)
print(result_amount)
输出
id amount ok_reason (REL)
0 1 5 110
1 2 40 101
2 3 10 101
55