我正在尝试按 YAML 中存储的条件过滤数据框。大概有100多个条件可以过滤;这些只是一些条件。
general_1:
condition_1:
'A': 1
'B': 5
'C': range(0, 8)
'D': 21
condition_2:
'A': 1
'B': 4
'C': range(9, 200)
'D': 22
condition_3:
'A': 1
'B': 3
'C': range(3, 200)
'D': 22
condition_4:
'A': 1
'B': 6
'C': range(3, 200)
'D': [21, 101, 102, 241, 242, 341, 342, 343, 344, 345, 346, 347, 348, 349, 351, 352, 353, 354, 355, 356, 357, 551, 552, 553, 554, 555, 556, 665, 667, 767, 861, 862]
我的目标是将此条件与数据框匹配并创建包含结果的新列,以便我可以标记不匹配的行。
input_file = data_file
config_file = yaml_file
def filter_columns(df, yaml_file):
with open(yaml_file) as f:
config = yaml.safe_load(f)
for row in df:
if (row['A'] == config['general_1']['condition_1']['A'] and
row['B'] == config['general_1']['condition_1']['B'] and
row['C'] == config['general_1']['condition_1']['C'] and
row['D'] in config['general_1']['condition_1']['D']):
row['matched'] = 1
elif (row['A'] == config['general_1']['condition_2']['A'] and
row['B'] == config['general_1']['condition_2']['B'] and
row['C'] == config['general_1']['condition_2']['C'] and
row['D'] in config['general_1']['condition_2']['D']):
row['matched'] = 1
elif (row['A'] == config['general_1']['condition_3']['A'] and
row['B'] == config['general_1']['condition_3']['B'] and
row['C'] == config['general_1']['condition_3']['C'] and
row['D'] in config['general_1']['condition_3']['D']):
row['matched'] = 1
else:
row['matched'] = 0
return df
with open(input_file, 'r') as f:
reader = csv.DictReader(f)
data = [row for row in reader]
filtered_data = filter_columns(data, config_file)
我不知道我哪里做错了。该函数不会创建包含结果的新列。