在给定可变数量的条件下如何在数据框上设置值?

问题描述 投票:1回答:3
from itertools import product
import pandas as pd

animals = ["dogs", "cats"]
eyes = ['brown', 'blue', 'green']
height = ['short', 'average', 'tall']
a = [animals, eyes, height]
df = pd.DataFrame(list(product(*a)), columns=["animals", "eyes", "height"])
df['value'] = 1

输出:

   animals   eyes   height  value
0     dogs  brown    short      1
1     dogs  brown  average      1
2     dogs  brown     tall      1
3     dogs   blue    short      1
4     dogs   blue  average      1
5     dogs   blue     tall      1
6     dogs  green    short      1

问题:如何创建单个函数,以便在给定一个或多个条件的情况下,一个或多个行中的“值”为零?]

示例:

# This would change all the 1s into 0s for all dogs with blue eyes.
zero_out(df, [("animals", "dogs"), ("eyes", "blue")])

# This would change all the 1s into 0s for all tall animals.
zero_out(df, [("height", "tall")])

到目前为止我的尝试:我尝试使用* unpacking来执行此操作,但是没有运气,因为我不知道如何使用解压缩的变量来设置多个条件。如果我对条件数进行硬编码,则设置多个条件很容易...df[(condition1) & (condition2) & (condition3)] = 0

此外,也许这超出了问题的范围,在使用* unpacking进行常规if语句的情况下,如何设置可变数量的条件(或不对if语句中的条件数量进行硬编码?

例如

if a > 0 and b > 4
#Or...
if a > 0 and b > 4 and c < 2

感谢您的帮助。

python pandas iterable-unpacking
3个回答
1
投票

如果我理解正确,您正在寻找.query()方法:

.query()

打印:

import pandas as pd
from itertools import product

animals = ["dogs", "cats"]
eyes = ['brown', 'blue', 'green']
height = ['short', 'average', 'tall']
a = [animals, eyes, height]
df = pd.DataFrame(list(product(*a)), columns=["animals", "eyes", "height"])
df['value'] = 1


def zero_out(df, lst):
    q = ' & '.join( '{} == "{}"'.format(col, val) for col, val in lst )
    df.loc[df.query(q).index, 'value'] = 0

zero_out(df, [("height", "tall")])
print(df)

animals eyes height value 0 dogs brown short 1 1 dogs brown average 1 2 dogs brown tall 0 3 dogs blue short 1 4 dogs blue average 1 5 dogs blue tall 0 6 dogs green short 1 7 dogs green average 1 8 dogs green tall 0 9 cats brown short 1 10 cats brown average 1 11 cats brown tall 0 12 cats blue short 1 13 cats blue average 1 14 cats blue tall 0 15 cats green short 1 16 cats green average 1 17 cats green tall 0

zero_out(df, [("animals", "dogs"), ("eyes", "blue")])

0
投票
   animals   eyes   height  value
0     dogs  brown    short      1
1     dogs  brown  average      1
2     dogs  brown     tall      1
3     dogs   blue    short      0
4     dogs   blue  average      0
5     dogs   blue     tall      0
6     dogs  green    short      1
7     dogs  green  average      1
8     dogs  green     tall      1
9     cats  brown    short      1
10    cats  brown  average      1
11    cats  brown     tall      1
12    cats   blue    short      1
13    cats   blue  average      1
14    cats   blue     tall      1
15    cats  green    short      1
16    cats  green  average      1
17    cats  green     tall      1

您也可以使用它。它比Andrej的方法更具通用性,因为它不假定过滤器值是字符串。


0
投票

您可以尝试:

def zero_out(df, list_of_filters, out_column='value'):
    conds = np.ones(df.shape[0], dtype=bool)
    for col_name, val in list_of_filters:
        cond = df[col_name].eq(val)
        conds &= cond
    df.loc[conds, out_column] = 0
    return df

结果:

def zero_out(df, *args):
    df_temp = df.copy()
    for arg in args:
        df_temp = df_temp[df_temp[arg[0]] == arg[1]].copy()
    df.iloc[df_temp.index, -1] = 0
    return df

zero_out(df, ("animals", "dogs"), ("eyes", "blue"))
© www.soinside.com 2019 - 2024. All rights reserved.