如何使用熊猫优化此值分配

问题描述 投票:0回答:2

我在Pandas中有一个DataFrame,其列'register'可以为0或某个正数,我想创建一个新列'Working',如果'register'中的该行或前面的7个中的任何一个为1的不是0。我尝试遍历它们,但是由于它是一个很大的DataFrame,因此工作非常慢。这是我的代码:

df['working'] = 0
for i in range(len(df['register'])):
    if df['register'][i] != 0 or \
        (i>1 and df['register'][i-1] != 0) or\
        (i>2 and df['register'][i-2] != 0) or\
        (i>3 and df['register'][i-3] != 0) or\
        (i>4 and df['register'][i-4] != 0) or\
        (i>5 and df['register'][i-5] != 0) or\
        (i>6 and df['register'][i-6] != 0):
        df['working'][i] = 1
    else:
        df['working'][i] = 0

我也尝试使用this,看起来像这样:

df['working']=df['register'].apply(lambda x: 1 if x!=0 or x.shift(1)!=0 or x.shift(2)!=0 or x.shift(3)!=0 or x.shift(4)!=0 or x.shift(5)!=0 or x.shift(6)!=0 else 0)

但是我得到了:

AttributeError:“ float”对象没有属性“ shift”

是否有使用熊猫的更好方法?

提前感谢。

python pandas dataframe
2个回答
1
投票

这应该可行,您可能希望将min_periods=1传递给rolling

df['working'] = df['register'].ne(0).rolling(6).sum().gt(0)

1
投票

尝试:

conditional_value= [1]
condition = [df['register'].rolling(8).sum()>0]
df['working'] = np.select(condition, working, default=0)

您可以提供其他条件和相应的值:

condition = [condition 1, condition 2, ......, condition n]
conditional_values = [value 1, value 2, ........, value n]
© www.soinside.com 2019 - 2024. All rights reserved.