我正在从Excel中的数据分析过渡到Python,但无法找到在我的数据框中使用的等效Python代码的解决方案。要计算“滚动总和”列,我将使用公式IF(C3 = FALSE,0,(1 + D2))(对于下表)。在此示例中,只要在“> 20”列中“金额”值大于20,则返回值1,然后将其加到其上方的值上。
我尝试在Python中创建滚动总和列:
def f(row):
if row['> 20'] == False:
val = 0
else:
#getting stuck here as to how to add to the row above, shift(1) is incorrect
val = 1 + shift(1)
return val
df['Rolling Sum'] = df.apply(f, axis=1)
Event | Amount | > 20 | Rolling Sum |
+-------+--------+-------+-------------+
| 1 | 7 | FALSE | |
| 2 | 25 | TRUE | 1 |
| 3 | 28 | TRUE | 2 |
| 4 | 3 | FALSE | 0 |
| 5 | 30 | TRUE | 1 |
| 6 | 35 | TRUE | 2 |
| 7 | 40 | TRUE | 3 |
| 8 | 6 | FALSE | 0 |
+-------+--------+-------+-------------+
尝试以下操作:
for index, row in df.iterrows():
if df.loc[index, '> 20'] == True:
df.loc[index, 'Rolling Sum'] = df.loc[index-1, 'Rolling Sum']+1
else:
df.loc[index, 'Rolling Sum'] = 0