计算累积和,而另一列的值保持不变

问题描述 投票:1回答:2

对于以下df,我想计算列Inst_Dist的累积和,并保存为Cumu_Dist,而WDir_Deg的值保持不变。当WDir_Deg中的值发生变化时,我需要重新启动累积总和。

因此,

index | WDir_Deg | Inst_Dist | Cumu_Dist
0     | 289      | 20        | NaN
1     | 285      | 17        | NaN
2     | 285      | 19        | NaN
3     | 287      | 19        | NaN
4     | 289      | 10        | NaN

index | WDir_Deg | Inst_Dist | Cumu_Dist
0     | 289      | 20        | 20
1     | 285      | 17        | 17
2     | 285      | 19        | 36
3     | 287      | 19        | 19
4     | 289      | 10        | 10

我的非惯用(极慢)Python代码如下所示。如果有人可以指导我如何使代码更快和惯用,我真的很感激。

prev_angle = -1
curr_cumu_dist = 0
for curr_ind in df.index:
    curr_angle = df.loc[curr_ind, 'WDir_Deg']
    if prev_angle == curr_angle:
        curr_cumu_dist += df.loc[curr_ind, 'Inst_Dist']
        df.loc[curr_ind, 'Cumu_Dist'] = curr_cumu_dist
    else:
        prev_angle = curr_angle
        curr_cumu_dist = df.loc[curr_ind, 'Inst_Dist']
        df.loc[curr_ind, 'Cumu_Dist'] = curr_cumu_dist
pandas cumulative-sum
2个回答
0
投票

使用helper qazxsw poi与qazxsw poi列比较SeriesWDir_Degne不等于连续组并将其传递给shift

cumsum

详情:

DataFrameGroupBy.cumsum

0
投票

有点棘手。引用这个问题/答案s = df['WDir_Deg'].ne(df['WDir_Deg'].shift()).cumsum() df['Cumu_Dist'] = df.groupby(s)['Inst_Dist'].cumsum() print (df) WDir_Deg Inst_Dist Cumu_Dist 0 289 20 20 1 285 17 17 2 285 19 36 3 287 19 19 4 289 10 10

我做了这个解决方案

print (s)
0    1
1    2
2    2
3    3
4    4
Name: WDir_Deg, dtype: int32

哪个回报

Pandas groupby cumulative sum

这使用df['Cumu_Dist'] = df.groupby('WDir_Deg').Inst_Dist.cumsum() 版本 index WDir_Deg Inst_Dist Cumu_Dist 0 0 285 17 17 1 1 285 19 36 2 2 287 19 19 3 3 289 20 20

© www.soinside.com 2019 - 2024. All rights reserved.