pandas 数据帧对连续重复项进行分组并对值求和

Question

在 pandas 数据框中，我完全不知道如何使用

groupby()

的方法来按列中的总和值来处理连续重复项

假设我有以下 DataFrame

df

:

index   type    value
  0    profit    11     
  1    profit    10
  2    loss      -5
  3    profit    50
  4    profit    15
  5    loss     -30
  6    loss     -25
  7    loss     -10

我想要的是：

index   type    grand
  0    profit    21  # total of 11 + 10 = 21
  1    loss      -5  # the same value as this row NOT consecutive duplicated
  2    profit    65  # total of 50 + 15 = 65
  3    loss     -30  # total of -30 -25 -10 = -65

我尝试做的事情：

df['grand'] = df.groupby(df['type'].ne(df['type'].shift()).cumsum()).cumcount()

但它让我计算连续重复的

我尝试用多种解决方案迭代行，但都失败了

Answer 1

代替

.cumcount()

使用 sum：

out = df.groupby((df["type"] != df["type"].shift()).cumsum()).agg(
    {"type": "first", "value": "sum"}
)
print(out.reset_index(drop=True))

打印：

     type  value
0  profit     21
1    loss     -5
2  profit     65
3    loss    -65

pandas 数据帧对连续重复项进行分组并对值求和

问题描述投票：0回答：1

1个回答

最新问题

pandas 数据帧对连续重复项进行分组并对值求和

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1