我有以下数据框:
班级 | 已收到 | 已发布 |
---|---|---|
FD | 10 | 0 |
FD | 0 | 2 |
RM | 5 | 0 |
RM | 0 | 3 |
FD | 0 | 2 |
下午 | 5 | 0 |
下午 | 1 | 0 |
RM | 1 | 0 |
FD | 4 | 0 |
我需要下面的数据框:
班级 | 已收到 | 已发布 | 剩余数量 |
---|---|---|---|
FD | 10 | 0 | 10 |
FD | 0 | 2 | 8 |
RM | 5 | 0 | 5 |
RM | 0 | 3 | 2 |
FD | 0 | 2 | 6 |
下午 | 5 | 0 | 5 |
下午 | 1 | 0 | 6 |
RM | 1 | 0 | 3 |
FD | 4 | 0 | 10 |
剩余数量列是每个班级收到的-发出的cumsum()。我尝试过不同的方法,但我不明白。
df['Remaining Quantity'] = df.groupby('Class').apply(lambda x: x['Received'].cumsum() - x['Issued'].cumsum()).reset_index(level = 0, drop=True)
输出:
Class Received Issued Remaining Quantity
0 FD 10 0 10
1 FD 0 2 8
2 RM 5 0 5
3 RM 0 3 2
4 FD 0 2 6
5 PM 5 0 5
6 PM 1 0 6
7 RM 1 0 3
8 FD 4 0 10
另一种可能的解决方案:
df["Remaining Quatity"] = (
df.eval("tmp=Received-Issued").groupby("Class")["tmp"].cumsum()
)
输出:
print(df)
Class Received Issued Remaining Quatity
0 FD 10 0 10
1 FD 0 2 8
2 RM 5 0 5
3 RM 0 3 2
4 FD 0 2 6
5 PM 5 0 5
6 PM 1 0 6
7 RM 1 0 3
8 FD 4 0 10
另一种解决方案:
df["Remaining Quatity"] = (g := df.groupby("Class").cumsum())["Received"] - g["Issued"]
print(df)
打印:
Class Received Issued Remaining Quatity
0 FD 10 0 10
1 FD 0 2 8
2 RM 5 0 5
3 RM 0 3 2
4 FD 0 2 6
5 PM 5 0 5
6 PM 1 0 6
7 RM 1 0 3
8 FD 4 0 10
一种方法是使用
.stack
计算差异,然后沿索引将值分配回。
df['Remaining Quality'] = df.assign(
Issued=df['Issued'] * -1).set_index('Class',append=True)\
.stack().groupby(level=1).cumsum().unstack(-1).droplevel(1,0)['Issued']
print(df)
Class Received Issued Remaining Quality
0 FD 10 0 10
1 FD 0 2 8
2 RM 5 0 5
3 RM 0 3 2
4 FD 0 2 6
5 PM 5 0 5
6 PM 1 0 6
7 RM 1 0 3
8 FD 4 0 10