我有这个数据框
import pandas as pd
import math
from pandas import Timestamp
Date = [Timestamp('2024-03-16 23:59:42'), Timestamp('2024-03-16 23:59:42'), Timestamp('2024-03-16 23:59:44'), Timestamp('2024-03-16 23:59:44'), Timestamp('2024-03-16 23:59:44'), Timestamp('2024-03-16 23:59:47'), Timestamp('2024-03-16 23:59:48'), Timestamp('2024-03-16 23:59:48'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49'), Timestamp('2024-03-16 23:59:49')]
Price = [0.6729, 0.6728, 0.6728, 0.6728, 0.6728, 0.673, 0.6728, 0.6729, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6728, 0.6729, 0.6728]
Side = [-1, -1, -1, 1, -1, 1, -1, 1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1]
Amount = [1579.2963000000002, 7.400799999999999, 6.728, 177.61919999999998, 797.2679999999999, 33650.0, 131.196, 48.448800000000006, 0.6728, 0.6728, 0.6728, 6.728, 0.6728, 1.3456, 0.6728, 0.6728, 0.6728, 0.6728, 0.6729, 0.6728]
buy = [math.nan, math.nan, math.nan, 177.61919999999998, math.nan, 33650.0, math.nan, 48.448800000000006, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, math.nan, 49.121700000000004, math.nan]
df = pd.DataFrame({
'Date':Date,
'Price':Price,
'Side':Side,
'Amount':Amount,
'buy':buy
})
print(df)
我使用
得到了
buy
列
df['buy'] = df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
但是我想在
buy
列中获取 0 而不是 nan 值,如果该价格尚未在组中满足或累积和的先前值
结果
buy
列需要 - [0,0,0,177.6192,177.6192,33650, 177.6192,48.4488, 177.6192,.....]
我该如何实现这个?
您可以
reindex
、ffill
和fillna
:
df['buy'] = (df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
.reindex(df.index).ffill().fillna(0)
)
或者分两步:
df['buy'] = df[df['Side'] == 1].groupby([df['Date'].dt.floor('H'), 'Price'])['Amount'].cumsum()
df['buy'] = df['buy'].ffill().fillna(0)
输出:
Date Price Side Amount buy
0 2024-03-16 23:59:42 0.6729 -1 1579.2963 0.0000
1 2024-03-16 23:59:42 0.6728 -1 7.4008 0.0000
2 2024-03-16 23:59:44 0.6728 -1 6.7280 0.0000
3 2024-03-16 23:59:44 0.6728 1 177.6192 177.6192
4 2024-03-16 23:59:44 0.6728 -1 797.2680 177.6192
5 2024-03-16 23:59:47 0.6730 1 33650.0000 33650.0000
6 2024-03-16 23:59:48 0.6728 -1 131.1960 33650.0000
7 2024-03-16 23:59:48 0.6729 1 48.4488 48.4488
8 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
9 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
10 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
11 2024-03-16 23:59:49 0.6728 -1 6.7280 48.4488
12 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
13 2024-03-16 23:59:49 0.6728 -1 1.3456 48.4488
14 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
15 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
16 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
17 2024-03-16 23:59:49 0.6728 -1 0.6728 48.4488
18 2024-03-16 23:59:49 0.6729 1 0.6729 49.1217
19 2024-03-16 23:59:49 0.6728 -1 0.6728 49.1217