注意:这是对我先前的问题的正确回答,请参见Dataframe cell to be locked and used for a running balance calculation conditional of result on another cell on same row
也请注意:无论谁能够在下面正确回答我的问题,我都会将该问题视为悬赏问题,并立即为成功答案提供50分。
说我有以下数据框
import pandas as pd
df = pd.DataFrame()
df['E'] = ('SIT','SCLOSE', 'SHODL', 'SHODL', 'SHODL', 'SHODL', 'SHODL', 'SHODL','SHODL','SCLOSE_BUY','BCLOSE_SELL', 'BHODL', 'BHODL', 'BHODL', 'BHODL', 'BHODL', 'BHODL','BUY','SIT','SIT')
df['F'] = (0.00,1.00,10.00, 5.00,6.00,-6.00, 6.00, 2.00,10.00,10.00,-8.00,33.00,-15.00,6.00,-1.00,5.00,10.00,0.00,0.00,0.00)
df.loc[19, 'G'] = 100.0000
对于G列,从100开始,我的上一个问题适用相同的规则,即如果在E列上发生买入或卖出,则G列上的相应余额将被锁定并连续用作基础金额,以计算F列为运行余额中每一行的增加/减少百分比,直到在栏E上显示BCLOSE或SCLOSE。
我已经解释了上一个问题中的规则,但是对该问题的新知识是,如果显示SCLOSE_BUY,则表明SELL已关闭,而BUY被打开,反之亦然,对于BCLOSE_SELL。 BCLOSE,SCLOSE,SCLOSE_BUY或BCLOSE_SELL行均成为运行余额计算的最后一行并在接下来显示“购买”或“卖出”时用作基础
仅供参考,安迪·L(Andy L.)成功回答了我的上一个问题,如下所示,但是当BCLOSE_SELL和SCLOSE_BUY接连发生时,此响应无法处理新的情况
df1 = df[::-1]
s = df1.B.isin(['BCLOSE','SCLOSE']).shift(fill_value=False).cumsum()
grps = df1.groupby(s)
init_val= 100
l = []
for _, grp in grps:
s = grp.C * 0.01 * init_val
s.iloc[0] = init_val
s = s.cumsum()
init_val = s.iloc[-1]
l.append(s)
上面的答案并没有解决我在现实生活中遇到的问题,因此我没有发生BCLOSE而是收到BCLOSE_SELL,这基本上将买入卖出(即,我关闭买入并打开卖出),这成为了基础进行中的行的金额。
如果行继续作为SHODL,我可以调整代码,以便正确计算运行余额,但是如果我随后收到SCLOSE_BUY(如数据框的第9行所示),则需要使该行关闭SELL,重新打开购买,该行也将是我的余额。
我理解所有这些听起来令人困惑,因为将下面的列添加到我的上面的数据框中就是结果应该是的样子。
df['G'] = (191.62,191.62,190.19,175.89,168.74,160.16,168.74,160.16,157.3,143,130,138,105,120,114,115,110,100,100,100)
我对here中发布的类似问题有充分记录的答案,但是让我对其进行一些调整,以便可以将其应用于您刚刚提出的问题。本质上,您所需要做的就是以下列方式在BCLOSE_SELL
和SCLOSE_BUY
处添加两个新的断点:
df.index[df[type_col].isin(['BCLOSE', 'SCLOSE', 'BCLOSE_SELL', 'SCLOSE_BUY'])][::-1]
[在上一行中,type_col
是指定操作的列的名称(例如SHOLD
或BCLOSE
),或者在您的情况下,列E
。
您可以在下面找到与您的两个问题均适用的完整和更新的代码:
# basic setup
type_col = 'E' # the name of the action type column
change_col = 'F' # the name of the delta change column
res_col = 'G' # the name of the resulting column
value = 100 # you can specify any initial value here
PERCENTAGE_CONST = 100
endpoints = [df.first_valid_index(), df.last_valid_index()]
# occurrences of 'BCLOSE', 'SCLOSE', 'BCLOSE_SELL' and 'SCLOSE_BUY' that break the sequence
breakpoints = df.index[df[type_col].isin(['BCLOSE','SCLOSE', 'BCLOSE_SELL', 'SCLOSE_BUY'])][::-1]
# removes the endpoints of the dataframe that do not break the structure
breakpoints = breakpoints.drop(endpoints, errors='ignore')
for i in range(len(breakpoints) + 1):
prv = breakpoints[i - 1] - 1 if i else -1 # previous or first breakpoint
try:
nex = breakpoints[i] - 1 # next breakpoint
except IndexError:
nex = None # last breakpoint
# cumulative sum of values adjusted for the percentage change appended to the resulting column
res = value + (df[change_col][prv: nex: -1] * value / PERCENTAGE_CONST).cumsum()[::-1]
df.loc[res.index, res_col] = res
# saving the value that will be the basis for percentage calculations
# for the next breakpoint
value = res.iloc[0]
产生的输出与您的预期结果一致:
>>> df
E F G
0 SIT 0.0 191.62
1 SCLOSE 1.0 191.62
2 SHODL 10.0 190.19
3 SHODL 5.0 175.89
4 SHODL 6.0 168.74
5 SHODL -6.0 160.16
6 SHODL 6.0 168.74
7 SHODL 2.0 160.16
8 SHODL 10.0 157.30
9 SCLOSE_BUY 10.0 143.00
10 BCLOSE_SELL -8.0 130.00
11 BHODL 33.0 138.00
12 BHODL -15.0 105.00
13 BHODL 6.0 120.00
14 BHODL -1.0 114.00
15 BHODL 5.0 115.00
16 BHODL 10.0 110.00
17 BUY 0.0 100.00
18 SIT 0.0 100.00
19 SIT 0.0 100.00