简介
我正在使用 python
pandas
在本地存储的市场数据上回测自己的策略。由于我想快速回测这些策略并且数据很大(7+000000 行),因此我尝试对所有操作进行矢量化以实现此目的。对于入场信号评估来说,情况已经如此,而且效果非常好。使用止盈和止损价格阈值作为每个入场的退出标准。即提供以下 DataFrame
和 datetime index
:
import pandas as pd
from pandas import Timestamp
import numpy as np
df = pd.DataFrame({
'open': {Timestamp('2021-01-03 22:11:00'): 1.22319, Timestamp('2021-01-03 22:12:00'): 1.22315, Timestamp('2021-01-03 22:15:00'): 1.22324, Timestamp('2021-01-03 22:16:00'): 1.22355, Timestamp('2021-01-03 22:17:00'): 1.22357},
'high': {Timestamp('2021-01-03 22:11:00'): 1.22319, Timestamp('2021-01-03 22:12:00'): 1.22318, Timestamp('2021-01-03 22:15:00'): 1.22358, Timestamp('2021-01-03 22:16:00'): 1.2236, Timestamp('2021-01-03 22:17:00'): 1.22361},
'low': {Timestamp('2021-01-03 22:11:00'): 1.22317, Timestamp('2021-01-03 22:12:00'): 1.22315, Timestamp('2021-01-03 22:15:00'): 1.22324, Timestamp('2021-01-03 22:16:00'): 1.22352, Timestamp('2021-01-03 22:17:00'): 1.22355},
'close': {Timestamp('2021-01-03 22:11:00'): 1.22317, Timestamp('2021-01-03 22:12:00'): 1.22315, Timestamp('2021-01-03 22:15:00'): 1.22358, Timestamp('2021-01-03 22:16:00'): 1.22352, Timestamp('2021-01-03 22:17:00'): 1.22356},
'longEntrySignal': {Timestamp('2021-01-03 22:11:00'): False, Timestamp('2021-01-03 22:12:00'): False, Timestamp('2021-01-03 22:15:00'): True, Timestamp('2021-01-03 22:16:00'): False, Timestamp('2021-01-03 22:17:00'): False},
'longEntry': {Timestamp('2021-01-03 22:11:00'): False, Timestamp('2021-01-03 22:12:00'): False, Timestamp('2021-01-03 22:15:00'): False, Timestamp('2021-01-03 22:16:00'): True, Timestamp('2021-01-03 22:17:00'): False},
'longEntryPrice': {Timestamp('2021-01-03 22:11:00'): np.nan, Timestamp('2021-01-03 22:12:00'): np.nan, Timestamp('2021-01-03 22:15:00'): np.nan, Timestamp('2021-01-03 22:16:00'): 1.22355, Timestamp('2021-01-03 22:17:00'): np.nan},
'longTpPrice': {Timestamp('2021-01-03 22:11:00'): np.nan, Timestamp('2021-01-03 22:12:00'): np.nan, Timestamp('2021-01-03 22:15:00'): np.nan, Timestamp('2021-01-03 22:16:00'): 1.2243451663854852, Timestamp('2021-01-03 22:17:00'): np.nan},
'longSlPrice': {Timestamp('2021-01-03 22:11:00'): np.nan, Timestamp('2021-01-03 22:12:00'): np.nan, Timestamp('2021-01-03 22:15:00'): np.nan, Timestamp('2021-01-03 22:16:00'): 1.2227548336145146, Timestamp('2021-01-03 22:17:00'): np.nan}})
print(df)
open high low close longEntrySignal longEntry longEntryPrice longTpPrice longSlPrice
2021-01-03 22:11:00 1.22319 1.22319 1.22317 1.22317 False False NaN NaN NaN
2021-01-03 22:12:00 1.22315 1.22318 1.22315 1.22315 False False NaN NaN NaN
2021-01-03 22:15:00 1.22324 1.22358 1.22324 1.22358 True False NaN NaN NaN
2021-01-03 22:16:00 1.22355 1.22360 1.22352 1.22352 False True 1.22355 1.224345 1.222755
2021-01-03 22:17:00 1.22357 1.22361 1.22355 1.22356 False False NaN NaN NaN
表示通过longEntrySignal
和True
在下一个蜡烛内开多头仓位的给定信号。False
表示使用longEntry
开仓,并且True
将该蜡烛的开盘价存储为入场价格。longEntryPrice
和longTpPrice
是根据达到的止盈或止损标准应平仓的相应价格。longSlPrice
所需输出
根据所选的止盈和止损阈值,可能会存在多个持仓,分别具有不同的入场点(时间)和止盈止损阈值。
不管怎样,我现在的问题是进场后如何平仓的计算。这意味着随后验证策略绩效的最少信息将是
exitPrice
和 exitTime
列。
open high low close longEntrySignal longEntry longEntryPrice longTpPrice longSlPrice exitPrice exitTime
2021-01-03 22:11:00 1.22319 1.22319 1.22317 1.22317 False False NaN NaN NaN NaN NaN
2021-01-03 22:12:00 1.22315 1.22318 1.22315 1.22315 False False NaN NaN NaN NaN NaN
2021-01-03 22:15:00 1.22324 1.22358 1.22324 1.22358 True False NaN NaN NaN NaN NaN
2021-01-03 22:16:00 1.22355 1.22360 1.22352 1.22352 False True 1.22355 1.224345 1.222755 1.224345 2021-01-03 22:29:00
2021-01-03 22:17:00 1.22357 1.22361 1.22355 1.22356 False False NaN NaN NaN NaN NaN
将是相应的 止盈 (exitPrice
) 或 止损 (longTpPrice
) 阈值,而如果在同一longSlPrice
longSlPrice
(蜡烛)内达到两个阈值,则应考虑row
)。就是相应的时间。exitTime
目前的做法
目前,当我将
apply()
减少到具有给定长条目的行时,我正在使用
df
函数进行退出计算:
entryDf = df[df['longEntry']].copy()
entryDf[['exitPrice', 'exitTime']] = entryDf.apply(lambda x: getLongExit(exitDf=df[['high', 'low']], entryPrice=x['longEntryPrice'], entryTime=x.index, takeProfit=x['longTpPrice'], stopLoss=x['longSlPrice']), axis=1, result_type='expand')
然后,基本上通过使用getLongExit
、.loc
和.idxmax()
来确定是否以及何时达到获利 或/和止损 阈值,并比较结果以检查发生哪种情况早些时候。它返回相应的.idxmin()
以及相应的exitPrice
。exitTime
根据确定的
exitPrice
信息,例如整体增益、获胜率等可以轻松计算。
我希望我能够让您正确理解我的问题是什么。很确定,使用
.ffill()
或类似的东西可能会更快,但我无法让它工作。我期待您的建议 - 谢谢!
IIUC,对于此任务,我将使用 numba:
import numba
@numba.njit
def get_long_exit(
index, high_vals, low_vals, tp_prices, sl_prices, out_exit_price, out_indices
):
for idx1 in range(len(index) - 1):
if np.isnan(tp_prices[idx1]):
continue
tp_entry, sl_entry = tp_prices[idx1], sl_prices[idx1]
for idx2 in range(idx1 + 1, len(index)):
h, l = high_vals[idx2], low_vals[idx2]
# i'm not sure about these if's, but you can adjust it to your needs:
if sl_entry > l:
out_exit_price[idx1] = sl_entry
out_indices[idx1] = index[idx2]
break
elif tp_entry < h:
out_exit_price[idx1] = tp_entry
out_indices[idx1] = index[idx2]
break
df["exitPrice"] = np.nan
df["exitIndex"] = np.nan
# then fill the exitPrice/exitIndex values by:
get_long_exit(
df.index.values,
df.high.values,
df.low.values,
df.longTpPrice.values,
df.longSlPrice.values,
df.exitPrice.values,
df.exitIndex.values,
)
print(df)