我正在将数据汇总到每日数据箱中。我的数据每天都有重复的数据点,但同一秒的数据点不同。
由于索引不唯一,朴素重采样和前向填充失败:
freq = "D"
compounded = compounded.asfreq(freq, method='ffill')
如何为每个重新采样日应用自定义函数以正确聚合日内时间范围内的多个时间点?
您可以将自定义函数传递给重新采样器
agg()
,该函数获取特定日期的所有值并生成聚合值。
这是累积乘积的示例
freq = "D"
# If we haved closed two positions on the same day, asfreq() will fail unless we merge profit values
def custom_cumprod_resampler(intraday_series):
if len(intraday_series) == 0:
return pd.NA
daily_compounded = intraday_series.add(1).cumprod().sub(1)
return daily_compounded.iloc[-1]
try:
resampled_compounded = compounded.resample(freq).agg(custom_cumprod_resampler)
resampled_compounded = resampled_compounded.ffill()
except Exception as e:
raise RuntimeError(f"Daily binning failed for: {profit_data}") from e