基本上,初学者试图找到包括今天,过去4天和未来几天(在这种情况下,第二天(明天))的总和。
Test
1995-07-01 1
1995-07-02 0
1995-07-03 0
1995-07-04 1
1995-07-05 0
1995-07-06 0
1995-07-07 0
1995-07-08 0
1995-07-09 0
1995-07-10 0
1995-07-11 1
获得“今天”和过去4天的总滚动金额df ['test']。rolling(5).sum()
1995-07-01 NaN
1995-07-02 NaN
1995-07-03 NaN
1995-07-04 NaN
1995-07-05 2.0
1995-07-06 1.0
1995-07-07 1.0
1995-07-08 1.0
1995-07-09 0.0
1995-07-10 0.0
1995-07-11 1.0
但是将第二天的值包含在该总和中对我来说很困难,我想要的是1995-07-10的输出显示1,因为它需要包含'明天'(因为1995-07-11在输出中是1测试数据)
df['Tomorrow'] = df.shift(-1)
df['Previous'] = df['Test'].rolling(4).sum()
df.sum(axis=1)
Output
1995-07-01 1.0
1995-07-02 0.0
1995-07-03 1.0
1995-07-04 3.0
1995-07-05 1.0
1995-07-06 1.0
1995-07-07 1.0
1995-07-08 0.0
1995-07-09 0.0
1995-07-10 1.0
1995-07-11 2.0
或者即使少于四天,您也希望前三行具有前四行的值:
df['Previous'] = df['Test'].rolling(4, min_periods=1).sum()
我相信您需要的是shift()
方法。它可以让您将数据移动几天,然后可以根据需要将其与日期对齐。
See this link用于文档。
df['test'].shift(-1,fill_value=0).rolling(5).sum()
提供:
date
1995-07-01 NaN
1995-07-02 NaN
1995-07-03 NaN
1995-07-04 NaN
1995-07-05 1.0
1995-07-06 1.0
1995-07-07 1.0
1995-07-08 0.0
1995-07-09 0.0
1995-07-10 1.0
1995-07-11 1.0
IIUC,Series.shift
Series.shift
您可以看到最后一天包含days_before = 4
days_after = 1
df['Test'].rolling(days_before + days_after + 1).sum().shift(-days_after)
1995-07-01 NaN
1995-07-02 NaN
1995-07-03 NaN
1995-07-04 NaN
1995-07-05 2.0
1995-07-06 1.0
1995-07-07 1.0
1995-07-08 1.0
1995-07-09 0.0
1995-07-10 1.0
1995-07-11 NaN
Name: Test, dtype: float64
,因为它没有几天,所以可以用NaN
之类的方法填充它>
使用