我有以下一列(timedelta对象),这是两个时间列之间的差的结果:
Duration
00:12:38.260000
00:01:00.750000
00:19:35.260000
00:00:29.990000
我正在尝试在此列上应用以下内容:
rolling(min_periods=3, window=5).sum()
我有以下错误:
No numeric types to aggregate
我应该转换我的持续时间吗?怎么样?
简短回答
[用.total_seconds()
转换为秒然后求和] >>
长答案
创建您的dataframe
和duration
列
import pandas as pd dt1 = ['2019-12-01 10:00:00', '2019-12-01 10:01:00', '2019-12-01 10:00:30', '2019-12-01 10:02:30', '2019-12-01 10:05:30'] dt2 = ['2019-12-01 10:10:00', '2019-12-01 11:06:00', '2019-12-01 10:01:00', '2019-12-01 10:02:30', '2019-12-01 10:07:30'] df = pd.DataFrame({'dt1': dt1, 'dt2': dt2}) df['dt1'] = pd.to_datetime(df['dt1']) df['dt2'] = pd.to_datetime(df['dt2']) df['duration'] = df['dt2'] - df['dt1'] df.info() <class 'pandas.core.frame.DataFrame'> RangeIndex: 5 entries, 0 to 4 Data columns (total 3 columns): dt1 5 non-null datetime64[ns] dt2 5 non-null datetime64[ns] duration 5 non-null timedelta64[ns] dtypes: datetime64[ns](2), timedelta64[ns](1) memory usage: 248.0 bytes
注意,持续时间是类型
timedelta
。
现在用.total_seconds()
转换为秒,然后求和。
df['duration_rolling_sum'] = pd.to_timedelta(df['duration'].dt.total_seconds().rolling(min_periods=3, window=5).sum(), unit='s')
df
dt1 dt2 duration duration_rolling_sum
0 2019-12-01 10:00:00 2019-12-01 10:10:00 00:10:00 NaT
1 2019-12-01 10:01:00 2019-12-01 11:06:00 01:05:00 NaT
2 2019-12-01 10:00:30 2019-12-01 10:01:00 00:00:30 01:15:30
3 2019-12-01 10:02:30 2019-12-01 10:02:30 00:00:00 01:15:30
4 2019-12-01 10:05:30 2019-12-01 10:07:30 00:02:00 01:17:30