Pandas resampler改变日期时间?

问题描述 投票:0回答:1

初步数据:

df.head()
df.tail()

输出:

                                  value
ts                                     
2017-09-20 21:00:45.514847+00:00  -60.0
2017-09-20 21:01:29.169977+00:00  -60.0
2017-09-20 21:02:13.694557+00:00  -60.0
2017-09-20 21:02:57.954950+00:00  -60.0
2017-09-20 21:03:40.615305+00:00  -60.0
                                  value
...
ts                                     
2017-09-21 20:56:27.126042+00:00  -60.0
2017-09-21 20:57:11.993958+00:00  -60.0
2017-09-21 20:57:55.010927+00:00  -60.0
2017-09-21 20:58:40.413179+00:00  -60.0
2017-09-21 20:59:25.451698+00:00  -60.0

如您所见,数据是时间戳(+03:00),值为1天= 24H

结果以不同的时间段重新采样数据:

resample_params = [u'1H', u'2H', u'4H', u'6H', u'8H', u'12H',]

我们开始做吧:

for resample_rule in resample_params:
    r = df1.resample(resample_rule, closed='right', label='left', base=1)
    # mean-median-count
    result = r.agg(['mean', 'median', 'count', 'std', 'sem', 'mad',])
    result.fillna(0, inplace=True)
    print result.value['count'], '\nlen =', len(result.value['count']), 'sum =', sum(result.value['count'])

输出:

ts
2017-09-20 21:00:00+00:00    82
2017-09-20 22:00:00+00:00    82
2017-09-20 23:00:00+00:00    83
2017-09-21 00:00:00+00:00    83
2017-09-21 01:00:00+00:00    83
Freq: H, Name: count, dtype: int64 
len = 24 sum = 1977
ts
2017-09-20 21:00:00+00:00    164
2017-09-20 23:00:00+00:00    166
2017-09-21 01:00:00+00:00    166
2017-09-21 03:00:00+00:00    166
2017-09-21 05:00:00+00:00    165
Freq: 2H, Name: count, dtype: int64 
len = 12 sum = 1977
ts
2017-09-20 21:00:00+00:00    330
2017-09-21 01:00:00+00:00    332
2017-09-21 05:00:00+00:00    328
2017-09-21 09:00:00+00:00    330
2017-09-21 13:00:00+00:00    329
Freq: 4H, Name: count, dtype: int64 
len = 6 sum = 1977
ts
2017-09-20 19:00:00+00:00    330
2017-09-21 01:00:00+00:00    497
2017-09-21 07:00:00+00:00    493
2017-09-21 13:00:00+00:00    493
2017-09-21 19:00:00+00:00    164
Freq: 6H, Name: count, dtype: int64 
len = 5 sum = 1977
ts
2017-09-20 17:00:00+00:00    330
2017-09-21 01:00:00+00:00    660
2017-09-21 09:00:00+00:00    659
2017-09-21 17:00:00+00:00    328
Freq: 8H, Name: count, dtype: int64 
len = 4 sum = 1977
ts
2017-09-20 13:00:00+00:00    330
2017-09-21 01:00:00+00:00    990
2017-09-21 13:00:00+00:00    657
Freq: 12H, Name: count, dtype: int64 
len = 3 sum = 1977

期间= u'1H', u'2H', u'4H'are好的

u'6H', u'8H', u'12H'给出len + 1并更改时间戳(查看每个df的第一行)

我尝试了不同的基础,封闭和标签参数和resample rules

如何获得超过4H的正确重新采样?

python pandas
1个回答
1
投票

经过resample方法的进一步研究,我得到的基础是int可以是负面的! base = -3对我很有用!

输出:

ts
2017-09-20 21:00:00+00:00    82
2017-09-20 22:00:00+00:00    82
2017-09-20 23:00:00+00:00    83
2017-09-21 00:00:00+00:00    83
2017-09-21 01:00:00+00:00    83
Freq: H, Name: count, dtype: int64 
len = 24 sum = 1977
ts
2017-09-20 21:00:00+00:00    164
2017-09-20 23:00:00+00:00    166
2017-09-21 01:00:00+00:00    166
2017-09-21 03:00:00+00:00    166
2017-09-21 05:00:00+00:00    165
Freq: 2H, Name: count, dtype: int64 
len = 12 sum = 1977
ts
2017-09-20 21:00:00+00:00    330
2017-09-21 01:00:00+00:00    332
2017-09-21 05:00:00+00:00    328
2017-09-21 09:00:00+00:00    330
2017-09-21 13:00:00+00:00    329
Freq: 4H, Name: count, dtype: int64 
len = 6 sum = 1977
ts
2017-09-20 21:00:00+00:00    496
2017-09-21 03:00:00+00:00    494
2017-09-21 09:00:00+00:00    495
2017-09-21 15:00:00+00:00    492
Freq: 6H, Name: count, dtype: int64 
len = 4 sum = 1977
ts
2017-09-20 21:00:00+00:00    662
2017-09-21 05:00:00+00:00    658
2017-09-21 13:00:00+00:00    657
Freq: 8H, Name: count, dtype: int64 
len = 3 sum = 1977
ts
2017-09-20 21:00:00+00:00    990
2017-09-21 09:00:00+00:00    987
Freq: 12H, Name: count, dtype: int64 
len = 2 sum = 1977
ts
2017-09-20 21:00:00+00:00    1977
Freq: 24H, Name: count, dtype: int64 
len = 1 sum = 1977
© www.soinside.com 2019 - 2024. All rights reserved.