我想对包含timedelta格式的数据进行转换。
我的数据看起来像这样,其中Time列为timedelta类型。
user in out location overlap Time
0 ron 12/21/2021 10:11 12/21/2016 17:50 home 0 4:19:03
1 ron 12/21/2016 13:26 12/21/2016 13:52 office 2 0:25:28
2 april 12/21/2016 8:12 12/21/2016 17:27 office 0 8:15:03
3 april 12/21/2016 18:54 12/21/2016 22:56 office 0 4:02:36
4 andy 12/21/2016 8:57 12/21/2016 12:15 home 0 2:59:40
基于用户和重叠部分,我想对“时间”进行转换。我已经做到了:
groups = sample.groupby('user')['Time']
flag = sample.groupby('user')['overlap'].transform('max')
sample.loc[:,'time_new'] = np.select([flag.eq(0), flag.isin([1,2])], [groups.transform('sum'), groups.transform('max')])
但是出现以下错误:
TypeError: Cannot cast scalar from dtype('<m8[ns]') to dtype('<m8') according to the rule 'same_kind'
我如何正确进行转换?
将时间增量转换为计算秒数的float
,然后进行数学运算。然后,如果需要,请转换回timedelta
groups = sample['Time'].dt.total_seconds().groupby(sample['user'])
flag = sample.groupby('user')['overlap'].transform('max')
sample.loc[:,'time_new'] = np.select([flag.eq(0), flag.isin([1,2])],
[groups.transform('sum'), groups.transform('max')])
sample['time_new'] = pd.to_timedelta(sample['time_new'], unit='s')
user in out location overlap Time time_new
0 ron 12/21/2021 10:11 12/21/2016 17:50 home 0 04:19:03 04:19:03
1 ron 12/21/2016 13:26 12/21/2016 13:52 office 2 00:25:28 04:19:03
2 april 12/21/2016 8:12 12/21/2016 17:27 office 0 08:15:03 12:17:39
3 april 12/21/2016 18:54 12/21/2016 22:56 office 0 04:02:36 12:17:39
4 andy 12/21/2016 8:57 12/21/2016 12:15 home 0 02:59:40 02:59:40