timedelta数据上的Python transform('sum')

问题描述 投票:0回答:1

我想对包含timedelta格式的数据进行转换。

我的数据看起来像这样,其中Time列为timedelta类型。

    user                in               out location  overlap    Time
0    ron  12/21/2021 10:11  12/21/2016 17:50     home     0  4:19:03
1    ron  12/21/2016 13:26  12/21/2016 13:52   office     2  0:25:28
2  april   12/21/2016 8:12  12/21/2016 17:27   office     0  8:15:03
3  april  12/21/2016 18:54  12/21/2016 22:56   office     0  4:02:36
4   andy   12/21/2016 8:57  12/21/2016 12:15     home     0  2:59:40

基于用户和重叠部分,我想对“时间”进行转换。我已经做到了:

groups = sample.groupby('user')['Time']
flag = sample.groupby('user')['overlap'].transform('max')
sample.loc[:,'time_new'] = np.select([flag.eq(0), flag.isin([1,2])], [groups.transform('sum'), groups.transform('max')]) 

但是出现以下错误:

TypeError: Cannot cast scalar from dtype('<m8[ns]') to dtype('<m8') according to the rule 'same_kind'

我如何正确进行转换?

python pandas timedelta
1个回答
0
投票

将时间增量转换为计算秒数的float,然后进行数学运算。然后,如果需要,请转换回timedelta

groups = sample['Time'].dt.total_seconds().groupby(sample['user'])

flag = sample.groupby('user')['overlap'].transform('max')
sample.loc[:,'time_new'] = np.select([flag.eq(0), flag.isin([1,2])], 
                                     [groups.transform('sum'), groups.transform('max')]) 

sample['time_new'] = pd.to_timedelta(sample['time_new'], unit='s')

    user                in               out location  overlap     Time time_new
0    ron  12/21/2021 10:11  12/21/2016 17:50     home        0 04:19:03 04:19:03
1    ron  12/21/2016 13:26  12/21/2016 13:52   office        2 00:25:28 04:19:03
2  april   12/21/2016 8:12  12/21/2016 17:27   office        0 08:15:03 12:17:39
3  april  12/21/2016 18:54  12/21/2016 22:56   office        0 04:02:36 12:17:39
4   andy   12/21/2016 8:57  12/21/2016 12:15     home        0 02:59:40 02:59:40
© www.soinside.com 2019 - 2024. All rights reserved.