指示组内的rejected_time是否在creation_timestamp之后的5分钟内

问题描述 投票:0回答:1

我正在尝试设置一个指示器,用于指示新申请何时导致旧申请被拒绝。

如果personal_id中的任何rejected_time发生在creation_timestamp之后5分钟内,则由于新的申请而被拒绝。基于此,我应该创建“new_application_causes_rejection”列,如示例中所示。

个人ID有数十万个,大多数都有多个应用程序ID,并且应用程序ID内的行数各不相同。

个人ID application_id 创建_时间戳 批准的金额 被拒绝_时间 新申请原因_拒绝
5a 694f 2023-01-24 13:01:07.939534 8000.0 2023-01-24 13:13:15.499000 0
5a 694f 2023-01-24 13:01:07.939534 8000.0 2023-01-24 14:38:02.359000 1
5a 694f 2023-01-24 13:01:07.939534 8000.0 2023-01-24 14:37:18.616000 1
5a 694f 2023-01-24 13:01:07.939534 NaN 2023-01-24 13:03:59.626000 0
5a 43fa 2023-01-24 14:36:08.287521 NaN 2023-01-24 14:37:22.096000 0
5a 43fa 2023-01-24 14:36:08.287521 13000.0 2023-01-24 14:39:31.750000 1
5a 43fa 2023-01-24 14:36:08.287521 13000.0 2023-02-02 08:42:26.980106 1
5a 43fa 2023-01-24 14:36:08.287521 NaN 2023-01-24 14:37:22.948214 0
5a a4b6 2023-01-24 14:38:42.625969 5000.0 2023-02-02 08:42:26.980106 0
5a a4b7 2023-01-24 14:38:42.625969 NaN 2023-01-24 14:38:46.922000 0
5a a4b8 2023-01-24 14:38:42.625969 8000.0 2023-02-02 08:42:26.980106 0
python pandas
1个回答
0
投票

我得到了不同的输出:

df['creation_timestamp'] = pd.to_datetime(df['creation_timestamp'])
df['rejected_time'] = pd.to_datetime(df['rejected_time'])

df['new'] = df['rejected_time'].sub(df['creation_timestamp']).lt(pd.Timedelta('5 Min')).astype(int)
print (df)
   personal_id application_id         creation_timestamp approved_amount  \
0          5a           694f  2023-01-24 13:01:07.939534         8000.0    
1          5a           694f  2023-01-24 13:01:07.939534         8000.0    
2          5a           694f  2023-01-24 13:01:07.939534         8000.0    
3          5a           694f  2023-01-24 13:01:07.939534            NaN    
4          5a           43fa  2023-01-24 14:36:08.287521            NaN    
5          5a           43fa  2023-01-24 14:36:08.287521        13000.0    
6          5a           43fa  2023-01-24 14:36:08.287521        13000.0    
7          5a           43fa  2023-01-24 14:36:08.287521            NaN    
8          5a           a4b6  2023-01-24 14:38:42.625969         5000.0    
9          5a           a4b7  2023-01-24 14:38:42.625969            NaN    
10         5a           a4b8  2023-01-24 14:38:42.625969         8000.0    

                rejected_time  new_application_causes_rejection  new  
0  2023-01-24 13:13:15.499000                                 0    0  
1  2023-01-24 14:38:02.359000                                 1    0  
2  2023-01-24 14:37:18.616000                                 1    0  
3  2023-01-24 13:03:59.626000                                 0    1  
4  2023-01-24 14:37:22.096000                                 0    1  
5  2023-01-24 14:39:31.750000                                 1    1  
6  2023-02-02 08:42:26.980106                                 1    0  
7  2023-01-24 14:37:22.948214                                 0    1  
8  2023-02-02 08:42:26.980106                                 0    0  
9  2023-01-24 14:38:46.922000                                 0    1  
10 2023-02-02 08:42:26.980106                                 0    0  

详情:

print (df['rejected_time'].sub(df['creation_timestamp']))
0    0 days 00:12:07.559466
1    0 days 01:36:54.419466
2    0 days 01:36:10.676466
3    0 days 00:02:51.686466
4    0 days 00:01:13.808479
5    0 days 00:03:23.462479
6    8 days 18:06:18.692585
7    0 days 00:01:14.660693
8    8 days 18:03:44.354137
9    0 days 00:00:04.296031
10   8 days 18:03:44.354137
dtype: timedelta64[ns]
© www.soinside.com 2019 - 2024. All rights reserved.