R bupar:获取每个案例的跟踪

问题描述 投票:0回答:1

我使用 bupar 包进行流程分析。假设我存储在 csv 文件中的数据如下所示(该文件已按 caseid 和时间戳正确排序):

STATUS;timestamp;CASEID
created;16-02-2023 09:46:32;1
revised;13-04-2023 23:58:59;1
accepted;13-04-2023 23:59:59;1
created;16-02-2023 09:46:32;2
accepted;13-04-2023 23:59:59;2
created;14-12-2022 13:17:54;3
revised;02-01-2023 23:59:59;3
accepted;28-02-2023 19:37:01;3
submitted;03-03-2023 23:59:59;3
created;02-01-2023 07:45:43;5
created;24-01-2022 16:05:58;6
accepted;03-02-2022 23:59:59;6
created;24-01-2022 15:52:53;7
accepted;03-02-2022 23:59:59;7
created;15-08-2022 12:54:23;8
rejected;18-08-2022 23:59:59;8
created;21-03-2022 15:32:05;9
accepted;26-04-2022 23:59:59;9
created;21-03-2022 15:42:39;10

第一个 id 为 1 的案例具有“created-revised-accepted”跟踪。所以首先是创建事件,然后修改,然后接受。

我现在使用以下代码来创建流程图:

library(bupaR)
library(processmapR)
library(edeaR)

datafile <- read.csv(file="pathtofile\\testfile.csv",header=T, sep=";")
datafile$timestampcolumn <- as.POSIXct(datafile$timestamp, format="%d-%m-%Y %H:%M:%S")

mytest <- simple_eventlog(datafile, case_id = "CASEID", activity_id = "STATUS", timestamp = "timestampcolumn")

process_map(mytest, type = frequency("absolute"))

这给出:

现在我想将每个案例的跟踪添加到我的原始文件中。当然,对于一个案例来说,痕迹总是相同的。所以输出应该是这样的(跟踪中的每个事件都用示例“-”分隔):

STATUS;timestamp;CASEID;trace
created;16-02-2023 09:46:32;1;created-revised-accepted
revised;13-04-2023 23:58:59;1;created-revised-accepted
accepted;13-04-2023 23:59:59;1;created-revised-accepted
created;16-02-2023 09:46:32;2;created-accepted
accepted;13-04-2023 23:59:59;2;created-accepted
created;14-12-2022 13:17:54;3;created-revised-accepted-submitted
revised;02-01-2023 23:59:59;3;created-revised-accepted-submitted
accepted;28-02-2023 19:37:01;3;created-revised-accepted-submitted
submitted;03-03-2023 23:59:59;3;created-revised-accepted-submitted
created;02-01-2023 07:45:43;5;created
created;24-01-2022 16:05:58;6;created-accepted
accepted;03-02-2022 23:59:59;6;created-accepted
created;24-01-2022 15:52:53;7;created-accepted
accepted;03-02-2022 23:59:59;7;created-accepted
created;15-08-2022 12:54:23;8;created-rejected
rejected;18-08-2022 23:59:59;8;created-rejected
created;21-03-2022 15:32:05;9;created-accepted
accepted;26-04-2022 23:59:59;9;created-accepted
created;21-03-2022 15:42:39;10;created

我尝试使用

filter_activity
trace_list
(来自 edeaR 包)和其他命令,但我无法弄清楚。我想使用 process_map 算法/bupar 包代码的结果。以便它对应于图中的输出。所以我不想自己手动实现算法来计算痕迹。因此,我当然可以实现一个算法来遍历每个案例并写下状态等。但这已经以某种方式存在于 bupar eventlog / process_map 命令中,我想使用它。我想深入研究细节,看看根据图表哪个案例有特定的痕迹。这就是为什么让它与 bupar 输出一致而不是单独使用算法对其进行编程很重要。这些信息必须已经以某种方式包含在内,否则图表将不存在。

那么我怎样才能实现这个目标呢?

r event-log process-mining bupar
1个回答
0
投票

我从未使用过任何这些软件包,但解决了这样的问题:

  1. 我看了
    mytest
    的班级:
class(mytest)
# [1] "eventlog"   "log"        "tbl_df"     "tbl"        "data.frame"
  1. 我研究了为类定义的方法
    eventlog
    :
methods(class = "eventlog")
# [1] act_collapse                     activities                       activity_frequency              
# [4] activity_instance_id             activity_presence                add_end_activity                
# [7] add_start_activity               arrange                          calculate_queuing_times         
# [10] case_id                          case_list                        cases                           
# [13] detect_resource_inconsistencies  dotted_chart                     durations                       
# [16] end_activities                   events_to_activitylog            filter                          
# [19] filter_activity_instance         filter_attributes                filter_endpoints_condition      
# [22] filter_infrequent_flows          filter_lifecycle                 filter_lifecycle_presence       
# [25] filter_precedence_resource       filter_time_period               filter_trim                     
# [28] filter_trim_lifecycle            first_n                          fix_resource_inconsistencies    
# [31] group_by                         group_by_activity                group_by_activity_instance      
# [34] group_by_case                    group_by_resource                group_by_resource_activity      
# [37] idle_time                        last_n                           lifecycle_id                    
# [40] lifecycle_labels                 lifecycles                       lined_chart                     
# [43] mapping                          mutate                           n_activity_instances            
# [46] n_events                         number_of_repetitions            number_of_selfloops             
# [49] process_map                      process_matrix                   processing_time                 
# [52] redo_repetitions_referral_matrix redo_selfloops_referral_matrix   resource_frequency              
# [55] resource_id                      resource_map                     resource_matrix                 
# [58] resources                        sample_n                         select                          
# [61] set_activity_instance_id         set_timestamp                    setdiff                         
# [64] size_of_repetitions              size_of_selfloops                slice_activities                
# [67] slice_events                     standardize_lifecycle            start_activities                
# [70] summarise                        summary                          throughput_time                 
# [73] timestamp                        timestamps                       to_activitylog                  
# [76] trace_explorer                   trace_length                     trace_list                      
# [79] ungroup_eventlog                 unite
  1. 我尝试了多种功能,直到找到一个可以解决您问题的功能:
    case_list

设置

library(bupaR)
library(processmapR)
library(edeaR)
library(dplyr)

d <- readr::read_delim(
"STATUS;timestamp;CASEID
created;16-02-2023 09:46:32;1
revised;13-04-2023 23:58:59;1
accepted;13-04-2023 23:59:59;1
created;16-02-2023 09:46:32;2
accepted;13-04-2023 23:59:59;2
created;14-12-2022 13:17:54;3
revised;02-01-2023 23:59:59;3
accepted;28-02-2023 19:37:01;3
submitted;03-03-2023 23:59:59;3
created;02-01-2023 07:45:43;5
created;24-01-2022 16:05:58;6
accepted;03-02-2022 23:59:59;6
created;24-01-2022 15:52:53;7
accepted;03-02-2022 23:59:59;7
created;15-08-2022 12:54:23;8
rejected;18-08-2022 23:59:59;8
created;21-03-2022 15:32:05;9
accepted;26-04-2022 23:59:59;9
created;21-03-2022 15:42:39;10", delim = ";")

d$timestampcolumn <- as.POSIXct(d$timestamp, format="%d-%m-%Y %H:%M:%S")
mytest <- simple_eventlog(d, 
                          case_id = "CASEID", 
                          activity_id = "STATUS", 
                          timestamp = "timestampcolumn")
process_map(mytest, type = frequency("absolute"))

解决方案

d %>% 
  inner_join(case_list(mytest) %>% 
               select(CASEID, trace),
             "CASEID")
# # A tibble: 19 × 5
#    STATUS    timestamp           CASEID timestampcolumn     trace                             
#    <chr>     <chr>                <dbl> <dttm>              <chr>                             
#  1 created   16-02-2023 09:46:32      1 2023-02-16 09:46:32 created,revised,accepted          
#  2 revised   13-04-2023 23:58:59      1 2023-04-13 23:58:59 created,revised,accepted          
#  3 accepted  13-04-2023 23:59:59      1 2023-04-13 23:59:59 created,revised,accepted          
#  4 created   16-02-2023 09:46:32      2 2023-02-16 09:46:32 created,accepted                  
#  5 accepted  13-04-2023 23:59:59      2 2023-04-13 23:59:59 created,accepted                  
#  6 created   14-12-2022 13:17:54      3 2022-12-14 13:17:54 created,revised,accepted,submitted
#  7 revised   02-01-2023 23:59:59      3 2023-01-02 23:59:59 created,revised,accepted,submitted
#  8 accepted  28-02-2023 19:37:01      3 2023-02-28 19:37:01 created,revised,accepted,submitted
#  9 submitted 03-03-2023 23:59:59      3 2023-03-03 23:59:59 created,revised,accepted,submitted
# 10 created   02-01-2023 07:45:43      5 2023-01-02 07:45:43 created                           
# 11 created   24-01-2022 16:05:58      6 2022-01-24 16:05:58 created,accepted                  
# 12 accepted  03-02-2022 23:59:59      6 2022-02-03 23:59:59 created,accepted                  
# 13 created   24-01-2022 15:52:53      7 2022-01-24 15:52:53 created,accepted                  
# 14 accepted  03-02-2022 23:59:59      7 2022-02-03 23:59:59 created,accepted                  
# 15 created   15-08-2022 12:54:23      8 2022-08-15 12:54:23 created,rejected                  
# 16 rejected  18-08-2022 23:59:59      8 2022-08-18 23:59:59 created,rejected                  
# 17 created   21-03-2022 15:32:05      9 2022-03-21 15:32:05 created,accepted                  
# 18 accepted  26-04-2022 23:59:59      9 2022-04-26 23:59:59 created,accepted                  
# 19 created   21-03-2022 15:42:39     10 2022-03-21 15:42:39 created 
© www.soinside.com 2019 - 2024. All rights reserved.