将数据从宽格式更改为长格式并在R中创建计算字段

问题描述 投票:0回答:1

以下数据捕获每个Adv(Adv_Code)的每月OPN(最优产品编号)。 Change_Dt捕获Adv将状态从A更改为B的月份。 在变更月份之前,所有OPN都属于A级状态,在月之后,所有OPN都属于B级状态。

以下是现有数据

Adv_Code    Change_Dt   April_OPN   May_OPN June_OPN    July_OPN    Aug_OPN Sep_OPN Oct_OPN Nov_OPN Dec_OPN Jan_OPN Feb_OPN March_OPN
A201        April       0           0       0           0           0       0       0       0       0       0       0       0
A198        July        2           0       0           1           2       0       5       0       0       0       0       0
S1212       Nov         0           3       4           0           0       3       0       1       0       0       0       0

我想通过转换为长格式并根据OPN月创建Adv_Status来创建以下数据结构。即如果Month_OPN是<Change_Dt,Adv_Status将是A else B.

Month_OPN只是4月到3月,即12个月。 OPN捕获每个Adv的每月OPN。它是每个Adv的4月NOP到Mar NOP列的值的转置。

预期产出:

Agent_Code  Change_Dt   Month_OPN   Adv_Status  OPN                                 
S1198201    April       April       B           0                                   
S1198201    April       May         B           0                                   
S1198201    April       June        B           0                                   
S1198201    April       July        B           0                                   
S1198201    April       Aug         B           0                                   
S1198201    April       Sep         B           0                                   
S1198201    April       Oct         B           0                                   
S1198201    April       Nov         B           0                                   
S1198201    April       Dec         B           0                                   
S1198201    April       Jan         B           0                                   
S1198201    April       Feb         B           0                                   
S1198201    April       Mar         B           0                                   
S1198203    July        April       A           2                                   
S1198203    July        May         A           0                                   
S1198203    July        June        A           0                                   
S1198203    July        July        B           1                                   
S1198203    July        Aug         B           2                                   
S1198203    July        Sep         B           0                                   
S1198203    July        Oct         B           5                                   
S1198203    July        Nov         B           0                                   
S1198203    July        Dec         B           0                                   
S1198203    July        Jan         B           0                                   
S1198203    July        Feb         B           0                                   
S1198203    July        Mar         B           0                                   
S1198212    Nov         April       A           0                                   
S1198212    Nov         May         A           3                                   
S1198212    Nov         June        A           4                                   
S1198212    Nov         July        A           0                                   
S1198212    Nov         Aug         A           0                                   
S1198212    Nov         Sep         A           3                                   
S1198212    Nov         Oct         A           0                                   
S1198212    Nov         Nov         B           1                                   
S1198212    Nov         Dec         B           0                                   
S1198212    Nov         Jan         B           0                                   
S1198212    Nov         Feb         B           0                                   
S1198212    Nov         Mar         B           0

有人可以帮我在R做这个吗?

r reshape
1个回答
1
投票

考虑使用内置常量month.name和month.abb进行清理和月份数计算的基础R的reshape

# RESHAPE
rdf <- reshape(df, idvar=c("Adv_Code", "Change_Dt"),
               varying=list(names(df)[-1][-1]), v.names="OPN",
               times=names(df)[-1][-1], timevar="Month_OPN",
               new.row.names=1:1E5, direction="long")

# CALCULATION
final_df <- within(rdf, {    
     # RETRIEVE MONTH NUMBER FROM MONTH NAME/MONTH ABBREV (e.g., JULY or JUL => 7)
     Change_Dt_Num <- sapply(Change_Dt, function(x) max(which(month.name==x), which(month.abb==x)))
     # REMOVE THE "_OPN" SUFFIX FROM Month_OPN VALUES
     Month_OPN <- sub("_OPN", "", Month_OPN)
     # RETRIEVE MONTH NUMBER FROM MONTH NAME/MONTH ABBREV (e.g., JULY or JUL => 7)
     Month_OPN_Num <- sapply(Month_OPN, function(x) max(which(month.name==x), which(month.abb==x)))

     # CONDITIONALLY ASSIGN "A" AND "B" BY COMPARING BOTH MONTH NUMBERS BEFORE/AFTER APRIL
     Adv_Status <- ifelse(Month_OPN_Num < Change_Dt_Num & Month_OPN_Num >= 4, "A", 
                          ifelse(Month_OPN_Num < Change_Dt_Num & Month_OPN_Num < 4, "B", "B"))
     # REMOVE HELPER COLUMNS (USED FOR ABOVE CALCULATION ONLY)    
     rm(Change_Dt_Num, Month_OPN_Num)
})

# RE-ORDER ROWS AND RESET ROW NAMES
final_df <- with(final_df, final_df[order(Adv_Code),])
row.names(final_df) <- NULL

产量

final_df
#    Adv_Code Change_Dt Month_OPN OPN Adv_Status
# 1      A198      July     April   2          A
# 2      A198      July       May   0          A
# 3      A198      July      June   0          A
# 4      A198      July      July   1          B
# 5      A198      July       Aug   2          B
# 6      A198      July       Sep   0          B
# 7      A198      July       Oct   5          B
# 8      A198      July       Nov   0          B
# 9      A198      July       Dec   0          B
# 10     A198      July       Jan   0          B
# 11     A198      July       Feb   0          B
# 12     A198      July     March   0          B
# 13     A201     April     April   0          B
# 14     A201     April       May   0          B
# 15     A201     April      June   0          B
# 16     A201     April      July   0          B
# 17     A201     April       Aug   0          B
# 18     A201     April       Sep   0          B
# 19     A201     April       Oct   0          B
# 20     A201     April       Nov   0          B
# 21     A201     April       Dec   0          B
# 22     A201     April       Jan   0          B
# 23     A201     April       Feb   0          B
# 24     A201     April     March   0          B
# 25    S1212       Nov     April   0          A
# 26    S1212       Nov       May   3          A
# 27    S1212       Nov      June   4          A
# 28    S1212       Nov      July   0          A
# 29    S1212       Nov       Aug   0          A
# 30    S1212       Nov       Sep   3          A
# 31    S1212       Nov       Oct   0          A
# 32    S1212       Nov       Nov   1          B
# 33    S1212       Nov       Dec   0          B
# 34    S1212       Nov       Jan   0          B
# 35    S1212       Nov       Feb   0          B
# 36    S1212       Nov     March   0          B

Rextester Demo

© www.soinside.com 2019 - 2024. All rights reserved.