根据不同变量使用多个条件改变新变量

问题描述 投票:0回答:1

我有以下数据框,代表五个人的回复。有四种不同的回答 Yes、No、Inc、Vag。

data_test = data.frame(Val_Av=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", "Inc", "Vag", "Yes"),
           Val_Am=c("No", "No", "No", "Inc", "No", "Yes", "Yes", "Inc", "Vag", NA),
           Val_ZM=c(NA, NA, NA, "Yes", "No", NA, "No", "Inc", "Vag", "Yes"),
           Val_FC=c("No", "No", "No", NA, "No", "Yes", "Yes", "Yes", "Inc", "No"),
           Val_CL=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", NA, NA, "Yes"))

  Val_Av Val_Am Val_ZM Val_FC Val_CL
1     Yes     No   <NA>     No    Yes
2      No     No   <NA>     No     No
3     Inc     No   <NA>     No    Inc
4     Yes    Inc    Yes   <NA>    Yes
5      No     No     No     No     No
6     Yes    Yes   <NA>    Yes    Yes
7      No    Yes     No    Yes     No
8     Inc    Inc    Inc    Yes   <NA>
9     Vag    Vag    Vag    Inc   <NA>
10    Yes   <NA>    Yes     No    Yes

我想创建另一个变量来总结遵循几个规则的响应

  • 如果所有变量的响应都相同,请写入响应(例如:第 2 行 -> 否、第 5 行 -> 否、第 6 行 -> 是)

  • 如果响应中没有 Yes,则连接所有唯一值(例如:line3 -> Inc;No,第 9 行 -> Vag;Inc)

  • 如果回答中有“是”并且

       - if there is strictly more Yes than other responses, write Yes (ex line 10 -> Yes, line4 -> Yes)
       - if there as much Yes as other responses, write "Dif" (Ex line 1-> Dif)
       - if there is strictly less Yes than other responses, concat all the unique value (Ex line8 -> Inc;Yes, line7->No;Yes)
    

我不知道如何继续。我将不胜感激任何想法。谢谢

r testing mutate
1个回答
0
投票

我知道如何使用 ifelse 函数(并且我认为我可以使用 case_when 进行管理),但我不知道如何编写条件。

使用基本 R 我已使用

apply
和嵌套
ifelse
语句明确重写了您的语句。在这种情况下,这不是推荐的方法。

apply(data_test, 1L, \(x) 
      ifelse(length(unique(x[!is.na(x)])) == 1L, 
             unique(x[!is.na(x)]), 
             ifelse(all(x != "Yes", na.rm = TRUE), 
                    paste(unique(x[!is.na(x)]), collapse = ";"),
                    ifelse(any(x == "Yes", na.rm = TRUE) & sum(x == "Yes", na.rm = TRUE) == sum(x != "Yes", na.rm = TRUE), 
                           "Dif", 
                           ifelse(any(x == "Yes", na.rm = TRUE) & sum(x == "Yes", na.rm = TRUE) > sum(x != "Yes", na.rm = TRUE), 
                                  "Yes", 
                                  paste(unique(x[!is.na(x)]), collapse = ";"))))))
#>  [1] "Dif"     "No"      "Inc;No"  "Yes"     "No"      "Yes"     "No;Yes" 
#>  [8] "Inc;Yes" "Vag;Inc" "Yes"

创建于 2023-12-05,使用 reprex v2.0.2

这是否会让您开始编写更优雅且更少冗余的方法?

数据

data_test = data.frame(Val_Av=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", "Inc", "Vag", "Yes"),
                       Val_Am=c("No", "No", "No", "Inc", "No", "Yes", "Yes", "Inc", "Vag", NA),
                       Val_ZM=c(NA, NA, NA, "Yes", "No", NA, "No", "Inc", "Vag", "Yes"),
                       Val_FC=c("No", "No", "No", NA, "No", "Yes", "Yes", "Yes", "Inc", "No"),
                       Val_CL=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", NA, NA, "Yes"))

© www.soinside.com 2019 - 2024. All rights reserved.