我有以下数据框,代表五个人的回复。有四种不同的回答 Yes、No、Inc、Vag。
data_test = data.frame(Val_Av=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", "Inc", "Vag", "Yes"),
Val_Am=c("No", "No", "No", "Inc", "No", "Yes", "Yes", "Inc", "Vag", NA),
Val_ZM=c(NA, NA, NA, "Yes", "No", NA, "No", "Inc", "Vag", "Yes"),
Val_FC=c("No", "No", "No", NA, "No", "Yes", "Yes", "Yes", "Inc", "No"),
Val_CL=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", NA, NA, "Yes"))
Val_Av Val_Am Val_ZM Val_FC Val_CL
1 Yes No <NA> No Yes
2 No No <NA> No No
3 Inc No <NA> No Inc
4 Yes Inc Yes <NA> Yes
5 No No No No No
6 Yes Yes <NA> Yes Yes
7 No Yes No Yes No
8 Inc Inc Inc Yes <NA>
9 Vag Vag Vag Inc <NA>
10 Yes <NA> Yes No Yes
我想创建另一个变量来总结遵循几个规则的响应
如果所有变量的响应都相同,请写入响应(例如:第 2 行 -> 否、第 5 行 -> 否、第 6 行 -> 是)
如果响应中没有 Yes,则连接所有唯一值(例如:line3 -> Inc;No,第 9 行 -> Vag;Inc)
如果回答中有“是”并且
- if there is strictly more Yes than other responses, write Yes (ex line 10 -> Yes, line4 -> Yes)
- if there as much Yes as other responses, write "Dif" (Ex line 1-> Dif)
- if there is strictly less Yes than other responses, concat all the unique value (Ex line8 -> Inc;Yes, line7->No;Yes)
我不知道如何继续。我将不胜感激任何想法。谢谢
我知道如何使用 ifelse 函数(并且我认为我可以使用 case_when 进行管理),但我不知道如何编写条件。
使用基本 R 我已使用
apply
和嵌套 ifelse
语句明确重写了您的语句。在这种情况下,这不是推荐的方法。
apply(data_test, 1L, \(x)
ifelse(length(unique(x[!is.na(x)])) == 1L,
unique(x[!is.na(x)]),
ifelse(all(x != "Yes", na.rm = TRUE),
paste(unique(x[!is.na(x)]), collapse = ";"),
ifelse(any(x == "Yes", na.rm = TRUE) & sum(x == "Yes", na.rm = TRUE) == sum(x != "Yes", na.rm = TRUE),
"Dif",
ifelse(any(x == "Yes", na.rm = TRUE) & sum(x == "Yes", na.rm = TRUE) > sum(x != "Yes", na.rm = TRUE),
"Yes",
paste(unique(x[!is.na(x)]), collapse = ";"))))))
#> [1] "Dif" "No" "Inc;No" "Yes" "No" "Yes" "No;Yes"
#> [8] "Inc;Yes" "Vag;Inc" "Yes"
创建于 2023-12-05,使用 reprex v2.0.2
这是否会让您开始编写更优雅且更少冗余的方法?
数据
data_test = data.frame(Val_Av=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", "Inc", "Vag", "Yes"),
Val_Am=c("No", "No", "No", "Inc", "No", "Yes", "Yes", "Inc", "Vag", NA),
Val_ZM=c(NA, NA, NA, "Yes", "No", NA, "No", "Inc", "Vag", "Yes"),
Val_FC=c("No", "No", "No", NA, "No", "Yes", "Yes", "Yes", "Inc", "No"),
Val_CL=c("Yes", "No", "Inc", "Yes", "No", "Yes", "No", NA, NA, "Yes"))