我有一个如下所示的数据框:
week_0 <- c(5,0,1,0,0,1)
week_1 <- c(5,0,4,0,2,1)
week_2 <- c(5,0,4,0,8,1)
week_3 <- c(5,0,4,0,8,3)
week_4 <- c(1,0,4,0,8,3)
week_5 <- c(1,0,4,0,8,3)
week_6 <- c(1,0,4,0,1,3)
week_7 <- c(1,0,4,0,1,3)
week_8 <- c(1,0,6,0,3,4)
week_9 <- c(2,4,6,7,3,4)
week_10 <- c(2,4,6,7,3,4)
Participant <- c("Lion","Cat","Dog","Snake","Tiger","Mouse")
test_data <- data.frame(Participant,week_0,week_4,week_8,week_9,week_10)
>test_data
Participant week_0 week_1 week_2 week_3 week_4 week_5 week_6 week_7 week_8 week_9 week_10
1 Lion 5 5 5 5 1 1 1 1 1 2 2
2 Cat 0 0 0 0 0 0 0 0 0 4 4
3 Dog 1 4 4 4 4 4 4 4 6 6 6
4 Snake 0 0 0 0 0 0 0 0 0 7 7
5 Tiger 0 2 8 8 8 8 1 1 3 3 3
6 Mouse 1 1 1 3 3 3 3 3 4 4 4
我想找出一行中比其他值出现更多的值。 例如,第一行的值为 1。我要返回的输出是第一行的 week_4。 对于第二行,出现次数多于其他的值是 0。我要返回的输出是 week_0 等... 所以最终的结果应该是: 第 4 周、第 0 周、第 1 周、第 0 周、第 2 周、第 3 周 我必须使用:
apply(test_data,1,function(x) names(which.max(table(x))))
但我没有得到我正在搜索的结果。
关于如何做到这一点有什么建议吗?
您的代码是良好的开端。您可以使用结果
match()
它在行中的第一个位置,然后使用这个位置索引到列名中:
apply(test_data[, -1], 1, function(x) {
val <- names(which.max(table(x)))
names(test_data)[-1][[match(val, x)]]
})
# "week_4" "week_0" "week_1" "week_0" "week_2" "week_3"
注意我用
test_data[, -1]
排除Participant
列;否则,如果没有多次出现的值,代码将返回参与者姓名,这可能不是您想要的。
dplyr
解决方案 add_count
+ slice_max
:
library(dplyr)
test_data %>%
tidyr::pivot_longer(starts_with('week')) %>%
add_count(Participant, value) %>%
slice_max(n, by = Participant, with_ties = FALSE)
# # A tibble: 6 × 4
# Participant name value n
# <chr> <chr> <dbl> <int>
# 1 Lion week_4 1 2
# 2 Cat week_0 0 3
# 3 Dog week_8 6 3
# 4 Snake week_0 0 3
# 5 Tiger week_8 3 3
# 6 Mouse week_8 4 3