我想要一行代码能够找到 Fin1 列中每 3 列值的中位数。例如,第一个值将为 1.54、5.08 和 5.26 中的 5.08。第二个是 5.27(满分为 2.79、5.27 和 8.12)。整个数据集包含 36 行,因此此操作必须发生 12 次。
(参见下面的数据框)
Block trial Trial Category Fin1 Fin2 Fin3 Fin4 Correct FastFin
2 1 2 1 1 1.5424 0.00000 0.00000 0.0000 1 1.5424
7 1 7 3 1 5.2617 0.97171 2.41070 3.8407 1 5.2617
9 1 9 2 1 5.0827 0.00000 0.00000 1.1977 1 5.0827
16 2 16 1 1 5.2732 1.28220 0.00000 3.0692 1 5.2732
19 2 19 2 1 8.1251 6.98210 1.52210 0.0000 1 8.1251
24 2 24 3 1 2.7960 1.87000 0.52903 0.0000 1 2.7960
这里的技巧是弄清楚如何对每三个连续行进行“分组”。一旦我们有了它,它就变成了按组的简单聚合。
我将使用
grp
变量来执行此操作,并使用 cumsum((rownum - 1) %% 3 == 0)
进行计算
library(dplyr)
quux %>%
group_by(grp = cumsum((row_number() - 1) %% 3 == 0)) %>%
mutate(val = median(Fin1)) %>%
ungroup()
# # A tibble: 6 × 12
# Block trial Trial Category Fin1 Fin2 Fin3 Fin4 Correct FastFin grp val
# <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int> <dbl>
# 1 1 2 1 1 1.54 0 0 0 1 1.54 1 5.08
# 2 1 7 3 1 5.26 0.972 2.41 3.84 1 5.26 1 5.08
# 3 1 9 2 1 5.08 0 0 1.20 1 5.08 1 5.08
# 4 2 16 1 1 5.27 1.28 0 3.07 1 5.27 2 5.27
# 5 2 19 2 1 8.13 6.98 1.52 0 1 8.13 2 5.27
# 6 2 24 3 1 2.80 1.87 0.529 0 1 2.80 2 5.27
如果您需要所有
Fin#
列,那么
quux %>%
group_by(grp = cumsum((row_number() - 1) %% 3 == 0)) %>%
mutate(across(starts_with("Fin"), ~ median(.x), .names = "{.col}_median")) %>%
ungroup()
# # A tibble: 6 × 15
# Block trial Trial Category Fin1 Fin2 Fin3 Fin4 Correct FastFin grp Fin1_median Fin2_median Fin3_median Fin4_median
# <int> <int> <int> <int> <dbl> <dbl> <dbl> <dbl> <int> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 2 1 1 1.54 0 0 0 1 1.54 1 5.08 0 0 1.20
# 2 1 7 3 1 5.26 0.972 2.41 3.84 1 5.26 1 5.08 0 0 1.20
# 3 1 9 2 1 5.08 0 0 1.20 1 5.08 1 5.08 0 0 1.20
# 4 2 16 1 1 5.27 1.28 0 3.07 1 5.27 2 5.27 1.87 0.529 0
# 5 2 19 2 1 8.13 6.98 1.52 0 1 8.13 2 5.27 1.87 0.529 0
# 6 2 24 3 1 2.80 1.87 0.529 0 1 2.80 2 5.27 1.87 0.529 0
grp <- cumsum((0:(nrow(quux)-1) %% 3) == 0)
grp
# [1] 1 1 1 2 2 2
Fins <- startsWith(names(quux), "Fin")
newFins <- paste0(Fins, "_median")
Fins
# [1] "Fin1" "Fin2" "Fin3" "Fin4"
newFins
# [1] "Fin1_median" "Fin2_median" "Fin3_median" "Fin4_median"
quux[,newFins] <- lapply(setNames(quux[,Fins], paste0(Fins, "_median")), function(val) ave(val, grp, FUN = median))
quux
# Block trial Trial Category Fin1 Fin2 Fin3 Fin4 Correct FastFin Fin1_median Fin2_median Fin3_median Fin4_median
# 2 1 2 1 1 1.5424 0.00000 0.00000 0.0000 1 1.5424 5.0827 0.00 0.00000 1.1977
# 7 1 7 3 1 5.2617 0.97171 2.41070 3.8407 1 5.2617 5.0827 0.00 0.00000 1.1977
# 9 1 9 2 1 5.0827 0.00000 0.00000 1.1977 1 5.0827 5.0827 0.00 0.00000 1.1977
# 16 2 16 1 1 5.2732 1.28220 0.00000 3.0692 1 5.2732 5.2732 1.87 0.52903 0.0000
# 19 2 19 2 1 8.1251 6.98210 1.52210 0.0000 1 8.1251 5.2732 1.87 0.52903 0.0000
# 24 2 24 3 1 2.7960 1.87000 0.52903 0.0000 1 2.7960 5.2732 1.87 0.52903 0.0000
数据
quux <- structure(list(Block = c(1L, 1L, 1L, 2L, 2L, 2L), trial = c(2L, 7L, 9L, 16L, 19L, 24L), Trial = c(1L, 3L, 2L, 1L, 2L, 3L), Category = c(1L, 1L, 1L, 1L, 1L, 1L), Fin1 = c(1.5424, 5.2617, 5.0827, 5.2732, 8.1251, 2.796), Fin2 = c(0, 0.97171, 0, 1.2822, 6.9821, 1.87), Fin3 = c(0, 2.4107, 0, 0, 1.5221, 0.52903), Fin4 = c(0, 3.8407, 1.1977, 3.0692, 0, 0), Correct = c(1L, 1L, 1L, 1L, 1L, 1L), FastFin = c(1.5424, 5.2617, 5.0827, 5.2732, 8.1251, 2.796)), class = "data.frame", row.names = c("2", "7", "9", "16", "19", "24"))
创建一个 3 行的
matrix
,然后使用 matrixStats::colMedians
。
> matrixStats::colMedians(matrix(quux$Fin1, nrow=3))
[1] 5.0827 5.2732
数据借自r2evans。