查找列中每 3 个值的中位数

问题描述 投票:0回答:2

我想要一行代码能够找到 Fin1 列中每 3 列值的中位数。例如,第一个值将为 1.54、5.08 和 5.26 中的 5.08。第二个是 5.27(满分为 2.79、5.27 和 8.12)。整个数据集包含 36 行,因此此操作必须发生 12 次。

(参见下面的数据框)

Block trial Trial Category   Fin1    Fin2    Fin3   Fin4    Correct FastFin

2      1     2     1      1 1.5424  0.00000 0.00000 0.0000       1  1.5424

7      1     7     3      1 5.2617  0.97171 2.41070 3.8407       1  5.2617

9      1     9     2      1 5.0827  0.00000 0.00000 1.1977       1  5.0827

16     2    16     1      1 5.2732  1.28220 0.00000 3.0692       1  5.2732

19     2    19     2      1 8.1251  6.98210 1.52210 0.0000       1  8.1251

24     2    24     3      1 2.7960  1.87000 0.52903 0.0000       1  2.7960
r median
2个回答
1
投票

这里的技巧是弄清楚如何对每三个连续行进行“分组”。一旦我们有了它,它就变成了按组的简单聚合。

我将使用

grp
变量来执行此操作,并使用
cumsum((rownum - 1) %% 3 == 0)

进行计算

dplyr

library(dplyr)
quux %>%
  group_by(grp = cumsum((row_number() - 1) %% 3 == 0)) %>%
  mutate(val = median(Fin1)) %>%
  ungroup()
# # A tibble: 6 × 12
#   Block trial Trial Category  Fin1  Fin2  Fin3  Fin4 Correct FastFin   grp   val
#   <int> <int> <int>    <int> <dbl> <dbl> <dbl> <dbl>   <int>   <dbl> <int> <dbl>
# 1     1     2     1        1  1.54 0     0      0          1    1.54     1  5.08
# 2     1     7     3        1  5.26 0.972 2.41   3.84       1    5.26     1  5.08
# 3     1     9     2        1  5.08 0     0      1.20       1    5.08     1  5.08
# 4     2    16     1        1  5.27 1.28  0      3.07       1    5.27     2  5.27
# 5     2    19     2        1  8.13 6.98  1.52   0          1    8.13     2  5.27
# 6     2    24     3        1  2.80 1.87  0.529  0          1    2.80     2  5.27

如果您需要所有

Fin#
列,那么

quux %>%
  group_by(grp = cumsum((row_number() - 1) %% 3 == 0)) %>%
  mutate(across(starts_with("Fin"), ~ median(.x), .names = "{.col}_median")) %>%
  ungroup()
# # A tibble: 6 × 15
#   Block trial Trial Category  Fin1  Fin2  Fin3  Fin4 Correct FastFin   grp Fin1_median Fin2_median Fin3_median Fin4_median
#   <int> <int> <int>    <int> <dbl> <dbl> <dbl> <dbl>   <int>   <dbl> <int>       <dbl>       <dbl>       <dbl>       <dbl>
# 1     1     2     1        1  1.54 0     0      0          1    1.54     1        5.08        0          0            1.20
# 2     1     7     3        1  5.26 0.972 2.41   3.84       1    5.26     1        5.08        0          0            1.20
# 3     1     9     2        1  5.08 0     0      1.20       1    5.08     1        5.08        0          0            1.20
# 4     2    16     1        1  5.27 1.28  0      3.07       1    5.27     2        5.27        1.87       0.529        0   
# 5     2    19     2        1  8.13 6.98  1.52   0          1    8.13     2        5.27        1.87       0.529        0   
# 6     2    24     3        1  2.80 1.87  0.529  0          1    2.80     2        5.27        1.87       0.529        0   

基础R

grp <- cumsum((0:(nrow(quux)-1) %% 3) == 0)
grp
# [1] 1 1 1 2 2 2
Fins <- startsWith(names(quux), "Fin")
newFins <- paste0(Fins, "_median")
Fins
# [1] "Fin1" "Fin2" "Fin3" "Fin4"
newFins
# [1] "Fin1_median" "Fin2_median" "Fin3_median" "Fin4_median"
quux[,newFins] <- lapply(setNames(quux[,Fins], paste0(Fins, "_median")), function(val) ave(val, grp, FUN = median))
quux
#    Block trial Trial Category   Fin1    Fin2    Fin3   Fin4 Correct FastFin Fin1_median Fin2_median Fin3_median Fin4_median
# 2      1     2     1        1 1.5424 0.00000 0.00000 0.0000       1  1.5424      5.0827        0.00     0.00000      1.1977
# 7      1     7     3        1 5.2617 0.97171 2.41070 3.8407       1  5.2617      5.0827        0.00     0.00000      1.1977
# 9      1     9     2        1 5.0827 0.00000 0.00000 1.1977       1  5.0827      5.0827        0.00     0.00000      1.1977
# 16     2    16     1        1 5.2732 1.28220 0.00000 3.0692       1  5.2732      5.2732        1.87     0.52903      0.0000
# 19     2    19     2        1 8.1251 6.98210 1.52210 0.0000       1  8.1251      5.2732        1.87     0.52903      0.0000
# 24     2    24     3        1 2.7960 1.87000 0.52903 0.0000       1  2.7960      5.2732        1.87     0.52903      0.0000

数据

quux <- structure(list(Block = c(1L, 1L, 1L, 2L, 2L, 2L), trial = c(2L, 7L, 9L, 16L, 19L, 24L), Trial = c(1L, 3L, 2L, 1L, 2L, 3L), Category = c(1L, 1L, 1L, 1L, 1L, 1L), Fin1 = c(1.5424, 5.2617, 5.0827, 5.2732, 8.1251, 2.796), Fin2 = c(0, 0.97171, 0, 1.2822, 6.9821, 1.87), Fin3 = c(0, 2.4107, 0, 0, 1.5221, 0.52903), Fin4 = c(0, 3.8407, 1.1977, 3.0692, 0, 0), Correct = c(1L, 1L, 1L, 1L, 1L, 1L), FastFin = c(1.5424, 5.2617, 5.0827, 5.2732, 8.1251, 2.796)), class = "data.frame", row.names = c("2", "7", "9",  "16", "19", "24"))

1
投票

创建一个 3 行的

matrix
,然后使用
matrixStats::colMedians

> matrixStats::colMedians(matrix(quux$Fin1, nrow=3))
[1] 5.0827 5.2732

数据借自r2evans

© www.soinside.com 2019 - 2024. All rights reserved.