R中多个变量组的错误

问题描述 投票:0回答:1

我有一个具有多个比例尺的数据框,我想计算每个参与者的均值和和以及每个比例尺的均值和总和。我无法弄清楚如何使用pmap_dbl来获得我的结果。我尝试编写一个函数,但是失败了。

这里是数据示例:

library(tidyverse)
df <- tibble(tep_1 = sample(c(0,1), 5, replace = TRUE),
             tep_2 = sample(c(0,1), 5, replace = TRUE),
             adarta_1 = sample(c(0,1), 5, replace = TRUE),
             adarta_2 = sample(c(0,1), 5, replace = TRUE),
             adarta_3 = sample(c(0,1), 5, replace = TRUE),
             adarta_4 = sample(c(0,1), 5, replace = TRUE),
             adarta_5 = sample(c(0,1), 5, replace = TRUE),
             adarta_6 = sample(c(0,1), 5, replace = TRUE))

这是我的功能,该功能不起作用。注意:此函数仅尝试获取行总和,但我还需要行均值,均值和标准差:

column_prefix <- c("tep", "adarta")

my_fun <- function(x, y) {
  x %>%
  select(starts_with(y)) %>%
    rowSums(x, na.rm = TRUE)
}

map2_dbl(.x = df, .y = column_prefix, .f = my_fun)

Error: Mapped vectors must have consistent lengths:
* `.x` has length 8
* `.y` has length 2

而且我想做到这一点,所以我可以使用该功能获得此输出:

library(tidyverse)
df <- df %>%
  mutate(tep_grand_mean = mean(c(tep_1, tep_2)),
         tep_sd = sd(tep_grand_mean),
         adarta_grand_mean = mean(c(adarta_1, adarta_1, adarta_2, adarta_3, adarta_4, adarta_5, adarta_6)),
         adarta_sd = sd(adarta_grand_mean),
         tep_sum = pmap_dbl(select(., starts_with("tep")), sum),
         tep_mean = rowMeans(select(., contains("tep")), na.rm = TRUE),
         adarta_sum = pmap_dbl(select(., starts_with("adarta")), sum),
         adarta_mean = rowMeans(select(., contains("adarta")), na.rm = TRUE))
~~~~~

r function iteration mapping purrr
1个回答
0
投票

在这里,对功能进行一些更改后,我们可能只需要map

map(column_prefix, my_fun, x = df)
#[[1]]
#[1] 0 0 2 2 1

#[[2]]
#[1] 4 2 0 1 4

my_fun <- function(x, y) {
  x %>%
   select(starts_with(y)) %>%
    rowSums(na.rm = TRUE)
}

[map2用于两个对象的长度相同或一个对象具有单个元素时,请用list包裹并回收


如果每个相似的前缀名称都需要mean,一种选择是split.default

library(stringr)
df %>% 
    split.default(str_remove(names(.), "_\\d+$")) %>% 
    map_df(rowMeans)
© www.soinside.com 2019 - 2024. All rights reserved.