循环多个变量并将每个变量插入到函数 R 中

问题描述 投票:0回答:1

我在 R 工作。

我有一些关于学校教职员工的数据:

data <- data.frame(person_id = c(1, 2, 3, 4, 5, 6, 7, 8), 
                   disability_status = c("yes", "no", "yes", "no", "yes", "no", "yes", "no"),
                   age_group = c("20-30","30-40","20-30","30-40","20-30","30-40","20-30","30-40"), 
                   teacher = c("yes", "no", "no", "yes", "no","yes", "no", "yes" ))

我编写了一个函数,可以对插入的变量进行求和。 “group_tag”参数是为了帮助以后在我的代码中进行调试。

group_the_data <- function(data, 
                           variable, 
                           group_tag) {
  
  grouped_output <- data %>%
                    mutate(flag = 1) %>%
                    group_by({{variable}}) %>%
                    summarise(number_staff = sum(flag, na.rm = T)) %>%
                    mutate(grouping_tag := {{group_tag}})
  
  return(grouped_output)
  
}

然后我使用该函数依次按残障状态、年龄组和教师进行分组:

disability_grouped <- group_the_data(data = data,
                                     variable = disability_status,
                                     group_tag = "disability status")

age_group_grouped <- group_the_data(data = data,
                                    variable = age_group,
                                    group_tag = "age group")

role_grouped <- group_the_data(data = data,
                               variable = teacher,
                               group_tag = "role")

一旦获得了所需的数据框,我就把它们绑定在一起:


all_data_grouped <- bind_rows(disability_grouped, age_group_grouped, role_grouped)

有没有办法循环访问变量,这样我就不需要将函数写三次?

或者使用Apply 函数之一是更好的主意吗?

r loops dplyr apply
1个回答
3
投票

您可以使用

lapply
purrr::map
来迭代变量。为此,我们需要循环遍历字符串而不是变量,因此您需要
pick
中的变量
group_by

library(tidyverse)

group_the_data <- function(data, 
                           variable, 
                           group_tag) {
  
  grouped_output <- data %>%
    mutate(flag = 1) %>%
    group_by(pick(variable)) %>% # pick the variable
    summarise(number_staff = sum(flag, na.rm = T)) %>%
    mutate(grouping_tag := {{group_tag}})
  
  return(grouped_output)
  
}

purrr::map(colnames(data)[-1], ~ group_the_data(data, variable = .x, group_tag = .x)) %>% 
  bind_rows()

# A tibble: 6 × 5
  disability_status number_staff grouping_tag      age_group teacher
  <chr>                    <dbl> <chr>             <chr>     <chr>  
1 no                           4 disability_status NA        NA     
2 yes                          4 disability_status NA        NA     
3 NA                           4 age_group         20-30     NA     
4 NA                           4 age_group         30-40     NA     
5 NA                           4 teacher           NA        no     
6 NA                           4 teacher           NA        yes 

同样,如果你想有不同的“变量”和“group_tag”,请使用

purrr::map2

purrr::map2(colnames(data)[-1], 
            c("disability status", "age group", "role"), 
            ~ group_the_data(data, variable = .x, group_tag = .y)) %>% 
  bind_rows()

# A tibble: 6 × 5
  disability_status number_staff grouping_tag      age_group teacher
  <chr>                    <dbl> <chr>             <chr>     <chr>  
1 no                           4 disability status NA        NA     
2 yes                          4 disability status NA        NA     
3 NA                           4 age group         20-30     NA     
4 NA                           4 age group         30-40     NA     
5 NA                           4 role              NA        no     
6 NA                           4 role              NA        yes   
© www.soinside.com 2019 - 2024. All rights reserved.