使用dplyr和for循环进行汇总

问题描述 投票:0回答:1

我想使用dplyr在for循环上用目标变量总结我的每个独立变量(列)

这是我的主要数据框

contract_ID Asurion变量_1变量_2变量_31年2年3牛顿4 N a d f5年6年

由我分组之后

a1 %group_by(Asurion,BhvrBnk_Donates_to_Env_Causes)%>%摘要(counT = n_distinct(CONTRACT_ID))%>%mutate(perc = paste0(round(counT / sum(counT)* 100,2),“%”))
Asurion Variable_1 CounT perc是3 75%Y b 1 25%N 1 1 50%铌1 50%

我希望对数据框中存在的每个变量都具有这种概括,我想使用for循环来实现。我如何达到我想要的结果

这是我尝试使用的方法,但似乎不起作用。它用于学校项目,为此我需要使用for循环。请在这里帮助我

categorical <- colnames(a)###where categroical is the names of all columns in a ###I would like to have a for loop for every column in a and summarise in the following way. I would like to store each of the summarisations in a separate dataframe for (i in categorical) { a[[i]] <- a %>% group_by(Asurion,get(i)) %>% summarise(counT=n_distinct(CONTRACT_ID)) %>% mutate(perc=paste0(round(counT/sum(counT)*100,2),"%")) }
r for-loop group-by dplyr summarize
1个回答
0
投票

您可能并不需要for loop来获得想要的东西。

df<-data.frame(contract_ID = 1:6, 
               Asurion = c("Y", "Y", "N", "N", "Y", "Y"), 
               Variable_1 = c("a", "a", "b", "a", "b","a"), 
               Variable_2 = c("c", "d", "c", "d", "c", "d"), 
               Variable_3 = c("f", "g", "g", "f", "f", "f"))

pct <- function(x) {
  df %>% 
  group_by(Asurion, {{x}}) %>% 
  summarise(counT=n_distinct(contract_ID)) %>% 
  mutate(perc = paste0(round(counT/sum(counT)*100,2),"%"))
}

pct(Variable_1)
pct(Variable_2)
pct(Variable_3)
> pct(Variable_1)
# A tibble: 4 x 4
# Groups:   Asurion [2]
  Asurion Variable_1 counT perc 
  <fct>   <fct>      <int> <chr>
1 N       a              1 50%  
2 N       b              1 50%  
3 Y       a              3 75%  
4 Y       b              1 25%  
> pct(Variable_1)
# A tibble: 4 x 4
# Groups:   Asurion [2]
  Asurion Variable_1 counT perc 
  <fct>   <fct>      <int> <chr>
1 N       a              1 50%  
2 N       b              1 50%  
3 Y       a              3 75%  
4 Y       b              1 25%  
> pct(Variable_2)
# A tibble: 4 x 4
# Groups:   Asurion [2]
  Asurion Variable_2 counT perc 
  <fct>   <fct>      <int> <chr>
1 N       c              1 50%  
2 N       d              1 50%  
3 Y       c              2 50%  
4 Y       d              2 50%  
> pct(Variable_3)
# A tibble: 4 x 4
# Groups:   Asurion [2]
  Asurion Variable_3 counT perc 
  <fct>   <fct>      <int> <chr>
1 N       f              1 50%  
2 N       g              1 50%  
3 Y       f              3 75%  
4 Y       g              1 25%  
> 

如果您确实有很多变量,则可以使用for loopapply之类的东西来迭代最后一位。

© www.soinside.com 2019 - 2024. All rights reserved.