dplyr:为什么个人计数摘要和索引摘要有所不同

问题描述 投票:1回答:1

我正在创建一个新列,其中包含函数内的分组摘要计数。为什么:

iris %>% 
  group_by(Species) %>% 
  mutate(Count = sum(Sepal.Length + Sepal.Width + Petal.Length + Petal.Width))

不会产生相同的结果

iris %>% mutate(count = sum(.[1:ncol(.)])

要么

  iris %>% 
  group_by(Species) %>% 
  mutate(Count = map_if(is.numeric, sum(rowSums(.))))

如何使用列索引创建计数总和以插入具有可变col_names的函数? (索引的原始原因)

r dplyr
1个回答
2
投票

一种方法是nest之后的group_by,用map循环嵌套'数据',选择数字列(select_if),mutate通过获得sumrowSumsunnest来创建'Count'

library(tidyverse)
iris %>% 
  group_by(Species) %>% 
  nest %>%
  mutate(data = map(data, ~ .x %>% 
                              select_if(is.numeric) %>% 
                              mutate(Count = sum(rowSums(.))))) %>% 
                              #or use reduce with sum
                              # mutate(Count = reduce(., `+`) %>% sum))) %>%
  unnest 
# A tibble: 150 x 6
#   Species Sepal.Length Sepal.Width Petal.Length Petal.Width Count
#   <fct>          <dbl>       <dbl>        <dbl>       <dbl> <dbl>
# 1 setosa           5.1         3.5          1.4         0.2  507.
# 2 setosa           4.9         3            1.4         0.2  507.
# 3 setosa           4.7         3.2          1.3         0.2  507.
# 4 setosa           4.6         3.1          1.5         0.2  507.
# 5 setosa           5           3.6          1.4         0.2  507.
# 6 setosa           5.4         3.9          1.7         0.4  507.
# 7 setosa           4.6         3.4          1.4         0.3  507.
# 8 setosa           5           3.4          1.5         0.2  507.
# 9 setosa           4.4         2.9          1.4         0.2  507.
#10 setosa           4.9         3.1          1.5         0.1  507.
# ... with 140 more rows
最新问题
© www.soinside.com 2019 - 2024. All rights reserved.