如何基于R中的group_by函数对列中的所有唯一因子求和并作为新列输出?

问题描述 投票:0回答:1

我有一个由4列组成的数据框,其中年份从2016-2018年开始,并且Lost_Reason值每年总计有15个唯一的“原因”:

Year1 LOST_REASON                   TotalLost
  <chr> <fct>                             <int>
1 2016  ""                                    0
2 2016  "Change in Business Strategy"        31
3 2016  "Data Issue"                         12
4 2016  "Lack of Adoption"                   21
5 2016  "Lack of Value"                      14
6 2016  "Lost to Competition"                20

如何重新格式化由以下简单代码生成的数据框:

df_test1 <- complete_df %>%
  mutate(full_year = format(as.Date(CLOSEDATE, format = "%m/%d/%Y"), "%Y-%m-%d")) %>%
  group_by(Year1, LOST_REASON) %>%
  summarise(TotalWon = sum(STAGENAME == 'Closed Won'), TotalLost = sum(STAGENAME == 'CS: Non-Renewal'))

匹配输出,这样每年都会对“ Lost_Reason”因子进行求和,并生成“ total”列:

                       Reason 2016 2017 2018 Total
1 Change in Business Strategy   31   39   45   151
2                  Data Issue   12   20   11    51
3            Lack of Adoption   21   25   26    89
4               Lack of Value   14   23   20    90
5         Lost to Competition   20   13   13    66
6                   No Budget   14   27   41   103
r dataframe group-by dplyr
1个回答
0
投票

根据“年份”列创建行索引后,它将是pivot_wider选项>

library(dplyr)
library(tidyr)
library(data.table)
df_test1 %>%
   mutate(rn = rowid(Year)) %>%
   pivot_wider(names_from = Year, values_from = TotalLost) %>%
   mutate(Total = `2016` + `2017` + `2018`)
© www.soinside.com 2019 - 2024. All rights reserved.