我有一个由4列组成的数据框,其中年份从2016-2018年开始,并且Lost_Reason值每年总计有15个唯一的“原因”:
Year1 LOST_REASON TotalLost
<chr> <fct> <int>
1 2016 "" 0
2 2016 "Change in Business Strategy" 31
3 2016 "Data Issue" 12
4 2016 "Lack of Adoption" 21
5 2016 "Lack of Value" 14
6 2016 "Lost to Competition" 20
如何重新格式化由以下简单代码生成的数据框:
df_test1 <- complete_df %>%
mutate(full_year = format(as.Date(CLOSEDATE, format = "%m/%d/%Y"), "%Y-%m-%d")) %>%
group_by(Year1, LOST_REASON) %>%
summarise(TotalWon = sum(STAGENAME == 'Closed Won'), TotalLost = sum(STAGENAME == 'CS: Non-Renewal'))
匹配输出,这样每年都会对“ Lost_Reason”因子进行求和,并生成“ total”列:
Reason 2016 2017 2018 Total
1 Change in Business Strategy 31 39 45 151
2 Data Issue 12 20 11 51
3 Lack of Adoption 21 25 26 89
4 Lack of Value 14 23 20 90
5 Lost to Competition 20 13 13 66
6 No Budget 14 27 41 103
根据“年份”列创建行索引后,它将是pivot_wider
选项>
library(dplyr)
library(tidyr)
library(data.table)
df_test1 %>%
mutate(rn = rowid(Year)) %>%
pivot_wider(names_from = Year, values_from = TotalLost) %>%
mutate(Total = `2016` + `2017` + `2018`)