R中带有日期和类别变量的总和

问题描述 投票:0回答:1

我有这个数据集:

df <- data.frame(Date = c("12-01-2019","12-01-2019","12-02-2019","12-02-2019","12-02-2019","12-03-2019"),
                 Country = c("France","USA","France","USA","Colombia","USA")).

我想在dplyr上应用cumsum并得到以下结果:

Date          Country cumsum
"12-01-2019" "France"   1
"12-01-2019" "USA"      1
"12-01-2019" "Colombia" 0
"12-02-2019" "France"   2
"12-02-2019" "USA"      2
"12-02-2019" "Colombia" 1
"12-03-2019" "France"   2
"12-03-2019" "USA"      3
"12-03-2019" "Colombia" 1

有任何建议吗?

非常感谢您的帮助。

问候!

r dplyr cumsum
1个回答
0
投票

我们可以为每个countDate组合设置Country行数,每个completeCountry缺失日期并将计数加为0。最后,对于每个Country,我们可以将cumsum ]。

library(dplyr)

df %>%
  mutate(Date = lubridate::mdy(Date)) %>%
  count(Date, Country) %>%
  tidyr::complete(Country, Date = seq(min(Date), max(Date), by = 'day'), 
                  fill = list(n = 0)) %>%
  group_by(Country) %>%
  mutate(n  = cumsum(n))


#  Country  Date           n
#  <chr>    <date>     <dbl>
#1 Colombia 2019-12-01     0
#2 Colombia 2019-12-02     1
#3 Colombia 2019-12-03     1
#4 France   2019-12-01     1
#5 France   2019-12-02     2
#6 France   2019-12-03     2
#7 USA      2019-12-01     1
#8 USA      2019-12-02     2
#9 USA      2019-12-03     3
© www.soinside.com 2019 - 2024. All rights reserved.