我有类似的累积数据;
df1 <- data.frame(code=c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,5,5,5),
date=c("2020-01-01", "2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-01","2020-01-02","2020-01-03",
"2020-01-04","2020-01-01","2020-01-01","2020-01-02","2020-01-02","2020-01-03","2020-01-04","2020-01-01",
"2020-01-02","2020-01-04","2020-01-03","2020-01-01","2020-01-02","2020-01-03","2020-01-04"),
cumulative=c(2,3,3,4,4,4,4,6,6,7,8,10,13,14,16,1,2,3,5,1,2,3,5))
从这里,我想提取每个代码和每个日期的最大累计数量;
df2 <- data.frame(code=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5),
date=c("2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-01","2020-01-02","2020-01-03",
"2020-01-04","2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-01",
"2020-01-02","2020-01-03","2020-01-04","2020-01-01","2020-01-02","2020-01-03","2020-01-04"),
cumulative=c(3,3,4,4,4,4,6,6,8,13,14,16,1,2,3,5,1,2,3,5))
现在,我每天都有每个代码的累积数字。从这里我要计算2天持续时间的发生率。
df3 <- data.frame(code=c(1,2,3,4,5),
incidence1=c(1,2,6,2,2),incidence2=c(1,2,3,3,3))
Incidence1表示2020-01-01与2020-01-03之间的差异,“ Incidence2”表示2020-01-02和2020-01-04之间的差异
我想知道的是1)如何在同一天内提取最大数量2)如何计算2天之间的差额
[请教我,谢谢。
这是通过创建每个备用行的组并获取它们之间cumulative
值之差的一种方法。为了获得与所示格式相同的预期输出,我们可以使用pivot_wider
中的tidyr
。
library(dplyr)
library(tidyr)
df2 %>%
group_by(code) %>%
group_by(gr = rep(seq(1, n()/2), 2), add = TRUE) %>%
summarise(incidence = diff(cumulative)) %>%
pivot_wider(names_from = gr, values_from = incidence, names_prefix = "incidence")
# code incidence1 incidence2
# <dbl> <dbl> <dbl>
#1 1 1 1
#2 2 2 2
#3 3 6 3
#4 4 2 3
#5 5 2 3