R:如何在提取同一天的最大数量后从累积数据计算2天持续时间的发生率?

问题描述 投票:1回答:1

我有类似的累积数据;

df1 <- data.frame(code=c(1,1,1,1,1,2,2,2,2,3,3,3,3,3,3,4,4,4,4,5,5,5,5), 
                 date=c("2020-01-01", "2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-01","2020-01-02","2020-01-03",
                        "2020-01-04","2020-01-01","2020-01-01","2020-01-02","2020-01-02","2020-01-03","2020-01-04","2020-01-01",
                        "2020-01-02","2020-01-04","2020-01-03","2020-01-01","2020-01-02","2020-01-03","2020-01-04"),
                 cumulative=c(2,3,3,4,4,4,4,6,6,7,8,10,13,14,16,1,2,3,5,1,2,3,5))

从这里,我想提取每个代码和每个日期的最大累计数量;

df2 <- data.frame(code=c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5), 
                  date=c("2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-01","2020-01-02","2020-01-03",
                         "2020-01-04","2020-01-01","2020-01-02","2020-01-03","2020-01-04","2020-01-01",
                         "2020-01-02","2020-01-03","2020-01-04","2020-01-01","2020-01-02","2020-01-03","2020-01-04"),
                  cumulative=c(3,3,4,4,4,4,6,6,8,13,14,16,1,2,3,5,1,2,3,5))

现在,我每天都有每个代码的累积数字。从这里我要计算2天持续时间的发生率。

df3 <- data.frame(code=c(1,2,3,4,5),
                  incidence1=c(1,2,6,2,2),incidence2=c(1,2,3,3,3))

Incidence1表示2020-01-01与2020-01-03之间的差异,“ Incidence2”表示2020-01-02和2020-01-04之间的差异

我想知道的是1)如何在同一天内提取最大数量2)如何计算2天之间的差额

[请教我,谢谢。

r dataframe max extraction
1个回答
0
投票

这是通过创建每个备用行的组并获取它们之间cumulative值之差的一种方法。为了获得与所示格式相同的预期输出,我们可以使用pivot_wider中的tidyr

library(dplyr)
library(tidyr)

df2 %>%
  group_by(code) %>%
  group_by(gr = rep(seq(1, n()/2), 2), add = TRUE) %>%
  summarise(incidence = diff(cumulative)) %>%
  pivot_wider(names_from = gr, values_from = incidence, names_prefix = "incidence")

#  code incidence1 incidence2
#  <dbl>      <dbl>      <dbl>
#1     1          1          1
#2     2          2          2
#3     3          6          3
#4     4          2          3
#5     5          2          3
© www.soinside.com 2019 - 2024. All rights reserved.