我有一项观察性研究的数据,其中包含每位患者随访的开始和结束。我想将这一时期按月分割,一行等于
数据框如下所示:
df <- data.frame(
PATIENT = c(1,2, 3, 4),
Start = c("2016-01-01", "2017-12-01", "2016-03-01", "2016-02-01"),
End = c("2016-05-30","2018-01-30","2016-06-30","2016-05-30")
)
预期的输出将如下所示:
data.frame(
PATIENT = c(1,1,1,1,2,2,3,3,3,3,4,4,4,4),
PERIOD = c("2016-01-01", "2016-02-01", "2016-03-01", "2016-04-01", "2016-05-01",
"2017-12-01","2018-01-01",
"2016-03-01","2016-04-01","2016-05-01","2016-06-01",
"2016-02-01","2016-03-01","2016-04-01","2016-05-01")
)
(如果有人好奇,或者如果它有助于寻找处于同样困境的人,我正在研究使用 TrialEmulation,这是一个新的软件包,可用于模拟观察数据的试验)
我使用过 tmerge 但我努力使这项工作达到预期的结果,我也看过 TimeSplitter 但没有成功
使用
seq.Date
的方法。
首先确保日期确实
class(Date)
df$Start <- as.Date(df$Start)
df$End <- as.Date(df$End)
library(dplyr)
library(tidyr)
df %>%
mutate(PERIOD = list(seq.Date(Start, End, "month")),
Start = NULL, End = NULL, .by = PATIENT) %>%
unnest(PERIOD)
# A tibble: 15 × 2
PATIENT PERIOD
<dbl> <date>
1 1 2016-01-01
2 1 2016-02-01
3 1 2016-03-01
4 1 2016-04-01
5 1 2016-05-01
6 2 2017-12-01
7 2 2018-01-01
8 3 2016-03-01
9 3 2016-04-01
10 3 2016-05-01
11 3 2016-06-01
12 4 2016-02-01
13 4 2016-03-01
14 4 2016-04-01
15 4 2016-05-01