我正在处理电子健康记录数据,并希望创建一个名为“episode”的指标变量,该变量加入彼此在7天内发生的抗生素药物。下面是一个模拟数据集和我想要的输出。我在R编程
df2=data.frame(
id = c(01,01,01,01,01,02,02,03,04),
date = c("2015-01-01 11:00",
"2015-01-06 13:29",
"2015-01-10 12:46",
"2015-01-25 14:45",
"2015-02-15 13:30",
"2015-01-01 10:00",
"2015-05-05 15:20",
"2015-01-01 15:19",
"2015-08-01 13:15"),
abx = c("AMPICILLIN",
"ERYTHROMYCIN",
"NEOMYCIN",
"AMPICILLIN",
"VANCOMYCIN",
"VANCOMYCIN",
"NEOMYCIN",
"PENICILLIN",
"ERYTHROMYCIN"));
df2
输出所需
id date abx episode
1 2015-01-01 11:00 AMPICILLIN 1
1 2015-01-06 13:29 ERYTHROMYCIN 1
1 2015-01-10 12:46 NEOMYCIN 1
1 2015-01-25 14:45 AMPICILLIN 2
1 2015-02-15 13:30 VANCOMYCIN 3
2 2015-01-01 10:00 VANCOMYCIN 1
2 2015-05-05 15:20 NEOMYCIN 1
3 2015-01-01 15:19 PENICILLIN 1
4 2015-08-01 13:15 ERYTHROMYCIN 1
像这样使用ave
:
grpno <- function(x) cumsum(c(TRUE, diff(x) >=7 ))
transform(df2, episode = ave(as.numeric(as.Date(date)), id, FUN = grpno))
赠送:
id date abx episode
1 1 2015-01-01 11:00 AMPICILLIN 1
2 1 2015-01-06 13:29 ERYTHROMYCIN 1
3 1 2015-01-10 12:46 NEOMYCIN 1
4 1 2015-01-25 14:45 AMPICILLIN 2
5 1 2015-02-15 13:30 VANCOMYCIN 3
6 2 2015-01-01 10:00 VANCOMYCIN 1
7 2 2015-05-05 15:20 NEOMYCIN 2
8 3 2015-01-01 15:19 PENICILLIN 1
9 4 2015-08-01 13:15 ERYTHROMYCIN 1
或者来自上面的dplyr和grpno
:
df2 %>%
group_by(id) %>%
mutate(episode = date %>% as.Date %>% as.numeric %>% grpno) %>%
ungroup