我有一些时间来处理正在使用的数据。我想过滤从研究对象第一次进入研究到第一次观察到的事件的数据(不担心第一次事件之后发生的复发事件-只想探索第一次事件的时间)。
我在between
函数中使用了filter
,该函数过去一直对我有用,但是这里有问题,因为有些主题从未发生过该事件,因此我得到一个错误,指出Error: Expecting a single value: [extent=0].
]
我想我想要的是一种方法,该方法是在研究开始进入第一时间之前,或者如果没有事件,则对受试者的所有数据进行过滤。
这是我的数据的示例:
## data
subject <- c("A", "A", "A", "A", "B", "B", "C", "C", "C", "D", "E", "E", "E", "E", "E", "F", "F", "F", "F", "F")
event <- c(0,0,1,0,0,0,0,0,1,0,0,1,0,1,1,0,0,0,0,0)
df <- data.frame(subject, event)
## create index to count the days the subject is in the study
library(tidyverse)
df <- df %>%
group_by(subject) %>%
mutate(ID = seq_along(subject))
df
# A tibble: 20 x 3
# Groups: subject [6]
subject event ID
<fct> <dbl> <int>
1 A 0 1
2 A 0 2
3 A 1 3
4 A 0 4
5 B 0 1
6 B 0 2
7 C 0 1
8 C 0 2
9 C 1 3
10 D 0 1
11 E 0 1
12 E 1 2
13 E 0 3
14 E 1 4
15 E 1 5
16 F 0 1
17 F 0 2
18 F 0 3
19 F 0 4
20 F 0 5
## filter event times between the start of the trial and when the subject has the event for the first time
df %>%
group_by(subject) %>%
filter(., between(row_number(),
left = which(ID == 1),
right = which(event == 1)))
最后一部分是我的错误发生的地方。
这是您要的吗?
df2 <- df %>%
group_by(subject) %>%
filter(cumsum(event) == 0 | (cumsum(event) == 1 & event == 1))
结果:
# A tibble: 16 x 2
# Groups: subject [6]
subject event
<fct> <dbl>
1 A 0
2 A 0
3 A 1
4 B 0
5 B 0
6 C 0
7 C 0
8 C 1
9 D 0
10 E 0
11 E 1
12 F 0
13 F 0
14 F 0
15 F 0
16 F 0