我有一个数据帧df,我需要有滞后值才能获得时间之间的差
df
ColA ColB Lag(ColB)
1 11:00:12 11:00:13
1 11:00:13 11:00:14
1 11:00:14 NA
2 11:00:15 11:00:16
2 11:00:16 11:00:17
2 11:00:17 NA
3 11:00:18 11:00:19
3 11:00:19 11:00:20
3 11:00:20 NA
仅在达到唯一值之前,我需要创建一个滞后。如果您看到,当ColA从1变为2并从2变为3时,滞后就是NA。那么有可能实现这一目标吗?
使用dplyr
和lubridate
,可以按组计算比较时间
library(dplyr)
library(lubridate)
df %>% group_by(ColA) %>% mutate(NewLag = lead(ColB)) %>%
mutate(diff = hms(NewLag)-hms(ColB))
# A tibble: 9 x 5
# Groups: ColA [3]
ColA ColB `Lag(ColB)` NewLag diff
<int> <chr> <chr> <chr> <dbl>
1 1 11:00:12 11:00:13 11:00:13 1
2 1 11:00:13 11:00:14 11:00:14 1
3 1 11:00:14 NA NA NA
4 2 11:00:15 11:00:16 11:00:16 1
5 2 11:00:16 11:00:17 11:00:17 1
6 2 11:00:17 NA NA NA
7 3 11:00:18 11:00:19 11:00:19 1
8 3 11:00:19 11:00:20 11:00:20 1
9 3 11:00:20 NA NA NA