我有一个数据集:
x = data.frame(store=c("store1", "store1", "store1","store2","store2", "store3", "store3", "store4", "store4", "store4"),
pos=c("room1", "room2", "room2", "room1", "room1", "room1", "room1", "room2", "room2", "room3"),
error=c("error1", "error2", "error2", "error5", "error6", "error2", "error3", "error1", "error3", "error2"),
time = c("10:00:14", "10:00:44", "10:20:31", "10:24:11", "10:55:14", "10:20:10", "10:44:12", "10:04:34", "12:34:55", "10:12:17")
)
我希望为每个存储和位置选择在错误列中具有错误2或错误5和在时间列中具有最大时间的行。我该怎么办?
所以新数据集必须像这样:
x_new = data.frame(store=c("store1","store2", "store3", "store4"),
pos=c("room2", "room1", "room1", "room3"),
error=c("error2", "error5", "error2", "error2"),
time = c("10:20:31", "10:24:11", "10:20:10", "10:12:17")
)
library(tidyverse)
library(chron)
x %>%
mutate(time = chron::as.times(time)) %>%
filter(error %in% c("error2", "error5")) %>%
group_by(store, pos) %>%
summarise(time = max(time, na.rm = T))