我用下面的代码计算连续两行的秒数差
set.seed(79)
library(outbreaks)
library(lubridate)
# Import data
disease_df <- measles_hagelloch_1861[, 3, drop = FALSE]
# Generate a random time for each day
disease_df$time <- sample(1:86400, nrow(disease_df), replace = TRUE)
disease_df$time <- hms::as.hms(disease_df$time)
# Combine date and time
disease_df$time1 <- with(disease_df, ymd(date_of_prodrome) + hms(time))
# Sort data
disease_df <- disease_df[order(disease_df$time1), ]
# Difference in days of two consecutive row
disease_df$diff <- as.numeric(difftime(disease_df$date_of_prodrome,
dplyr::lag(disease_df$date_of_prodrome, 1), units = 'days'))
# Difference in seconds of two consecutive row
disease_df$diff1 <- as.numeric(difftime(disease_df$time1,
dplyr::lag(disease_df$time1, 1), units = 'secs'))
这里是结果数据框
和错误消息longer object length is not a multiple of shorter object length
。
[能否请您解释为什么difftime
可以在几天内正常工作,但会在几秒钟内导致错误?非常感谢!
time1
列的类型为"POSIXlt"
。我不确定如何将difftime
与units = 'secs'
一起使用不起作用,但是如果将其转换为POSIXct
,则不会出现任何错误。
disease_df$time1 <- as.POSIXct(disease_df$time1)
disease_df$diff1 <- as.numeric(difftime(disease_df$time1,
dplyr::lag(disease_df$time1, 1), units = 'secs'))
显然dplyr
不满意dplyr::lag(disease_df$time1, 1)
,因为disease_df$time1
的格式。
将其转换为POSIXct即可,因此只需更新代码的这一部分:
# Combine date and time and convert to POSIXct
disease_df$time1 <- as.POSIXct(with(disease_df, ymd(date_of_prodrome) + hms(time)))