我正在使用 R 从多个 pdf 文件中提取数据。提取这些数据后,我需要格式化日期类型的字段。日期来自各种格式的 pdf 文件
data = c( "1/10/2022 2:36:00 pm",
"1/11/2022 12:47:00 pm",
"1/12/2022 9:47:00 am",
"1/13/2022 9:21:00 am",
"1/14/2022 12:59:00 pm",
"1/10/2022 2:39:00 pm",
"1/11/2022 12:46:00 pm",
"1/12/2022 9:48:00 am",
"1/13/2022 9:22:00 am",
"1/14/2022 1:00:00 pm",
"1/10/2022 2:40:00 pm",
"1/11/2022 12:45:00 pm",
"1/12/2022 9:49:00 am",
"1/13/2022 9:23:00 am",
"1/14/2022 1:01:00 pm",
"1/10/2022 2:42:00 pm",
"1/11/2022 12:44:00 pm",
"1/12/2022 9:50:00 am",
"1/13/2022 9:24:00 am",
"1/14/2022 1:02:00 pm",
"1/10/2022 2:44:00 pm",
"1/11/2022 12:43:00 pm",
"1/12/2022 9:51:00 am",
"1/13/2022 9:25:00 am",
"1/14/2022 1:03:00 pm",
"10/01/2022 14:36:00",
"11/01/2022 12:47:00",
"12/01/2022 09:47:00",
"13/01/2022 09:21:00",
"14/01/2022 12:59:00",
"10/01/2022 14:39:00",
"11/01/2022 12:46:00",
"12/01/2022 09:48:00",
"13/01/2022 09:22:00",
"14/01/2022 13:00:00",
"10/01/2022 14:40:00",
"11/01/2022 12:45:00",
"12/01/2022 09:49:00",
"13/01/2022 09:23:00",
"14/01/2022 13:01:00",
"10/01/2022 14:42:00")
df <- data.frame(data)
我尝试如下
date_parser <- function(d) {
if (endsWith(d, "m")) {
as.POSIXct(d, format = "%m/%d/%Y %I:%M %p")
} else {
as.POSIXct(d, format = "%d/%m/%Y %H:%M")
}
}
df = df%>%
rowwise %>%
mutate(data = date_parser(data))
我收到此错误消息
mutate 中的错误:参数中:data_hora_leitura_coc08 = date_parser(data_hora_leitura_coc08)。在第 1732 行中。由 `if (endsWith(d, "m")) 中的错误导致,其中需要 TRUE/FALSE
另外,有些日期显示为“NA”,如何解决?
您的示例数据有几秒钟,因此您需要为每种格式添加
%S
。
date_parser <- function(d) {
if (endsWith(d, "m")) {
as.POSIXct(d, format = "%m/%d/%Y %I:%M:%S %p")
} else {
as.POSIXct(d, format = "%d/%m/%Y %H:%M:%S")
}
}
df %>%
rowwise() %>%
mutate(data = date_parser(data))
# # A tibble: 41 × 1
# # Rowwise:
# data
# <dttm>
# 1 2022-01-10 14:36:00
# 2 2022-01-11 12:47:00
# 3 2022-01-12 09:47:00
# 4 2022-01-13 09:21:00
# 5 2022-01-14 12:59:00
# 6 2022-01-10 14:39:00
# 7 2022-01-11 12:46:00
# 8 2022-01-12 09:48:00
# 9 2022-01-13 09:22:00
# 10 2022-01-14 13:00:00
# # ℹ 31 more rows
# # ℹ Use `print(n = ...)` to see more rows
lubridate
正是您想要的。尝试一下
df %>%
mutate(parsed_data = if_else(endsWith(data, "m"), mdy_hms(data), dmy_hms(data)))
与
parse_date_time
library(lubridate)
df %>% mutate(new = parse_date_time(data, orders=c("mdY IMS p", "dmY HMS")))