我有一个数据帧,其中一列包含POSIXct
时间戳,所有时间戳均以UTC时间表示,另一列包含时区,如下所示:
time_data
time_stamp time_zone
<dttm> <chr>
1 2020-02-06 07:08:59 America/Chicago
2 2020-02-06 07:43:50 America/Denver
3 2020-02-06 08:44:51 America/New_York
4 2020-02-06 08:45:07 America/New_York
5 2020-02-06 08:45:10 America/New_York
6 2020-02-06 08:45:14 America/New_York
7 2020-02-06 08:45:30 America/New_York
8 2020-02-06 08:45:47 America/Chicago
9 2020-02-06 08:45:48 America/New_York
10 2020-02-06 08:45:49 America/New_York
[我知道我可以使用lubridate::with_tz
函数将单个UTC POSIXct
时间戳转换为本地化到时区的时间戳,如下所示:
with_tz(time_data$time_stamp[1], tz=time_data$time_zone[1])
"2020-02-06 01:08:59 CST"
但是,我尝试对整个向量/列执行此操作,但出现错误:
with_tz(time_data$time_stamp, tz=time_data$time_zone)
Error in as.POSIXlt.POSIXct(x, tz) : invalid 'tz' value
任何提示将不胜感激!谢谢!
您可以完成此任务。当您有多个时区时,您的工作量将比平时多一些。您需要连续遍历每个时间和时区并创建一个新的时间戳。您采样的数据称为mydf
。在下面的代码中,由于time_stemp
是mydf
中的字符,因此我首先创建了一个日期对象。然后,对于每一行,获取时间和时区并使用with_tz()
。 map2()
正在处理作业。您也可以使用mapply()
。由于map2()
返回列表,因此最后使用unnest()
。
library(tidyverse)
library(lubridate)
mutate(mydf, time_stamp = as.POSIXct(time_stamp, format = "%Y-%m-%d %H:%M:%S", tz = "UTC"),
new_time = map2(.x = time_stamp, .y = time_zone,
.f = function(x, y) {with_tz(time = x, tzone = y)})) %>%
unnest(new_time)
time_stamp time_zone new_time
<dttm> <chr> <dttm>
1 2020-02-06 07:08:59 America/Chicago 2020-02-06 01:08:59
2 2020-02-06 07:43:50 America/Denver 2020-02-06 01:43:50
3 2020-02-06 08:44:51 America/New_York 2020-02-06 02:44:51
4 2020-02-06 08:45:07 America/New_York 2020-02-06 02:45:07
5 2020-02-06 08:45:10 America/New_York 2020-02-06 02:45:10
6 2020-02-06 08:45:14 America/New_York 2020-02-06 02:45:14
7 2020-02-06 08:45:30 America/New_York 2020-02-06 02:45:30
8 2020-02-06 08:45:47 America/Chicago 2020-02-06 02:45:47
9 2020-02-06 08:45:48 America/New_York 2020-02-06 02:45:48
10 2020-02-06 08:45:49 America/New_York 2020-02-06 02:45:49
DATA
mydf <- structure(list(time_stamp = c("2020-02-06 07:08:59", "2020-02-06 07:43:50",
"2020-02-06 08:44:51", "2020-02-06 08:45:07", "2020-02-06 08:45:10",
"2020-02-06 08:45:14", "2020-02-06 08:45:30", "2020-02-06 08:45:47",
"2020-02-06 08:45:48", "2020-02-06 08:45:49"), time_zone = c("America/Chicago",
"America/Denver", "America/New_York", "America/New_York", "America/New_York",
"America/New_York", "America/New_York", "America/Chicago", "America/New_York",
"America/New_York")), row.names = c(NA, -10L), class = c("tbl_df",
"tbl", "data.frame"))