当查看与从 PDT 到 PST 的时区更改重叠的日期时间值时,R 似乎在 1:36:14 而不是预期的 2:00:00 切换时区。具体来说,R 将 PST 时区分配给 2021-11-07 01:36:14 之后的所有日期时间(如下所示):
x <-c(
"2021-11-07 1:00:00",
"2021-11-07 1:00:01",
"2021-11-07 1:35:00",
"2021-11-07 1:36:00",
"2021-11-07 1:36:10",
"2021-11-07 1:36:14",
"2021-11-07 1:36:15",
"2021-11-07 1:36:30",
"2021-11-07 1:36:59",
"2021-11-07 1:45:00",
"2021-11-07 1:59:59",
"2021-11-07 2:00:00",
"2021-11-07 2:30:00"
)
x_pst <- as.POSIXct(x, tz = "PST8PDT")
> x_pst
# ...
[5] "2021-11-07 01:36:10 PDT" "2021-11-07 01:36:14 PDT"
[7] "2021-11-07 01:36:15 PST" "2021-11-07 01:36:30 PST"
# ...
除此之外,lubridate 似乎在切换之前将所有日期时间调整为 PST(使用相同的数据):
x_pst <- lubridate::as_datetime(x, tz = "PST8PDT")
> x_pst
[1] "2021-11-07 01:00:00 PST" "2021-11-07 01:00:01 PST"
[3] "2021-11-07 01:35:00 PST" "2021-11-07 01:36:00 PST"
[5] "2021-11-07 01:36:10 PST" "2021-11-07 01:36:14 PST"
[7] "2021-11-07 01:36:15 PST" "2021-11-07 01:36:30 PST"
[9] "2021-11-07 01:36:59 PST" "2021-11-07 01:45:00 PST"
[11] "2021-11-07 01:59:59 PST" "2021-11-07 02:00:00 PST"
[13] "2021-11-07 02:30:00 PST"
x_pst <- lubridate::ymd_hms(x, tz = "PST8PDT")
> x_pst
# same output as above
那么为什么时区在如此特定的时间切换,lubridate 通过将 PST 分配给更改之前的所有日期时间来做什么?
会议信息:
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.0
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: US/Pacific
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
loaded via a namespace (and not attached):
[1] compiler_4.3.1 generics_0.1.3 tools_4.3.1
[4] lubridate_1.9.3 timechange_0.2.0
这不是一个完整的答案,但我希望有更多专业知识的人可以以此为基础。
as.POSIXct
在进入代码之前,我首先想提供一些背景信息。我们从通用函数
as.POSIXct
开始,它定义了多个 S3
方法。
as.POSIXct
#> function (x, tz = "", ...)
#> UseMethod("as.POSIXct")
methods(as.POSIXct)
#> [1] as.POSIXct.Date as.POSIXct.default as.POSIXct.numeric as.POSIXct.POSIXlt
#> see '?methods' for accessing help and source code
对于OP给出的示例,由于我们正在处理字符数据类型,因此我们将使用
default
方法,其定义为:
as.POSIXct.default
#> function (x, tz = "", ...)
#> {
#> if (inherits(x, "POSIXct"))
#> return(if (missing(tz)) x else .POSIXct(x, tz))
#> if (is.null(x))
#> return(.POSIXct(numeric(), tz))
#> if (is.character(x) || is.factor(x))
#> return(as.POSIXct(as.POSIXlt(x, tz, ...), tz, ...))
#> if (is.logical(x) && all(is.na(x)))
#> return(.POSIXct(as.numeric(x), tz))
#> stop(gettextf("do not know how to convert '%s' to class %s",
#> deparse1(substitute(x)), dQuote("POSIXct")), domain = NA)
#> }
这让我们调用
as.POSIXlt
(上面的第三个条件),这是一个通用函数,它恰好有一个字符S3
方法:as.POSIXlt.character
。我不会粘贴源代码,但该函数的核心是strptime
。
strptime
#> function (x, format, tz = "")
#> .Internal(strptime(if (is.character(x)) x else if (is.object(x)) `names<-`(as.character(x),
#> names(x)) else `storage.mode<-`(x, "character"), format, tz))
您可以在
此处查看
C
代码。我最初尝试逻辑地遵循代码,但事实证明这非常困难。
RApiDatetime
幸运的是,有一个包RApiDatetime(感谢Dirk!),它具有以下功能:
RApiDatetime::rapistrptime
。根据 OP 提供的值调用它:
RApiDatetime::rapistrptime(x, fmt = "%Y-%m-%d %H:%M:%OS", "PST8PDT")
#> $sec
#> [1] 0 1 0 0 10 14 15 30 59 0 59 0 0
#>
#> $min
#> [1] 0 0 35 36 36 36 36 36 36 45 59 0 30
#>
#> $hour
#> [1] 1 1 1 1 1 1 1 1 1 1 1 2 2
#>
#> $mday
#> [1] 7 7 7 7 7 7 7 7 7 7 7 7 7
#>
#> $mon
#> [1] 10 10 10 10 10 10 10 10 10 10 10 10 10
#>
#> $year
#> [1] 121 121 121 121 121 121 121 121 121 121 121 121 121
#>
#> $wday
#> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
#>
#> $yday
#> [1] 310 310 310 310 310 310 310 310 310 310 310 310 310
#>
#> $isdst
#> [1] 1 1 1 1 1 1 0 0 0 0 0 0 0
#>
#> $zone
#> [1] "PDT" "PDT" "PDT" "PDT" "PDT" "PDT" "PST" "PST" "PST" "PST" "PST" "PST" "PST"
#>
#> $gmtoff
#> [1] NA NA NA NA NA NA NA NA NA NA NA NA NA
#>
#> attr(,"class")
#> [1] "POSIXlt" "POSIXt"
#> attr(,"tzone")
#> [1] "PST8PDT" "PST" "PDT"
我们看到
isdst
领域看起来值得研究。克隆存储库并粗略使用 printf
后,我可以更轻松地遵循该路径。我们发现isdist
背后的真正行动发生在这里。
.
.
OK = tm->tm_year < 138 && tm->tm_year >= (have_broken_mktime() ? 70 : 02);
if(OK) {
res = (double) mktime(tm);
if (res == -1.) return res;
.
.
mktime
最后我们在关于
mktime
的评论中得到了我的主张。
我写了这个非常简单的
C++
函数来看看调用 mktime
后我们的结构会发生什么:
#include <Rcpp.h>
using namespace Rcpp;
#include <time.h>
#include <stdio.h>
// [[Rcpp::export]]
void CheckMkTime(int tm_sec) {
struct tm info;
info.tm_sec = tm_sec;
info.tm_min = 36;
info.tm_hour = 1;
info.tm_mday = 7;
info.tm_mon = 10;
info.tm_year = 121;
info.tm_wday = 0;
info.tm_yday = 310;
info.tm_isdst = -1;
time_t val = mktime(&info);
printf("mktime_res: %jd\n, tm_gmtoff: %ld\n, tm_sec: %d\n,"
"tm_min: %d\n, tm_hour: %d\n, tm_mday: %d\n, tm_mon: %d\n,"
"tm_year: %d\n, tm_wday: %d\n, tm_yday: %d\n, tm_isdst: %d\n",
val,
info.tm_gmtoff,
info.tm_sec,
info.tm_min,
info.tm_hour,
info.tm_mday,
info.tm_mon,
info.tm_year,
info.tm_wday,
info.tm_yday,
info.tm_isdst);
}
并用
tm_sec = 14
调用它,我们有:
CheckMkTime(14)
mktime_res: 1636274174
, tm_gmtoff: -25200
, tm_sec: 14
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 1
通过
tm_sec = 15
我们看到:
CheckMkTime(15)
mktime_res: 1636277775
, tm_gmtoff: -28800
, tm_sec: 15
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 0
所以问题是
,对吧?mktime
我不太确定...我编写了一个纯
C
函数并在终端中编译它:
#include <time.h>
#include <stdio.h>
int main(void) {
struct tm info;
info.tm_sec = 14;
info.tm_min = 36;
info.tm_hour = 1;
info.tm_mday = 7;
info.tm_mon = 10;
info.tm_year = 121;
info.tm_wday = 0;
info.tm_yday = 310;
info.tm_isdst = -1;
time_t val = mktime(&info);
printf(" sec = 14 mktime_res: %jd\n, tm_gmtoff: %ld\n, tm_sec: %d\n,"
"tm_min: %d\n, tm_hour: %d\n, tm_mday: %d\n, tm_mon: %d\n,"
"tm_year: %d\n, tm_wday: %d\n, tm_yday: %d\n, tm_isdst: %d\n",
val,
info.tm_gmtoff,
info.tm_sec,
info.tm_min,
info.tm_hour,
info.tm_mday,
info.tm_mon,
info.tm_year,
info.tm_wday,
info.tm_yday,
info.tm_isdst);
info.tm_sec = 15;
info.tm_min = 36;
info.tm_hour = 1;
info.tm_mday = 7;
info.tm_mon = 10;
info.tm_year = 121;
info.tm_wday = 0;
info.tm_yday = 310;
info.tm_isdst = -1;
val = mktime(&info);
printf("\n\n sec = 15 mktime_res: %jd\n, tm_gmtoff: %ld\n, tm_sec: %d\n,"
"tm_min: %d\n, tm_hour: %d\n, tm_mday: %d\n, tm_mon: %d\n,"
"tm_year: %d\n, tm_wday: %d\n, tm_yday: %d\n, tm_isdst: %d\n",
val,
info.tm_gmtoff,
info.tm_sec,
info.tm_min,
info.tm_hour,
info.tm_mday,
info.tm_mon,
info.tm_year,
info.tm_wday,
info.tm_yday,
info.tm_isdst);
return 0;
}
我们看到的终端:
% clang time_shift.c -o time_shift
% ./time_shift
sec = 14 mktime_res: 1636266974
, tm_gmtoff: -18000
, tm_sec: 14
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 0
sec = 15 mktime_res: 1636266975
, tm_gmtoff: -18000
, tm_sec: 15
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 0
我们在这里没有看到问题......我现在真的很困惑。我尝试查看 mktime 源代码,但它超出了我的范围。