R 在 1:36:14 而不是在 2:00:00 从 PDT 切换到 PST - Lubridate 在切换之前分配时区

问题描述 投票:0回答:1

当查看与从 PDT 到 PST 的时区更改重叠的日期时间值时,R 似乎在 1:36:14 而不是预期的 2:00:00 切换时区。具体来说,R 将 PST 时区分配给 2021-11-07 01:36:14 之后的所有日期时间(如下所示):

x <-c(
    "2021-11-07 1:00:00",
    "2021-11-07 1:00:01",
    "2021-11-07 1:35:00",
    "2021-11-07 1:36:00",
    "2021-11-07 1:36:10",
    "2021-11-07 1:36:14",
    "2021-11-07 1:36:15",
    "2021-11-07 1:36:30",
    "2021-11-07 1:36:59",
    "2021-11-07 1:45:00",
    "2021-11-07 1:59:59",
    "2021-11-07 2:00:00",
    "2021-11-07 2:30:00"
    )
x_pst <- as.POSIXct(x, tz = "PST8PDT")
> x_pst
# ...
[5] "2021-11-07 01:36:10 PDT" "2021-11-07 01:36:14 PDT"
[7] "2021-11-07 01:36:15 PST" "2021-11-07 01:36:30 PST"
# ...

除此之外,lubridate 似乎在切换之前将所有日期时间调整为 PST(使用相同的数据):

x_pst <- lubridate::as_datetime(x, tz = "PST8PDT")
> x_pst
[1] "2021-11-07 01:00:00 PST" "2021-11-07 01:00:01 PST"
[3] "2021-11-07 01:35:00 PST" "2021-11-07 01:36:00 PST"
[5] "2021-11-07 01:36:10 PST" "2021-11-07 01:36:14 PST"
[7] "2021-11-07 01:36:15 PST" "2021-11-07 01:36:30 PST"
[9] "2021-11-07 01:36:59 PST" "2021-11-07 01:45:00 PST"
[11] "2021-11-07 01:59:59 PST" "2021-11-07 02:00:00 PST"
[13] "2021-11-07 02:30:00 PST"

x_pst <- lubridate::ymd_hms(x, tz = "PST8PDT")
> x_pst
# same output as above

那么为什么时区在如此特定的时间切换,lubridate 通过将 PST 分配给更改之前的所有日期时间来做什么?

会议信息:

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.0

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: US/Pacific
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

loaded via a namespace (and not attached):
[1] compiler_4.3.1   generics_0.1.3   tools_4.3.1     
[4] lubridate_1.9.3  timechange_0.2.0
r datetime lubridate
1个回答
0
投票

这不是一个完整的答案,但我希望有更多专业知识的人可以以此为基础。

as.POSIXct

在进入代码之前,我首先想提供一些背景信息。我们从通用函数

as.POSIXct
开始,它定义了多个
S3
方法。

as.POSIXct
#> function (x, tz = "", ...)
#> UseMethod("as.POSIXct")

methods(as.POSIXct)
#> [1] as.POSIXct.Date    as.POSIXct.default as.POSIXct.numeric as.POSIXct.POSIXlt
#> see '?methods' for accessing help and source code

对于OP给出的示例,由于我们正在处理字符数据类型,因此我们将使用

default
方法,其定义为:

as.POSIXct.default
#> function (x, tz = "", ...)
#> {
#>     if (inherits(x, "POSIXct"))
#>         return(if (missing(tz)) x else .POSIXct(x, tz))
#>     if (is.null(x))
#>         return(.POSIXct(numeric(), tz))
#>     if (is.character(x) || is.factor(x))
#>         return(as.POSIXct(as.POSIXlt(x, tz, ...), tz, ...))
#>     if (is.logical(x) && all(is.na(x)))
#>         return(.POSIXct(as.numeric(x), tz))
#>     stop(gettextf("do not know how to convert '%s' to class %s",
#>         deparse1(substitute(x)), dQuote("POSIXct")), domain = NA)
#> }

这让我们调用

as.POSIXlt
(上面的第三个条件),这是一个通用函数,它恰好有一个字符
S3
方法:
as.POSIXlt.character
。我不会粘贴源代码,但该函数的核心是
strptime

strptime
#> function (x, format, tz = "")
#> .Internal(strptime(if (is.character(x)) x else if (is.object(x)) `names<-`(as.character(x),
#>     names(x)) else `storage.mode<-`(x, "character"), format, tz))

您可以在

此处
查看C代码。我最初尝试逻辑地遵循代码,但事实证明这非常困难。

RApiDatetime

幸运的是,有一个包RApiDatetime(感谢Dirk!),它具有以下功能:

RApiDatetime::rapistrptime
。根据 OP 提供的值调用它:

RApiDatetime::rapistrptime(x, fmt = "%Y-%m-%d %H:%M:%OS", "PST8PDT")
#> $sec
#>  [1]  0  1  0  0 10 14 15 30 59  0 59  0  0
#>
#> $min
#>  [1]  0  0 35 36 36 36 36 36 36 45 59  0 30
#>
#> $hour
#>  [1] 1 1 1 1 1 1 1 1 1 1 1 2 2
#>
#> $mday
#>  [1] 7 7 7 7 7 7 7 7 7 7 7 7 7
#>
#> $mon
#>  [1] 10 10 10 10 10 10 10 10 10 10 10 10 10
#>
#> $year
#>  [1] 121 121 121 121 121 121 121 121 121 121 121 121 121
#>
#> $wday
#>  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0
#>
#> $yday
#>  [1] 310 310 310 310 310 310 310 310 310 310 310 310 310
#>
#> $isdst
#>  [1] 1 1 1 1 1 1 0 0 0 0 0 0 0
#>
#> $zone
#>  [1] "PDT" "PDT" "PDT" "PDT" "PDT" "PDT" "PST" "PST" "PST" "PST" "PST" "PST" "PST"
#>
#> $gmtoff
#>  [1] NA NA NA NA NA NA NA NA NA NA NA NA NA
#>
#> attr(,"class")
#> [1] "POSIXlt" "POSIXt"
#> attr(,"tzone")
#> [1] "PST8PDT" "PST"     "PDT"

我们看到

isdst
领域看起来值得研究。克隆存储库并粗略使用
printf
后,我可以更轻松地遵循该路径。我们发现
isdist
背后的真正行动发生在这里

.
.
    OK = tm->tm_year < 138 && tm->tm_year >= (have_broken_mktime() ? 70 : 02);
    if(OK) {
    res = (double) mktime(tm);
    if (res == -1.) return res;
.
.

mktime

最后我们在关于

mktime
的评论中得到了我的主张。

我写了这个非常简单的

C++
函数来看看调用
mktime
后我们的结构会发生什么:

#include <Rcpp.h>
using namespace Rcpp;

#include <time.h>
#include <stdio.h>

// [[Rcpp::export]]
void CheckMkTime(int tm_sec) {
    struct tm info;

    info.tm_sec = tm_sec;
    info.tm_min = 36;
    info.tm_hour = 1;
    info.tm_mday = 7;
    info.tm_mon = 10;
    info.tm_year = 121;
    info.tm_wday = 0;
    info.tm_yday = 310;
    info.tm_isdst = -1;

    time_t val = mktime(&info);
    printf("mktime_res: %jd\n, tm_gmtoff: %ld\n, tm_sec: %d\n,"
           "tm_min: %d\n, tm_hour: %d\n, tm_mday: %d\n, tm_mon: %d\n,"
           "tm_year: %d\n, tm_wday: %d\n, tm_yday: %d\n, tm_isdst: %d\n",
           val,
           info.tm_gmtoff,
           info.tm_sec,
           info.tm_min,
           info.tm_hour,
           info.tm_mday,
           info.tm_mon,
           info.tm_year,
           info.tm_wday,
           info.tm_yday,
           info.tm_isdst);
}

并用

tm_sec = 14
调用它,我们有:

CheckMkTime(14)
mktime_res: 1636274174
, tm_gmtoff: -25200
, tm_sec: 14
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 1

通过

tm_sec = 15
我们看到:

CheckMkTime(15)
mktime_res: 1636277775
, tm_gmtoff: -28800
, tm_sec: 15
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 0

所以问题是

mktime
,对吧?

我不太确定...我编写了一个纯

C
函数并在终端中编译它:

#include <time.h>
#include <stdio.h>

int main(void) {
    struct tm info;

    info.tm_sec = 14;
    info.tm_min = 36;
    info.tm_hour = 1;
    info.tm_mday = 7;
    info.tm_mon = 10;
    info.tm_year = 121;
    info.tm_wday = 0;
    info.tm_yday = 310;
    info.tm_isdst = -1;

    time_t val = mktime(&info);
    printf(" sec = 14 mktime_res: %jd\n, tm_gmtoff: %ld\n, tm_sec: %d\n,"
               "tm_min: %d\n, tm_hour: %d\n, tm_mday: %d\n, tm_mon: %d\n,"
               "tm_year: %d\n, tm_wday: %d\n, tm_yday: %d\n, tm_isdst: %d\n",
               val,
               info.tm_gmtoff,
               info.tm_sec,
               info.tm_min,
               info.tm_hour,
               info.tm_mday,
               info.tm_mon,
               info.tm_year,
               info.tm_wday,
               info.tm_yday,
               info.tm_isdst);


    info.tm_sec = 15;
    info.tm_min = 36;
    info.tm_hour = 1;
    info.tm_mday = 7;
    info.tm_mon = 10;
    info.tm_year = 121;
    info.tm_wday = 0;
    info.tm_yday = 310;
    info.tm_isdst = -1;

    val = mktime(&info);
    printf("\n\n sec = 15 mktime_res: %jd\n, tm_gmtoff: %ld\n, tm_sec: %d\n,"
               "tm_min: %d\n, tm_hour: %d\n, tm_mday: %d\n, tm_mon: %d\n,"
               "tm_year: %d\n, tm_wday: %d\n, tm_yday: %d\n, tm_isdst: %d\n",
               val,
               info.tm_gmtoff,
               info.tm_sec,
               info.tm_min,
               info.tm_hour,
               info.tm_mday,
               info.tm_mon,
               info.tm_year,
               info.tm_wday,
               info.tm_yday,
               info.tm_isdst);

    return 0;
}

我们看到的终端:

% clang time_shift.c -o time_shift
% ./time_shift
 sec = 14 mktime_res: 1636266974
, tm_gmtoff: -18000
, tm_sec: 14
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 0


 sec = 15 mktime_res: 1636266975
, tm_gmtoff: -18000
, tm_sec: 15
,tm_min: 36
, tm_hour: 1
, tm_mday: 7
, tm_mon: 10
,tm_year: 121
, tm_wday: 0
, tm_yday: 310
, tm_isdst: 0

我们在这里没有看到问题......我现在真的很困惑。我尝试查看 mktime 源代码,但它超出了我的范围。

© www.soinside.com 2019 - 2024. All rights reserved.