解答:Perl方法,用于解决本地时间中存储的模棱两可的数据而不参考DST状态

问题描述 投票:0回答:1

这是answer to a prior question的后续版本,其中@ikegami提供了将CST / CDT的本地时间转换为UTC(Z)的示例。

他以有限的格式将我的示例粘贴到注释中,并创建了此源数据:

11/03/19 01:00:00AM CDT
11/03/19 01:59:58AM CDT
11/03/19 01:59:59AM CDT
11/03/19 01:00:00AM CST
11/03/19 01:00:01AM CST
11/03/19 01:59:59AM CST

注释中的源数据实际上就是这个(但是不可读,因为我不知道如何在注释中插入多行代码块):

11/03/19 01:00:00AM CDT
11/03/19 01:30:00AM CDT
11/03/19 01:50:00AM CDT
11/03/19 01:00:00AM CDT
11/03/19 01:30:00AM CDT
11/03/19 01:50:00AM CDT
11/03/19 02:00:00AM CST
11/03/19 02:10:01AM CST

[重要区别是DST标记是从日期和时间派生的,因此无论时间是在01:00到02:00之间,它都是CDT,无论它是“第一次”还是“第二次”出现当地时间小时。那就是要解决的歧义。上面的数据已经是另一个脚本的输出。原始源数据如下所示:

11/03/19 01:00:00A
11/03/19 01:30:00A
11/03/19 01:50:00A
11/03/19 01:00:00A
11/03/19 01:30:00A
11/03/19 01:50:00A
11/03/19 02:00:00A

要解决此问题,我的第一个想法是检测早于上一个条目的时间条目,并在该条目和随后的条目上强制将DST标志设置为Standard,直到计算出的DST标志更改为止。

这是最好的方法,还是存在其他可能已经解决了此问题的“内置”功能或库函数?

perl datetime dst
1个回答
0
投票

我写下了这个问题,但是在开发解决方案之前没有发布它。这是我出于歧义而提出的-我希望它可以帮助其他人。特别感谢@ikegami提供使用Strptime和strftime创建UTC格式的最终​​输出的examples。这是整个脚本,从时间戳格式开始,例如“ 11/03/19 01:00:00A”。在此示例中,数据记录被裁剪为时间戳,删除随后的数据。

#!/usr/bin/perl

# Parse local time stamped data files save in UTC Time


use strict;
use warnings;
use File::Basename;   
use Time::Piece;    
use Time::Local;     
use feature qw( say );
use DateTime::Format::Strptime qw( );

STDOUT->autoflush(1);

my $format = DateTime::Format::Strptime->new(
   pattern   => '%m/%d/%y %I:%M:%S%p %Z',
   locale    => 'en',
   zone_map  => { CST => '-0600', CDT => '-0500' }, # Handle non-standard time zone names.
   time_zone => 'America/Chicago',                  # Optional. Convert result to this tz.
   strict    => 1,
   on_error  => 'croak',
);


# Input file from command line.
open (INFILE, $ARGV[0])
  or die "Error cannot open $ARGV[0]: $!";

print "Opening $ARGV[0] for input     ";           


my $base = fileparse($ARGV[0],'\..*');
my $outfname = $base.'.csv';

if ( defined $ARGV[1] )  {    # If a 2nd paramater is given, use it for the output file
   open (OUT, '>'.$ARGV[1]); 
 } else {
   open (OUT, '>'.$outfname);    # Else name the output file with .csv
   print "opening $outfname for output\n";
 }


 my $prev_outtemp = 0;            
 my $lastparsedtime = 0;  
 my $ForceStdTime = 0;

 while (<INFILE>) {
  chomp;
  my $workline = $_;   
  $workline=~ s/^\s+|\s+$//g;   # trim leading/trailing space
  my @pline = split(/ /,$workline);    #split input line into elements @pline[0] @pline[1] etc            

  $pline[1] =~ s/A/AM/;     # change A and M to AM/PM
  $pline[1] =~ s/P/PM/;

  my $timestampstr =  $pline[0]." ".$pline[1];

                                                    #    12/31/09 08:40:00AM
  my $parsedtime = Time::Piece->strptime($timestampstr, '%m/%d/%y %I:%M:%S%p');

  my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =  localtime($parsedtime);      

  my $tltime = timelocal( $sec, $min, $hour, $mday, $mon, $year );
    $year += 1900;
    $mon += 1;

  my ($dst) = (localtime($tltime))[8]; # fetch daylight savings time flag

  if ( $parsedtime < $lastparsedtime ) {      # Time discontinuity
            $ForceStdTime  = 1;               # force standard time during 1 hour in spring
        }            
  $lastparsedtime = $parsedtime;

  my $dtlocaltag = "";
  if ($dst and not $ForceStdTime) {
        $dtlocaltag = "\"$pline[0] $pline[1] CDT\"";
            }
   else {
        $dtlocaltag =  "\"$pline[0] $pline[1] CST\"";
            }

    print "$dtlocaltag   ";          

  if (not $dst) {
      $ForceStdTime = 0;        # reset force flag once localtime sees standard time
      }             

   my $dt = $format->parse_datetime($dtlocaltag);
   my $epoch = $dt->epoch;
   my $local_dt_str = $dt->strftime("%Y-%m-%dT%H:%M:%S%z");
   $dt->set_time_zone('UTC');
   my $utc_dt_str = $dt->strftime("%Y-%m-%dT%H:%M:%SZ");
   print " $local_dt_str $utc_dt_str\n";
   print OUT "\"$utc_dt_str\"";

   #  --- snip --- processing of rest of data in text input line....

   print OUT "\n";

}  # end of while loop

close (INFILE);
close(OUT);

输入数据文件看起来像这样:

11/03/19 12:50:00A
11/03/19 01:00:00A
11/03/19 01:10:00A
11/03/19 01:20:00A
11/03/19 01:30:00A
11/03/19 01:40:00A
11/03/19 01:50:00A
11/03/19 01:00:00A
11/03/19 01:10:00A
11/03/19 01:20:00A
11/03/19 01:30:00A
11/03/19 01:40:00A
11/03/19 01:50:00A
11/03/19 02:00:00A
11/03/19 02:10:01A
03/08/20 12:50:00A
03/08/20 01:00:00A
03/08/20 01:10:00A
03/08/20 01:20:00A
03/08/20 01:30:00A
03/08/20 01:40:00A
03/08/20 01:50:00A
03/08/20 03:00:00A
03/08/20 03:10:00A

输出数据文件如下所示:

"2019-11-03T05:50:00Z"
"2019-11-03T06:00:00Z"
"2019-11-03T06:10:00Z"
"2019-11-03T06:20:00Z"
"2019-11-03T06:30:00Z"
"2019-11-03T06:40:00Z"
"2019-11-03T06:50:00Z"
"2019-11-03T07:00:00Z"
"2019-11-03T07:10:00Z"
"2019-11-03T07:20:00Z"
"2019-11-03T07:30:00Z"
"2019-11-03T07:40:00Z"
"2019-11-03T07:50:00Z"
"2019-11-03T08:00:00Z"
"2019-11-03T08:10:01Z"
"2020-03-08T06:50:00Z"
"2020-03-08T07:00:00Z"
"2020-03-08T07:10:00Z"
"2020-03-08T07:20:00Z"
"2020-03-08T07:30:00Z"
"2020-03-08T07:40:00Z"
"2020-03-08T07:50:00Z"
"2020-03-08T08:00:00Z"
"2020-03-08T08:10:00Z"
© www.soinside.com 2019 - 2024. All rights reserved.