我使用
time.time()
记录了一些带时间戳的数据。我想使用 pandas
评估数据并将时间戳转换为日期时间对象以便更好地处理。然而,当我尝试时,我的所有计时数据都偏离了一小时。此示例在我的机器上重现了该问题:
import datetime as dt
import pandas as pd
origin = dt.datetime(2024, 1, 15).timestamp()
timestamps = [origin + 3600 * i for i in range(10)]
print([dt.datetime.fromtimestamp(t).isoformat() for t in timestamps])
print(pd.to_datetime(timestamps, unit='s'))
输出:
['2024-01-15T00:00:00', '2024-01-15T01:00:00', '2024-01-15T02:00:00', '2024-01-15T03:00:00', '2024-01-15T04:00:00', '2024-01-15T05:00:00', '2024-01-15T06:00:00', '2024-01-15T07:00:00', '2024-01-15T08:00:00', '2024-01-15T09:00:00']
DatetimeIndex(['2024-01-14 23:00:00', '2024-01-15 00:00:00',
'2024-01-15 01:00:00', '2024-01-15 02:00:00',
'2024-01-15 03:00:00', '2024-01-15 04:00:00',
'2024-01-15 05:00:00', '2024-01-15 06:00:00',
'2024-01-15 07:00:00', '2024-01-15 08:00:00'],
dtype='datetime64[ns]', freq=None)
我猜测这与我的时区有关(我在 UTC+1),但我很困惑应该如何处理这个问题。如果可能的话,我想避免明确指定时区等(尽管必要时我会这样做)。我想要获得与
dt.datetime.fromtimestamp()
相同的时间。我该怎么做?
datetime
。
您应该在创建源时指定使用 UTC 日期时间,否则这将使用您的时区:
import datetime as dt
import pandas as pd
origin = dt.datetime(2024, 1, 15, tzinfo=dt.timezone.utc).timestamp()
timestamps = [origin + 3600 * i for i in range(10)]
print(pd.to_datetime(timestamps, unit='s'))
输出:
DatetimeIndex(['2024-01-15 00:00:00', '2024-01-15 01:00:00',
'2024-01-15 02:00:00', '2024-01-15 03:00:00',
'2024-01-15 04:00:00', '2024-01-15 05:00:00',
'2024-01-15 06:00:00', '2024-01-15 07:00:00',
'2024-01-15 08:00:00', '2024-01-15 09:00:00'],
dtype='datetime64[ns]', freq=None)
同样:
print([dt.datetime.fromtimestamp(t, dt.timezone.utc).isoformat() for t in timestamps])
输出:
['2024-01-15T00:00:00+00:00', '2024-01-15T01:00:00+00:00', '2024-01-15T02:00:00+00:00', '2024-01-15T03:00:00+00:00', '2024-01-15T04:00:00+00:00', '2024-01-15T05:00:00+00:00', '2024-01-15T06:00:00+00:00', '2024-01-15T07:00:00+00:00', '2024-01-15T08:00:00+00:00', '2024-01-15T09:00:00+00:00']