我的数据框有一列没有时区。但由于夏令时的变化,我在数据操作方面遇到了一些问题。这些问题是由于时钟向前更改时“丢失”行以及时钟向后更改时“重复”行造成的。
我编写了一个函数来添加时区,它会自动将 DST 计入计数。
def fix_dutch_clock_shift(df):
df_fixed = df.copy()
pd.to_datetime(df_fixed['datetime'])
amsterdam_tz = pytz.timezone('Europe/Amsterdam')
for index, row in df_fixed.iterrows():
dt = row['datetime']
dt = amsterdam_tz.localize(dt)
df_fixed.at[index, 'datetime'] = dt
return df_fixed
这对于在正确的日期时间向前更改时钟将时区从 +01:00 更改为 +02:00 非常有效,但对于向后更改的时钟我仍然得到重复项。
Datetime, Long, Short
2022-10-30 02:00:00+02:00,61.58,61.58
2022-10-30 02:15:00+02:00,49.98,49.98
2022-10-30 02:30:00+02:00,26.72,26.72
2022-10-30 02:45:00+02:00,-111.32,-111.32
2022-10-30 02:00:00+02:00,-111.32,-111.32
2022-10-30 02:15:00+02:00,-130.19,-130.19
2022-10-30 02:30:00+02:00,-6.69,-6.69
2022-10-30 02:45:00+02:00,-130.19,-130.19
但我希望时区在第二次到达 2022-10-30 02:00 时从 +02:00 移回 +01:00。
Datetime, Long, Short
2022-10-30 02:00:00+02:00,61.58,61.58
2022-10-30 02:15:00+02:00,49.98,49.98
2022-10-30 02:30:00+02:00,26.72,26.72
2022-10-30 02:45:00+02:00,-111.32,-111.32
2022-10-30 02:00:00+01:00,-111.32,-111.32
2022-10-30 02:15:00+01:00,-130.19,-130.19
2022-10-30 02:30:00+01:00,-6.69,-6.69
2022-10-30 02:45:00+01:00,-130.19,-130.19
IIUC,您的日期时间数据在全年中具有 UTC 偏移量 (+02:00),应为 +01:00 或 +02:00,具体取决于 DST 是否有效。
您可以通过先删除偏移量,然后设置正确的时区来实现此目的:
# to datetime, then remove UTC offset (+02:00)
df["dt"] = pd.to_datetime(df["Datetime"]).dt.tz_localize(None)
# set time zone and infer dst transition times
df["dt"] = df["dt"].dt.tz_localize("Europe/Amsterdam", ambiguous="infer")
df
Datetime Long Short dt
0 2022-10-30 02:00:00+02:00 61.58 61.58 2022-10-30 02:00:00+02:00
1 2022-10-30 02:15:00+02:00 49.98 49.98 2022-10-30 02:15:00+02:00
2 2022-10-30 02:30:00+02:00 26.72 26.72 2022-10-30 02:30:00+02:00
3 2022-10-30 02:45:00+02:00 -111.32 -111.32 2022-10-30 02:45:00+02:00
4 2022-10-30 02:00:00+02:00 -111.32 -111.32 2022-10-30 02:00:00+01:00
5 2022-10-30 02:15:00+02:00 -130.19 -130.19 2022-10-30 02:15:00+01:00
6 2022-10-30 02:30:00+02:00 -6.69 -6.69 2022-10-30 02:30:00+01:00
7 2022-10-30 02:45:00+02:00 -130.19 -130.19 2022-10-30 02:45:00+01:00