如何根据 Pandas 上的第一个索引值获取连续日期索引

问题描述 投票:0回答:1

我得到了带有此索引的 df,在某个时候日期从 2024-03-03 更改为 2023-02-25。 我想用正确部分的逻辑扩展替换错误部分(2023...)

样品:

2024-02-23    -5.60000
2024-02-24   -13.00000
2024-02-25   -27.20000
2024-02-26    -4.20000
2024-02-27   -11.20000
2024-02-28   -14.73625
2024-02-29   -19.37000
2024-03-01   -16.89000
2024-03-02    -5.97000
2024-03-03    -1.30000
2023-02-25   -35.40000
2023-02-26   -28.70000
2023-02-27   -26.40000
2023-02-28   -15.40000
2023-03-01   -14.10000
2023-03-02   -11.20000
2023-03-03   -21.00000
2023-03-04   -17.00000
2023-03-05   -17.60000
2023-03-06    -6.70000

如何让它变得干净并且Pythonic?

python pandas datetime
1个回答
0
投票

要纠正部分日期错误的 DataFrame 索引(例如,跳回一年),您可以识别不连续点,然后通过添加必要的时间增量来调整不正确的日期:

import pandas as pd


dates = ['2024-02-23', '2024-02-24', '2024-02-25', '2024-02-26', '2024-02-27', 
         '2024-02-28', '2024-02-29', '2024-03-01', '2024-03-02', '2024-03-03',
         '2023-02-25', '2023-02-26', '2023-02-27', '2023-02-28', '2023-03-01',
         '2023-03-02', '2023-03-03', '2023-03-04', '2023-03-05', '2023-03-06']
values = [-5.6, -13.0, -27.2, -4.2, -11.2, -14.73625, -19.37, -16.89, -5.97,
          -1.3, -35.4, -28.7, -26.4, -15.4, -14.1, -11.2, -21.0, -17.0, -17.6, -6.7]
df = pd.DataFrame(values, index=pd.to_datetime(dates), columns=['Values'])

# Find where the date decreases from one row to the next
discontinuity_point = df.index[df.index.to_series().diff() < pd.Timedelta(days=0)].min()
# Add one year to all dates that are less than the discontinuity point
df.index = df.index.map(lambda x: x if x >= discontinuity_point else x + pd.DateOffset(years=1))

print(df)
© www.soinside.com 2019 - 2024. All rights reserved.