我在尝试使用合适的预测器对象进行预测时遇到错误。我已经遵循了这个 SarimaxForecaster 教程,并且没有收到任何错误。然而,我的数据集每小时一次,当我尝试预测时,我收到以下错误 -
TypeError: Expected index of type <class 'pandas.core.indexes.datetimes.DatetimeIndex'> for `last_window`. Got <class 'pandas.core.indexes.range.RangeIndex'>.
似乎与函数中的
last_window
有关。请参阅下面的可重现代码和完整错误日志 -
# libraries
import pandas as pd
import numpy as np
from skforecast.Sarimax import Sarimax
from skforecast.ForecasterSarimax import ForecasterSarimax
# dataset
start_date = '2023-01-01'
end_date = '2024-01-01'
date_range = pd.date_range(start=start_date, end=end_date, freq='D')
qty = np.random.randint(low=10, high=100, size=len(date_range))
data = pd.DataFrame({'date': date_range, 'qty': qty})
data.set_index('date', inplace=True)
end_train = '2023-11-01'
data_train = df.loc[:end_train]
data_test = df.loc[end_train:]
# changing to series for skforecast
data_train_series = pd.Series(data_train['qty'].values, index=data_train.index, name='qty')
data_test_series = pd.Series(data_test['qty'].values, index=data_test.index, name='qty')
# sarimax forecaster
forecaster = ForecasterSarimax(
regressor=Sarimax(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
)
forecaster.fit(y=data_train_series, suppress_warnings=True)
# predict
predictions = forecaster.predict(steps=len(data_test_series))
完整错误日志 -
TypeError Traceback (most recent call last)
File ~/Desktop/School/2024_projects/time_series_feature_engineering/practice/01_bakery_skforecast_quickstart.py:1
----> 1 predictions = forecaster.predict(steps=len(data_test_series))
File ~/Desktop/School/2024_projects/time_series_feature_engineering/.venv/lib/python3.9/site-packages/skforecast/ForecasterSarimax/ForecasterSarimax.py:363, in ForecasterSarimax.predict(self, steps, last_window, last_window_exog, exog)
359 # Needs to be a new variable to avoid arima_res_.append when using
360 # self.last_window. It already has it stored.
361 last_window_check = last_window if last_window is not None else self.last_window
--> 363 check_predict_input(
364 forecaster_name = type(self).__name__,
365 steps = steps,
366 fitted = self.fitted,
367 included_exog = self.included_exog,
368 index_type = self.index_type,
369 index_freq = self.index_freq,
370 window_size = self.window_size,
371 last_window = last_window_check,
372 last_window_exog = last_window_exog,
373 exog = exog,
374 exog_type = self.exog_type,
375 exog_col_names = self.exog_col_names,
376 interval = None,
377 alpha = None,
378 max_steps = None,
379 levels = None,
380 series_col_names = None
381 )
383 # If not last_window is provided, last_window needs to be None
384 if last_window is not None:
File ~/Desktop/School/2024_projects/time_series_feature_engineering/.venv/lib/python3.9/site-packages/skforecast/utils/utils.py:643, in check_predict_input(forecaster_name, steps, fitted, included_exog, index_type, index_freq, window_size, last_window, last_window_exog, exog, exog_type, exog_col_names, interval, alpha, max_steps, levels, series_col_names)
638 _, last_window_index = preprocess_last_window(
639 last_window = last_window.iloc[:0],
640 return_values = False
641 )
642 if not isinstance(last_window_index, index_type):
--> 643 raise TypeError(
644 (f"Expected index of type {index_type} for `last_window`. "
645 f"Got {type(last_window_index)}.")
646 )
647 if isinstance(last_window_index, pd.DatetimeIndex):
648 if not last_window_index.freqstr == index_freq:
TypeError: Expected index of type <class 'pandas.core.indexes.datetimes.DatetimeIndex'> for `last_window`. Got <class 'pandas.core.indexes.range.RangeIndex'>.
我遇到了同样的错误,可以在 preprocess_last_window() 函数的文档字符串中找到原因。如果未指定 DatetimeIndex 的频率,此函数会将 DatetimeIndex 转换为 RangeIndex,因此会出现错误消息:需要 DatetimeIndex 但提供了 RangeIndex。在我看来,这是一个相当误导性的错误消息,因为您实际上提供了带有 DateTimeIndex 的 pd.Series 。如果您在 end_train = '2023-11-01' 之前添加以下代码行,它应该可以工作。
data.index = pd.date_range(start=data.index[0], end=data.index[-1], freq='D')