尝试在 Skforecast 中使用拟合的 ForecasterSarimax 对象进行预测时出现类型错误

问题描述 投票:0回答:1

我在尝试使用合适的预测器对象进行预测时遇到错误。我已经遵循了这个 SarimaxForecaster 教程,并且没有收到任何错误。然而,我的数据集每小时一次,当我尝试预测时,我收到以下错误 -

TypeError: Expected index of type <class 'pandas.core.indexes.datetimes.DatetimeIndex'> for `last_window`. Got <class 'pandas.core.indexes.range.RangeIndex'>.

似乎与函数中的

last_window
有关。请参阅下面的可重现代码和完整错误日志 -

# libraries
import pandas as pd
import numpy as np
from skforecast.Sarimax import Sarimax
from skforecast.ForecasterSarimax import ForecasterSarimax

# dataset
start_date = '2023-01-01'
end_date = '2024-01-01'

date_range = pd.date_range(start=start_date, end=end_date, freq='D')
qty = np.random.randint(low=10, high=100, size=len(date_range))
data = pd.DataFrame({'date': date_range, 'qty': qty})
data.set_index('date', inplace=True)

end_train = '2023-11-01'

data_train = df.loc[:end_train]
data_test  = df.loc[end_train:]


# changing to series for skforecast
data_train_series = pd.Series(data_train['qty'].values, index=data_train.index, name='qty')
data_test_series  = pd.Series(data_test['qty'].values, index=data_test.index, name='qty')

# sarimax forecaster
forecaster = ForecasterSarimax(
                 regressor=Sarimax(order=(1, 1, 1), seasonal_order=(1, 1, 1, 12))
            )

forecaster.fit(y=data_train_series, suppress_warnings=True)

# predict
predictions = forecaster.predict(steps=len(data_test_series))

完整错误日志 -

TypeError                                 Traceback (most recent call last)
File ~/Desktop/School/2024_projects/time_series_feature_engineering/practice/01_bakery_skforecast_quickstart.py:1
----> 1 predictions = forecaster.predict(steps=len(data_test_series))

File ~/Desktop/School/2024_projects/time_series_feature_engineering/.venv/lib/python3.9/site-packages/skforecast/ForecasterSarimax/ForecasterSarimax.py:363, in ForecasterSarimax.predict(self, steps, last_window, last_window_exog, exog)
    359 # Needs to be a new variable to avoid arima_res_.append when using 
    360 # self.last_window. It already has it stored.
    361 last_window_check = last_window if last_window is not None else self.last_window
--> 363 check_predict_input(
    364     forecaster_name  = type(self).__name__,
    365     steps            = steps,
    366     fitted           = self.fitted,
    367     included_exog    = self.included_exog,
    368     index_type       = self.index_type,
    369     index_freq       = self.index_freq,
    370     window_size      = self.window_size,
    371     last_window      = last_window_check,
    372     last_window_exog = last_window_exog,
    373     exog             = exog,
    374     exog_type        = self.exog_type,
    375     exog_col_names   = self.exog_col_names,
    376     interval         = None,
    377     alpha            = None,
    378     max_steps        = None,
    379     levels           = None,
    380     series_col_names = None
    381 )
    383 # If not last_window is provided, last_window needs to be None
    384 if last_window is not None:

File ~/Desktop/School/2024_projects/time_series_feature_engineering/.venv/lib/python3.9/site-packages/skforecast/utils/utils.py:643, in check_predict_input(forecaster_name, steps, fitted, included_exog, index_type, index_freq, window_size, last_window, last_window_exog, exog, exog_type, exog_col_names, interval, alpha, max_steps, levels, series_col_names)
    638 _, last_window_index = preprocess_last_window(
    639                            last_window   = last_window.iloc[:0],
    640                            return_values = False
    641                        ) 
    642 if not isinstance(last_window_index, index_type):
--> 643     raise TypeError(
    644         (f"Expected index of type {index_type} for `last_window`. "
    645          f"Got {type(last_window_index)}.")
    646     )
    647 if isinstance(last_window_index, pd.DatetimeIndex):
    648     if not last_window_index.freqstr == index_freq:

TypeError: Expected index of type <class 'pandas.core.indexes.datetimes.DatetimeIndex'> for `last_window`. Got <class 'pandas.core.indexes.range.RangeIndex'>.


forecasting arima sarimax
1个回答
0
投票

我遇到了同样的错误,可以在 preprocess_last_window() 函数的文档字符串中找到原因。如果未指定 DatetimeIndex 的频率,此函数会将 DatetimeIndex 转换为 RangeIndex,因此会出现错误消息:需要 DatetimeIndex 但提供了 RangeIndex。在我看来,这是一个相当误导性的错误消息,因为您实际上提供了带有 DateTimeIndex 的 pd.Series 。如果您在 end_train = '2023-11-01' 之前添加以下代码行,它应该可以工作。

data.index = pd.date_range(start=data.index[0], end=data.index[-1], freq='D')
© www.soinside.com 2019 - 2024. All rights reserved.