Statsmodels ARIMA - 使用predict（）和forecast（）的不同结果

Question

我会使用（Statsmodels）ARIMA来预测系列中的值：

plt.plot(ind, final_results.predict(start=0 ,end=26))
plt.plot(ind, forecast.values)
plt.show()

我想我会从这两个图得到相同的结果，但我得到了这个：

我想知道问题是关于预测还是预测

Answer 1

从图表中可以看出，您正在使用forecast()进行样本预测，使用预测进行样本内预测。根据ARIMA方程的性质，样本外预测往往会收敛到长预测期的样本均值。

为了找出forecast()和predict()如何适用于不同场景，我系统地比较了ARIMA_results类中的各种模型。随意重现与statsmodels_arima_comparison.py in this repository的比较。我查看了order=(p,d,q)的每个组合，仅将p, d, q限制为0或1.例如，可以使用order=(1,0,0)获得简单的自回归模型。简而言之，我使用以下(stationary) time series查看了三个选项：

答：迭代样本内预测形成了历史。历史由前80％的时间序列组成，测试集由最后的20％组成。然后我预测了测试集的第一点，为历史增加了真实值，预测了第二点等。这将给出模型预测质量的评估。

for t in range(len(test)):
    model = ARIMA(history, order=order)
    model_fit = model.fit(disp=-1)
    yhat_f = model_fit.forecast()[0][0]
    yhat_p = model_fit.predict(start=len(history), end=len(history))[0]
    predictions_f.append(yhat_f)
    predictions_p.append(yhat_p)
    history.append(test[t])

B.接下来，我通过迭代预测测试系列的下一个点来查看样本外预测，并将此预测附加到历史记录中。

for t in range(len(test)):
    model_f = ARIMA(history_f, order=order)
    model_p = ARIMA(history_p, order=order)
    model_fit_f = model_f.fit(disp=-1)
    model_fit_p = model_p.fit(disp=-1)
    yhat_f = model_fit_f.forecast()[0][0]
    yhat_p = model_fit_p.predict(start=len(history_p), end=len(history_p))[0]
    predictions_f.append(yhat_f)
    predictions_p.append(yhat_p)
    history_f.append(yhat_f)
    history_f.append(yhat_p)

C.我使用forecast(step=n)参数和predict(start, end)参数，以便使用这些方法进行内部多步预测。

model = ARIMA(history, order=order)
    model_fit = model.fit(disp=-1)
    predictions_f_ms = model_fit.forecast(steps=len(test))[0]
    predictions_p_ms = model_fit.predict(start=len(history), end=len(history)+len(test)-1)

结果表明：

A.预测和预测AR的产量相同结果，但ARMA的结果不同：test time series chart

B.预测和预测产量AR和ARMA的不同结果：test time series chart

C.预测和预测产量相同的AR结果，但ARMA的结果不同：test time series chart

此外，比较B.和C中看似相同的方法。我发现结果中存在细微但可见的差异。

我认为这些差异主要是由于“在forecast()中对原始内源变量的水平进行预测”和predict()预测水平差异（compare the API reference）。

此外，鉴于我更信任statsmodels函数的内部功能，而不是我的简单迭代预测循环（这是主观的），我建议使用forecast(step)或predict(start, end)。

Answer 2

继续noteven2degrees的回复，我提交了一个拉取请求，以便在方法B中从history_f.append(yhat_p)更正为history_p.append(yhat_p)。

此外，正如noteven2degrees建议的那样，与forecast()不同，predict()需要一个参数typ='levels'来输出预测，而不是差异预测。

在上述两个变化之后，方法B产生与方法C相同的结果，而方法C花费的时间少得多，这是合理的。两者都趋向于趋势，因为我认为这与模型本身的平稳性有关。

无论在哪种方法中，forecast()和predict()都会产生与p，d，q的任何配置相同的结果。

Statsmodels ARIMA - 使用predict（）和forecast（）的不同结果

问题描述投票：6回答：2

2个回答

最新问题

Statsmodels ARIMA - 使用predict（）和forecast（）的不同结果

问题描述 投票：6回答：2

2个回答

最新问题

问题描述投票：6回答：2