带有 pycaret 的时间序列挂在比较模型中

问题描述 投票:0回答:1

我正在尝试使用 pycaret autoML 包使用 google colab 中以下链接parts_revenue_data中的数据进行时间序列预测。当我尝试比较模型并找到最好的模型时,代码挂起并保持在 20%。

代码可以在下面找到

# Only enable critical logging (Optional)
import os
os.environ["PYCARET_CUSTOM_LOGGING_LEVEL"] = "CRITICAL"

def what_is_installed():
    from pycaret import show_versions
    show_versions()

try:
    what_is_installed()
except ModuleNotFoundError:
   !pip install pycaret
   what_is_installed()

import pandas as pd
import numpy as np
import pycaret
pycaret.__version__ # 3.1.0

df = pd.read_csv('parts_revenue.csv', delimiter=';')

from pycaret.utils.time_series import clean_time_index

cleaned = clean_time_index(data=df,
                           index_col='Posting Date',
                           freq='D')

# Verify the resulting DataFrame
print(cleaned.head(n=50))

# parts['MA12'] = parts['Parts Revenue'].rolling(12).mean()


# import plotly.express as px
# fig = px.line(parts, x="Posting Date", y=["Parts Revenue", 
#                "MA12"], template = 'plotly_dark')
# fig.show()

import time
import numpy as np

from pycaret.time_series import *

# We want to forecast the next 12 days of data and we will use 3 
# fold cross-validation to test the models.
fh = 12 # or alternately fh = np.arange(1,13)
fold = 3

# Global Figure Settings for notebook ----
# Depending on whether you are using jupyter notebook, jupyter lab, 
# Google Colab, you may have to set the renderer appropriately
# NOTE: Setting to a static renderer here so that the notebook 
# saved size is reduced.
fig_kwargs = {
              # "renderer": "notebook",
              "renderer": "png",
              "width": 1000,
              "height": 600,
             }

"""## EDA"""

eda = TSForecastingExperiment()
eda.setup(cleaned,
          fh=fh,
          numeric_imputation_target = 0,
          fig_kwargs=fig_kwargs
        )

eda.plot_model()


eda.plot_model(plot="diagnostics",
               fig_kwargs={"height": 800, "width": 1000}
              )

eda.plot_model(
               plot="diff",
               data_kwargs={"lags_list": [[1], [1, 7]],
               "acf": True,
               "pacf": True,
               "periodogram": True},
               fig_kwargs={"height": 800, "width": 1500} )


"""## Modeling"""

exp = TSForecastingExperiment()
exp.setup(data = cleaned,
          fh=fh,
          numeric_imputation_target = 0.0,
          fig_kwargs=fig_kwargs,
          seasonal_period = 5
      )

# compare baseline models
best = exp_ts.compare_models(errors = 'raise') # CODE HANGS HERE!

# plot forecast for 36 months in future
plot_model(best,
           plot = 'forecast',
           data_kwargs = {'fh' : 24}
       )

这与 pycaret 中的错误有关还是代码有问题?

python time-series google-colaboratory forecasting pycaret
1个回答
0
投票

注意:我没有足够的代表来发表评论,所以我将把这个准解决方法放在这里,如果需要的话我可以稍后删除它,或者在我有足够的代表后将其移至评论

我还经历过

compare_models
,当我使用 M1 Max 进行 MBP 时,时间序列异常缓慢(即,在大约 4000 条记录的数据集上运行超过 10 分钟)。我没有在 Colab 中尝试过。

注意到它挂在 Auto ARIMA 上,我将其从列表中排除,如下所示。这将运行时间减少到大约 1 分钟。

# compare baseline models
best = exp_ts.compare_models(errors="raise", exclude="auto_arima")

虽然我知道这本身并不是一个解决方案,但也许它可以帮助您解锁。

环境详情:

  • Python 3.10.12
  • pycaret==3.1.0
© www.soinside.com 2019 - 2024. All rights reserved.