如何在 R 中使用 ets 函数进行带有预测变量的时间序列

问题描述 投票:0回答:1

我有这个数据集

dat1197=structure(list(Dates = structure(c(18993, 19024, 19052, 19083, 
19113, 19144, 19174, 19205, 19236, 19266, 19297, 19327, 19358, 
19389, 19417, 19448, 19478, 19509, 19539, 19570, 19601, 19631, 
19662, 19692, 19723), class = "Date"), total = c(290107L, 198827L, 
369809L, 328653L, 230351L, 319991L, 361509L, 263837L, 423810L, 
267680L, 195494L, 236771L, 202171L, 286674L, 313943L, 303044L, 
307096L, 170928L, 144136L, 189956L, 232079L, 201174L, 199433L, 
150333L, 195069L), conv_count = c(31L, 9414L, 10662L, 10817L, 
10544L, 10824L, 11828L, 13365L, 11795L, 12731L, 12961L, 11215L, 
16180L, 20123L, 16419L, 16190L, 17597L, 16966L, 18805L, 16072L, 
18493L, 17952L, 24781L, 25582L, 712L), unique_id_publishers = c(4270L, 
4838L, 4227L, 4628L, 4300L, 5178L, 4297L, 8440L, 7616L, 10328L, 
7959L, 6239L, 7429L, 7748L, 7189L, 6837L, 7393L, 6773L, 7028L, 
7395L, 7473L, 10730L, 8814L, 64489L, 5464L), median_seconds = c(7881.49604743083, 
7881.49604743083, 488.966666666667, 488.966666666667, 531.916666666667, 
488.966666666667, 531.916666666667, 595, 574.75, 604.25, 595, 
721.25, 595, 1000.75, 1479.5, 1196.5, 2514.5, 2324, 2642.5, 828, 
4821, 4344.5, 6468, 3941, 8822), total_forecasted = c(252179.383228222, 
211378.341678112, 298854.813540318, 297876.900653167, 298769.06537375, 
297419.968269761, 293248.366585249, 282633.709438049, 290279.426901374, 
283780.066745602, 284744.759870922, 292012.326293479, 271309.781396652, 
249031.822264103, 259416.064075342, 264210.373105608, 241258.234178068, 
246833.896200638, 234745.99587691, 268889.359224122, 208522.098966603, 
214275.525057851, 159854.631183384, 144778.271030721, 236571.818861993
)), row.names = c(NA, -25L), class = "data.frame")

我想使用预测器进行时间序列分析。我的因变量是

total
conv_count
unique_id_publishers
median_seconds
是应该解释 t
otal
变量的预测变量。

我尝试这样做。这是我的代码。此代码迭代参数以查找模型具有最大 R 平方的参数

library(forecast)
library(zoo)

# Convert the dataset to data.table
dat1197 <- as.data.table(dat1197)

# Convert the Dates column to Date format
dat1197$Dates <- as.Date(paste(dat1197$Dates, "-01", sep=""))
# Create a time series without a Dates column

# Dividing the sample into training and test
train_data <- dat1197[Dates < as.Date("2023-11-01")]
test_data <- dat1197[Dates >= as.Date("2023-11-01") & Dates <= as.Date("2024-01-01")]
ts_data <- zoo(train_data[, c("total")])
# Specifying predictors
xreg <- train_data[, c("conv_count", "unique_id_publishers", "median_seconds")]
# Convert predictors to a numeric matrix
xreg_matrix <- as.matrix(xreg)

best_model <- NULL
best_r_squared <- 0

# Loop for selecting ETS model parameters with maximum R-squared
for (error in c("A", "M")) {
   for (trend in c("N", "A", "Ad", "M")) {
     for (seasonal in c("N", "A", "Ad", "M")) {
       model <- ets(ts_data, model = paste0(error, trend, seasonal), xreg = xreg_matrix)
       r_squared <- accuracy(model)$R2
       if (r_squared > best_r_squared) {
         best_model <- model
         best_r_squared <- r_squared
       }
     }
   }
}

# Obtaining forecasts for the test period
forecast_data <- forecast(best_model, xreg = as.matrix(test_data[, c("conv_count", "unique_id_publishers", "median_seconds")]), newdata = as.matrix(test_data[, c("conv_count", "unique_id_publishers ", "median_seconds")]), h = nrow(test_data))

我收到错误

Error in ets(ts_data, model = paste0(error, trend, seasonal), xreg = xreg_matrix) :
   No model able to be fitted

我做错了什么以及使用我的预测器执行时间序列的正确程度如何? 您的任何帮助都很有价值。

r time-series forecasting
1个回答
0
投票
  1. ets()
    没有
    xreg
    参数。请参阅帮助文件。
    smooth::es()
    函数确实允许协变量。
  2. 以这种方式循环模型是没有意义的,因为如果您不指定
    ets()
    参数,
    model
    会在内部执行此操作。
  3. R 平方是选择预测模型的糟糕方法。它不允许模型复杂性,并且它测量相关性而不是预测准确性。想象一下预测值正好是相应观测值的一半,就能看出问题所在。
© www.soinside.com 2019 - 2024. All rights reserved.