时间序列模型规范

Question

我想在 R 中运行回归分析来解释我的 DV g_law_tot 的变化，即从 t-1 到 t 的总预算增长率。我有从 1994 年到 2023 年的年度数据。我有一些政治、制度和经济变量作为 IV。我对预测不感兴趣，只是想了解哪些是预算年度百分比变化最相关的解释因素。 时间序列是正确的选择吗？我如何了解哪种特定模型最适合我的情况？我迷失在太多的视频、阅读、博客中。

非常感谢！

下面是我的df：

df <- structure(list(year = c(1994, 1995, 1996, 1997, 1998, 1999, 2000, 
2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 
2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 
2023), end_legislative_term = c(0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 
0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0), 
    technocratic = c(0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0), C_enpp = c(7.88, 
    7.88, 6.07, 6.07, 6.07, 6.07, 6.07, 5.45, 5.45, 5.45, 5.45, 
    5.45, 5.09, 5.09, 3.08, 3.08, 3.08, 3.08, 3.08, 3.52, 3.52, 
    3.52, 3.52, 3.52, 4.38, 4.38, 4.38, 4.38, 5.64, 5.64), leading_ch = c(0, 
    1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 
    0, 1, 0, 1, 0, 1, 0, 1, 0, 1), steering_center = c(0, 0, 
    0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 1, 
    1, 1, 1, 0, 0, 0, 0, 0, 0), gpd_growth = c(2.683230835, 1.266540391, 
    1.830276287, 1.810314719, 1.625659888, 3.78691271, 1.951454556, 
    0.25403202, 0.13847967, 1.424073911, 0.817498767, 1.790934831, 
    1.486917388, -0.962084833, -5.280713695, 1.713274462, 0.707045938, 
    -2.980514474, -1.841095694, -0.004870179, 0.778657835, 1.293374251, 
    1.667491666, 0.926246385, 0.482993514, -8.974277401, 8.313634284, 
    3.724824143, 0.691742081, NA), g_law_tot = c(13.1533324313674, 
    6.60474423446604, -11.8440505854976, 2.8432575110453, 10.6431848644662, 
    7.29624633255459, 2.81686804707955, -0.0275394074771063, 
    -0.570750090159, -0.192522133860973, -0.553271817228385, 
    -2.94082586015983, 4.28422079688282, 5.5247420527077, -4.450067276262, 
    -0.0110042515499731, -3.47617754038942, -0.98458875847075, 
    2.91395911842109, 4.04542948833524, 5.56044089389971, -2.05342343455547, 
    -0.0998891519709444, 2.70233019929693, 2.48770368114364, 
    2.31876491767493, 15.1682979912092, 6.53052765440996, 4.09519458060663, 
    -4.94486838456369)), row.names = c(NA, -30L), class = c("tbl_df", 
"tbl", "data.frame"))

Answer 1

我不经常与 TS 打交道，但阅读

modeltime

生态系统非常清晰的小插图并不太难。看起来你的数据可能缺乏季节性，但同样，这根本不是我的领域。

我添加了基于小插图的工作流程

library(tidyverse)
library(tidymodels)
library(modeltime)
library(timetk);library(vip)

df <- df  |> mutate(year = as.Date(year))

# Split (not mandatory)
df_splits <- initial_time_split(df)
df_te <- testing(df_splits)
df_tr <- training(df_splits)

# classic arima model, see also here: https://cran.r-project.org/web/packages/modeltime/vignettes/getting-started-with-modeltime.html
arima <- 
  arima_reg() |> 
  set_engine(engine = "auto_arima") |>
  fit(g_law_tot ~ year, data = df_tr)

# linear regression
lm <- 
  linear_reg() |>
  set_engine(engine = "lm") |>
  fit(g_law_tot ~ ., data = df_tr)


lm |> vip::vip() # testing var importance for lm

# Plot
  modeltime_table(arima, lm) |> modeltime_calibrate(new_data = df_te) |> 
modeltime_forecast(new_data = df_te,actual_data = df) |> 
  plot_modeltime_forecast(.interactive = F)

  
  modeltime_table(arima, lm) |> modeltime_calibrate(new_data = df_te) |> 
    modeltime_accuracy() |>
    table_modeltime_accuracy(.interactive = F)

时间序列模型规范

问题描述投票：0回答：1

1个回答

最新问题

时间序列模型规范

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1