为什么原始正则化回归（ordinalNet）的系数符号错误？

Question

我有具有有序因子响应（“低”、“中”、“高”）和大量预测变量的数据。我正在使用

ordinalNet

包追求序数正则化回归模型。

我是第一次使用这个包，所以在模拟数据上进行了测试。但是，系数的符号与我预期的相反。

在这个模拟数据上，我期望

的正系数。这可以从图中看出（随着

值的增加，明显更高的中等概率和高概率）。非正则化方法 (

MASS::polr

) 产生我期望的正系数。然而，

产生的

ordinalNet

的系数是负的。

感谢任何帮助！

library(tidyverse)
library(MASS)
#> 
#> Attaching package: 'MASS'
#> The following object is masked from 'package:dplyr':
#> 
#>     select
library(ordinalNet)

# simulated data
dat <- 
  expand_grid(
    repeats = 1:10,
    id = 1:3
  ) %>%
  mutate(
    x = id + runif(length(id), -1, 1),
    x2 = runif(length(id), -1, 1),
    y = factor(c("low", "med", "high"), c("low", "med", "high"), ordered = TRUE)[id],
  )
  
# inspect
dat
#> # A tibble: 30 × 5
#>    repeats    id     x     x2 y    
#>      <int> <int> <dbl>  <dbl> <ord>
#>  1       1     1 0.498  0.539 low  
#>  2       1     2 1.31  -0.406 med  
#>  3       1     3 2.30   0.774 high 
#>  4       2     1 0.163  0.408 low  
#>  5       2     2 1.99  -0.504 med  
#>  6       2     3 3.86   0.931 high 
#>  7       3     1 0.822 -0.799 low  
#>  8       3     2 1.96  -0.267 med  
#>  9       3     3 2.83   0.769 high 
#> 10       4     1 0.127  0.139 low  
#> # … with 20 more rows

# confirm that high values of x produce higher probability of medium and then high
dat %>%
  ggplot() + 
  aes(x, y) + 
  geom_point()


# with MASS::polr
# https://stats.oarc.ucla.edu/r/dae/ordinal-logistic-regression/
model <- polr(y ~ x + x2, data = dat)
coef(model)
#>         x        x2 
#> 3.6007872 0.6991214

# with ordinalNet
form <- y ~ x + x2
x <- model.matrix(form, dat)[, -1]
y <- model.frame(form, dat)[, 1, drop = TRUE]
rmodel <- ordinalNetCV(x, y)
#> Fitting ordinalNet on full training data
#> Fitting ordinalNet on fold 1 of 5 
#> Fitting ordinalNet on fold 2 of 5 
#> Fitting ordinalNet on fold 3 of 5 
#> Fitting ordinalNet on fold 4 of 5 
#> Fitting ordinalNet on fold 5 of 5 
#> Done
coef(rmodel$fit)
#> (Intercept):1 (Intercept):2             x            x2 
#>     4.2554900     8.4649827    -3.4500300    -0.6367011
# why do coefficients have the wrong sign?

^{创建于 2023-03-20 与 reprex v2.0.2}

Answer 1

反向逻辑。如果为 TRUE，则模型的“向后”形式是合适的，即模型以相反顺序的响应类别定义。例如，具有 K+1 个响应类别的反向累积模型将链接函数应用于累积概率 P(Y ≥ 2), . . . , P(Y ≥ K + 1), 而不是 P(Y ≤ 1), . . . , P(Y ≤ K).

亲爱的亚瑟我和你一样来自确切的问题，我不知道我的回答是否正确。当我们拟合一个累积模型时，我们应该非常小心我们使用链接函数的累积方向。在这种情况下，您可以在公式中添加参数 reverse=TRUE。

为什么原始正则化回归（ordinalNet）的系数符号错误？

问题描述投票：0回答：1

1个回答

最新问题

为什么原始正则化回归（ordinalNet）的系数符号错误？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1