这是我的代码:
library(MASS)
library(caret)
df <- Boston
set.seed(3721)
cv.10.folds <- createFolds(df$medv, k = 10)
lasso_grid <- expand.grid(fraction=c(1,0.1,0.01,0.001))
lasso <- train(medv ~ .,
data = df,
preProcess = c("center", "scale"),
method ='lasso',
tuneGrid = lasso_grid,
trControl= trainControl(method = "cv",
number = 10,
index = cv.10.folds))
lasso
与线性模型不同,我无法从summary(lasso)找到Lasso回归模型的系数。我应该怎么做?或者也许我可以使用glmnet?
当您使用method="lasso"
进行训练时,来自elasticnet的enet被称为:
lasso$finalModel$call
elasticnet::enet(x = as.matrix(x), y = y, lambda = 0)
小插图写道:
LARS-EN算法计算出完整的弹性网解 同时在相同的收缩参数的所有值 计算成本最小二乘拟合
在lasso$finalModel$beta.pure
下,您具有与lasso$finalModel$L1norm
下的L1范数的16个值相对应的所有16组系数的系数:
length(lasso$finalModel$L1norm)
[1] 16
dim(lasso$finalModel$beta.pure)
[1] 16 13
您也可以使用预测来查看它:
predict(lasso$finalModel,type="coef")
$s
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
$fraction
[1] 0.00000000 0.06666667 0.13333333 0.20000000 0.26666667 0.33333333
[7] 0.40000000 0.46666667 0.53333333 0.60000000 0.66666667 0.73333333
[13] 0.80000000 0.86666667 0.93333333 1.00000000
$mode
[1] "step"
$coefficients
crim zn indus chas nox rm age
0 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 0.000000 0.00000000
1 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 0.000000 0.00000000
2 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 1.677765 0.00000000
3 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 2.571071 0.00000000
4 0.00000000 0.0000000 0.00000000 0.0000000 0.0000000 2.716138 0.00000000
5 0.00000000 0.0000000 0.00000000 0.2586083 0.0000000 2.885615 0.00000000
6 -0.05232643 0.0000000 0.00000000 0.3543411 0.0000000 2.953605 0.00000000
7 -0.13286554 0.0000000 0.00000000 0.4095229 0.0000000 2.984026 0.00000000
8 -0.21665925 0.0000000 0.00000000 0.5196189 -0.5933941 3.003512 0.00000000
9 -0.32168140 0.3326103 0.00000000 0.6044308 -1.0246080 2.973693 0.00000000
10 -0.33568474 0.3771889 -0.02165730 0.6165190 -1.0728128 2.967696 0.00000000
11 -0.42820289 0.4522827 -0.09212253 0.6407298 -1.2474934 2.932427 0.00000000
12 -0.62605363 0.7005114 0.00000000 0.6574277 -1.5655601 2.832726 0.00000000
13 -0.88747102 1.0150162 0.00000000 0.6856705 -1.9476465 2.694820 0.00000000
14 -0.91679342 1.0613165 0.09956489 0.6837833 -2.0217269 2.684401 0.00000000
15 -0.92906457 1.0826390 0.14103943 0.6824144 -2.0587536 2.676877 0.01948534
由插入符调整的超参数是最大L1范数的分数,因此在您提供的结果中,它将为1,即最大:
lasso
The lasso
506 samples
13 predictor
Pre-processing: centered (13), scaled (13)
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 51, 51, 51, 50, 51, 50, ...
Resampling results across tuning parameters:
fraction RMSE Rsquared MAE
0.001 9.182599 0.5075081 6.646013
0.010 9.022117 0.5075081 6.520153
0.100 7.597607 0.5572499 5.402851
1.000 6.158513 0.6033310 4.140362
RMSE was used to select the optimal model using the smallest value.
The final value used for the model was fraction = 1.
为了获得最佳分数的系数:
predict(lasso$finalModel,type="coef",s=16)
$s
[1] 16
$fraction
[1] 1
$mode
[1] "step"
$coefficients
crim zn indus chas nox rm
-0.92906457 1.08263896 0.14103943 0.68241438 -2.05875361 2.67687661
age dis rad tax ptratio black
0.01948534 -3.10711605 2.66485220 -2.07883689 -2.06264585 0.85010886
lstat
-3.74733185