R 中的 cv.glmnet 是否为二进制数据返回双 MSE?

问题描述 投票:0回答:0

我注意到当结果为二进制时,cv.glmnet 中“$cvm”或“plot()”返回的 MSE 和 MAE 标准的最小值是实际值的两倍。这很令人困惑。是否在某处解释了这样做的理由,还是我理解错了什么?

# sample R code 
# generate binomial data
n<-10000
x1<-runif(n,-2,2); x2<-runif(n,-2,2)
p<-exp(x1+x2)/(1+exp(x1+x2))
y<-rbinom(n,1,p)

# predictors
x<-matrix(0,n,10)
for (k in 1:ncol(x)) {a<-runif(1) 
 x[,k]<-0.5*(a*x1+(1-a)*x2+runif(n,-2,2))}

# training data
xa<-x[seq(n/2),]; ya<-y[seq(n/2)]
# test data
xb<-x[-1*seq(n/2),]; yb<-y[-1*seq(n/2)]

library(glmnet)
cvfit1<-cv.glmnet(xa,ya,family="binomial",alpha=1,type.measure="mse",nfolds=10)
cvfit2<-cv.glmnet(xa,ya,family="binomial",alpha=1,type.measure="mae",nfolds=10)

min(cvfit1$cvm) # 0.354
min(cvfit2$cvm) # 0.710
plot(cvfit1) # also this has min 0.354 on y-axis

# But mse and mae from predicted values are half of the previous:
pp<-predict(cvfit1,newx=xb,s="lambda.min",type="response")
mean((pp-yb)^2) # 0.179
mean(abs(pp-yb)) # 0.358
r glmnet mse
© www.soinside.com 2019 - 2024. All rights reserved.