码:
library(caret); library(kernlab); data(spam)
inTrain <- createDataPartition(y=spam$type,
p=0.75, list=FALSE)
training <- spam[inTrain,]
testing <- spam[-inTrain,]
M <- abs(cor(training[,-58]))
diag(M) <- 0
which(M > 0.8,arr.ind=T)
preProc <- preProcess(log10(spam[,-58]+1),method="pca",pcaComp=2)
spamPC <- predict(preProc,log10(spam[,-58]+1))
preProc <- preProcess(log10(training[,-58]+1),method="pca",pcaComp=2)
trainPC <- predict(preProc,log10(training[,-58]+1))
modelFit <- train(training$type ~ .,method="glm",preProc = "pca",data=trainPC)
何时执行以下错误发生:
Error in [.data.frame(data, , all.vars(Terms), drop = FALSE) :
undefined columns selected
您的代码中存在多个错误。
以下代码将执行您要执行的操作。如果y是一个因子,glm方法将指定family参数本身。但最好自己指定一下。
inTrain <- createDataPartition(y=spam$type,
p=0.75, list=FALSE)
training <- spam[inTrain,]
testing <- spam[-inTrain,]
modelFit <- train(type ~ .,
data = training,
method = "glm",
preProc = "pca",
family = binomial)
你会得到一些警告,如下所示。但如果你只是运行一个glm,你也会得到那些。
glm.fit:拟合概率数值为0或1