如何从插入符:: train对象绘制最终的c50决策树模型(库C50)

问题描述 投票:1回答:1

我使用caret库中的train函数训练了Decision Tree模型:

gr = expand.grid(trials = c(1, 10, 20), model = c("tree", "rules"), winnow = c(TRUE, FALSE))
dt = train(y ~ ., data = train, method = "C5.0", trControl = trainControl(method = 'cv', number = 10), tuneGrid = gr)

现在,我想为最终模型绘制Decision Tree。但是此命令不起作用:

plot(dt$finalModel)

Error in data.frame(eval(parse(text = paste(obj$call)[xspot])), eval(parse(text = paste(obj$call)[yspot])),  : 
  arguments imply differing number of rows: 4160, 208, 0

有人已经在这里问过这个问题:topic

建议的解决方案是使用拟合的train对象中的bestTune手动定义相关的c5.0模型。然后正常绘制c5.0模型

c5model = C5.0(x = x, y = y, trials = dt$bestTune$trials, rules = dt$bestTune$model == "rules", control = C5.0Control(winnow = dt$bestTune$winnow))
plot(c5model)

我试图这样做。是的,它可以绘制c5.0模型BUT来自train对象的预测概率和手动重新创建的c5.0模型不匹配。我猜这是因为在手动重新创建的[[c5.0模型中,我们没有实现10倍交叉验证

所以,我的问题是:是否可以从

caret :: train

对象中提取最终的c5.0模型并绘制该Decision Tree
r decision-tree r-caret c5.0
1个回答
1
投票
预测的概率应该相同,请参见下文:

library(MASS) library(caret) library(C50) library(partykit) traindata = Pima.tr testdata = Pima.te gr = expand.grid(trials = c(1, 2), model = c("tree"), winnow = c(TRUE, FALSE)) dt = train(x = traindata[,-ncol(testdata)], y = traindata[,ncol(testdata)], method = "C5.0",trControl = trainControl(method = 'cv', number=3),tuneGrid=gr) c5model = C5.0.default(x = traindata[,-ncol(testdata)], y = traindata[,ncol(testdata)], trials = dt$bestTune$trials, rules = dt$bestTune$model == "rules", control = C5.0Control(winnow = dt$bestTune$winnow)) all.equal(predict(c5model,testdata[,-ncol(testdata)],type="prob"), predict(dt$finalModel,testdata[,-ncol(testdata)],type="prob")) [1] TRUE

所以我建议您仔细检查预测是否相同。

您看到从插入符号绘制最终模型的错误来自于$ call下存储的内容,这很奇怪,我们可以用一个可以进行绘制的调用来代替它:

plot(c5model)

enter image description here

finalMod = dt$finalModel finalMod$call = c5model$call plot(finalMod)

enter image description here

或者您可以像使用培训结果那样重写它,但是您可以看到它与表达式有点复杂(或者至少我不太满意):

newcall = substitute(C5.0.default(x = X, y = Y, trials = ntrials, rules = RULES, control = C5.0Control(winnow = WINNOW)), list( X = quote(traindata[, -ncol(traindata)]), Y = quote(traindata[, ncol(traindata)]), RULES = dt$bestTune$model == "rules", ntrials = dt$bestTune$trials, WINNOW = dt$bestTune$winnow) ) finalMod = dt$finalModel finalMod$call = newcall

© www.soinside.com 2019 - 2024. All rights reserved.