R-mlr-搜索超参数时基准和重采样之间有什么区别

问题描述 投票:1回答:1

我正在寻找最佳的超参数设置,我意识到我可以在MLR中以两种方式做到这一点。基准功能和重采样功能。两者有什么区别?

如果要通过基准测试,我可以比较多个模型,并提取调整后的参数,这比重采样更具优势。相反,如果我使用重采样,则一次只能调整一个模型,而且我还注意到我的CPU飞速发展。

我应该如何以及何时使用另一种?

data(BostonHousing, package = "mlbench")

BostonHousing$chas <- as.integer(levels(BostonHousing$chas))[BostonHousing$chas]

library('mlr')
library('parallel')
library("parallelMap")

# ---- define learning tasks -------
regr.task = makeRegrTask(id = "bh", data = BostonHousing, target = "medv")

# ---- tune Hyperparameters -------- 

set.seed(1234)

# Define a search space for each learner'S parameter
ps_xgb = makeParamSet(
  makeIntegerParam("nrounds",lower=5,upper=50),
  makeIntegerParam("max_depth",lower=3,upper=15),
  # makeNumericParam("lambda",lower=0.55,upper=0.60),
  # makeNumericParam("gamma",lower=0,upper=5),
  makeNumericParam("eta", lower = 0.01, upper = 1),
  makeNumericParam("subsample", lower = 0, upper = 1),
  makeNumericParam("min_child_weight",lower=1,upper=10),
  makeNumericParam("colsample_bytree",lower = 0.1,upper = 1)
)

# Choose a resampling strategy
rdesc = makeResampleDesc("CV", iters = 5L)

# Choose a performance measure
meas = rmse

# Choose a tuning method
ctrl = makeTuneControlRandom(maxit = 30L)

# Make tuning wrappers
tuned.lm = makeLearner("regr.lm")
tuned.xgb = makeTuneWrapper(learner = "regr.xgboost", resampling = rdesc, measures = meas,
                           par.set = ps_xgb, control = ctrl, show.info = FALSE)

# -------- Benchmark experiements -----------
# Four learners to be compared
lrns = list(tuned.lm, tuned.xgb)

#setup Parallelization 
parallelStart(mode = "socket", #multicore #socket
              cpus = detectCores(),
              # level = "mlr.tuneParams",
              mc.set.seed = TRUE)

# Conduct the benchmark experiment
bmr = benchmark(learners = lrns, 
                tasks = regr.task,
                resamplings = rdesc,
                measures = rmse, 
                keep.extract = T,
                models = F,
                show.info = F)

parallelStop()

# ------ Extract HyperParameters -----
bmr_hp <- getBMRTuneResults(bmr)
bmr_hp$bh$regr.xgboost.tuned[[1]]


res <-
  resample(
    tuned.xgb,
    regr.task,
    resampling = rdesc,
    extract = getTuneResult, #getFeatSelResult, getTuneResult
    show.info = TRUE,
    measures = meas
  )

res$extract
r benchmarking resampling hyperparameters mlr
1个回答
0
投票

如果要通过基准测试,我可以比较多个模型,并且提取调整后的参数,这比重采样更具优势。

您也可以使用resample()执行此操作。


[benchmark()只是resample()的包装,可以更轻松地对多个任务/学习者/重采样进行实验。

© www.soinside.com 2019 - 2024. All rights reserved.