如何解决由GridSearch引起的best_estimator_错误?

问题描述 投票:1回答:1

我想在拟合后打印RandomizedSearchCV的属性best_estimator。但是出了点问题。下面是我的主要代码:

from xgboost.sklearn import XGBRegressor

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV

parameters = {
    'min_child_weight':[2,3,4],
}

xlf = XGBRegressor(learning_rate=0.1,n_estimators=50,max_depth=5,min_child_weight = 1,
                     subsample=0.8,colsample_btree=0.8,objective='reg:linear',
                     scale_pos_weight=1,random_state=27)
n_iter_search = 5

gsearch = RandomizedSearchCV(xlf,param_distributions=parameters,n_iter=n_iter_search, cv=2, iid=False)


start=time.clock()
gsearch.fit(x_train.values,y_train.values,eval_set = [(x_test.values,y_test.values)],eval_metric = "rmse",
            early_stopping_rounds = 20)
end=time.clock()
print('RandomSearch Running time: %s Seconds'%(end-start))
print("Best score: %0.3f" % gsearch.best_score_)
best_estimator = gsearch.best_estimator_
print("Best parameters set",best_estimator)

以下是我的错误信息:

Traceback (most recent call last):
  File "D:\PythonProject\TestPackagePytorch\code.py", line 213, in <module>
    print("Best parameters set",best_estimator)
  File "E:\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\base.py", line 279, in __repr__
    repr_ = pp.pformat(self)
  File "E:\Anaconda3\envs\tensorflow\lib\pprint.py", line 144, in pformat
    self._format(object, sio, 0, 0, {}, 0)
  File "E:\Anaconda3\envs\tensorflow\lib\pprint.py", line 161, in _format
    rep = self._repr(object, context, level)
  File "E:\Anaconda3\envs\tensorflow\lib\pprint.py", line 393, in _repr
    self._depth, level)
  File "E:\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\utils\_pprint.py", line 170, in format
    changed_only=self._changed_only)
  File "E:\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\utils\_pprint.py", line 414, in _safe_repr
    params = _changed_params(object)
  File "E:\Anaconda3\envs\tensorflow\lib\site-packages\sklearn\utils\_pprint.py", line 98, in _changed_params
    if (repr(v) != repr(init_params[k]) and
KeyError: 'base_score'

并且在训练过程中,出现了一个句子:


  This may not be accurate due to some parameters are only used in language bindings but
  passed down to XGBoost core.  Or some parameters are not used but slip through this
  verification. Please open an issue if you find above cases.

我也通过GridSearchCV尝试过,但是失败了,并获得了相同的错误信息。请帮我!另外,我试图给其他数据集添加机会并获得相同的错误信息。

python tensorflow scikit-learn keyerror
1个回答
0
投票

问题似乎出在sklearn版本上,如果您使用的是sklearn的旧版本,请使用以下命令将其升级到anaconda。

conda install scikit-learn=0.23.1

我使用示例模型尝试了您的代码,我能够得到结果,我使用的是sklearn版本0.22.2。

下面是您的代码示例输出,其中包含base_score。

RandomSearch Running time: 4.218714999999975 Seconds
Best score: 0.789
Best parameters set XGBRegressor(base_score=0.5, booster='gbtree', colsample_btree=0.8,
             colsample_bylevel=1, colsample_bynode=1, colsample_bytree=1,
             gamma=0, importance_type='gain', learning_rate=0.1,
             max_delta_step=0, max_depth=5, min_child_weight=4, missing=None,
             n_estimators=50, n_jobs=1, nthread=None, objective='reg:linear',
             random_state=27, reg_alpha=0, reg_lambda=1, scale_pos_weight=1,
             seed=None, silent=None, subsample=0.8, verbosity=1)
© www.soinside.com 2019 - 2024. All rights reserved.