XGBoost使用Hyperopt。超参数调整时面临的问题

问题描述 投票:0回答:1

我正在尝试使用Hyperopt进行超参数调整XGBoostClassifier。但是我面临一个错误。请在下面找到我正在使用的代码以及错误:-

Step_1:目标功能

import csv
from hyperopt import STATUS_OK
from timeit import default_timer as timer
MAX_EVALS = 200
N_FOLDS = 10
def objective(params, n_folds = N_FOLDS):
    """Objective function for XGBoost Hyperparameter Optimization"""
    # Keep track of evals
    global ITERATION
    ITERATION += 1
#     # Retrieve the subsample if present otherwise set to 1.0
#     subsample = params['boosting_type'].get('subsample', 1.0)
#     # Extract the boosting type
#     params['boosting_type'] = params['boosting_type']['boosting_type']
#     params['subsample'] = subsample
    # Make sure parameters that need to be integers are integers
    for parameter_name in ['max_depth', 'colsample_bytree', 
                          'min_child_weight']:
        params[parameter_name] = int(params[parameter_name])
    start = timer()
    # Perform n_folds cross validation
    cv_results = xgb.cv(params, train_set, num_boost_round = 10000, 
                       nfold = n_folds, early_stopping_rounds = 100, 
                       metrics = 'auc', seed = 50)
    run_time = timer() - start
    # Extract the best score
    best_score = np.max(cv_results['auc-mean'])
    # Loss must be minimized
    loss = 1 - best_score
    # Boosting rounds that returned the highest cv score
    n_estimators = int(np.argmax(cv_results['auc-mean']) + 1)
    # Write to the csv file ('a' means append)
    of_connection = open(out_file, 'a')
    writer = csv.writer(of_connection)
    writer.writerow([loss, params, ITERATION, n_estimators, 
                   run_time])
    # Dictionary with information for evaluation
    return {'loss': loss, 'params': params, 'iteration': ITERATION,
           'estimators': n_estimators, 'train_time': run_time, 
           'status': STATUS_OK}

我也定义了样本空间和优化算法。运行Hyperopt时,我在下面遇到此错误。错误在于目标函数中。

Error:KeyError:'auc-mean'

<ipython-input-62-8d4e97f16929> in objective(params, n_folds)
     25     run_time = timer() - start
     26     # Extract the best score
---> 27     best_score = np.max(cv_results['auc-mean'])
     28     # Loss must be minimized
     29     loss = 1 - best_score
machine-learning data-science xgboost hyperopt
1个回答
0
投票

首先,打印cv_results并查看存在哪个键。

在下面的示例笔记本中,键为:'test-auc-mean'和'train-auc-mean'

请参见此处的单元格5:https://www.kaggle.com/tilii7/bayesian-optimization-of-xgboost-parameters

© www.soinside.com 2019 - 2024. All rights reserved.