sklearn的make_scorer中的自定义函数

Question

我正在尝试创建一个自定义评分函数，以实现针对分类问题的GridSearchCV，并且我认为我不太了解它的工作原理（我已经阅读了文档）。我的目标是对错误分类的类型赋予不同的权重。我的代码如下。 good和excellent是我的样本属于的两个类。我认为问题是GridSearchCV将真实的和预测的值传递给score_func时，但我不确定如何解决它。

def score_func(y, y_pred):
    '''score function for grid search'''
    error = 0
    for i in range(len(y)):
        if y[i] == 'excellent':
            if y_pred[i] == 'excellent':
                error += 10
            elif y_pred[i] == 'good':
                error += 5
    return error

score_f = make_scorer(score_func, needs_proba=False ,needs_threshold=False)

RF = make_pipeline(
        StandardScaler(),
        RandomForestClassifier(random_state=101, criterion = 'gini')
        )

gs_rf = GridSearchCV(estimator=RF, param_grid=param_grid, scoring=score_f, 
                     cv=KFold(5, True, random_state=1234)).fit(X_data,y_data)

提前感谢！

Answer 1

如果您的目标是关联标签的权重，则无需创建函数。

只需使用class_weight中的RandomForestClassifier参数。

weight_dict = {'excellent':10, 'good':5}
RandomForestClassifier(random_state=101, criterion='gini', class_weight=weight_dict)

sklearn的make_scorer中的自定义函数

问题描述投票：0回答：1

1个回答

最新问题

sklearn的make_scorer中的自定义函数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1