如何在自定义估计器上使用GridSearchCV？

Question

我使用 sklearn BaseEstimator 和 ClassifierMixin 构建了一个自定义 Estimator。但当涉及到交叉验证时，GridSearchCV 给我的分数是 nan 值。这是估计器的代码：

class RegressionClassifier(ClassifierMixin, BaseEstimator):
    def __init__(self, regressor=RidgeCV(cv=10, fit_intercept=True), alpha=0, n_components=1):
        self.alpha = alpha
        self.n_components = n_components
        self.regressor = regressor
        self.estimator = None

    def fit(self, X, y):
        pipe = Pipeline(steps=[
            ('imputate', SimpleImputer(missing_values=np.nan, strategy="median")),
            ('scale', StandardScaler()),
            ('reduce', PCA(self.n_components)),
            ('regress', self.regressor)
        ])
        self.estimator = pipe.fit(X, y)
        return self
        
    def predict(self, X):
        predictions = self.estimator.predict(X)
        converter = [
        predictions < -self.alpha,
        (-self.alpha <= predictions) & (predictions < self.alpha),
        predictions >= self.alpha
         ]
        classes = [2, 0, 0]
        predicted_class = np.select(converter, classes)
        return predicted_class

    def score(self, X, y):
        y_true = train_new_y[y.index]
        return accuracy_score(y_true, y_true)

这应该返回 1 作为分数，因为我计算了相同预测的准确性。

估算器的工作原理如下：

管道（进行线性回归）->输出（回归结果向量）
转换器（获取管道结果）->输出（对回归结果进行一些神奇操作的类）
分数应该采用生成的类以及这些类和目标向量之间的输出准确性

网格搜索结果：

Fitting 5 folds for each of 100 candidates, totalling 500 fits
[CV 1/5] END ........alpha=0.0, n_components=50.0;, score=nan total time=   0.2s
[CV 2/5] END ........alpha=0.0, n_components=50.0;, score=nan total time=   0.1s
[CV 3/5] END ........alpha=0.0, n_components=50.0;, score=nan total time=   0.2s

Answer 1

我这样做解决了问题：

创建自定义评分函数（重要的是要知道 API 应该是 func(estimator, X, y)，其中 X 是 gridsearchcv 传递的数据矩阵，y 是相应的输出）

def 记分器（估计器，X，y）：预测类 = estimator.predict(X) 返回accuracy_score(predicted_class, train_new_y[y.index])
将此函数作为 GridSearchCV 的参数传递

olm = 回归分类器() 参数 = { "alpha":np.linspace(0,1,10), "n_components":np.linspace(50, 200, 10) } poly_cv = GridSearchCV(olm, params, 评分=scorer, verbose=3)

如何在自定义估计器上使用GridSearchCV？

问题描述投票：0回答：1

1个回答

最新问题

如何在自定义估计器上使用GridSearchCV？

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1