Scikit-learn:对训练数据进行交叉验证,然后将模型拟合到测试数据上

问题描述 投票:0回答:1

我希望对训练数据执行n折交叉验证方法,然后在测试子集上使用优化参数对模型进行拟合。

from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn import linear_model
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import TimeSeriesSplit

iris = datasets.load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, 
                                                        random_state=1234)

lm = linear_model.LinearRegression() 
cv = TimeSeriesSplit(n_splits=10).split(y_train) # [Question: 1]
cv_score = cross_val_score(lm, X_train, y_train, cv=cv, scoring="r2")

我的问题是:

  • [问题:1]假设这是逻辑回归,是否正确?如果我要考虑类的不平等(检查代码的[[第12行)?
  • [问题:2]
  • 如何在[[X_test数据上根据cross_val_score拟合模型以预测y_test数据?
python scikit-learn cross-validation
1个回答
0
投票
    [问题:1]假设这是逻辑回归,是否正确?如果我要考虑类的不平等(检查代码的[[第12行)?
  • [问题:2]

    如何在[[X_test
    数据上根据cross_val_score拟合模型以预测
      y_test数据?
  • 您将需要GridSearch。然后可以检索最佳模型并将其用于测试集。示例:from sklearn import svm, datasets from sklearn.model_selection import GridSearchCV iris = datasets.load_iris() parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]} svc = svm.SVC(gamma="scale") clf = GridSearchCV(svc, parameters, cv=5) clf.fit(iris.data, iris.target) y_test = clf.best_estimator_.predict(X_test)
  • © www.soinside.com 2019 - 2024. All rights reserved.