当我使用 Python 的 Sklearn 进行交叉验证并获取不同指标(准确度、精度等)的分数时,如下所示:
result_accuracy = cross_val_score(classifier, X_train, y_train, scoring='accuracy', cv=10)
result_precision = cross_val_score(classifier, X_train, y_train, scoring='precision', cv=10)
result_recall = cross_val_score(classifier, X_train, y_train, scoring='recall', cv=10)
result_f1 = cross_val_score(classifier, X_train, y_train, scoring='f1', cv=10)
针对不同指标的
cross_val_score()
函数的每次执行是否都会使训练数据产生相同的10倍?如果没有,我是否需要先使用 KFold 进行一般 10 折,如下所示:
seed = 7
kf = KFold(n_splits=10, random_state=seed)
result_accuracy = cross_val_score(classifier, X_train, y_train, scoring='accuracy', cv=kf)
result_precision = cross_val_score(classifier, X_train, y_train, scoring='precision', cv=kf)
result_recall = cross_val_score(classifier, X_train, y_train, scoring='recall', cv=kf)
result_f1 = cross_val_score(classifier, X_train, y_train, scoring='f1', cv=kf)
在 cross_val_score 中使用 random_state 参数每次设置相同的分割。
result_accuracy = cross_val_score(classifier, X_train, y_train, scoring='accuracy', cv=10, random_state=42)