让我们假设我有1000个要素的数据。我想对此数据应用SVM-RFE,每次删除10%的功能。人们如何才能在消除阶段的总体水平上获得准确性。例如,我要获得超过1000个功能,900个功能,800个功能,...,2个功能和1个功能的性能。另外,我想跟踪每个级别的功能。
image.png
当前框架在RFE的每次迭代中均未对模型评分/存储功能集。
也许您可以使用专用于RFECV类的私有函数获得评分。
>>> from sklearn.datasets import make_friedman1
>>> from sklearn.feature_selection import RFE
>>> from sklearn.svm import SVR
>>> from sklearn.model_selection._validation import _score
>>> X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)
>>> estimator = SVR(kernel="linear")
>>> selector = RFE(estimator, 5, step=1)
>>> from sklearn.metrics import check_scoring
>>> scorer = check_scoring(estimator, 'r2')
>>> selector._fit(
... X, y, lambda estimator, features:
... _score(estimator, X[:, features], y, scorer))
RFE(estimator=SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1,
gamma='scale', kernel='linear', max_iter=-1, shrinking=True,
tol=0.001, verbose=False),
n_features_to_select=5, step=1, verbose=0)
>>> selector.scores_
[0.6752027280057595, 0.6750531506827873, 0.6722333425078437, 0.6684835939207456, 0.6669024507875724, 0.6751247326304468]
>>> selector.ranking_
array([1, 1, 1, 1, 1, 6, 4, 3, 2, 5])
如果要检索每个级别/迭代的功能集,则需要编辑fit method。