如何提取 scikit-learn 中 GradientBoostingRegressor 中第一棵树使用的特征名称

问题描述 投票:0回答:1

我想从我的GradientBoostingRegressor

第一个估计器
打印特征的名称,但出现以下错误。 Scikit_learn版本=
1.2.2

model.estimators_[0]._final_estimator.feature_names_in_

output:
AttributeError                            Traceback (most recent call last)
Cell In[115], line 1
----> 1 model.estimators_[0]._final_estimator.feature_names_in_

AttributeError: 'GradientBoostingRegressor' object has no attribute 'feature_names_in_'
python scikit-learn ensemble-learning
1个回答
0
投票

您写道,您想要专门获取集成的first估计器的特征名称。不幸的是,各个树的特征名称未存储。这就是为什么它会给你错误

AttributeError: 'GradientBoostingRegressor' object has no attribute 'feature_names_in_'

但是,由于它们是在与整个模型相同的特征集上进行训练的,因此主

GradientBoostingRegressor
中的特征名称可用于其每个决策树。因此,您可以像这样提取集合的特征名称(因此可用于第一棵树):

model.feature_names_in_

如果您对第一棵树使用的功能名称感兴趣,您可以这样做:

from sklearn.ensemble import GradientBoostingRegressor from sklearn.datasets import fetch_california_housing import pandas as pd # Load the dataset data = fetch_california_housing() X, y = data.data, data.target feature_names = data.feature_names # Create and fit the GradientBoostingRegressor model = GradientBoostingRegressor(max_features=0.5, random_state=0) model.fit(X, y) # Directly fit on X, y without converting to DataFrame # Access the first tree of the first estimator first_tree = model.estimators_[0, 0] # Get the feature indices used in the first tree and filter out non-features used_feature_indices = set([i for i in first_tree.tree_.feature if i >= 0]) # Map indices to feature names used_feature_names = [feature_names[i] for i in used_feature_indices] print("All feature names:", feature_names) print("Names of features used in the first tree:", used_feature_names) print("Names of features not used in the first tree:", set(feature_names) - set(used_feature_names))
    
© www.soinside.com 2019 - 2024. All rights reserved.