如何提取 scikit-learn 中 GradientBoostingRegressor 中第一棵树使用的特征名称

Question

我想从我的GradientBoostingRegressor的

第一个估计器

打印特征的名称，但出现以下错误。 Scikit_learn版本=

1.2.2

model.estimators_[0]._final_estimator.feature_names_in_

output:
AttributeError                            Traceback (most recent call last)
Cell In[115], line 1
----> 1 model.estimators_[0]._final_estimator.feature_names_in_

AttributeError: 'GradientBoostingRegressor' object has no attribute 'feature_names_in_'

Answer 1

您写道，您想要专门获取集成的first估计器的特征名称。不幸的是，各个树的特征名称未存储。这就是为什么它会给你错误

AttributeError: 'GradientBoostingRegressor' object has no attribute 'feature_names_in_'

但是，由于它们是在与整个模型相同的特征集上进行训练的，因此主

GradientBoostingRegressor

中的特征名称可用于其每个决策树。因此，您可以像这样提取集合的特征名称（因此可用于第一棵树）：

model.feature_names_in_

如果您对第一棵树使用的功能名称感兴趣，您可以这样做：

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.datasets import fetch_california_housing
import pandas as pd

# Load the dataset
data = fetch_california_housing()
X, y = data.data, data.target
feature_names = data.feature_names

# Create and fit the GradientBoostingRegressor
model = GradientBoostingRegressor(max_features=0.5, random_state=0)
model.fit(X, y)  # Directly fit on X, y without converting to DataFrame

# Access the first tree of the first estimator
first_tree = model.estimators_[0, 0]

# Get the feature indices used in the first tree and filter out non-features
used_feature_indices = set([i for i in first_tree.tree_.feature if i >= 0])

# Map indices to feature names
used_feature_names = [feature_names[i] for i in used_feature_indices]

print("All feature names:", feature_names)
print("Names of features used in the first tree:", used_feature_names)
print("Names of features not used in the first tree:", set(feature_names) - set(used_feature_names))

如何提取 scikit-learn 中 GradientBoostingRegressor 中第一棵树使用的特征名称

问题描述投票：0回答：1

1个回答

最新问题

如何提取 scikit-learn 中 GradientBoostingRegressor 中第一棵树使用的特征名称

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1