将 XGBoost Shapely 值转换为“SHAP”的解释器对象

问题描述 投票:0回答:2

我正在尝试将 XGBoost 形状值转换为 SHAP 解释器对象。将[此处][1]的示例与内置 SHAP 库一起使用需要几天的时间(即使在二次采样数据集上),而 XGBoost 库则需要几分钟。然而。我想输出一个类似于[此处][2]示例中显示的蜂群图。

我的想法是,我可以使用 XGBoost 库来恢复形状值,然后使用 SHAP 库绘制它们,但蜂群图需要一个解释器对象。如何将我的 XGBoost 助推器对象转换为解释器对象?

这是我尝试过的:

import shap
booster = model.get_booster()
d_test = xgboost.DMatrix(X_test[0:100], y_test[0:100])
shap_values = booster.predict(d_test, pred_contribs=True)
shap.plots.beeswarm(shap_values)

返回:

TypeError: The beeswarm plot requires an `Explanation` object as the `shap_values` argument.

为了澄清,如果可能的话,我想用 xgboost 内置库生成的值创建解释器对象。避免 shap.explainer 或 shap.TreeExplainer 函数调用是一个优先事项,因为它们需要更长的时间(几天)而不是几分钟才能返回。 [1]:https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/Python%20Version%20of%20Tree%20SHAP.html [2]:https://shap.readthedocs.io/en/latest/example_notebooks/api_examples/plots/beeswarm.html#A-simple-beeswarm-summary-plot

python xgboost shap
2个回答
1
投票

您需要将从XGBoost模型获得的SHAP值转换为SHAP Explanation对象。解释对象是 SHAP 库中的标准格式,它不仅包括 SHAP 值,还包括附加信息,例如特征名称和基本值,如下所示:

import shap

# Assuming shap_values, X_test are defined as in your code

# Create an explainer with your model
explainer = shap.Explainer(booster, X_test[0:100])

# Alternatively, create the explainer using the TreeExplainer if the above line gives trouble
# explainer = shap.TreeExplainer(booster)

# Get the expected value (base value) - it's often the output value for the background dataset
expected_value = explainer.expected_value

# If your model is a multi-class model, you will have multiple expected values
if isinstance(expected_value, np.ndarray):
    expected_value = expected_value[0]

# Create the SHAP Explanation object
shap_explanation = shap.Explanation(shap_values, 
                                    base_values=expected_value, 
                                    data=X_test.iloc[0:100], # assuming X_test is a DataFrame
                                    feature_names=X_test.columns.tolist())

现在您已经有了 Explanation 对象,您现在可以使用它来创建蜂群图。

shap.plots.beeswarm(shap_explanation)


0
投票

如果您正在构建一个

Explanation
对象(而不是像您在问题中所述的
Explainer
),那么您可以执行以下操作:

import xgboost as xgb
import shap
from sklearn.model_selection import train_test_split

X, y = shap.datasets.california()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

d_train = xgb.DMatrix(X_train, y_train)
d_test = xgb.DMatrix(X_test, y_test)

params = {"objective": "reg:squarederror", "tree_method": "hist", "device":"cuda"}

model = xgb.train(params, d_train, 100)
shap_values = model.predict(d_test, pred_contribs=True)

exp = shap.Explanation(shap_values[:,:-1], data = X_test, feature_names=X.columns)
shap.summary_plot(exp)

© www.soinside.com 2019 - 2024. All rights reserved.