将 XGBoost Shapely 值转换为“SHAP”的解释器对象

Question

我正在尝试将 XGBoost 形状值转换为 SHAP 解释器对象。将[此处][1]的示例与内置 SHAP 库一起使用需要几天的时间（即使在二次采样数据集上），而 XGBoost 库则需要几分钟。然而。我想输出一个类似于[此处][2]示例中显示的蜂群图。

我的想法是，我可以使用 XGBoost 库来恢复形状值，然后使用 SHAP 库绘制它们，但蜂群图需要一个解释器对象。如何将我的 XGBoost 助推器对象转换为解释器对象？

这是我尝试过的：

import shap
booster = model.get_booster()
d_test = xgboost.DMatrix(X_test[0:100], y_test[0:100])
shap_values = booster.predict(d_test, pred_contribs=True)
shap.plots.beeswarm(shap_values)

返回：

TypeError: The beeswarm plot requires an `Explanation` object as the `shap_values` argument.

为了澄清，如果可能的话，我想用 xgboost 内置库生成的值创建解释器对象。避免 shap.explainer 或 shap.TreeExplainer 函数调用是一个优先事项，因为它们需要更长的时间（几天）而不是几分钟才能返回。 [1]：https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/Python%20Version%20of%20Tree%20SHAP.html [2]：https://shap.readthedocs.io/en/latest/example_notebooks/api_examples/plots/beeswarm.html#A-simple-beeswarm-summary-plot

Answer 1

您需要将从XGBoost模型获得的SHAP值转换为SHAP Explanation对象。解释对象是 SHAP 库中的标准格式，它不仅包括 SHAP 值，还包括附加信息，例如特征名称和基本值，如下所示：

import shap

# Assuming shap_values, X_test are defined as in your code

# Create an explainer with your model
explainer = shap.Explainer(booster, X_test[0:100])

# Alternatively, create the explainer using the TreeExplainer if the above line gives trouble
# explainer = shap.TreeExplainer(booster)

# Get the expected value (base value) - it's often the output value for the background dataset
expected_value = explainer.expected_value

# If your model is a multi-class model, you will have multiple expected values
if isinstance(expected_value, np.ndarray):
    expected_value = expected_value[0]

# Create the SHAP Explanation object
shap_explanation = shap.Explanation(shap_values, 
                                    base_values=expected_value, 
                                    data=X_test.iloc[0:100], # assuming X_test is a DataFrame
                                    feature_names=X_test.columns.tolist())

现在您已经有了 Explanation 对象，您现在可以使用它来创建蜂群图。

shap.plots.beeswarm(shap_explanation)

Answer 2

如果您正在构建一个

Explanation

对象（而不是像您在问题中所述的

Explainer

），那么您可以执行以下操作：

import xgboost as xgb
import shap
from sklearn.model_selection import train_test_split

X, y = shap.datasets.california()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

d_train = xgb.DMatrix(X_train, y_train)
d_test = xgb.DMatrix(X_test, y_test)

params = {"objective": "reg:squarederror", "tree_method": "hist", "device":"cuda"}

model = xgb.train(params, d_train, 100)
shap_values = model.predict(d_test, pred_contribs=True)

exp = shap.Explanation(shap_values[:,:-1], data = X_test, feature_names=X.columns)
shap.summary_plot(exp)

将 XGBoost Shapely 值转换为“SHAP”的解释器对象

问题描述投票：0回答：2

2个回答

最新问题

将 XGBoost Shapely 值转换为“SHAP”的解释器对象

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2