如何在 R 中获取插入符号模型的 SHAP 值?

问题描述 投票:0回答:1

我正在尝试获取模型的 SHAP 值(我使用插入符号构建)。我有一个 RF 模型,数据是:

data = structure(list(Main_Street = structure(c(2L, 3L, 2L, 1L, 3L, 
2L, 3L, 1L, 2L, 2L), .Label = c("64", "70", "270"), class = "factor"), 
    Blocked_Lanes = c(3L, 4L, 2L, 1L, 1L, 2L, 6L, 3L, 3L, 3L), 
    Total_Vehicle_Count = c(1L, 2L, 2L, 2L, 1L, 4L, 3L, 2L, 2L, 
    1L), Tractor_Trailer_Count = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L), Weather_Winter_Storm = structure(c(1L, 2L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("No", "Yes"), class = "factor"), 
    Weather_Rain = structure(c(2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    2L, 1L), .Label = c("No", "Yes"), class = "factor"), Injuries_Count = c(0L, 
    0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L), Accident_Overturned_Car = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("No", "Yes"
    ), class = "factor"), Fatalities_Count = c(0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L), Speed = c(65L, 46L, 10L, 42L, 40L, 
    21L, 15L, 57L, 59L, 59L), Total_Volume = c(48.7, 22.5, 47.3, 
    102, 138, 75.3, 60.5, 83.3, 18, 26.7), Occupancy = c(3.5, 
    1.7, 40.8, 23.8, 14.1, 31, 27.1, 4.9, 2.6, 2.5), Lanes_Cleared_Duration = c(53L, 
    35L, 32L, 4L, 11L, 35L, 42L, 12L, 36L, 69L)), row.names = c(NA, 
-10L), class = "data.frame")

射频型号为:

fitControl <- trainControl(method = "repeatedcv", 
                           number = 10, 
                           repeats = 10) 

set.seed (2356)
randomforestGrid <-  expand.grid(mtry = c(2:sqrt(61))) # better be a dataframe
set.seed(2356)
rf_model <- train(Lanes_Cleared_Duration~.,
                 data = training, 
                 method = "rf", 
                 trControl = fitControl, 
                 metric= "RMSE",
                 verbose = FALSE, 
                 tuneGrid = randomforestGrid,
               n.trees = c(1:50)*100)

有很多关于如何绘制 SHAP 图的资源,但没有一个适合我的数据,而且我不断收到错误。例如,This post尝试提出类似的问题,但没有解决问题。这是我想要得到的类似的情节:

是否也可以导出包含每个变量的 SHAP 值的数据框?

r machine-learning r-caret shap interpretation
1个回答
0
投票

这是一个根据我们的 {kernelshap}README 稍作修改的示例:

library(caret)
library(kernelshap)
library(shapviz)

fit <- train(
  Sepal.Length ~ ., 
  data = iris, 
  method = "rf", 
  tuneGrid = data.frame(mtry = 2:4),
  trControl = trainControl(method = "oob")
)

# take subsample as bg_X if data has >500 rows or so
s <- kernelshap(fit, X = iris[, -1], bg_X = iris) 
sv <- shapviz(s)
sv_importance(sv, kind = "bee")
sv_dependence(sv, v = colnames(iris[, -1]))

head(s$S)
     Sepal.Width Petal.Length Petal.Width     Species
[1,]  0.18710551   -0.7689923 -0.11966640 -0.02138098
[2,] -0.04975942   -0.8421627 -0.16929579 -0.02247297
[3,] -0.05134404   -0.9807516 -0.21007903 -0.02603232
[4,] -0.01474815   -0.8314441 -0.18571834 -0.02234505
[5,]  0.16345002   -0.8066228 -0.13735372 -0.02104766
[6,]  0.27269103   -0.6231013 -0.06449333 -0.01560054

© www.soinside.com 2019 - 2024. All rights reserved.