使用“iml”包评估 REEMforest 和 MERF 的变量重要性

问题描述 投票:0回答:1

我最近开始使用

LongituRF
包。 我正在将其拟合到一些数据,并且我想使用
iml
包来评估变量的重要性。 我已经使用过
iml
,并且我喜欢它的特性。但是,当我使用
LongituRF
时,我无法评估变量的重要性。

在下面的代码中,我创建了一些数据,并将

REEMforest
包中的
MERF
LongituRF
安装到数据中。然后我尝试评估变量的重要性,但收到此错误消息:

初始化时出错(...): 请使用 y 目标向量调用 Predictor$new()。

很明显,

Predictor$new()
在我的代码中没有正确定义。

在示例代码的末尾,我还为数据添加了

randomForest
并评估变量重要性。正如你所看到的,它在那里工作得很好。

你知道我该如何解决这个问题吗?

示例代码:

# libraries ---------------------------------------------------------------

install.packages("LongituRF")
# #(S)REEMforest is an adaptation of the random forest regression method to longitudinal data introduced by Capitaine et. al. (2020) <doi:10.1177/0962280220946080>
library(LongituRF)

install.packages("iml")
# for assessing variable importance
library(iml)

# -------------------------------------------------------------------------
# a function that creates some data for me

dgp_math_s <- function(ni,nj, RI_sd, sigma2 = 1,
                       gamma00 = 0, gamma01 = 0, gamma10 = 0, gamma02 = 0, gamma20 = 0){
  
  dgp_grid <- expand.grid(
    ni = 1:ni,
    nj = 1:nj,
    studying = NA,
    atmosphere = NA,
    motivation = NA,
    math_score = NA, 
    Rij = NA,
    U0j = NA
  )
  
  dgp_grid$atmosphere <- rep(rbinom(nj,1,0.5), each = length(1:ni))
  #create a random factorial level 2 predictor, same value for the whole cluster 
  
  dgp_grid$U0j <- rep(rnorm(nj, mean = 3, sd = RI_sd), each = ni)
  #create level 2 residual 
  
  dgp_grid$Rij <- rnorm(ni*nj, mean = 3, sd = sqrt(sigma2))
  # create level 1 residual with sigma2 = 1
  
  dgp_grid$studying <-sample(0:5, ni*nj, replace = TRUE)
  # create level 1 explanatory/predictor variable (draw from standard normal) 
  
  dgp_grid$motivation <-sample(0:5, ni*nj, replace = TRUE)
  # create level 1 explanatory/predictor variable (draw from standard normal) 
  
  dgp_grid$math_score <-
    gamma00 + gamma10 * dgp_grid$studying + gamma20 * dgp_grid$motivation + gamma01 * dgp_grid$atmosphere +
    dgp_grid$U0j + dgp_grid$Rij
  #create math_score
  
  return(dgp_grid)
}
# -------------------------------------------------------------------------

dgp_math<-dgp_math_s(ni = 20, nj = 20, RI_sd = 2, gamma10 = 0, gamma01 = 0)
#create data 


# Fitting REEMforest ------------------------------------------------------

predictors <- dgp_math[, c("studying", "atmosphere","motivation")]

outcome <- dgp_math$math_score
outcome <- as.vector(outcome)


SREEMF <- LongituRF::REEMforest(X=predictors,Y=dgp_math$math_score,Z=matrix(rep(1, nrow(dgp_math)), ncol = 1),
                                id=dgp_math$nj,time=dgp_math$ni,ntree=100,sto="none", mtry = 2)

#Fitting REEMforest


# Fitting MERF ------------------------------------------------------------

MERF <- LongituRF::MERF(X=predictors,Y=dgp_math$math_score,Z=matrix(rep(1, nrow(dgp_math)), ncol = 1),
                                id=dgp_math$nj,time=dgp_math$ni,ntree=100,sto="none", mtry = 2)

#Fitting MERF


# Assessing variable importance using "iml" -------------------------------


pred <- Predictor$new(SREEMF$forest, data = cbind(predictors, dgp_math$math_score))

imp <- iml::FeatureImp$new(pred, loss = "mse", compare = "difference")$results
# Variable importance of REEMforest




pred <- Predictor$new(MERF$forest, data = cbind(predictors, dgp_math$math_score))

imp <- iml::FeatureImp$new(pred, loss = "mse", compare = "difference")$results
# Variable importance of MERF



# example using CARTforest ------------------------------------------------
install.packages("randomForest")
library(randomForest)

mybreimanforest <- randomForest::randomForest(math_score ~ studying + motivation + atmosphere, data = dgp_math, ntree= 500)


## Variable importance using iml -------------------------------------------

pred_breimanforest <- Predictor$new(mybreimanforest, data = dgp_math)

imp_breimanforest <- FeatureImp$new(pred_breimanforest, loss = "mse", compare = "difference")$results
#this works for the randomforest

r random-forest longitudinal iml
1个回答
0
投票

问题解决了吗?我还有纵向数据,并尝试将特征选择应用于我的数据。

© www.soinside.com 2019 - 2024. All rights reserved.