在 R 中创建提升图

Question

假设我有以下数据框，其中包含与他们相关的一些分数：

Score | hasDefaulted
10    | 0
13    | 0
15    | 1
17    | 0
...

我想用 R 制作一个提升图，首先按分数对人口进行排序，然后在 X 轴上显示人口百分比，在 Y 轴上显示默认百分比。我找不到一个好的软件包可以让我控制执行此操作。我已经探索了Package Lift以及Package Gains，但我不知道如何对它们进行足够的控制来完成我上面描述的操作。例如，当我尝试使用 Package Lift 时，如

plotLift(sort(dataFrame$Score, decreasing=FALSE), dataFrame$hasDefaulted)

我得到一些奇怪的情节：

但考虑到我的愿望，该图最终应该看起来像累积密度函数。

有人可以向我展示如何正确使用此类软件包，或者指导我找到满足要求的软件包吗？预先感谢。

Answer 1

我总是尝试构建自己的代码，而不是尝试不太灵活的代码。

我认为您可以解决这个问题：

# Creating the data frame
df <- data.frame("Score"=runif(100,1,100),
                 "hasDefaulted"=round(runif(100,0,1),0))

# Ordering the dataset
df <- df[order(df$Score),]

# Creating the cumulative density
df$cumden <- cumsum(df$hasDefaulted)/sum(df$hasDefaulted)

# Creating the % of population
df$perpop <- (seq(nrow(df))/nrow(df))*100

# Ploting
plot(df$perpop,df$cumden,type="l",xlab="% of Population",ylab="% of Default's")

这就是你想要的吗？

Answer 2

我认为您正在寻找增益图，而不是提升图。我注意到他们之间有些混乱。您可以参考提升图了解更多信息。

require(ROCR)
data(ROCR.simple)
pred <- prediction(ROCR.simple$predictions, ROCR.simple$labels)

gain <- performance(pred, "tpr", "rpp")
plot(gain, main = "Gain Chart")

Answer 3

即使这个问题是大约 5 年前提出的，我也想分享一下，我最近发现了一个很好的软件包，可以帮助构建 GAIN 和 LIFT 图表，并显示增益和提升表：软件包名称是

CustomerScoringMetrics

。

功能：

cumGainsChart()

、

cumGainsChart()

、

liftChart()

、

liftTable()

等

Answer 4

我编写了一个package来计算准确率并绘制CAP曲线。您可以从 GitHub 下载它。

CAP_plot_AR <- 
 
function(logistic_probability_vector,given_outcome_vector_from_data){
   

 
  length <- length(logistic_probability_vector)
  sum_exited <- sum(given_outcome_vector_from_data)
  prop_Exit = sum_exited/length


  randomModel <- 1:length
  randomModel <- randomModel *prop_Exit
  #transform scale to percentage
  randomModel_percent <- randomModel/sum_exited


  perfectModel <- rep(sum_exited,length)
  perfectModel[1:sum_exited] <- 1:sum_exited
  #transform scale to percentage
  perfectModel_percent <- perfectModel/sum_exited

  #give outcomes proper order from the highest model probability
  logistic_probability_vector <- sort(logistic_probability_vector, decreasing = T)
  given_outcome_vector_from_data <- given_outcome_vector_from_data[as.numeric(names(logistic_probability_vector))]

  logisticRegression <- rep(0,length)
  logisticRegression[1] <- 0 + given_outcome_vector_from_data[1]
  for(i in 2:length){
    logisticRegression[i] <- logisticRegression[i-1] +given_outcome_vector_from_data[i]
  }
  #transform scale to percentage
  logisticRegression_percent <- logisticRegression/sum_exited




  yscale <- 1:length
  yscale <- yscale/length
  data <- data.frame(random = randomModel_percent, logistic=logisticRegression_percent,perfect=perfectModel_percent)

  p <- ggplot2::ggplot(data=data, ggplot2::aes(yscale,y=random)) +
    ggplot2::geom_line() +
    ggplot2::geom_line(ggplot2::aes(yscale,logistic))+
    ggplot2::geom_line(ggplot2::aes(yscale,perfect)) +
    ggplot2::scale_y_continuous(labels = scales::percent ) +
    ggplot2::scale_x_continuous(labels = scales::percent )


  print(p)


  between <- MESS::auc(1:length,logisticRegression)-MESS::auc(1:length,randomModel)
  over <- MESS::auc(1:length,perfectModel)-MESS::auc(1:length,randomModel)
  AR <- between/over
  print(AR)
}

在 R 中创建提升图

问题描述投票：0回答：4

4个回答

最新问题

在 R 中创建提升图

问题描述 投票：0回答：4

4个回答

最新问题

问题描述投票：0回答：4