通过计算lapply()中的敏感性,特异性和准确性,迭代时无效的参数类型参数

问题描述 投票:0回答:1

问题:

要计算confusionMatrix()的灵敏度,特异性,循环或序列的准确性,其值范围为seq(0.1,0.9,by = 0.1)。

目标:

要迭代的值:0.1到0.9,乘以0.1,通过自定义编码的confusionMatrix函数计算灵敏度,特异性和准确性,当插入符号::: confusionMatrix错误级别不同时,该函数处理水平错误。

空记录已被删除。

问题代码区:

下面是正在执行的R代码,在可执行行中,lapply将每个序列(0.1-0.9)应用于函数compute_seq_accuracy.func():

computed_accuracies <- lapply(compute_for_values, compute_seq_accuracy.func)

我的问题是帮助确定错误的原因,例如

Error in !all.equal(nrow(data), ncol(data)) : invalid argument type
In addition: Warning message:
In Ops.factor(loans_predict_fcm, as.numeric(value)) :

 Error in !all.equal(nrow(data), ncol(data)) : invalid argument type 

R Code WIP解决方案

这是R代码正在开发中,有90%可以正常工作,直到出现上述错误:

# function
compute_seq_accuracy.func <- function(value) {
        tryCatch({
                p <- factor(ifelse(model_prediction < value, 0, 1)) 
                confusion_table <- compute_confustion_matrix(loans_train_data$statusRank, p) 
                c_matrix <- confusionMatrix(confusion_table) 
                return(c_matrix$overall['Accuracy']) 
        }, 
        error = function(e) return(NULL)
        )
}
# function
compute_confusion_matrix.func <- function(y, p) {
        confusion_table <- table(y, p)
        if(nrow(confusion_table)!=ncol(confusion_table)){
                missings <- setdiff(colnames(confusion_table),rownames(confusion_table))
                missing_mat <- mat.or.vec(nr = length(missings), nc = ncol(confusion_table))
                confusion_table  <- as.table(rbind(as.matrix(confusion_table), missing_mat))
                rownames(confusion_table) <- colnames(confusion_table)
        }
        return(confusion_table)
}

# works ok here
x <- compute_confusion_matrix.func(loans_train_data$statusRank, model_prediction)
confusion_matrix <- confusionMatrix(x)
confusion_matrix$byClass['Sensitivity']
confusion_matrix$byClass['Specificity']
confusion_matrix$overall['Accuracy']

compute_for_values = seq(0.1,0.9, by=0.1)

## WIP error in !all.equal(nrow(data, ncol(data)))
computed_accuracies <- lapply(compute_for_values, compute_seq_accuracy.func)
names(computed_accuracies) <- compute_for_values
computed_accuracies[which.max(computed_accuracies)]

数据和消息

尝试...捕获警告消息:

> computed_accuracies <- sapply(compute_for_values, compute_seq_accuracy.func, simplify = FALSE)
Warning messages:
1: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
2: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
3: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
4: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
5: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
6: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
7: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
8: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
9: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
> 

部分校正

已确定错误的数据集:model_prediction。这引起了错误:“ factor(model_prediction,value):‘

> head(model_prediction, 50)
 [1] Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Bad  Good
[26] Good Good Good Good Bad  Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good
Levels: Bad Good
> 

更正的数据集:

 head(loans_predict,50)
    11413      2561     25337      1643     14264     24191     33989     28193     21129      7895     29007     26622      3065 
0.8375821 0.7516343 0.8375704 0.7671279 0.7201578 0.7917037 0.8980501 0.8259884 0.8604232 0.8664207 0.7609676 0.7753622 0.9321958 
    11423      3953      5789     30150      6070      1486     13195     30344     26721       716     24609     22196     10770 
0.8325967 0.9459098 0.5903160 0.5997290 0.9045176 0.6782181 0.7546154 0.8381577 0.7943421 0.7198638 0.4522069 0.7129170 0.8632025 
    18042      3710     21750     23492     10680      5088     10434      3228      8696     29688     33847      2997     24772 
0.8941667 0.6445716 0.7659989 0.2616490 0.7402274 0.7115220 0.8985310 0.7300686 0.8737217 0.6712457 0.7037675 0.6868837 0.7534947 
    28396      6825     27619     26433     25542     33853     32926     33585     20362      6895     20634 
0.7516796 0.7261610 0.8437550 0.8662871 0.8620579 0.9355447 0.6786310 0.6017286 0.9340776 0.9022817 0.7832571 
> 
> compute_for_values
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
r lapply predict
1个回答
0
投票

考虑将您的方法包装在tryCatch中以捕获异常并在错误时返回NULL,您可以进一步调查哪个0.1导致错误,并且可以用NULL最后删除此类Filter元素。下面还使用sapply(包装到lapply),如果使用字符向量作为输入,它会返回一个命名列表。

compute_seq_accuracy.func <- function(value) {
     tryCatch({
        p <- factor(ifelse(loans_predict_fcm < as.numeric(value), 'Bad', 'Good')) 
        confusion_table <- compute_confustion_matrix(loans_train_data$statusRank, p) 
        c_matrix <- confusionMatrix(confusion_table) 
        return(c_matrix$overall['Accuracy']) 
     }, 
        error = function(e) return(NULL)
     )
}

compute_for_values <- as.character(seq(0.1, 0.9, by=0.1))

## WIP error in !all.equal(nrow(data, ncol(data))) 
computed_accuracies <- sapply(compute_for_values, compute_seq_accuracy.func, simplify = FALSE)

# REMOVE NULLs FROM LIST
computed_accuracies <- Filter(LENGTH, computed_accuracies)
© www.soinside.com 2019 - 2024. All rights reserved.