问题:
要计算confusionMatrix()的灵敏度,特异性,循环或序列的准确性,其值范围为seq(0.1,0.9,by = 0.1)。
目标:
要迭代的值:0.1到0.9,乘以0.1,通过自定义编码的confusionMatrix函数计算灵敏度,特异性和准确性,当插入符号::: confusionMatrix错误级别不同时,该函数处理水平错误。
空记录已被删除。
问题代码区:
下面是正在执行的R代码,在可执行行中,lapply将每个序列(0.1-0.9)应用于函数compute_seq_accuracy.func():
computed_accuracies <- lapply(compute_for_values, compute_seq_accuracy.func)
我的问题是帮助确定错误的原因,例如
Error in !all.equal(nrow(data), ncol(data)) : invalid argument type
In addition: Warning message:
In Ops.factor(loans_predict_fcm, as.numeric(value)) :
Error in !all.equal(nrow(data), ncol(data)) : invalid argument type
R Code WIP解决方案
这是R代码正在开发中,有90%可以正常工作,直到出现上述错误:
# function
compute_seq_accuracy.func <- function(value) {
tryCatch({
p <- factor(ifelse(model_prediction < value, 0, 1))
confusion_table <- compute_confustion_matrix(loans_train_data$statusRank, p)
c_matrix <- confusionMatrix(confusion_table)
return(c_matrix$overall['Accuracy'])
},
error = function(e) return(NULL)
)
}
# function
compute_confusion_matrix.func <- function(y, p) {
confusion_table <- table(y, p)
if(nrow(confusion_table)!=ncol(confusion_table)){
missings <- setdiff(colnames(confusion_table),rownames(confusion_table))
missing_mat <- mat.or.vec(nr = length(missings), nc = ncol(confusion_table))
confusion_table <- as.table(rbind(as.matrix(confusion_table), missing_mat))
rownames(confusion_table) <- colnames(confusion_table)
}
return(confusion_table)
}
# works ok here
x <- compute_confusion_matrix.func(loans_train_data$statusRank, model_prediction)
confusion_matrix <- confusionMatrix(x)
confusion_matrix$byClass['Sensitivity']
confusion_matrix$byClass['Specificity']
confusion_matrix$overall['Accuracy']
compute_for_values = seq(0.1,0.9, by=0.1)
## WIP error in !all.equal(nrow(data, ncol(data)))
computed_accuracies <- lapply(compute_for_values, compute_seq_accuracy.func)
names(computed_accuracies) <- compute_for_values
computed_accuracies[which.max(computed_accuracies)]
数据和消息
尝试...捕获警告消息:
> computed_accuracies <- sapply(compute_for_values, compute_seq_accuracy.func, simplify = FALSE)
Warning messages:
1: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
2: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
3: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
4: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
5: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
6: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
7: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
8: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
9: In Ops.factor(model_prediction, value) : ‘<’ not meaningful for factors
>
部分校正
已确定错误的数据集:model_prediction。这引起了错误:“ factor(model_prediction,value):‘
> head(model_prediction, 50)
[1] Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Bad Good
[26] Good Good Good Good Bad Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good Good
Levels: Bad Good
>
更正的数据集:
head(loans_predict,50)
11413 2561 25337 1643 14264 24191 33989 28193 21129 7895 29007 26622 3065
0.8375821 0.7516343 0.8375704 0.7671279 0.7201578 0.7917037 0.8980501 0.8259884 0.8604232 0.8664207 0.7609676 0.7753622 0.9321958
11423 3953 5789 30150 6070 1486 13195 30344 26721 716 24609 22196 10770
0.8325967 0.9459098 0.5903160 0.5997290 0.9045176 0.6782181 0.7546154 0.8381577 0.7943421 0.7198638 0.4522069 0.7129170 0.8632025
18042 3710 21750 23492 10680 5088 10434 3228 8696 29688 33847 2997 24772
0.8941667 0.6445716 0.7659989 0.2616490 0.7402274 0.7115220 0.8985310 0.7300686 0.8737217 0.6712457 0.7037675 0.6868837 0.7534947
28396 6825 27619 26433 25542 33853 32926 33585 20362 6895 20634
0.7516796 0.7261610 0.8437550 0.8662871 0.8620579 0.9355447 0.6786310 0.6017286 0.9340776 0.9022817 0.7832571
>
> compute_for_values
[1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
考虑将您的方法包装在tryCatch
中以捕获异常并在错误时返回NULL
,您可以进一步调查哪个0.1
导致错误,并且可以用NULL
最后删除此类Filter
元素。下面还使用sapply
(包装到lapply
),如果使用字符向量作为输入,它会返回一个命名列表。
compute_seq_accuracy.func <- function(value) {
tryCatch({
p <- factor(ifelse(loans_predict_fcm < as.numeric(value), 'Bad', 'Good'))
confusion_table <- compute_confustion_matrix(loans_train_data$statusRank, p)
c_matrix <- confusionMatrix(confusion_table)
return(c_matrix$overall['Accuracy'])
},
error = function(e) return(NULL)
)
}
compute_for_values <- as.character(seq(0.1, 0.9, by=0.1))
## WIP error in !all.equal(nrow(data, ncol(data)))
computed_accuracies <- sapply(compute_for_values, compute_seq_accuracy.func, simplify = FALSE)
# REMOVE NULLs FROM LIST
computed_accuracies <- Filter(LENGTH, computed_accuracies)