我发现了所有叶节点的误分类率。
samples = 3635 + 1101 = 4736, class = Cash, 误分类率 = 1101 / 4736 = 0.232.
samples = 47436 + 44556 = 91992, class = Cash, 误分类率 = 44556 / 91992 = 0.484.
samples = 7072 + 15252 = 22324, class = Credit Card, 误分类率 = 7072 / 22324 = 0.317.
samples = 1294 + 1456 = 2750, class = Credit Card, 误分类率 = 1294 / 2750 = 0.470.
samples = 7238 + 22295 = 29533, class = Credit Card, 误分类率 = 7238 / 29533 = 0.245.
我发现很难从这里找到 AUC 值。这个你能帮我吗。我将不胜感激。
from sklearn.metrics import roc_auc_score
def create_actual_prediction_arrays(n_pos, n_neg):
prob = n_pos / (n_pos + n_neg)
y_true = [1] * n_pos + [0] * n_neg
y_score = [prob] * (n_pos + n_neg)
return y_true, y_score
total_y_true = []
total_y_score = []
for n_pos, n_neg in [(3635, 1101), (47436, 44556), (7072, 15252), (1294, 1456), (7238, 22295)]:
y_true, y_score = create_actual_prediction_arrays(n_pos, n_neg)
total_y_true = total_y_true + y_true
total_y_score = total_y_pred + y_score
print("auc_score = ", roc_auc_score(y_true=total_y_true, y_score=total_y_pred))
解释 - 这会收集所有节点的所有真实 y 值和预测的 y_scores 并计算 AUC 分数。