R的ROCR package提供了ROC曲线绘制的选项,这些颜色将沿曲线进行颜色编码和标记阈值:
我能用Python得到的最接近的东西是
from sklearn.metrics import roc_curve
fpr, tpr, thresholds = roc_curve(qualityTrain.PoorCare, qualityTrain.Pred1)
plt.plot(fpr, tpr, label='ROC curve', color='b')
plt.axes().set_aspect('equal')
plt.xlim([-0.05, 1.05])
plt.ylim([-0.05, 1.05])
给出
是否有提供与R标记(使用print.cutoffs.at
)和颜色代码(使用colorize
)阈值的功能等效的功能的软件包?大概此信息在thresholds
中,由sklearn.metrics.roc_curve
返回,但我不知道如何使用它来对代码进行颜色编码和标记。
roc_data = sklearn.metrics.roc_curve(...)
plot_roc(*roc_data, label_every=5)
用法:
import sklearn # for the roc curve
import matplotlib.pyplot as plt
def plot_roc(labels, predictions, positive_label, thresholds_every=10, title=''):
# fp: false positive rates. tp: true positive rates
fp, tp, thresholds = sklearn.metrics.roc_curve(labels, predictions, pos_label=positive_label)
roc_auc = sklearn.metrics.auc(fp, tp)
figure(figsize=(16, 16))
plt.plot(fp, tp, label='ROC curve (area = %0.2f)' % roc_auc, linewidth=2, color='darkorange')
plt.plot([0, 1], [0, 1], color='navy', linestyle='--', linewidth=2)
plt.xlabel('False positives rate')
plt.ylabel('True positives rate')
plt.xlim([-0.03, 1.0])
plt.ylim([0.0, 1.03])
plt.title(title)
plt.legend(loc="lower right")
plt.grid(True)
# plot some thresholds
thresholdsLength = len(thresholds)
colorMap=plt.get_cmap('jet', thresholdsLength)
for i in range(0, thresholdsLength, thresholds_every):
threshold_value_with_max_four_decimals = str(thresholds[i])[:5]
plt.text(fp[i] - 0.03, tp[i] + 0.005, threshold_value_with_max_four_decimals, fontdict={'size': 15}, color=colorMap(i/thresholdsLength));
plt.show()
结果:labels = [1, 1, 2, 2, 2, 3]
predictions = [0.7, 0.99, 0.9, 0.3, 0.7, 0.01] # predictions/accuray for class 1
plot_roc(labels, predictions, positive_label=1, thresholds_every=1, title="ROC Curve - Class 1")