模型评估指标 Precision, Recall, ROC and AUC

时间:2022-12-08 18:43:02

ACC, Precision and Recall

这些概念是针对 binary classifier 而言的.

  • 准确率 (accuracy) 是指分类正确的样本占总样本个数的比例.
  • 精确率 (precision) 是指分类正确的正样本占预测为正的样本个数的比例. 是针对预测而言的. 在信息检索领域称为查准率.
  • 召回率 (recall) 是指分类正确的正样本占真正的正样本个数的比例. 是针对样本而言的. 在信息检索领域称为查全率.

模型评估指标 Precision, Recall, ROC and AUC

为了提高 precision, 分类器会更加保守地预测正样本, 而这往往会导致 recall 降低. 综合考虑这两个因素的一个办法是绘制曲线, 另一个常用的指标是 F1 score, 它是 precision 和 recall 的调和平均值 (harmonic mean).

Curves

  • P-R (Precision-Recall) 曲线. 横轴为 recall, 纵轴为 precision. 一般来说模型将大于某一阈值的判定为正, 否则为负; 则曲线上的一个点代表在某一阈值下的 precision 和 recall.

模型评估指标 Precision, Recall, ROC and AUC

  • ROC (Receiver Operating Characteristic) 曲线. 受试者工作特征曲线. 横轴为 FPR (False Positive Rate), 纵轴为 TPR (True Positive Rate), 其实就是 recall. ROC 曲线总是位于 \(y=x\) 的上方 (否则使预测概率 \(p \leftarrow 1-p\) 即可).

模型评估指标 Precision, Recall, ROC and AUC

由定义易得, 对于不均衡程度不同的测试集, P-R 曲线会有大变化, 而 ROC 曲线比较稳定.

例如 TP = FP = TN = FN = 1, precision = 1/2, recall = 1/2.
将负样本 copy 为原来的 N 倍, 则 TN = FP = N, precision = 1/(N+1), recall = 1/2, 发生了很大变动, 而 ROC 曲线不变.

  • AUC (Area Under Curve). 指 ROC 曲线下的面积, 越大越好.

Misc

模型评估指标 Precision, Recall, ROC and AUC

The ROC curve was first used during World War II for the analysis of radar signals before it was employed in signal detection theory. Following the attack on Pearl Harbor in 1941, the United States army began new research to increase the prediction of correctly detected Japanese aircraft from their radar signals. For these purposes they measured the ability of a radar receiver operator to make these important distinctions, which was called the Receiver Operating Characteristic.

References

[1] 如何解释召回率与准确率? - 知乎. https://www.zhihu.com/question/19645541

图片来自
Precision and recall - Wikipedia. https://en.wikipedia.org/wiki/Precision_and_recall
Receiver operating characteristic - Wikipedia. https://en.wikipedia.org/wiki/Receiver_operating_characteristic
Precision-recall curves – what are they and how are they used?. https://acutecaretesting.org/en/articles/precision-recall-curves-what-are-they-and-how-are-they-used