We propose the Variation Calibration Error (VCE) metric for assessing the calibration of machine learning classifiers. The metric can be viewed as an extension of the well-known Expected Calibration Error (ECE) which assesses the calibration of the maximum probability or confidence. Other ways of measuring the variation of a probability distribution exist which have the advantage of taking into account the full probability distribution, for example the Shannon entropy. We show how the ECE approach can be extended from assessing confidence calibration to assessing the calibration of any metric of variation. We present numerical examples upon synthetic predictions which are perfectly calibrated by design, demonstrating that, in this scenario, the VCE has the desired property of approaching zero as the number of data samples increases, in contrast to another entropy-based calibration metric (the UCE) which has been proposed in the literature.
翻译:我们提出了变异校准误差(VCE)这一度量指标,用于评估机器学习分类器的校准程度。该指标可视为对广为人知的期望校准误差(ECE)的扩展,后者仅评估最大概率(即置信度)的校准情况。存在其他衡量概率分布变异性的方法,例如香农熵,其优势在于能够考虑完整的概率分布。我们展示了如何将ECE方法从评估置信度校准推广至评估任何变异度量的校准。我们在通过设计实现完美校准的合成预测数据上进行了数值示例演示,结果表明在此场景下,随着数据样本量的增加,VCE具有趋近于零的理想特性;而文献中提出的另一种基于熵的校准度量(UCE)则不具备这一特性。