This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main concepts and methods, including proper scoring rules and other evaluation metrics, visualisation approaches, a comprehensive account of post-hoc calibration methods for binary and multiclass classification, and several advanced topics.
翻译:本文既是对分类器校准原理与实践的入门介绍,也是其详细综述。一个校准良好的分类器能正确量化其逐实例预测的不确定性或置信度水平。这对于关键应用、最优决策制定、成本敏感分类以及某些类型的环境变化至关重要。校准研究有着悠久历史,其起源比机器学习作为学术领域的诞生早数十年。然而,近期对校准兴趣的增加催生了新方法,并将研究从二分类场景扩展至多分类场景。需考虑的选项和问题空间非常庞大,导航其中需要正确的概念集与工具。我们既提供入门材料,也提供主要概念和方法的最新技术细节,包括恰当评分规则及其他评估指标、可视化方法、二分类与多分类的完备事后校准方法综述,以及若干高级专题。