Calibration in Deep Learning: A Survey of the State-of-the-Art

Calibrating deep neural models plays an important role in building reliable, robust AI systems in safety-critical applications. Recent work has shown that modern neural networks that possess high predictive capability are poorly calibrated and produce unreliable model predictions. Though deep learning models achieve remarkable performance on various benchmarks, the study of model calibration and reliability is relatively underexplored. Ideal deep models should have not only high predictive performance but also be well calibrated. There have been some recent methods proposed to calibrate deep models by using different mechanisms. In this survey, we review the state-of-the-art calibration methods and provide an understanding of their principles for performing model calibration. First, we start with the definition of model calibration and explain the root causes of model miscalibration. Then we introduce the key metrics that can measure this aspect. It is followed by a summary of calibration methods that we roughly classified into four categories: post-hoc calibration, regularization methods, uncertainty estimation, and composition methods. We also covered some recent advancements in calibrating large models, particularly large language models (LLMs). Finally, we discuss some open issues, challenges, and potential directions.

翻译：校准深度神经模型在安全关键应用中构建可靠、稳健的AI系统方面起着重要作用。近期研究表明，具有高预测能力的现代神经网络往往校准不良，产生不可靠的模型预测。尽管深度学习模型在各种基准测试上取得了显著性能，但对模型校准和可靠性的研究相对不足。理想的深度模型不仅应具备高预测性能，还应得到良好校准。近年来已有一些方法通过不同机制对深度模型进行校准。在本综述中，我们回顾了最先进的校准方法，并深入理解其执行模型校准的原理。首先，我们从模型校准的定义出发，解释模型校准不良的根本原因。随后介绍能够衡量这一方面的关键指标。接着概述校准方法，我们将其大致分为四类：事后校准、正则化方法、不确定性估计和组合方法。我们还涵盖了校准大模型（特别是大型语言模型，LLMs）的一些最新进展。最后，我们讨论了若干开放问题、挑战及潜在研究方向。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/