Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent regularizers have been recently introduced in the context of conventional frequentist learning to penalize deviations between confidence and accuracy. In contrast, data-independent regularizers are at the core of Bayesian learning, enforcing adherence of the variational distribution in the model parameter space to a prior density. The former approach is unable to quantify epistemic uncertainty, while the latter is severely affected by model misspecification. In light of the limitations of both methods, this paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs), that applies both regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
翻译:深度学习模型,包括大语言模型等现代系统,在决策不确定性估计方面存在不可靠的问题。为提升模型置信度质量(即校准性能),常见方法是在训练损失中添加数据依赖型或数据独立型正则化项。近期,数据依赖型正则化被引入传统频率学派学习框架,用于惩罚置信度与准确性之间的偏差;而数据独立型正则化则是贝叶斯学习的核心,通过强制模型参数空间中的变分分布服从先验密度来实现约束。前者无法量化认知不确定性,后者则严重受限于模型错误设定。针对这两种方法的局限性,本文提出一种集成框架——校准感知贝叶斯神经网络(CA-BNNs),该框架在贝叶斯学习的变分分布优化过程中同时应用两类正则化项。数值结果验证了该方法在期望校准误差(ECE)和可靠性图方面的优势。