Deep discriminative approaches like random forests and deep neural networks have recently found applications in many important real-world scenarios. However, deploying these learning algorithms in safety-critical applications raises concerns, particularly when it comes to ensuring confidence calibration for both in-distribution and out-of-distribution data points. Many popular methods for in-distribution (ID) calibration, such as isotonic regression and Platt's sigmoidal regression, exhibit excellent ID calibration performance but often at the cost of classification accuracy. Moreover, these methods are not calibrated for the entire feature space, leading to overconfidence in the case of out-of-distribution (OOD) samples. In this paper, we leveraged the fact that deep models, including both random forests and deep-nets, learn internal representations which are unions of polytopes with affine activation functions to conceptualize them both as partitioning rules of the feature space. We replace the affine function in each polytope populated by the training data with a Gaussian kernel. We propose sufficient conditions for our proposed methods to be consistent estimators of the corresponding class conditional densities. Moreover, our experiments on both tabular and vision benchmarks show that the proposed approaches obtain well-calibrated posteriors while mostly preserving or improving the classification accuracy of the original algorithm for in-distribution region, and extrapolates beyond the training data to handle out-of-distribution inputs appropriately.
翻译:深度判别方法,如随机森林和深度神经网络,近年来在众多重要现实场景中得到应用。然而,将这些学习算法部署到安全关键应用中引发了担忧,尤其是在确保对分布内和分布外数据点都实现置信度校准时。许多流行的分布内校准方法,例如等渗回归和Platt的S型回归,表现出优异的分布内校准性能,但往往以牺牲分类准确率为代价。此外,这些方法并未针对整个特征空间进行校准,导致在分布外样本情况下出现过自信。在本文中,我们利用这样一个事实:包括随机森林和深度网络在内的深度模型学习到的内部表示是带有仿射激活函数的多面体并集,从而将两者均概念化为特征空间的分区规则。我们将训练数据填充的每个多面体中的仿射函数替换为高斯核。我们提出了充分条件,使得所提出的方法能够一致地估计相应的类别条件密度。此外,我们在表格和视觉基准上的实验表明,所提出的方法在大多保持或提高原始算法在分布内区域的分类准确率的同时,获得了良好校准的后验概率,并能超越训练数据进行外推,从而恰当处理分布外输入。