Prevalent deep learning models suffer from significant over-confidence under distribution shifts. In this paper, we propose Density-Softmax, a single deterministic approach for uncertainty estimation via a combination of density function with the softmax layer. By using the latent representation's likelihood value, our approach produces more uncertain predictions when test samples are distant from the training samples. Theoretically, we prove that Density-Softmax is distance aware, which means its associated uncertainty metrics are monotonic functions of distance metrics. This has been shown to be a necessary condition for a neural network to produce high-quality uncertainty estimation. Empirically, our method enjoys similar computational efficiency as standard softmax on shifted CIFAR-10, CIFAR-100, and ImageNet dataset across modern deep learning architectures. Notably, Density-Softmax uses 4 times fewer parameters than Deep Ensembles and 6 times lower latency than Rank-1 Bayesian Neural Network, while obtaining competitive predictive performance and lower calibration errors under distribution shifts.
翻译:摘要:当前普遍存在的深度学习模型在分布偏移下存在显著的过度自信问题。本文提出Density-Softmax,一种通过将密度函数与Softmax层结合的单确定性不确定性估计方法。通过利用隐表示的对数似然值,当测试样本远离训练样本时,该方法会产生更不确定的预测。理论上,我们证明了Density-Softmax具有距离感知性,即其相关不确定性度量是距离度量的单调函数。这已被证明是神经网络生成高质量不确定性估计的必要条件。实验上,我们的方法在现代深度学习架构上,对于CIFAR-10、CIFAR-100和ImageNet数据集的偏移版本,享有与标准Softmax相似的计算效率。值得注意的是,与深度集成方法相比,Density-Softmax参数减少4倍;与秩1贝叶斯神经网络相比,延迟降低6倍,同时在分布偏移下获得具有竞争力的预测性能和更低的校准误差。