Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks - a popular class of neural network architectures - have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class, that we design to dominate the logits of the original classes as we move away from the training data.This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training. Evaluation on various benchmarks demonstrates strong performance against competitive baselines on both far-away and realistic OOD data.
翻译:判别训练下的确定性神经网络是分类问题的默认选择。然而,尽管这些网络在域内测试集上取得了最先进的结果,它们往往对分布外(OOD)数据过度自信。例如,ReLU网络(一类流行的神经网络架构)已被证明,当测试数据远离训练集时,即便经过OOD数据训练,也几乎总是产生高置信度预测。我们通过在神经网络输出中添加一个对应额外类别的logit项克服了这一问题,该项被设计为在远离训练数据时主导原始类别的logits。该技术能够在不破坏简单判别点估计训练的前提下,可证明地防止对远距离测试数据产生任意高置信度。在多个基准上的评估表明,该方法在远距离和现实OOD数据上均优于竞争基线。