Leveraging the models' outputs, specifically the logits, is a common approach to estimating the test accuracy of a pre-trained neural network on out-of-distribution (OOD) samples without requiring access to the corresponding ground truth labels. Despite their ease of implementation and computational efficiency, current logit-based methods are vulnerable to overconfidence issues, leading to prediction bias, especially under the natural shift. In this work, we first study the relationship between logits and generalization performance from the view of low-density separation assumption. Our findings motivate our proposed method MaNo which (1) applies a data-dependent normalization on the logits to reduce prediction bias, and (2) takes the $L_p$ norm of the matrix of normalized logits as the estimation score. Our theoretical analysis highlights the connection between the provided score and the model's uncertainty. We conduct an extensive empirical study on common unsupervised accuracy estimation benchmarks and demonstrate that MaNo achieves state-of-the-art performance across various architectures in the presence of synthetic, natural, or subpopulation shifts.
翻译:利用模型输出(特别是logits)来估计预训练神经网络在分布外(OOD)样本上的测试精度,是一种无需对应真实标签的常用方法。尽管现有基于logits的方法易于实现且计算高效,但它们容易受到过度自信问题的影响,导致预测偏差,在自然偏移下尤为明显。在本工作中,我们首先从低密度分离假设的角度研究了logits与泛化性能之间的关系。我们的发现启发了所提出的方法MaNo,该方法(1)对logits施加数据依赖的归一化以减少预测偏差,(2)以归一化logits矩阵的$L_p$范数作为估计分数。理论分析揭示了所提供分数与模型不确定性之间的关联。我们在常见的无监督精度估计基准上进行了广泛的实证研究,结果表明MaNo在存在合成偏移、自然偏移或子群体偏移的情况下,能够跨多种架构实现最先进的性能。