Post-hoc out-of-distribution (OOD) detection has garnered intensive attention in reliable machine learning. Many efforts have been dedicated to deriving score functions based on logits, distances, or rigorous data distribution assumptions to identify low-scoring OOD samples. Nevertheless, these estimate scores may fail to accurately reflect the true data density or impose impractical constraints. To provide a unified perspective on density-based score design, we propose a novel theoretical framework grounded in Bregman divergence, which extends distribution considerations to encompass an exponential family of distributions. Leveraging the conjugation constraint revealed in our theorem, we introduce a \textsc{ConjNorm} method, reframing density function design as a search for the optimal norm coefficient $p$ against the given dataset. In light of the computational challenges of normalization, we devise an unbiased and analytically tractable estimator of the partition function using the Monte Carlo-based importance sampling technique. Extensive experiments across OOD detection benchmarks empirically demonstrate that our proposed \textsc{ConjNorm} has established a new state-of-the-art in a variety of OOD detection setups, outperforming the current best method by up to 13.25$\%$ and 28.19$\%$ (FPR95) on CIFAR-100 and ImageNet-1K, respectively.
翻译:事后分布外(OOD)检测在可靠机器学习领域受到广泛关注。许多研究致力于基于逻辑值、距离或严格的数据分布假设推导评分函数,以识别低分OOD样本。然而,这些估计的评分可能无法准确反映真实数据密度,或施加了不切实际的约束。为提供基于密度的评分设计的统一视角,我们提出了一种基于Bregman散度的新型理论框架,该框架将分布考量扩展至包含指数族分布。利用我们定理中揭示的共轭约束,我们引入了\textsc{ConjNorm}方法,将密度函数设计重新定义为针对给定数据集寻找最优范数系数$p$的过程。鉴于归一化的计算挑战,我们采用基于蒙特卡洛的重要性采样技术,设计了一种无偏且可解析处理的配分函数估计器。在OOD检测基准上的大量实验表明,我们提出的\textsc{ConjNorm}方法在各种OOD检测设置中确立了新的最优性能,在CIFAR-100和ImageNet-1K数据集上分别以高达13.25$\%$和28.19$\%$(FPR95)的优势超越了当前最佳方法。