Deep generative models that can tractably compute input likelihoods, including normalizing flows, often assign unexpectedly high likelihoods to out-of-distribution (OOD) inputs. We mitigate this likelihood paradox by manipulating input entropy based on semantic similarity, applying stronger perturbations to inputs that are less similar to an in-distribution memory bank. We provide a theoretical analysis showing that entropy control increases the expected log-likelihood gap between in-distribution and OOD samples in favor of the in-distribution, and we explain why the procedure works without any additional training of the density model. We then evaluate our method against likelihood-based OOD detectors on standard benchmarks and find consistent AUROC improvements over baselines, supporting our explanation.
翻译:能够高效计算输入似然度的深度生成模型(包括归一化流)常为分布外输入分配异常高的似然度。本文通过基于语义相似度的输入熵操纵来缓解这一似然悖论:对与分布内记忆库相似度较低的输入施加更强的扰动。我们通过理论分析证明,熵控制能增大分布内样本与分布外样本之间的期望对数似然差,且该差异有利于分布内样本;同时阐明了该方法无需对密度模型进行额外训练即可生效的原理。最后,我们在标准基准测试中将本方法与基于似然度的分布外检测器进行对比,发现其AUROC指标相对基线模型取得稳定提升,从而验证了理论解释的有效性。