Out-of-distribution (OOD) detection is an indispensable aspect of secure AI when deploying machine learning models in real-world applications. Previous paradigms either explore better scoring functions or utilize the knowledge of outliers to equip the models with the ability of OOD detection. However, few of them pay attention to the intrinsic OOD detection capability of the given model. In this work, we generally discover the existence of an intermediate stage of a model trained on in-distribution (ID) data having higher OOD detection performance than that of its final stage across different settings, and further identify one critical data-level attribution to be learning with the atypical samples. Based on such insights, we propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data. Our method utilizes a mask to figure out the memorized atypical samples, and then finetune the model or prune it with the introduced mask to forget them. Extensive experiments and analysis demonstrate the effectiveness of our method. The code is available at: https://github.com/tmlr-group/Unleashing-Mask.
翻译:分布外检测是机器学习模型在实际部署中实现安全人工智能不可或缺的一环。以往的研究范式要么探索更优的评分函数,要么利用异常样本知识使模型具备分布外检测能力。然而,鲜有工作关注模型自身固有的分布外检测能力。本研究发现,在多种设定下,基于分布内数据训练的模型在中间阶段展现出比最终阶段更高的分布外检测性能,并进一步识别出关键的数据层面归因——与异常样本的共同学习。基于此洞察,我们提出一种名为Unleashing Mask的新方法,旨在利用分布内数据恢复已训练模型的分布外判别能力。该方法通过掩码识别被记忆的异常样本,随后对模型进行微调或利用所引入的掩码进行剪枝以遗忘这些样本。大量实验与分析证明了该方法的有效性。代码已开源:https://github.com/tmlr-group/Unleashing-Mask。