The deployment of machine learning solutions in real-world scenarios often involves addressing the challenge of out-of-distribution (OOD) detection. While significant efforts have been devoted to OOD detection in classical supervised settings, the context of weakly supervised learning, particularly the Multiple Instance Learning (MIL) framework, remains under-explored. In this study, we tackle this challenge by adapting post-hoc OOD detection methods to the MIL setting while introducing a novel benchmark specifically designed to assess OOD detection performance in weakly supervised scenarios. Across extensive experiments based on diverse public datasets, KNN emerges as the best-performing method overall. However, it exhibits significant shortcomings on some datasets, emphasizing the complexity of this under-explored and challenging topic. Our findings shed light on the complex nature of OOD detection under the MIL framework, emphasizing the importance of developing novel, robust, and reliable methods that can generalize effectively in a weakly supervised context. The code for the paper is available here: https://github.com/loic-lb/OOD_MIL.
翻译:在现实场景中部署机器学习解决方案时,经常需要应对分布外检测的挑战。尽管在经典监督学习设置下针对分布外检测已有大量研究,但弱监督学习情境,尤其是多实例学习框架,仍探索不足。在本研究中,我们通过将后验分布外检测方法适配至多实例学习设定来应对这一挑战,同时引入一个专门用于评估弱监督场景下分布外检测性能的新型基准测试。基于多样化公开数据集的广泛实验表明,KNN 方法在整体上表现最优,但在某些数据集上存在显著缺陷,凸显了这一未被充分探索且具有挑战性议题的复杂性。我们的研究结果揭示了多实例学习框架下分布外检测的复杂本质,强调了开发能在弱监督情境下有效泛化的新型、稳健且可靠方法的重要性。本文代码可通过以下链接获取:https://github.com/loic-lb/OOD_MIL。