Detecting out-of-distribution (OOD) samples is important for deploying machine learning models in safety-critical applications such as autonomous driving and robot-assisted surgery. Existing research has mainly focused on unimodal scenarios on image data. However, real-world applications are inherently multimodal, which makes it essential to leverage information from multiple modalities to enhance the efficacy of OOD detection. To establish a foundation for more realistic Multimodal OOD Detection, we introduce the first-of-its-kind benchmark, MultiOOD, characterized by diverse dataset sizes and varying modality combinations. We first evaluate existing unimodal OOD detection algorithms on MultiOOD, observing that the mere inclusion of additional modalities yields substantial improvements. This underscores the importance of utilizing multiple modalities for OOD detection. Based on the observation of Modality Prediction Discrepancy between in-distribution (ID) and OOD data, and its strong correlation with OOD performance, we propose the Agree-to-Disagree (A2D) algorithm to encourage such discrepancy during training. Moreover, we introduce a novel outlier synthesis method, NP-Mix, which explores broader feature spaces by leveraging the information from nearest neighbor classes and complements A2D to strengthen OOD detection performance. Extensive experiments on MultiOOD demonstrate that training with A2D and NP-Mix improves existing OOD detection algorithms by a large margin. Our source code and MultiOOD benchmark are available at https://github.com/donghao51/MultiOOD.
翻译:检测分布外(OOD)样本对于在自动驾驶和机器人辅助手术等安全关键应用中部署机器学习模型至关重要。现有研究主要集中于图像数据的单模态场景。然而,现实世界应用本质上是多模态的,这要求必须利用来自多个模态的信息以提升OOD检测的效能。为建立更贴近实际的多模态OOD检测基础,我们提出了首个同类基准MultiOOD,其特点在于多样化的数据集规模和变化的模态组合。我们首先在MultiOOD上评估了现有的单模态OOD检测算法,观察到仅通过引入额外模态即可带来显著性能提升,这凸显了利用多模态进行OOD检测的重要性。基于对分布内(ID)数据与OOD数据之间模态预测差异的观察,及其与OOD性能的强相关性,我们提出了Agree-to-Disagree(A2D)算法,以在训练过程中促进此类差异。此外,我们引入了一种新颖的离群点合成方法NP-Mix,该方法通过利用最近邻类别的信息探索更广阔的特征空间,并与A2D互补以增强OOD检测性能。在MultiOOD上进行的大量实验表明,使用A2D和NP-Mix进行训练能大幅提升现有OOD检测算法的性能。我们的源代码及MultiOOD基准可通过 https://github.com/donghao51/MultiOOD 获取。