General-Purpose Multi-Modal OOD Detection Framework

Out-of-distribution (OOD) detection identifies test samples that differ from the training data, which is critical to ensuring the safety and reliability of machine learning (ML) systems. While a plethora of methods have been developed to detect uni-modal OOD samples, only a few have focused on multi-modal OOD detection. Current contrastive learning-based methods primarily study multi-modal OOD detection in a scenario where both a given image and its corresponding textual description come from a new domain. However, real-world deployments of ML systems may face more anomaly scenarios caused by multiple factors like sensor faults, bad weather, and environmental changes. Hence, the goal of this work is to simultaneously detect from multiple different OOD scenarios in a fine-grained manner. To reach this goal, we propose a general-purpose weakly-supervised OOD detection framework, called WOOD, that combines a binary classifier and a contrastive learning component to reap the benefits of both. In order to better distinguish the latent representations of in-distribution (ID) and OOD samples, we adopt the Hinge loss to constrain their similarity. Furthermore, we develop a new scoring metric to integrate the prediction results from both the binary classifier and contrastive learning for identifying OOD samples. We evaluate the proposed WOOD model on multiple real-world datasets, and the experimental results demonstrate that the WOOD model outperforms the state-of-the-art methods for multi-modal OOD detection. Importantly, our approach is able to achieve high accuracy in OOD detection in three different OOD scenarios simultaneously. The source code will be made publicly available upon publication.

翻译：分布外（OOD）检测旨在识别与训练数据不同的测试样本，对于确保机器学习（ML）系统的安全性和可靠性至关重要。尽管已有大量方法被开发用于单模态OOD检测，但针对多模态OOD检测的研究仍相对较少。当前基于对比学习的方法主要研究一种场景：给定图像及其对应文本描述均来自新领域时，进行多模态OOD检测。然而，ML系统在实际部署中可能面临由传感器故障、恶劣天气、环境变化等多重因素导致的更多异常场景。因此，本工作的目标是同时以细粒度方式检测多种不同的OOD场景。为实现该目标，我们提出一个名为WOOD的通用弱监督OOD检测框架，该框架结合二元分类器与对比学习组件以融合两者优势。为更好区分分布内（ID）与OOD样本的潜在表征，我们采用Hinge损失约束其相似性。此外，我们开发了一种新的评分指标，整合二元分类器与对比学习的预测结果以识别OOD样本。在多个真实数据集上评估所提出的WOOD模型，实验结果表明该模型在多模态OOD检测中优于现有最先进方法。重要的是，我们的方法能够在三种不同OOD场景中同时实现高精度检测。源代码将在论文发表后公开。