Handling out-of-distribution (OOD) samples has become a major stake in the real-world deployment of machine learning systems. This work explores the application of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples: unseen classes and adversarial perturbations. Since in practice the distribution of such samples is not known in advance, we do not assume access to OOD examples. We first show that similarity functions trained with contrastive learning can be leveraged with the maximum mean discrepancy (MMD) two-sample test to verify whether two independent sets of samples are drawn from the same distribution. Inspired by this approach, we introduce CADet (Contrastive Anomaly Detection), a method based on contrastive transformations to perform anomaly detection on single samples. CADet compares favorably to adversarial detection methods to detect adversarially perturbed samples on ImageNet. Simultaneously, it achieves comparable performance to unseen label detection methods on two challenging benchmarks: ImageNet-O and iNaturalist. CADet is fully self-supervised and requires neither labels for in-distribution samples nor access to OOD examples.
翻译:摘要:处理分布外样本已成为机器学习系统实际部署中的关键挑战。本研究探索将自监督对比学习应用于两类分布外样本(未知类别与对抗扰动)的同步检测。由于实际中此类样本的分布不可预知,我们未假设可获取分布外样本。首先证明,经对比学习训练的相似性函数可结合最大均值差异双样本检验,用于验证两组独立样本是否来自同一分布。受此启发,我们提出基于对比变换的异常检测方法CADet(对比异常检测),用于单样本异常检测。在ImageNet数据集上,CADet对对抗扰动样本的检测性能优于现有对抗检测方法;同时在ImageNet-O与iNaturalist两个挑战性基准测试中,其未知标签检测能力可比肩专用方法。CADet完全采用自监督学习,既无需分布内样本标签,亦不依赖分布外样本的获取。