Detecting out-of-distribution (OOD) inputs is critical for safely deploying deep learning models in real-world scenarios. In recent years, many OOD detectors have been developed, and even the benchmarking has been standardized, i.e. OpenOOD. The number of post-hoc detectors is growing fast and showing an option to protect a pre-trained classifier against natural distribution shifts, claiming to be ready for real-world scenarios. However, its efficacy in handling adversarial examples has been neglected in the majority of studies. This paper investigates the adversarial robustness of the 16 post-hoc detectors on several evasion attacks and discuss a roadmap towards adversarial defense in OOD detectors.
翻译:检测分布外(OOD)输入对于在现实场景中安全部署深度学习模型至关重要。近年来,众多OOD检测器被开发出来,其基准测试甚至已实现标准化(如OpenOOD)。后验检测器的数量正快速增长,展现出保护预训练分类器免受自然分布偏移的潜力,并宣称已具备现实场景应用条件。然而,现有研究大多忽视了其在处理对抗样本方面的有效性。本文通过多种规避攻击对16种后验检测器的对抗鲁棒性进行系统性研究,并探讨了OOD检测器对抗防御的发展路径。