Out-of-distribution (OOD) detection is critical to ensuring the reliability and safety of machine learning systems. For instance, in autonomous driving, we would like the driving system to issue an alert and hand over the control to humans when it detects unusual scenes or objects that it has never seen during training time and cannot make a safe decision. The term, OOD detection, first emerged in 2017 and since then has received increasing attention from the research community, leading to a plethora of methods developed, ranging from classification-based to density-based to distance-based ones. Meanwhile, several other problems, including anomaly detection (AD), novelty detection (ND), open set recognition (OSR), and outlier detection (OD), are closely related to OOD detection in terms of motivation and methodology. Despite common goals, these topics develop in isolation, and their subtle differences in definition and problem setting often confuse readers and practitioners. In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i.e., AD, ND, OSR, OOD detection, and OD. Under our framework, these five problems can be seen as special cases or sub-tasks, and are easier to distinguish. We then review each of these five areas by summarizing their recent technical developments, with a special focus on OOD detection methodologies. We conclude this survey with open challenges and potential research directions.
翻译:分布外检测对于确保机器学习系统的可靠性和安全性至关重要。例如,在自动驾驶中,我们希望驾驶系统在检测到训练阶段从未见过的异常场景或物体时发出警报并将控制权交给人类,以避免做出不安全决策。“分布外检测”这一术语首次出现于2017年,此后受到研究界越来越多的关注,涌现出大量方法,涵盖基于分类、基于密度和基于距离等不同类型。与此同时,异常检测、新异检测、开集识别和离群检测等问题在动机和方法上与分布外检测密切相关。尽管目标相似,但这些领域彼此独立发展,且定义和问题设置上的细微差异常使读者和实践者感到困惑。本综述首先提出一个统一的框架——广义分布外检测,该框架涵盖上述五个问题(即异常检测、新异检测、开集识别、分布外检测和离群检测)。在此框架下,这五个问题可作为特例或子任务,更易于区分。随后,我们回顾每个领域的最新技术进展,特别关注分布外检测方法。最后,总结开放挑战与潜在研究方向。