Video anomaly detection (VAD) aims to discover behaviors or events deviating from the normality in videos. As a long-standing task in the field of computer vision, VAD has witnessed much good progress. In the era of deep learning, with the explosion of architectures of continuously growing capability and capacity, a great variety of deep learning based methods are constantly emerging for the VAD task, greatly improving the generalization ability of detection algorithms and broadening the application scenarios. Therefore, such a multitude of methods and a large body of literature make a comprehensive survey a pressing necessity. In this paper, we present an extensive and comprehensive research review, covering the spectrum of five different categories, namely, semi-supervised, weakly supervised, fully supervised, unsupervised and open-set supervised VAD, and we also delve into the latest VAD works based on pre-trained large models, remedying the limitations of past reviews in terms of only focusing on semi-supervised VAD and small model based methods. For the VAD task with different levels of supervision, we construct a well-organized taxonomy, profoundly discuss the characteristics of different types of methods, and show their performance comparisons. In addition, this review involves the public datasets, open-source codes, and evaluation metrics covering all the aforementioned VAD tasks. Finally, we provide several important research directions for the VAD community.
翻译:视频异常检测(VAD)旨在发现视频中偏离正常模式的行为或事件。作为计算机视觉领域一项长期存在的任务,VAD已取得了诸多良好进展。在深度学习时代,随着能力与规模持续增长的架构不断涌现,基于深度学习的各类VAD方法层出不穷,极大地提升了检测算法的泛化能力并拓宽了应用场景。因此,如此众多的方法与大量文献使得进行全面综述成为迫切需求。本文提出了一项广泛而全面的研究综述,涵盖了五类不同范式,即半监督、弱监督、全监督、无监督及开放集监督VAD,并深入探讨了基于预训练大模型的最新VAD工作,弥补了过往综述仅聚焦于半监督VAD及基于小模型方法的局限性。针对不同监督程度的VAD任务,我们构建了层次清晰的分类体系,深入剖析了各类方法的特点,并展示了其性能比较。此外,本综述涵盖了涵盖上述所有VAD任务的公共数据集、开源代码与评估指标。最后,我们为VAD研究领域提出了若干重要研究方向。