Interleaving One-Class and Weakly-Supervised Models with Adaptive Thresholding for Unsupervised Video Anomaly Detection

Video Anomaly Detection (VAD) has been extensively studied under the settings of One-Class Classification (OCC) and Weakly-Supervised learning (WS), which however both require laborious human-annotated normal/abnormal labels. In this paper, we study Unsupervised VAD (UVAD) that does not depend on any label by combining OCC and WS into a unified training framework. Specifically, we extend OCC to weighted OCC (wOCC) and propose a wOCC-WS interleaving training module, where the two models automatically generate pseudo-labels for each other. We face two challenges to make the combination effective: (1) Models' performance fluctuates occasionally during the training process due to the inevitable randomness of the pseudo labels. (2) Thresholds are needed to divide pseudo labels, making the training depend on the accuracy of user intervention. For the first problem, we propose to use wOCC requiring soft labels instead of OCC trained with hard zero/one labels, as soft labels exhibit high consistency throughout different training cycles while hard labels are prone to sudden changes. For the second problem, we repeat the interleaving training module multiple times, during which we propose an adaptive thresholding strategy that can progressively refine a rough threshold to a relatively optimal threshold, which reduces the influence of user interaction. A benefit of employing OCC and WS methods to compose a UVAD method is that we can incorporate the most recent OCC or WS model into our framework. Experiments demonstrate the effectiveness of the proposed UVAD framework.

翻译：视频异常检测在一类分类和弱监督学习两种设定下已得到广泛研究，但这两种设定均依赖人工标注的正常/异常标签。本文研究无监督视频异常检测，通过将一类分类与弱监督学习结合到统一的训练框架中，使其不依赖任何标签。具体而言，我们将一类分类扩展为加权一类分类，并提出一个加权一类分类-弱监督交替训练模块，使两个模型能够自动为彼此生成伪标签。为实现有效结合，我们面临两个挑战：（1）由于伪标签不可避免的随机性，模型在训练过程中的性能会出现偶然性波动。（2）需要阈值来划分伪标签，使得训练依赖于用户干预的准确性。针对第一个问题，我们提出使用需要软标签的加权一类分类替代基于硬性0/1标签训练的一类分类，因为软标签在不同训练周期中表现出高度一致性，而硬标签容易发生突变。针对第二个问题，我们多次重复交替训练模块，并提出一种自适应阈值策略，能够将粗略阈值逐步优化至相对最优阈值，从而降低用户交互的影响。采用一类分类与弱监督方法构建无监督视频异常检测方法的一个优势是，我们可以将最新的一类分类或弱监督模型纳入本框架。实验证明了所提出的无监督视频异常检测框架的有效性。