Weakly supervised video anomaly detection aims to detect anomalies and identify abnormal categories with only video-level labels. We propose CPL-VAD, a dual-branch framework with cross pseudo labeling. The binary anomaly detection branch focuses on snippet-level anomaly localization, while the category classification branch leverages vision-language alignment to recognize abnormal event categories. By exchanging pseudo labels, the two branches transfer complementary strengths, combining temporal precision with semantic discrimination. Experiments on XD-Violence and UCF-Crime demonstrate that CPL-VAD achieves state-of-the-art performance in both anomaly detection and abnormal category classification.
翻译:弱监督视频异常检测旨在仅利用视频级标签检测异常并识别异常类别。本文提出CPL-VAD,一种采用交叉伪标注的双分支框架。其中二元异常检测分支专注于片段级异常定位,而类别分类分支则利用视觉-语言对齐识别异常事件类别。通过交换伪标签,两个分支能够传递互补优势,将时序定位精度与语义判别能力相结合。在XD-Violence和UCF-Crime数据集上的实验表明,CPL-VAD在异常检测与异常类别分类任务上均达到了最先进的性能水平。