Recent advancements in weakly-supervised video anomaly detection have achieved remarkable performance by applying the multiple instance learning paradigm based on multimodal foundation models such as CLIP to highlight anomalous instances and classify categories. However, their objectives may tend to detect the most salient response segments, while neglecting to mine diverse normal patterns separated from anomalies, and are prone to category confusion due to similar appearance, leading to unsatisfactory fine-grained classification results. Therefore, we propose a novel Disentangled Semantic Alignment Network (DSANet) to explicitly separate abnormal and normal features from coarse-grained and fine-grained aspects, enhancing the distinguishability. Specifically, at the coarse-grained level, we introduce a self-guided normality modeling branch that reconstructs input video features under the guidance of learned normal prototypes, encouraging the model to exploit normality cues inherent in the video, thereby improving the temporal separation of normal patterns and anomalous events. At the fine-grained level, we present a decoupled contrastive semantic alignment mechanism, which first temporally decomposes each video into event-centric and background-centric components using frame-level anomaly scores and then applies visual-language contrastive learning to enhance class-discriminative representations. Comprehensive experiments on two standard benchmarks, namely XD-Violence and UCF-Crime, demonstrate that DSANet outperforms existing state-of-the-art methods.


翻译:近年来,基于多模态基础模型(如CLIP)的多示例学习范式在弱监督视频异常检测中取得了显著进展,通过突出异常实例并分类类别。然而,这些方法的目标往往倾向于检测最显著的响应片段,而忽略了挖掘与异常分离的多样化正常模式,且由于外观相似性容易导致类别混淆,导致细粒度分类结果不理想。因此,我们提出了一种新颖的解耦语义对齐网络(DSANet),从粗粒度和细粒度层面显式分离异常与正常特征,增强可区分性。具体而言,在粗粒度层面,我们引入了一个自引导的正常性建模分支,该分支在学习到的正常原型指导下重构输入视频特征,鼓励模型利用视频中固有的正常性线索,从而提升正常模式与异常事件在时间上的分离。在细粒度层面,我们提出了一种解耦对比语义对齐机制,首先利用帧级异常得分将每个视频在时间上分解为以事件为中心和以背景为中心的组件,然后应用视觉-语言对比学习以增强类别判别性表示。在XD-Violence和UCF-Crime两个标准基准上的全面实验表明,DSANet优于现有的最先进方法。

0
下载
关闭预览

相关内容

Top
微信扫码咨询专知VIP会员