基于高斯溅射引导的专家混合模型：弱监督视频异常检测新方法 (Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection)

Video Anomaly Detection (VAD) is a challenging task due to the variability of anomalous events and the limited availability of labeled data. Under the Weakly-Supervised VAD (WSVAD) paradigm, only video-level labels are provided during training, while predictions are made at the frame level. Although state-of-the-art models perform well on simple anomalies (e.g., explosions), they struggle with complex real-world events (e.g., shoplifting). This difficulty stems from two key issues: (1) the inability of current models to address the diversity of anomaly types, as they process all categories with a shared model, overlooking category-specific features; and (2) the weak supervision signal, which lacks precise temporal information, limiting the ability to capture nuanced anomalous patterns blended with normal events. To address these challenges, we propose Gaussian Splatting-guided Mixture of Experts (GS-MoE), a novel framework that employs a set of expert models, each specialized in capturing specific anomaly types. These experts are guided by a temporal Gaussian splatting loss, enabling the model to leverage temporal consistency and enhance weak supervision. The Gaussian splatting approach encourages a more precise and comprehensive representation of anomalies by focusing on temporal segments most likely to contain abnormal events. The predictions from these specialized experts are integrated through a mixture-of-experts mechanism to model complex relationships across diverse anomaly patterns. Our approach achieves state-of-the-art performance, with a 91.58% AUC on the UCF-Crime dataset, and demonstrates superior results on XD-Violence and MSAD datasets. By leveraging category-specific expertise and temporal guidance, GS-MoE sets a new benchmark for VAD under weak supervision.

翻译：视频异常检测（VAD）因异常事件的多样性和标注数据有限而极具挑战性。在弱监督视频异常检测（WSVAD）范式下，训练时仅提供视频级标签，而预测需在帧级别进行。尽管现有最优模型在简单异常（如爆炸）上表现良好，但在复杂现实事件（如商店盗窃）上仍存在困难。这一困境源于两个关键问题：（1）当前模型无法处理异常类型的多样性，因其使用共享模型处理所有类别，忽略了类别特异性特征；（2）弱监督信号缺乏精确的时间信息，限制了模型捕捉与正常事件交织的细微异常模式的能力。为应对这些挑战，我们提出高斯溅射引导的专家混合模型（GS-MoE），该新颖框架采用一组专家模型，每个专家专门捕捉特定异常类型。这些专家通过时序高斯溅射损失进行引导，使模型能够利用时序一致性并增强弱监督信号。高斯溅射方法通过聚焦最可能包含异常事件的时序片段，促进对异常更精确、更全面的表征。这些专用专家的预测通过专家混合机制进行整合，以建模跨多样异常模式的复杂关系。我们的方法在UCF-Crime数据集上实现了91.58%的AUC，达到了当前最优性能，并在XD-Violence和MSAD数据集上展现出卓越结果。通过利用类别特异性专业知识与时序引导，GS-MoE为弱监督下的视频异常检测设立了新基准。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日