Detection of violence and weaponized violence in closed-circuit television (CCTV) footage requires a comprehensive approach. In this work, we introduce the \emph{Smart-City CCTV Violence Detection (SCVD)} dataset, specifically designed to facilitate the learning of weapon distribution in surveillance videos. To tackle the complexities of analyzing 3D surveillance video for violence recognition tasks, we propose a novel technique called \emph{SSIVD-Net} (\textbf{S}alient-\textbf{S}uper-\textbf{I}mage for \textbf{V}iolence \textbf{D}etection). Our method reduces 3D video data complexity, dimensionality, and information loss while improving inference, performance, and explainability through salient-super-Image representations. Considering the scalability and sustainability requirements of futuristic smart cities, the authors introduce the \emph{Salient-Classifier}, a novel architecture combining a kernelized approach with a residual learning strategy. We evaluate variations of SSIVD-Net and Salient Classifier on our SCVD dataset and benchmark against state-of-the-art (SOTA) models commonly employed in violence detection. Our approach exhibits significant improvements in detecting both weaponized and non-weaponized violence instances. By advancing the SOTA in violence detection, our work offers a practical and scalable solution suitable for real-world applications. The proposed methodology not only addresses the challenges of violence detection in CCTV footage but also contributes to the understanding of weapon distribution in smart surveillance. Ultimately, our research findings should enable smarter and more secure cities, as well as enhance public safety measures.
翻译:在闭路电视监控画面中检测暴力及武装暴力行为需要综合性的技术手段。本研究提出了专为监控视频中武器分布学习而设计的《智慧城市闭路电视暴力检测(SCVD)》数据集。为攻克三维监控视频暴力识别任务中的分析复杂性,我们提出名为SSIVD-Net(显著超图像暴力检测网络)的创新技术。该方法通过显著超图像表征,在降低三维视频数据复杂度、维度与信息损失的同时,提升推理性能、检测效能与可解释性。针对未来智慧城市的可扩展性与可持续性需求,作者创新性地提出结合核方法与残差学习策略的显著分类器架构。我们在SCVD数据集上评估了SSIVD-Net及显著分类器的多种变体,并与暴力检测领域当前主流模型进行基准测试。实验结果表明,本方法在武装暴力与非武装暴力实例检测中均取得显著性能提升。通过推动暴力检测领域的当前最优技术水平,本研究为实际应用提供了兼具实用性与可扩展性的解决方案。所提方法不仅解决了闭路电视监控中的暴力检测难题,更深化了对智慧监控中武器分布规律的理解。最终,本研究将为构建更智能、更安全的城市环境及强化公共安全措施提供有力支撑。