Spiking neural networks (SNNs) have gained traction in vision due to their energy efficiency, bio-plausibility, and inherent temporal processing. Yet, despite this temporal capacity, most progress concentrates on static image benchmarks, and SNNs still underperform on dynamic video tasks compared to artificial neural networks (ANNs). In this work, we diagnose a fundamental pass-band mismatch: Standard spiking dynamics behave as a temporal low pass that emphasizes static content while attenuating motion bearing bands, where task relevant information concentrates in dynamic tasks. This phenomenon explains why SNNs can approach ANNs on static tasks yet fall behind on tasks that demand richer temporal understanding.To remedy this, we propose the Pass-Bands Optimizer (PBO), a plug-and-play module that optimizes the temporal pass-band toward task-relevant motion bands. PBO introduces only two learnable parameters, and a lightweight consistency constraint that preserves semantics and boundaries, incurring negligible computational overhead and requires no architectural changes. PBO deliberately suppresses static components that contribute little to discrimination, effectively high passing the stream so that spiking activity concentrates on motion bearing content. On UCF101, PBO yields over ten percentage points improvement. On more complex multi-modal action recognition and weakly supervised video anomaly detection, PBO delivers consistent and significant gains, offering a new perspective for SNN based video processing and understanding.
翻译:脉冲神经网络(SNNs)因其能效高、生物可解释性强以及固有的时序处理能力,在视觉领域受到关注。然而,尽管具备这种时序处理能力,大多数进展仍集中在静态图像基准测试上,并且在动态视频任务中,与人工神经网络(ANNs)相比,SNNs的表现仍然欠佳。在本工作中,我们诊断出一个根本性的通带失配问题:标准的脉冲动态特性表现为一种时序低通滤波器,它强调静态内容,同时衰减承载运动的频带,而在动态任务中,与任务相关的信息恰恰集中在这些频带。这一现象解释了为何SNNs在静态任务上可以接近ANNs,但在需要更丰富时序理解的任务上却落后。为了弥补这一点,我们提出了通带优化器(PBO),这是一个即插即用的模块,可将时序通带优化至与任务相关的运动频带。PBO仅引入两个可学习参数和一个轻量级的一致性约束,以保持语义和边界,计算开销可忽略不计,且无需改变网络架构。PBO有意识地抑制对区分度贡献甚微的静态成分,有效地对数据流进行高通滤波,从而使脉冲活动集中在承载运动的内容上。在UCF101数据集上,PBO带来了超过十个百分点(10%)的性能提升。在更复杂的多模态动作识别和弱监督视频异常检测任务中,PBO也带来了一致且显著的增益,为基于SNN的视频处理与理解提供了新的视角。