Spiking Neural Networks (SNNs) provide energy-efficient computation but their deployment is constrained by dense connectivity and high spiking operation costs. Existing magnitude-based pruning strategies, when naively applied to SNNs, fail to account for temporal accumulation, non-uniform timestep contributions, and membrane stability, often leading to severe performance degradation. This paper proposes Spiking Layer-Adaptive Magnitude-based Pruning (SLAMP), a theory-guided pruning framework that generalizes layer-adaptive magnitude pruning to temporal SNNs by explicitly controlling worst-case output distortion across layers and timesteps. SLAMP formulates sparsity allocation as a temporal distortion-constrained optimization problem, yielding time-aware layer importance scores that reduce to conventional layer-adaptive pruning in single-timestep limit. An efficient two-stage procedure is derived, combining temporal score estimation, global sparsity allocation, and magnitude pruning with retraining for stability recovery. Experiments on CIFAR10, CIFAR100, and the event-based CIFAR10-DVS datasets demonstrate that SLAMP achieves substantial connectivity and spiking operation reductions while preserving accuracy, enabling efficient and deployable SNN inference.
翻译:脉冲神经网络(SNNs)能提供高能效计算,但其部署受到密集连接和高脉冲操作成本的限制。现有的幅度剪枝策略若直接应用于SNNs,未能考虑时间累积、非均匀时间步贡献以及膜电位稳定性,通常会导致严重的性能下降。本文提出脉冲层自适应幅度剪枝(SLAMP),这是一个理论指导的剪枝框架,通过显式控制各层和各时间步的最坏情况输出失真,将层自适应幅度剪枝推广至时序SNNs。SLAMP将稀疏度分配表述为一个时序失真约束优化问题,产生具有时间感知的层重要性分数,该分数在单时间步极限下可退化为传统的层自适应剪枝。本文推导出一个高效的两阶段流程,结合了时序分数估计、全局稀疏度分配、幅度剪枝以及用于稳定性恢复的重训练。在CIFAR10、CIFAR100以及基于事件的CIFAR10-DVS数据集上的实验表明,SLAMP在保持精度的同时,实现了显著的连接和脉冲操作削减,从而实现了高效且可部署的SNN推理。