As the third-generation neural network, the Spiking Neural Network (SNN) has the advantages of low power consumption and high energy efficiency, making it suitable for implementation on edge devices. More recently, the most advanced SNN, Spikformer, combines the self-attention module from Transformer with SNN to achieve remarkable performance. However, it adopts larger channel dimensions in MLP layers, leading to an increased number of redundant model parameters. To effectively decrease the computational complexity and weight parameters of the model, we explore the Lottery Ticket Hypothesis (LTH) and discover a very sparse ($\ge$90%) subnetwork that achieves comparable performance to the original network. Furthermore, we also design a lightweight token selector module, which can remove unimportant background information from images based on the average spike firing rate of neurons, selecting only essential foreground image tokens to participate in attention calculation. Based on that, we present SparseSpikformer, a co-design framework aimed at achieving sparsity in Spikformer through token and weight pruning techniques. Experimental results demonstrate that our framework can significantly reduce 90% model parameters and cut down Giga Floating-Point Operations (GFLOPs) by 20% while maintaining the accuracy of the original model.
翻译:作为第三代神经网络,脉冲神经网络(SNN)具有低功耗和高能效的优势,适合部署于边缘设备。近期最先进的SNN模型Spikformer将Transformer自注意力模块与SNN结合,取得了显著性能。然而,其MLP层采用较大通道维度导致模型参数冗余增加。为有效降低模型计算复杂度与权重参数,我们探索了彩票假说(LTH),发现一个极度稀疏($\ge$90%)的子网络能够达到与原网络相当的性能。此外,我们还设计了一种轻量级令牌选择器模块,可根据神经元平均脉冲发放率去除图像中不重要的背景信息,仅选择关键前景图像令牌参与注意力计算。基于此,我们提出SparseSpikformer——一个通过令牌与权重剪枝技术实现Spikformer稀疏化的协同设计框架。实验结果表明,本框架可在保持原模型精度的同时,减少90%模型参数并降低20%浮点运算量(GFLOPs)。