Spiking Neural Networks (SNNs) have emerged as a new generation of energy-efficient neural networks suitable for implementation on neuromorphic hardware. As neuromorphic hardware has limited memory and computational resources, parameter pruning and quantization have recently been explored to improve the efficiency of SNNs. State-of-the-art SNN pruning/quantization methods employ multiple compression and training iterations, increasing the cost for pre-trained or very large SNNs. In this paper, we propose a novel one-shot post-training compression framework, Spiking Brain Compression (SBC), that extends the classical Optimal Brain Surgeon method to SNNs. SBC replaces the current-based objective found in the common layer-wise compression method with a spike-train-based objective whose Hessian is cheaply computable, allowing a single backward pass to compress parameters and analytically rescale the rest. Applying SBC to SNN pruning and quantization across event-based and static datasets (up to ImageNet), including SEW-ResNet152 and spike-driven Transformers, we achieve state-of-the-art one-shot post-training compression for SNNs, with single- to double-digit accuracy gains over ANN compression baselines ported to SNNs. We further report a synaptic-operation-based energy proxy and a calibration-size ablation, demonstrating robust performance under sub-one-sample-per-class calibration.
翻译:脉冲神经网络(SNNs)已成为适用于神经形态硬件实现的新一代高能效神经网络。由于神经形态硬件的内存和计算资源有限,近年来参数剪枝和量化技术被探索用于提升SNNs的效率。当前最先进的SNN剪枝/量化方法采用多轮压缩与训练迭代,增加了预训练或超大规模SNNs的应用成本。本文提出一种新颖的单次训练后压缩框架——脉冲大脑压缩(SBC),该方法将经典的最优脑外科医生算法扩展至SNNs。SBC将常见逐层压缩方法中基于电流的目标函数替换为基于脉冲序列的目标函数,其海森矩阵可高效计算,仅需单次反向传播即可完成参数压缩并解析式重缩放其余参数。将SBC应用于事件驱动与静态数据集(最高至ImageNet规模)的SNN剪枝与量化任务,包括SEW-ResNet152及脉冲驱动Transformer模型,我们在SNN单次训练后压缩领域实现了最先进的性能,相较于移植至SNNs的人工神经网络压缩基线方法获得了从个位数到两位数的精度提升。此外,我们通过基于突触操作的能耗代理指标及校准规模消融实验,证明了该方法在每类样本数不足一的校准条件下仍具有鲁棒性能。