Spiking Neural Networks (SNNs) offer a promising and energy-efficient alternative to conventional neural networks, thanks to their sparse binary activation. However, they face challenges regarding memory and computation overhead due to complex spatio-temporal dynamics and the necessity for multiple backpropagation computations across timesteps during training. To mitigate this overhead, compression techniques such as quantization are applied to SNNs. Yet, naively applying quantization to SNNs introduces a mismatch in membrane potential, a crucial factor for the firing of spikes, resulting in accuracy degradation. In this paper, we introduce Membrane-aware Distillation on quantized Spiking Neural Network (MD-SNN), which leverages membrane potential to mitigate discrepancies after weight, membrane potential, and batch normalization quantization. To our knowledge, this study represents the first application of membrane potential knowledge distillation in SNNs. We validate our approach on various datasets, including CIFAR10, CIFAR100, N-Caltech101, and TinyImageNet, demonstrating its effectiveness for both static and dynamic data scenarios. Furthermore, for hardware efficiency, we evaluate the MD-SNN with SpikeSim platform, finding that MD-SNNs achieve 14.85X lower energy-delay-area product (EDAP), 2.64X higher TOPS/W, and 6.19X higher TOPS/mm2 compared to floating point SNNs at iso-accuracy on N-Caltech101 dataset.
翻译:脉冲神经网络(SNNs)凭借其稀疏的二进制激活特性,为传统神经网络提供了一种极具前景且高能效的替代方案。然而,由于其复杂的时空动态特性以及在训练过程中需要跨时间步进行多次反向传播计算,SNNs面临着内存和计算开销方面的挑战。为了降低这种开销,量化等压缩技术被应用于SNNs。然而,将量化直接应用于SNNs会引入膜电位失配——这是影响脉冲发放的关键因素——从而导致精度下降。本文提出了一种基于膜电位感知的量化脉冲神经网络蒸馏方法(MD-SNN),该方法利用膜电位来缓解权重、膜电位和批量归一化量化后产生的差异。据我们所知,本研究首次将膜电位知识蒸馏应用于SNNs。我们在多个数据集(包括CIFAR10、CIFAR100、N-Caltech101和TinyImageNet)上验证了所提方法的有效性,证明了其在静态和动态数据场景下的优越性能。此外,为了评估硬件效率,我们使用SpikeSim平台对MD-SNN进行了评估,发现在N-Caltech101数据集上达到相同精度时,与浮点SNNs相比,MD-SNN实现了14.85倍更低的能量-延迟-面积乘积(EDAP)、2.64倍更高的TOPS/W以及6.19倍更高的TOPS/mm²。