Interest in spiking neural networks (SNNs) has been growing steadily, promising an energy-efficient alternative to formal neural networks (FNNs), commonly known as artificial neural networks (ANNs). Despite increasing interest, especially for Edge applications, these event-driven neural networks suffered from their difficulty to be trained compared to FNNs. To alleviate this problem, a number of innovative methods have been developed to provide performance more or less equivalent to that of FNNs. However, the spiking activity of a network during inference is usually not considered. While SNNs may usually have performance comparable to that of FNNs, it is often at the cost of an increase of the network's activity, thus limiting the benefit of using them as a more energy-efficient solution. In this paper, we propose to leverage Knowledge Distillation (KD) for SNNs training with surrogate gradient descent in order to optimize the trade-off between performance and spiking activity. Then, after understanding why KD led to an increase in sparsity, we also explored Activations regularization and proposed a novel method with Logits Regularization. These approaches, validated on several datasets, clearly show a reduction in network spiking activity (-26.73% on GSC and -14.32% on CIFAR-10) while preserving accuracy.
翻译:脉冲神经网络(SNNs)作为形式神经网络(FNNs,通常称为人工神经网络ANNs)的高能效替代方案,其受关注度正稳步增长。尽管在边缘计算等应用场景中关注度日益提升,这类事件驱动型神经网络相较于FNNs仍面临训练困难的挑战。为缓解此问题,学界已开发出多种创新方法,使SNNs获得与FNNs大致相当的性能。然而,网络在推理过程中的脉冲活动通常未被纳入考量。虽然SNNs常能达到与FNNs可比拟的性能,但这往往以增加网络活动为代价,从而限制了其作为高能效解决方案的优势。本文提出在采用代理梯度下降的SNNs训练中引入知识蒸馏(KD),以优化性能与脉冲活动之间的平衡。在理解KD如何导致稀疏性提升后,我们进一步探索了激活正则化方法,并提出一种新颖的对数正则化技术。经多个数据集验证,这些方法在保持精度的同时显著降低了网络脉冲活动(在GSC数据集上降低26.73%,在CIFAR-10数据集上降低14.32%)。