Brain-inspired Spiking Neural Networks (SNNs) have attracted much attention due to their event-based computing and energy-efficient features. However, the spiking all-or-none nature has prevented direct training of SNNs for various applications. The surrogate gradient (SG) algorithm has recently enabled spiking neural networks to shine in neuromorphic hardware. However, introducing surrogate gradients has caused SNNs to lose their original sparsity, thus leading to the potential performance loss. In this paper, we first analyze the current problem of direct training using SGs and then propose Masked Surrogate Gradients (MSGs) to balance the effectiveness of training and the sparseness of the gradient, thereby improving the generalization ability of SNNs. Moreover, we introduce a temporally weighted output (TWO) method to decode the network output, reinforcing the importance of correct timesteps. Extensive experiments on diverse network structures and datasets show that training with MSG and TWO surpasses the SOTA technique.
翻译:受大脑启发的脉冲神经网络(SNNs)因其基于事件的计算和节能特性而备受关注。然而,脉冲的"全有或全无"特性阻碍了SNNs在各种应用中的直接训练。替代梯度(SG)算法最近使得脉冲神经网络在神经形态硬件中大放异彩。然而,引入替代梯度导致SNNs失去了原有的稀疏性,从而可能导致性能损失。本文首先分析了当前使用SG进行直接训练所面临的问题,然后提出了掩码替代梯度(MSGs)来平衡训练的有效性与梯度的稀疏性,从而提升SNNs的泛化能力。此外,我们引入了一种时序加权输出(TWO)方法来解码网络输出,以强化正确时间步的重要性。在不同网络结构和数据集上进行的大量实验表明,使用MSG和TWO进行训练超越了当前最先进(SOTA)的技术。