Neuromorphic computing has recently gained momentum with the emergence of various neuromorphic processors. As the field advances, there is an increasing focus on developing training methods that can effectively leverage the unique properties of spiking neural networks (SNNs). SNNs emulate the temporal dynamics of biological neurons, making them particularly well-suited for real-time, event-driven processing. To fully harness the potential of SNNs across different neuromorphic platforms, effective training methodologies are essential. In SNNs, learning rules are based on neurons' spiking behavior, that is, if and when spikes are generated due to a neuron's membrane potential exceeding that neuron's spiking threshold, and this spike timing encodes vital information. However, the threshold is generally treated as a hyperparameter, and incorrect selection can lead to neurons that do not spike for large portions of the training process, hindering the effective rate of learning. This work focuses on the significance of learning neuron thresholds alongside weights in SNNs. Our results suggest that promoting threshold from a hyperparameter to a trainable parameter effectively addresses the issue of dead neurons during training. This leads to a more robust training algorithm, resulting in improved convergence, increased test accuracy, and a substantial reduction in the number of training epochs required to achieve viable accuracy on spatiotemporal datasets such as NMNIST, DVS128, and Spiking Heidelberg Digits (SHD), with up to 30% training speed-up and up to 2% higher accuracy on these datasets.
翻译:随着各类神经形态处理器的涌现,神经形态计算领域近期获得了显著发展。随着该领域的进步,开发能够有效利用脉冲神经网络独特特性的训练方法日益受到关注。SNNs模拟了生物神经元的时间动态特性,使其特别适合于实时、事件驱动的处理任务。为了在不同神经形态平台上充分发挥SNNs的潜力,有效的训练方法至关重要。在SNNs中,学习规则基于神经元的脉冲行为——即当神经元膜电位超过其脉冲阈值时是否产生脉冲以及何时产生脉冲,这种脉冲时序编码着关键信息。然而,阈值通常被视为超参数,其不当选择可能导致神经元在训练过程的大部分时间内不产生脉冲,从而阻碍学习的有效速率。本研究聚焦于在SNNs中同步学习神经元阈值与权重的重要性。我们的研究结果表明,将阈值从超参数提升为可训练参数,能有效解决训练过程中的"死亡神经元"问题。这催生了一种更鲁棒的训练算法,在时空数据集(如NMNIST、DVS128和Spiking Heidelberg Digits)上实现了更优的收敛性、更高的测试精度,并大幅减少达到可行精度所需的训练周期数——在这些数据集上可获得高达30%的训练加速和最高2%的精度提升。