Equilibrium Propagation (EP) is a biologically plausible local learning algorithm initially developed for convergent recurrent neural networks (RNNs), where weight updates rely solely on the connecting neuron states across two phases. The gradient calculations in EP have been shown to approximate the gradients computed by Backpropagation Through Time (BPTT) when an infinitesimally small nudge factor is used. This property makes EP a powerful candidate for training Spiking Neural Networks (SNNs), which are commonly trained by BPTT. However, in the spiking domain, previous studies on EP have been limited to architectures involving few linear layers. In this work, for the first time we provide a formulation for training convolutional spiking convergent RNNs using EP, bridging the gap between spiking and non-spiking convergent RNNs. We demonstrate that for spiking convergent RNNs, there is a mismatch in the maximum pooling and its inverse operation, leading to inaccurate gradient estimation in EP. Substituting this with average pooling resolves this issue and enables accurate gradient estimation for spiking convergent RNNs. We also highlight the memory efficiency of EP compared to BPTT. In the regime of SNNs trained by EP, our experimental results indicate state-of-the-art performance on the MNIST and FashionMNIST datasets, with test errors of 0.97% and 8.89%, respectively. These results are comparable to those of convergent RNNs and SNNs trained by BPTT. These findings underscore EP as an optimal choice for on-chip training and a biologically-plausible method for computing error gradients.
翻译:平衡传播(EP)是一种具有生物合理性的局部学习算法,最初为收敛型循环神经网络(RNN)而开发,其权重更新仅依赖于两个阶段中连接神经元的状态。研究表明,当使用无穷小的推动因子时,EP中的梯度计算近似于通过时间反向传播(BPTT)计算的梯度。这一特性使EP成为训练脉冲神经网络(SNN)的有力候选方法,而SNN通常由BPTT训练。然而,在脉冲领域,先前关于EP的研究仅限于包含少量线性层的架构。在本工作中,我们首次提出了使用EP训练卷积脉冲收敛型RNN的公式,从而弥合了脉冲与非脉冲收敛型RNN之间的差距。我们证明,对于脉冲收敛型RNN,最大池化与其逆操作之间存在不匹配,导致EP中的梯度估计不准确。将其替换为平均池化可解决此问题,并为脉冲收敛型RNN实现准确的梯度估计。我们还强调了EP相较于BPTT的内存效率。在使用EP训练的SNN体系中,我们的实验结果表明,在MNIST和FashionMNIST数据集上取得了最先进的性能,测试错误率分别为0.97%和8.89%。这些结果与收敛型RNN和由BPTT训练的SNN的结果相当。这些发现强调了EP是片上训练的最佳选择,也是一种具有生物合理性的误差梯度计算方法。