In computational neuroscience, fixed points of recurrent neural networks are commonly used to model neural responses to static or slowly changing stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. A natural approach is to use gradient descent on the Euclidean space of synaptic weights. We show that this approach can lead to poor learning performance due, in part, to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produces more robust learning dynamics. We show that these learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights.
翻译:在计算神经科学中,循环神经网络的不动点常用于模拟神经元对静态或缓慢变化刺激的响应。这类应用提出了如何训练循环神经网络权重以最小化基于不动点评估的损失函数的问题。一种自然的方法是在欧几里得突触权重空间上使用梯度下降。我们证明该方法可能导致学习性能不佳,部分原因在于损失曲面中出现的奇异性。通过重参数化循环网络模型,我们推导出两种替代学习规则,能够产生更稳健的学习动态。我们证明,这些学习规则可分别解释为在循环权重空间的非欧几里得度量下的最速下降法和梯度下降法。我们的结果质疑了常见隐含假设——即大脑中的学习应遵循突触权重的负欧几里得梯度。