In computational neuroscience, fixed points of recurrent neural network models are commonly used to model neural responses to static or slowly changing stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. A natural approach is to use gradient descent on the Euclidean space of synaptic weights. We show that this approach can lead to poor learning performance due, in part, to singularities that arise in the loss surface. We use a re-parameterization of the recurrent network model to derive two alternative learning rules that produces more robust learning dynamics. We show that these learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should necessarily follow the negative Euclidean gradient of synaptic weights.
翻译:在计算神经科学中,循环神经网络模型的不动点常被用于模拟神经元对静态或缓慢变化刺激的响应。这些应用提出了一个问题:如何训练循环神经网络中的权重,以最小化在不动点上评估的损失函数。一种自然的方法是在欧几里得突触权重空间上使用梯度下降。我们证明,这种方法可能导致较差的学习性能,部分原因在于损失曲面中出现的奇点。我们通过对循环网络模型进行重新参数化,推导出两种替代学习规则,从而产生更稳健的学习动态。我们证明,这些学习规则可分别解释为在循环权重空间上的非欧几里得度量下的最速下降法和梯度下降法。我们的结果质疑了大脑中的学习必然遵循突触权重的负欧几里得梯度这一常见隐含假设。