Backpropagation (BP) has been pivotal in advancing machine learning and remains essential in computational applications and comparative studies of biological and artificial neural networks. Despite its widespread use, the implementation of BP in the brain remains elusive, and its biological plausibility is often questioned due to inherent issues such as the need for symmetry of weights between forward and backward connections, and the requirement of distinct forward and backward phases of computation. Here, we introduce a novel neuroplasticity rule that offers a potential mechanism for implementing BP in the brain. Similar in general form to the classical Hebbian rule, this rule is based on the core principles of maintaining the balance of excitatory and inhibitory inputs as well as on retrograde signaling, and operates over three progressively slower timescales: neural firing, retrograde signaling, and neural plasticity. We hypothesize that each neuron possesses an internal state, termed credit, in addition to its firing rate. After achieving equilibrium in firing rates, neurons receive credits based on their contribution to the E-I balance of postsynaptic neurons through retrograde signaling. As the network's credit distribution stabilizes, connections from those presynaptic neurons are strengthened that significantly contribute to the balance of postsynaptic neurons. We demonstrate mathematically that our learning rule precisely replicates BP in layered neural networks without any approximations. Simulations on artificial neural networks reveal that this rule induces varying community structures in networks, depending on the learning rate. This simple theoretical framework presents a biologically plausible implementation of BP, with testable assumptions and predictions that may be evaluated through biological experiments.
翻译:反向传播(BP)在推动机器学习发展中具有关键作用,并且在计算应用以及生物与人工神经网络的比较研究中仍然不可或缺。尽管BP被广泛使用,但其在大脑中的实现机制仍不明确,其生物合理性也常因固有的问题而受到质疑,例如需要前向与反向连接之间的权重对称性,以及需要不同的前向与反向计算阶段。在此,我们提出了一种新的神经可塑性规则,为在大脑中实现BP提供了一种潜在的机制。该规则在一般形式上类似于经典的海伯规则,其核心原理基于维持兴奋性与抑制性输入的平衡以及逆行信号传递,并在三个逐渐变慢的时间尺度上运行:神经发放、逆行信号传递和神经可塑性。我们假设每个神经元除了发放率之外,还拥有一个称为“信用”的内部状态。在发放率达到平衡后,神经元通过逆行信号传递,根据其对突触后神经元E-I平衡的贡献获得信用。随着网络信用分布的稳定,那些对突触后神经元平衡有显著贡献的突触前神经元的连接会被加强。我们通过数学证明,我们的学习规则在分层神经网络中无需任何近似即可精确复现BP。在人工神经网络上的模拟表明,该规则会根据学习率在网络中诱导出不同的社区结构。这一简单的理论框架提出了一个生物合理的BP实现方案,其可检验的假设和预测可通过生物学实验进行评估。