Backpropagation, the foundational algorithm for training neural networks, is typically understood as a symbolic computation that recursively applies the chain rule. We show it emerges exactly as the finite-time relaxation of a physical dynamical system. By formulating feedforward inference as a continuous-time process and applying Lagrangian theory of non-conservative systems to handle asymmetric interactions, we derive a global energy functional on a doubled state space encoding both activations and sensitivities. The saddle-point dynamics of this energy perform inference and credit assignment simultaneously through local interactions. We term this framework ''Dyadic Backpropagation''. Crucially, we prove that unit-step Euler discretization, the natural timescale of layer transitions, recovers standard backpropagation exactly in precisely 2L steps for an L-layer network, with no approximations. Unlike prior energy-based methods requiring symmetric weights, asymptotic convergence, or vanishing perturbations, our framework guarantees exact gradients in finite time. This establishes backpropagation as the digitally optimized shadow of a continuous physical relaxation, providing a rigorous foundation for exact gradient computation in analog and neuromorphic substrates where continuous dynamics are native.
翻译:反向传播作为训练神经网络的基础算法,通常被理解为递归应用链式法则的符号计算。我们证明其精确地呈现为物理动力系统在有限时间内的弛豫过程。通过将前馈推理表述为连续时间过程,并应用非保守系统的拉格朗日理论处理非对称相互作用,我们在编码激活值与敏感度的双重状态空间上推导出全局能量泛函。该能量泛函的鞍点动力学通过局部相互作用同时执行推理与信用分配。我们将此框架称为“二元反向传播”。关键的是,我们证明了单位步长欧拉离散化——即层间转换的自然时间尺度——能在L层网络中精确地以2L步恢复标准反向传播,且无需任何近似。与先前需要对称权重、渐近收敛或微扰趋零的基于能量的方法不同,我们的框架保证了有限时间内获得精确梯度。这确立了反向传播作为连续物理弛豫过程的数字化优化投影,为在连续动力学本征的模拟与神经形态基底中实现精确梯度计算奠定了严格理论基础。