The search for "biologically plausible" learning algorithms has converged on the idea of representing gradients as activity differences. However, most approaches require a high degree of synchronization (distinct phases during learning) and introduce substantial computational overhead, which raises doubts regarding their biological plausibility as well as their potential utility for neuromorphic computing. Furthermore, they commonly rely on applying infinitesimal perturbations (nudges) to output units, which is impractical in noisy environments. Recently it has been shown that by modelling artificial neurons as dyads with two oppositely nudged compartments, it is possible for a fully local learning algorithm named ``dual propagation'' to bridge the performance gap to backpropagation, without requiring separate learning phases or infinitesimal nudging. However, the algorithm has the drawback that its numerical stability relies on symmetric nudging, which may be restrictive in biological and analog implementations. In this work we first provide a solid foundation for the objective underlying the dual propagation method, which also reveals a surprising connection with adversarial robustness. Second, we demonstrate how dual propagation is related to a particular adjoint state method, which is stable regardless of asymmetric nudging.
翻译:寻找“生物合理”的学习算法已趋同于将梯度表示为活动差异的思想。然而,大多数方法需要高度同步(学习过程中的不同阶段)并引入大量计算开销,这引发了对其生物合理性及神经形态计算潜在效用的质疑。此外,它们通常依赖于对输出单元施加无穷小扰动(微推),这在噪声环境中不切实际。近期研究表明,通过将人工神经元建模为具有两个相反微推动向的二元组,一种名为“对偶传播”的完全局部学习算法能够在无需独立学习阶段或无穷小微推的情况下弥合与反向传播的性能差距。然而,该算法的缺陷在于其数值稳定性依赖于对称微推,这可能限制其在生物与模拟实现中的应用。本文首先为对偶传播方法的目标函数提供了坚实基础,揭示了其与对抗鲁棒性之间的惊人联系。其次,我们证明了对偶传播与特定伴随状态方法相关,且该方法在非对称微推下仍保持稳定性。