Training Large Neural Networks With Low-Dimensional Error Feedback

Training deep neural networks typically relies on backpropagating high dimensional error signals a computationally intensive process with little evidence supporting its implementation in the brain. However, since most tasks involve low-dimensional outputs, we propose that low-dimensional error signals may suffice for effective learning. To test this hypothesis, we introduce a novel local learning rule based on Feedback Alignment that leverages indirect, low-dimensional error feedback to train large networks. Our method decouples the backward pass from the forward pass, enabling precise control over error signal dimensionality while maintaining high-dimensional representations. We begin with a detailed theoretical derivation for linear networks, which forms the foundation of our learning framework, and extend our approach to nonlinear, convolutional, and transformer architectures. Remarkably, we demonstrate that even minimal error dimensionality on the order of the task dimensionality can achieve performance matching that of traditional backpropagation. Furthermore, our rule enables efficient training of convolutional networks, which have previously been resistant to Feedback Alignment methods, with minimal error. This breakthrough not only paves the way toward more biologically accurate models of learning but also challenges the conventional reliance on high-dimensional gradient signals in neural network training. Our findings suggest that low-dimensional error signals can be as effective as high-dimensional ones, prompting a reevaluation of gradient-based learning in high-dimensional systems. Ultimately, our work offers a fresh perspective on neural network optimization and contributes to understanding learning mechanisms in both artificial and biological systems.

翻译：训练深度神经网络通常依赖于反向传播高维误差信号，这一计算密集型过程缺乏支持其在大脑中实现的证据。然而，由于大多数任务涉及低维输出，我们提出低维误差信号可能足以实现有效学习。为验证这一假设，我们引入了一种基于反馈对齐的新型局部学习规则，该规则利用间接的低维误差反馈来训练大型网络。我们的方法将反向传播过程与前向传播过程解耦，从而在保持高维表征的同时实现对误差信号维度的精确控制。我们首先针对线性网络进行了详细的理论推导，这构成了我们学习框架的基础，随后将我们的方法扩展到非线性网络、卷积网络以及Transformer架构。值得注意的是，我们证明即使误差维度低至任务维度量级，其性能仍可与传统反向传播方法相媲美。此外，我们的规则能够以最小误差高效训练卷积网络，而此类网络先前对反馈对齐方法具有抵抗性。这一突破不仅为构建更符合生物学实际的学习模型铺平了道路，也对神经网络训练中依赖高维梯度信号的传统观念提出了挑战。我们的研究结果表明，低维误差信号可以像高维信号一样有效，这促使我们重新评估高维系统中基于梯度的学习方法。最终，我们的工作为神经网络优化提供了新的视角，并有助于理解人工与生物系统中的学习机制。