Backpropagation, which uses the chain rule, is the de-facto standard algorithm for optimizing neural networks nowadays. Recently, Hinton (2022) proposed the forward-forward algorithm, a promising alternative that optimizes neural nets layer-by-layer, without propagating gradients throughout the network. Although such an approach has several advantages over back-propagation and shows promising results, the fact that each layer is being trained independently limits the optimization process. Specifically, it prevents the network's layers from collaborating to learn complex and rich features. In this work, we study layer collaboration in the forward-forward algorithm. We show that the current version of the forward-forward algorithm is suboptimal when considering information flow in the network, resulting in a lack of collaboration between layers of the network. We propose an improved version that supports layer collaboration to better utilize the network structure, while not requiring any additional assumptions or computations. We empirically demonstrate the efficacy of the proposed version when considering both information flow and objective metrics. Additionally, we provide a theoretical motivation for the proposed method, inspired by functional entropy theory.
翻译:反向传播利用链式法则,是当前优化神经网络的默认标准算法。近日,Hinton(2022)提出了前向-前向算法,这是一种有前景的替代方案,该算法通过逐层优化神经网络,而无需在整体网络中传播梯度。尽管此类方法相较于反向传播具有若干优势并展现出令人鼓舞的结果,但每层独立训练的特性限制了优化过程。具体而言,它阻碍了网络各层通过协作来学习复杂而丰富的特征。本研究探讨了前向-前向算法中的层协作问题。我们揭示了当前版本的前向-前向算法在考虑网络中信息流时存在次优性,导致网络各层之间缺乏协作。我们提出了一种改进版本,支持层协作以更好地利用网络结构,同时无需引入任何额外假设或计算。通过信息流和客观指标两方面的实证,我们证明了所提出版本的有效性。此外,受函数熵理论启发,我们为所提方法提供了理论依据。