Recent years have witnessed the outstanding success of deep learning in various fields such as vision and natural language processing. This success is largely indebted to the massive size of deep learning models that is expected to increase unceasingly. This growth of the deep learning models is accompanied by issues related to their considerable energy consumption, both during the training and inference phases, as well as their scalability. Although a number of work based on unconventional physical systems have been proposed which addresses the issue of energy efficiency in the inference phase, efficient training of deep learning models has remained unaddressed. So far, training of digital deep learning models mainly relies on backpropagation, which is not suitable for physical implementation as it requires perfect knowledge of the computation performed in the so-called forward pass of the neural network. Here, we tackle this issue by proposing a simple deep neural network architecture augmented by a biologically plausible learning algorithm, referred to as "model-free forward-forward training". The proposed architecture enables training deep physical neural networks consisting of layers of physical nonlinear systems, without requiring detailed knowledge of the nonlinear physical layers' properties. We show that our method outperforms state-of-the-art hardware-aware training methods by improving training speed, decreasing digital computations, and reducing power consumption in physical systems. We demonstrate the adaptability of the proposed method, even in systems exposed to dynamic or unpredictable external perturbations. To showcase the universality of our approach, we train diverse wave-based physical neural networks that vary in the underlying wave phenomenon and the type of non-linearity they use, to perform vowel and image classification tasks experimentally.
翻译:近年来,深度学习在视觉和自然语言处理等各个领域取得了显著成功。这一成功在很大程度上归功于深度学习模型的巨大规模,且这一规模预计将持续增长。然而,深度学习模型的增长伴随着训练和推理阶段的巨大能耗以及可扩展性等相关问题。尽管已有一些基于非常规物理系统的工作提出了解决推理阶段能耗问题的方案,但深度学习模型的高效训练问题仍未得到解决。目前,数字深度学习模型的训练主要依赖反向传播算法,然而该算法并不适用于物理实现,因为它需要完全了解神经网络所谓前向传播中的计算过程。本文通过提出一种简单且结合了生物合理学习算法(称为"无模型前向-前向训练")的深度神经网络架构,解决了这一问题。该架构能够训练由物理非线性系统层组成的深度物理神经网络,而无需详细了解非线性物理层的特性。研究表明,我们的方法在提升训练速度、减少数字计算量以及降低物理系统功耗方面优于当前最先进的硬件感知训练方法。此外,我们证明了该方法即使在面临动态或不可预测外部扰动的系统中也具有良好的适应性。为展示该方法的普适性,我们实验性地训练了多种基于波的物理神经网络(这些网络在底层波动现象及所用非线性类型上各有不同),并成功完成了元音和图像分类任务。