Optics is an exciting route for the next generation of computing hardware for machine learning, promising several orders of magnitude enhancement in both computational speed and energy efficiency. However, to reach the full capacity of an optical neural network it is necessary that the computing not only for the inference, but also for the training be implemented optically. The primary algorithm for training a neural network is backpropagation, in which the calculation is performed in the order opposite to the information flow for inference. While straightforward in a digital computer, optical implementation of backpropagation has so far remained elusive, particularly because of the conflicting requirements for the optical element that implements the nonlinear activation function. In this work, we address this challenge for the first time with a surprisingly simple and generic scheme. Saturable absorbers are employed for the role of the activation units, and the required properties are achieved through a pump-probe process, in which the forward propagating signal acts as the pump and backward as the probe. Our approach is adaptable to various analog platforms, materials, and network structures, and it demonstrates the possibility of constructing neural networks entirely reliant on analog optical processes for both training and inference tasks.
翻译:光学是实现下一代机器学习计算硬件的激动人心途径,有望在计算速度和能效方面提升数个数量级。然而,要充分发挥光学神经网络的潜力,不仅推理过程,训练过程也需通过光学方式实现。神经网络训练的核心算法是反向传播,其计算顺序与推理的信息流方向相反。尽管在数字计算机中此方法易于实现,但光学反向传播的实现至今仍具挑战性,尤其是实现非线性激活函数的光学元件存在相互矛盾的要求。本研究首次通过一种出奇简单且通用的方案解决了这一难题。可饱和吸收体被用作激活单元,并通过泵浦-探测过程实现所需特性:前向传播信号作为泵浦光,反向传播信号作为探测光。该方法可适配多种模拟平台、材料及网络结构,证明了构建完全依赖模拟光学过程进行训练和推理的神经网络的可行性。