The ultimate goal of artificial intelligence is to mimic the human brain to perform decision-making and control directly from high-dimensional sensory input. Diffractive optical networks provide a promising solution for implementing artificial intelligence with high-speed and low-power consumption. Most of the reported diffractive optical networks focus on single or multiple tasks that do not involve environmental interaction, such as object recognition and image classification. In contrast, the networks capable of performing decision-making and control have not yet been developed to our knowledge. Here, we propose using deep reinforcement learning to implement diffractive optical networks that imitate human-level decision-making and control capability. Such networks taking advantage of a residual architecture, allow for finding optimal control policies through interaction with the environment and can be readily implemented with existing optical devices. The superior performance of these networks is verified by engaging three types of classic games, Tic-Tac-Toe, Super Mario Bros., and Car Racing. Finally, we present an experimental demonstration of playing Tic-Tac-Toe by leveraging diffractive optical networks based on a spatial light modulator. Our work represents a solid step forward in advancing diffractive optical networks, which promises a fundamental shift from the target-driven control of a pre-designed state for simple recognition or classification tasks to the high-level sensory capability of artificial intelligence. It may find exciting applications in autonomous driving, intelligent robots, and intelligent manufacturing.
翻译:人工智能的终极目标是模拟人脑,直接从高维感官输入中执行决策与控制。衍射光网络为实现高速、低功耗的人工智能提供了极具前景的解决方案。目前报道的衍射光网络主要聚焦于无需环境交互的单一或多个任务(如目标识别与图像分类)。据我们所知,能够执行决策与控制的网络尚未被研发。本文提出利用深度强化学习实现具备人类级决策与控制能力的衍射光网络。这类网络采用残差架构,可通过与环境交互找到最优控制策略,并能基于现有光学器件直接部署。我们通过三种经典游戏(井字棋、超级马里奥兄弟、赛车竞速)验证了网络的优异性能。最后,我们基于空间光调制器演示了利用衍射光网络进行井字棋对弈的实验。该工作标志着衍射光网络领域的坚实进展,有望实现从预设状态驱动的简单识别分类任务,向具备高级感知能力的人工智能的根本性转变。其在自动驾驶、智能机器人及智能制造领域具有广阔应用前景。