With the continuous increase in the size and complexity of machine learning models, the need for specialized hardware to efficiently run such models is rapidly growing. To address such a need, silicon-photonic-based neural network (SP-NN) accelerators have recently emerged as a promising alternative to electronic accelerators due to their lower latency and higher energy efficiency. Not only can SP-NNs alleviate the fan-in and fan-out problem with linear algebra processors, their operational bandwidth can match that of the photodetection rate (typically 100 GHz), which is at least over an order of magnitude faster than electronic counterparts that are restricted to a clock rate of a few GHz. Unfortunately, the underlying silicon photonic devices in SP-NNs suffer from inherent optical losses and crosstalk noise originating from fabrication imperfections and undesired optical couplings, the impact of which accumulates as the network scales up. Consequently, the inferencing accuracy in an SP-NN can be affected by such inefficiencies -- e.g., can drop to below 10% -- the impact of which is yet to be fully studied. In this paper, we comprehensively model the optical loss and crosstalk noise using a bottom-up approach, from the device to the system level, in coherent SP-NNs built using Mach-Zehnder interferometer (MZI) devices. The proposed models can be applied to any SP-NN architecture with different configurations to analyze the effect of loss and crosstalk. Such an analysis is important where there are inferencing accuracy and scalability requirements to meet when designing an SP-NN. Using the proposed analytical framework, we show a high power penalty and a catastrophic inferencing accuracy drop of up to 84% for SP-NNs of different scales with three known MZI mesh configurations (i.e., Reck, Clements, and Diamond) due to accumulated optical loss and crosstalk noise.
翻译:随着机器学习模型规模和复杂性的持续增长,对能够高效运行这类模型的专用硬件的需求迅速增加。为满足这一需求,基于硅光子的神经网络(SP-NN)加速器因其低延迟和高能效,近年来成为电子加速器的有前景替代方案。SP-NN不仅能缓解线性代数处理器的扇入和扇出问题,其工作带宽还可匹配光电探测速率(通常为100 GHz),这比受限于几GHz时钟速率的电子对应方案至少快一个数量级以上。然而,SP-NN中底层硅光子器件存在由制造缺陷和非期望光学耦合引起的固有光损耗和串扰噪声,其影响随网络规模扩大而累积。因此,SP-NN的推理精度可能受到此类低效性的影响——例如可能降至10%以下——但其影响尚待充分研究。本文采用从器件到系统级的自底向上方法,对基于马赫-曾德尔干涉仪(MZI)器件构建的相干SP-NN中的光损耗和串扰噪声进行了全面建模。所提出的模型可应用于具有不同配置的任何SP-NN架构,以分析损耗和串扰的影响。当设计SP-NN时需要满足推理精度和可扩展性要求时,此类分析至关重要。利用所提出的分析框架,我们展示了在三种已知MZI网格配置(即Reck、Clements和Diamond)下,不同规模的SP-NN因累积光损耗和串扰噪声而出现的高功率代价和高达84%的灾难性推理精度下降。