Binary Neural Networks (BNNs) are increasingly preferred over full-precision Convolutional Neural Networks(CNNs) to reduce the memory and computational requirements of inference processing with minimal accuracy drop. BNNs convert CNN model parameters to 1-bit precision, allowing inference of BNNs to be processed with simple XNOR and bitcount operations. This makes BNNs amenable to hardware acceleration. Several photonic integrated circuits (PICs) based BNN accelerators have been proposed. Although these accelerators provide remarkably higher throughput and energy efficiency than their electronic counterparts, the utilized XNOR and bitcount circuits in these accelerators need to be further enhanced to improve their area, energy efficiency, and throughput. This paper aims to fulfill this need. For that, we invent a single-MRR-based optical XNOR gate (OXG). Moreover, we present a novel design of bitcount circuit which we refer to as Photo-Charge Accumulator (PCA). We employ multiple OXGs in a cascaded manner using dense wavelength division multiplexing (DWDM) and connect them to the PCA, to forge a novel Optical XNOR-Bitcount based Binary Neural Network Accelerator (OXBNN). Our evaluation for the inference of four modern BNNs indicates that OXBNN provides improvements of up to 62x and 7.6x in frames-per-second (FPS) and FPS/W (energy efficiency), respectively, on geometric mean over two PIC-based BNN accelerators from prior work. We developed a transaction-level, event-driven python-based simulator for evaluation of accelerators (https://github.com/uky-UCAT/B_ONN_SIM).
翻译:二进制神经网络(BNN)因在推理处理中能以极小精度损失降低存储和计算需求,正日益优于全精度卷积神经网络(CNN)。BNN将CNN模型参数转换为1比特精度,使BNN推理仅需通过简单的XNOR和bitcount操作实现,从而适用于硬件加速。目前已有多款基于光子集成电路(PIC)的BNN加速器被提出。尽管这些加速器在吞吐量和能效上显著超越电子方案,但其采用的XNOR和bitcount电路仍需进一步优化以提升面积效率、能效和吞吐量。本文旨在解决这一需求。为此,我们发明了一种基于单微环谐振器(MRR)的光学XNOR门(OXG),并提出一种新型bitcount电路——光电荷累加器(PCA)。通过密集波分复用(DWDM)技术将多个OXG级联并与PCA连接,构建了新型光学XNOR-Bitcount二进制神经网络加速器(OXBNN)。对四种现代BNN推理的评估表明,相较于现有两种基于PIC的BNN加速器,OXBNN在帧率(FPS)和能效(FPS/W)的几何平均值上分别实现了最高62倍和7.6倍的提升。我们开发了一款基于事务级事件驱动的Python仿真器用于加速器评估(https://github.com/uky-UCAT/B_ONN_SIM)。