Scalable isomorphic physical neural networks (PNNs) are emerging NN acceleration paradigms for their high-bandwidth, in-propagation computation. Despite backpropagation (BP)-based training is often the industry standard for its robustness and fast gradient convergences, existing BP-PNN training methods need to truncate the propagation of analogue signal at each layer and acquire accurate hidden neuron readouts for deep networks. This compromises the incentive of PNN for fast in-propagation processing. In addition, the required readouts introduce massive bottlenecks due to the conversions between the analogue-digital interfaces to shuttle information across. These factors limit both the time and energy efficiency during training. Here we introduce the asymmetrical training (AT) method, a BP-based method that can perform training on an encapsulated deep network, where the information propagation is maintained within the analogue domain until the output layer. AT's minimum information access bypass analogue-digital interface bottleneck wherever possible. For any deep network structure, AT offers significantly improved time and energy efficiency compared to existing BP-PNN methods, and scales well for large network sizes. We demonstrated AT's error-tolerant and calibration-free training for encapsulated integrated photonic deep networks to achieve near ideal BP performances. AT's well-behaved training is demonstrated repeatably across different datasets and network structures
翻译:可扩展的同构物理神经网络(PNNs)凭借其高带宽和传播内计算能力,正成为新兴的神经网络加速范式。尽管基于反向传播(BP)的训练方法因其鲁棒性和快速梯度收敛性常被视为行业标准,但现有的BP-PNN训练方法需要在每一层截断模拟信号的传播,并为深度网络获取精确的隐藏神经元读出。这削弱了PNN实现快速传播内处理的优势。此外,所需的读出操作因需要在模拟-数字接口间进行信息转换而引入大量瓶颈。这些因素限制了训练过程中的时间和能效。本文提出非对称训练(AT)方法,这是一种基于BP的训练方法,可在封装式深度网络上执行训练,其中信息传播保持在模拟域内直至输出层。AT通过最小化信息访问,尽可能绕过模拟-数字接口瓶颈。对于任何深度网络结构,与现有BP-PNN方法相比,AT显著提升了时间和能效,并能良好适应大规模网络。我们通过实验证明,AT能够对封装式集成光子深度网络进行容错且无需校准的训练,达到接近理想BP的性能。AT的稳定训练行为在不同数据集和网络结构上均得到可重复验证。