Asymmetrical estimator for training encapsulated deep photonic neural networks

Scalable isomorphic physical neural networks (PNNs) are emerging NN acceleration paradigms for their high-bandwidth, in-propagation computation. Despite backpropagation (BP)-based training is often the industry standard for its robustness and fast gradient convergences, existing BP-PNN training methods need to truncate the propagation of analogue signal at each layer and acquire accurate hidden neuron readouts for deep networks. This compromises the incentive of PNN for fast in-propagation processing. In addition, the required readouts introduce massive bottlenecks due to the conversions between the analogue-digital interfaces to shuttle information across. These factors limit both the time and energy efficiency during training. Here we introduce the asymmetrical training (AT) method, a BP-based method that can perform training on an encapsulated deep network, where the information propagation is maintained within the analogue domain until the output layer. AT's minimum information access bypass analogue-digital interface bottleneck wherever possible. For any deep network structure, AT offers significantly improved time and energy efficiency compared to existing BP-PNN methods, and scales well for large network sizes. We demonstrated AT's error-tolerant and calibration-free training for encapsulated integrated photonic deep networks to achieve near ideal BP performances. AT's well-behaved training is demonstrated repeatably across different datasets and network structures

翻译：可扩展的同构物理神经网络（PNNs）凭借其高带宽和传播内计算能力，正成为新兴的神经网络加速范式。尽管基于反向传播（BP）的训练方法因其鲁棒性和快速梯度收敛性常被视为行业标准，但现有的BP-PNN训练方法需要在每一层截断模拟信号的传播，并为深度网络获取精确的隐藏神经元读出。这削弱了PNN实现快速传播内处理的优势。此外，所需的读出操作因需要在模拟-数字接口间进行信息转换而引入大量瓶颈。这些因素限制了训练过程中的时间和能效。本文提出非对称训练（AT）方法，这是一种基于BP的训练方法，可在封装式深度网络上执行训练，其中信息传播保持在模拟域内直至输出层。AT通过最小化信息访问，尽可能绕过模拟-数字接口瓶颈。对于任何深度网络结构，与现有BP-PNN方法相比，AT显著提升了时间和能效，并能良好适应大规模网络。我们通过实验证明，AT能够对封装式集成光子深度网络进行容错且无需校准的训练，达到接近理想BP的性能。AT的稳定训练行为在不同数据集和网络结构上均得到可重复验证。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日