Analog in-memory computing (AIMC) -- a promising approach for energy-efficient acceleration of deep learning workloads -- computes matrix-vector multiplications (MVMs) but only approximately, due to nonidealities that often are non-deterministic or nonlinear. This can adversely impact the achievable deep neural network (DNN) inference accuracy as compared to a conventional floating point (FP) implementation. While retraining has previously been suggested to improve robustness, prior work has explored only a few DNN topologies, using disparate and overly simplified AIMC hardware models. Here, we use hardware-aware (HWA) training to systematically examine the accuracy of AIMC for multiple common artificial intelligence (AI) workloads across multiple DNN topologies, and investigate sensitivity and robustness to a broad set of nonidealities. By introducing a new and highly realistic AIMC crossbar-model, we improve significantly on earlier retraining approaches. We show that many large-scale DNNs of various topologies, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers, can in fact be successfully retrained to show iso-accuracy on AIMC. Our results further suggest that AIMC nonidealities that add noise to the inputs or outputs, not the weights, have the largest impact on DNN accuracy, and that RNNs are particularly robust to all nonidealities.
翻译:模拟存内计算(AIMC)——作为能效加速深度学习工作负载的一种有前景方法——在执行矩阵向量乘法(MVM)时因非理想特性(通常是非确定性或非线性的)而仅能近似实现。相较于传统浮点(FP)实现,这可能会对深度神经网络(DNN)推理精度产生不利影响。尽管先前研究提出通过重训练提升鲁棒性,但相关工作仅基于简化且差异化的AIMC硬件模型探索了少量DNN拓扑结构。本文采用硬件感知(HWA)训练,系统性地评估了多种常见人工智能(AI)工作负载在多种DNN拓扑下的AIMC精度,并研究了其对广泛非理想特性的敏感度与鲁棒性。通过引入高度真实的新型AIMC交叉开关模型,我们在早期重训练方法基础上实现了显著改进。研究表明:涵盖卷积神经网络(CNN)、循环神经网络(RNN)及Transformer等不同拓扑的多种大规模DNN,实际上均可成功重训练以在AIMC上达到等精度。实验结果进一步表明:对输入或输出(而非权重)引入噪声的AIMC非理想特性对DNN精度影响最大,且RNN对所有非理想特性均表现出特别强的鲁棒性。