System-level Impact of Non-Ideal Program-Time of Charge Trap Flash (CTF) on Deep Neural Network

Learning of deep neural networks (DNN) using Resistive Processing Unit (RPU) architecture is energy-efficient as it utilizes dedicated neuromorphic hardware and stochastic computation of weight updates for in-memory computing. Charge Trap Flash (CTF) devices can implement RPU-based weight updates in DNNs. However, prior work has shown that the weight updates (V_T) in CTF-based RPU are impacted by the non-ideal program time of CTF. The non-ideal program time is affected by two factors of CTF. Firstly, the effects of the number of input pulses (N) or pulse width (pw), and secondly, the gap between successive update pulses (t_gap) used for the stochastic computation of weight updates. Therefore, the impact of this non-ideal program time must be studied for neural network training simulations. In this study, Firstly, we propose a pulse-train design compensation technique to reduce the total error caused by non-ideal program time of CTF and stochastic variance of a network. Secondly, we simulate RPU-based DNN with non-ideal program time of CTF on MNIST and Fashion-MNIST datasets. We find that for larger N (~1000), learning performance approaches the ideal (software-level) training level and, therefore, is not much impacted by the choice of t_gap used to implement RPU-based weight updates. However, for lower N (<500), learning performance depends on T_gap of the pulses. Finally, we also performed an ablation study to isolate the causal factor of the improved learning performance. We conclude that the lower noise level in the weight updates is the most likely significant factor to improve the learning performance of DNN. Thus, our study attempts to compensate for the error caused by non-ideal program time and standardize the pulse length (N) and pulse gap (t_gap) specifications for CTF-based RPUs for accurate system-level on-chip training.

翻译：深度神经网络（DNN）利用电阻处理单元（RPU）架构进行学习具有高能效特性，这得益于其专用神经形态硬件与权重更新的随机计算机制实现了存内计算。电荷陷阱闪存（CTF）器件可在DNN中实现基于RPU的权重更新。然而，先前研究表明，基于CTF的RPU中权重更新（V_T）受CTF非理想编程时间的影响。该非理想编程时间受CTF的两个因素制约：首先，输入脉冲数量（N）或脉冲宽度（pw）的影响；其次，用于权重更新随机计算的连续更新脉冲间的间隔（t_gap）的影响。因此，必须在神经网络训练仿真中研究这种非理想编程时间的影响。本研究首先提出一种脉冲序列设计补偿技术，以降低由CTF非理想编程时间和网络随机方差引起的总误差。其次，我们在MNIST和Fashion-MNIST数据集上仿真了包含CTF非理想编程时间的RPU基DNN。研究发现，当N较大（~1000）时，学习性能接近理想（软件级）训练水平，因此受实现RPU权重更新所选t_gap的影响较小。但当N较小（<500）时，学习性能取决于脉冲间隔t_gap。最后，我们通过消融研究分离了学习性能提升的因果因素，得出结论：权重更新中更低的噪声水平是提升DNN学习性能的最显著因素。因此，本研究致力于补偿非理想编程时间引起的误差，并为基于CTF的RPU标准化脉冲长度（N）与脉冲间隔（t_gap）参数，以实现精确的片上系统级训练。