Memristor-based hardware offers new possibilities for energy-efficient machine learning (ML) by providing analog in-memory matrix multiplication. Current hardware prototypes cannot fit large neural networks, and related literature covers only small ML models for tasks like MNIST or single word recognition. Simulation can be used to explore how hardware properties affect larger models, but existing software assumes simplified hardware. We propose a PyTorch-based library based on "Synaptogen" to simulate neural network execution with accurately captured memristor hardware properties. For the first time, we show how an ML system with millions of parameters would behave on memristor hardware, using a Conformer trained on the speech recognition task TED-LIUMv2 as example. With adjusted quantization-aware training, we limit the relative degradation in word error rate to 25% when using a 3-bit weight precision to execute linear operations via simulated analog computation.
翻译:忆阻器硬件通过提供模拟内存矩阵乘法,为高效能机器学习(ML)开辟了新途径。当前硬件原型尚无法承载大型神经网络,相关文献仅涵盖用于MNIST或单词语音识别等任务的小型ML模型。虽然可通过仿真探索硬件特性对大型模型的影响,但现有软件通常基于简化的硬件假设。我们提出一个基于"Synaptogen"的PyTorch库,能够精确模拟忆阻器硬件特性下的神经网络执行过程。以在TED-LIUMv2语音识别任务上训练的Conformer模型为例,首次展示了参数规模达百万级的ML系统在忆阻器硬件上的运行特性。通过调整量化感知训练策略,在使用3比特权重精度执行模拟计算线性运算时,我们将词错误率的相对劣化控制在25%以内。