In-memory computing for Machine Learning (ML) applications remedies the von Neumann bottlenecks by organizing computation to exploit parallelism and locality. Non-volatile memory devices such as Resistive RAM (ReRAM) offer integrated switching and storage capabilities showing promising performance for ML applications. However, ReRAM devices have design challenges, such as non-linear digital-analog conversion and circuit overheads. This paper proposes an In-Memory Boolean-to-Current Inference Architecture (IMBUE) that uses ReRAM-transistor cells to eliminate the need for such conversions. IMBUE processes Boolean feature inputs expressed as digital voltages and generates parallel current paths based on resistive memory states. The proportional column current is then translated back to the Boolean domain for further digital processing. The IMBUE architecture is inspired by the Tsetlin Machine (TM), an emerging ML algorithm based on intrinsically Boolean logic. The IMBUE architecture demonstrates significant performance improvements over binarized convolutional neural networks and digital TM in-memory implementations, achieving up to a 12.99x and 5.28x increase, respectively.
翻译:面向机器学习应用的内存内计算通过组织计算以利用并行性和局部性,缓解了冯·诺依曼瓶颈。电阻式RAM等非易失性存储器件具备集成开关与存储能力,在机器学习应用中展现出良好性能。然而,电阻式RAM器件存在非线性数模转换及电路开销等设计挑战。本文提出一种内存内布尔-电流推理架构(IMBUE),该架构采用电阻式RAM-晶体管单元,消除了此类转换需求。IMBUE将数字电压形式的布尔特征输入进行处理,基于电阻存储状态生成并行电流路径,随后将比例列电流转换回布尔域以进行后续数字处理。该架构受基于内在布尔逻辑的新兴机器学习算法——Tsetlin机器(TM)启发。与二值化卷积神经网络及数字TM内存内实现相比,IMBUE架构分别实现了高达12.99倍和5.28倍的性能提升。