The increasing demand for processing large volumes of data for machine learning models has pushed data bandwidth requirements beyond the capability of traditional von Neumann architecture. In-memory computing (IMC) has recently emerged as a promising solution to address this gap by enabling distributed data storage and processing at the micro-architectural level, significantly reducing both latency and energy. In this paper, we present the IMPACT: InMemory ComPuting Architecture Based on Y-FlAsh Technology for Coalesced Tsetlin Machine Inference, underpinned on a cutting-edge memory device, Y-Flash, fabricated on a 180 nm CMOS process. Y-Flash devices have recently been demonstrated for digital and analog memory applications, offering high yield, non-volatility, and low power consumption. The IMPACT leverages the Y-Flash array to implement the inference of a novel machine learning algorithm: coalesced Tsetlin machine (CoTM) based on propositional logic. CoTM utilizes Tsetlin automata (TA) to create Boolean feature selections stochastically across parallel clauses. The IMPACT is organized into two computational crossbars for storing the TA and weights. Through validation on the MNIST dataset, IMPACT achieved 96.3% accuracy. The IMPACT demonstrated improvements in energy efficiency, e.g., 2.23X over CNN-based ReRAM, 2.46X over Neuromorphic using NOR-Flash, and 2.06X over DNN-based PCM, suited for modern ML inference applications.
翻译:机器学习模型处理海量数据的需求日益增长,使得数据带宽要求已超出传统冯·诺依曼架构的能力范围。内存计算(IMC)作为一种新兴解决方案,通过在微架构层面实现分布式数据存储与处理,显著降低了延迟与能耗,从而有效弥合了这一差距。本文提出IMPACT:基于Y-Flash技术的融合Tsetlin机器推理内存计算架构,其核心采用基于180纳米CMOS工艺制造的先进存储器件Y-Flash。Y-Flash器件近期已被证实适用于数字与模拟存储应用,具备高良率、非易失性和低功耗特性。IMPACT利用Y-Flash阵列实现了一种基于命题逻辑的新型机器学习算法——融合Tsetlin机器(CoTM)的推理过程。CoTM通过Tsetlin自动机(TA)在并行子句间随机生成布尔特征选择。IMPACT架构采用两个计算交叉阵列分别存储TA和权重参数。通过在MNIST数据集上的验证,IMPACT实现了96.3%的准确率。该架构在能效方面表现出显著提升:相较于基于CNN的ReRAM方案提升2.23倍,相较于采用NOR-Flash的神经形态计算方案提升2.46倍,相较于基于DNN的相变存储器方案提升2.06倍,充分适配现代机器学习推理应用需求。