Compute-In-Memory (CiM) is a promising solution to accelerate Deep Neural Networks (DNNs) as it can avoid energy-intensive DNN weight movement and use memory arrays to perform low-energy, high-density computations. These benefits have inspired research across the CiM stack, but CiM research often focuses on only one level of the stack (i.e., devices, circuits, architecture, workload, or mapping) or only one design point (e.g., one fabricated chip). There is a need for a full-stack modeling tool to evaluate design decisions in the context of full systems (e.g., see how a circuit impacts system energy) and to perform rapid early-stage exploration of the CiM co-design space. To address this need, we propose CiMLoop: an open-source tool to model diverse CiM systems and explore decisions across the CiM stack. CiMLoop introduces (1) a flexible specification that lets users describe, model, and map workloads to both circuits and architecture, (2) an accurate energy model that captures the interaction between DNN operand values, hardware data representations, and analog/digital values propagated by circuits, and (3) a fast statistical model that can explore the design space orders-of-magnitude more quickly than other high-accuracy models. Using CiMLoop, researchers can evaluate design choices at different levels of the CiM stack, co-design across all levels, fairly compare different implementations, and rapidly explore the design space.
翻译:计算内存(CiM)是一种有前景的深度神经网络(DNN)加速方案,因为它能够避免高能耗的DNN权重移动,并利用存储阵列实现低能耗、高密度的计算。这些优势激发了CiM全栈领域的研究,但现有研究往往仅关注堆栈中的某一层(如器件、电路、架构、工作负载或映射)或单一设计点(如特定制造芯片)。当前亟需一种全栈建模工具,以在全系统背景下评估设计决策(例如观察电路如何影响系统能耗),并实现对CiM协同设计空间的快速早期探索。为此,本文提出CiMLoop:一个用于建模多样化CiM系统并探索CiM全栈设计决策的开源工具。CiMLoop引入三大特性:(1) 灵活规范体系,允许用户描述、建模并将工作负载映射至电路与架构层面;(2) 精确能耗模型,可捕捉DNN操作数值、硬件数据表示与电路传播的模拟/数字信号之间的交互关系;(3) 快速统计模型,其设计空间探索速度比其他高精度模型快数个数量级。借助CiMLoop,研究人员可评估CiM堆栈不同层级的设计选择、实现跨层级协同设计、公平比较不同实施方案,并快速探索设计空间。