Compute-In-Memory (CiM) is a promising solution to accelerate Deep Neural Networks (DNNs) as it can avoid energy-intensive DNN weight movement and use memory arrays to perform low-energy, high-density computations. These benefits have inspired research across the CiM stack, but CiM research often focuses on only one level of the stack (i.e., devices, circuits, architecture, workload, or mapping) or only one design point (e.g., one fabricated chip). There is a need for a full-stack modeling tool to evaluate design decisions in the context of full systems (e.g., see how a circuit impacts system energy) and to perform rapid early-stage exploration of the CiM co-design space. To address this need, we propose CiMLoop: an open-source tool to model diverse CiM systems and explore decisions across the CiM stack. CiMLoop introduces (1) a flexible specification that lets users describe, model, and map workloads to both circuits and architecture, (2) an accurate energy model that captures the interaction between DNN operand values, hardware data representations, and analog/digital values propagated by circuits, and (3) a fast statistical model that can explore the design space orders-of-magnitude more quickly than other high-accuracy models. Using CiMLoop, researchers can evaluate design choices at different levels of the CiM stack, co-design across all levels, fairly compare different implementations, and rapidly explore the design space.
翻译:内存计算(CiM)是加速深度神经网络(DNN)的一种有前景的解决方案,因为它可以避免高能耗的DNN权重移动,并利用存储阵列执行低能耗、高密度的计算。这些优势激发了跨CiM技术栈的研究,但CiM研究往往仅关注栈中的单一层次(即器件、电路、架构、工作负载或映射)或单一设计点(例如单个流片芯片)。目前亟需一种全栈建模工具,以在全系统背景下评估设计决策(例如分析电路如何影响系统能耗),并对CiM协同设计空间进行快速的早期探索。为满足这一需求,我们提出了CiMLoop:一个用于建模多样化CiM系统并探索跨CiM栈决策的开源工具。CiMLoop引入了(1)灵活的规范描述,允许用户对工作负载进行描述、建模,并将其映射到电路与架构;(2)精确的能耗模型,能够捕捉DNN操作数值、硬件数据表示以及电路传播的模拟/数字值之间的相互作用;(3)快速的统计模型,其设计空间探索速度相比其他高精度模型可提升数个数量级。借助CiMLoop,研究人员能够评估CiM栈不同层次的设计选择,实现跨所有层次的协同设计,公平比较不同实现方案,并快速探索设计空间。