In-memory-computing is emerging as an efficient hardware paradigm for deep neural network accelerators at the edge, enabling to break the memory wall and exploit massive computational parallelism. Two design models have surged: analog in-memory-computing (AIMC) and digital in-memory-computing (DIMC), offering a different design space in terms of accuracy, efficiency and dataflow flexibility. This paper targets the fair comparison and benchmarking of both approaches to guide future designs, through a.) an overview of published architectures; b.) an analytical cost model for energy and throughput; c.) scheduling of workloads on a variety of modeled IMC architectures for end-to-end network efficiency analysis, offering valuable workload-hardware co-design insights.
翻译:存内计算正成为边缘端深度神经网络加速器的一种高效硬件范式,有望突破存储墙限制并实现大规模计算并行性。两种设计模型应运而生:模拟存内计算(AIMC)与数字存内计算(DIMC),它们在精度、效率和数据流灵活性方面提供了不同的设计空间。本文旨在通过以下方法实现两种方法的公平比较与基准测试,以指导未来设计:a)综述已发表的架构;b)建立能量与吞吐量的分析成本模型;c)在各种建模的IMC架构上调度工作负载以进行端到端网络效率分析,从而提供有价值的工作负载-硬件协同设计洞见。