Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.
翻译:冯·诺依曼架构中执行数据密集型任务时,由于存储墙瓶颈,难以同时实现高性能和高能效。存内计算借助外围接口和数据通路支持,通过实现存储器内原位并行乘累加操作,成为一种有前景的缓解方案。基于SRAM的电荷域存内计算在提升能效和计算精度方面展现出潜力。然而,现有基于SRAM的电荷域存内计算面临扩展挑战,难以满足高性能多比特量化应用的吞吐量需求。本文提出一种基于SRAM的高吞吐量ReLU优化型电荷域存内计算宏。该宏可在一次存内计算周期内完成两个8位有符号向量的乘累加与ReLU操作,仅需执行一次模数转换。通过模拟计算与模数转换接口的非线性补偿,本工作在CIFAR-10数据集上实现了51.2GOPS的吞吐量、10.3TOPS/W的能效,以及88.6%的准确率。