Computing-in-Memory (CIM) has shown great potential for enhancing efficiency and performance for deep neural networks (DNNs). However, the lack of flexibility in CIM leads to an unnecessary expenditure of computational resources on less critical operations, and a diminished Signal-to-Noise Ratio (SNR) when handling more complex tasks, significantly hindering the overall performance. Hence, we focus on the integration of CIM with Saliency-Aware Computing -- a paradigm that dynamically tailors computing precision based on the importance of each input. We propose On-the-fly Saliency-Aware Hybrid CIM (OSA-HCIM) offering three primary contributions: (1) On-the-fly Saliency-Aware (OSA) precision configuration scheme, which dynamically sets the precision of each MAC operation based on its saliency, (2) Hybrid CIM Array (HCIMA), which enables simultaneous operation of digital-domain CIM (DCIM) and analog-domain CIM (ACIM) via split-port 6T SRAM, and (3) an integrated framework combining OSA and HCIMA to fulfill diverse accuracy and power demands. Implemented on a 65nm CMOS process, OSA-HCIM demonstrates an exceptional balance between accuracy and resource utilization. Notably, it is the first CIM design to incorporate a dynamic digital-to-analog boundary, providing unprecedented flexibility for saliency-aware computing. OSA-HCIM achieves a 1.95x enhancement in energy efficiency, while maintaining minimal accuracy loss compared to DCIM when tested on CIFAR100 dataset.
翻译:存算一体架构在提升深度神经网络效率与性能方面展现出巨大潜力。然而,现有存算一体架构因缺乏灵活性,导致对非关键运算的计算资源过度消耗,同时处理复杂任务时信噪比降低,显著制约了整体性能。因此,本研究聚焦于存算一体与显著性感知计算的融合——一种根据输入重要性动态调整计算精度的范式。我们提出动态显著性感知混合存算一体架构,主要贡献包括:(1) 动态显著性感知精度配置方案,能够根据每次乘累加操作的显著性动态设定计算精度;(2) 混合存算一体阵列,通过分裂端口6T静态随机存储器实现数字域存算一体与模拟域存算一体的并行运算;(3) 融合OSA与HCIMA的集成框架,以满足多样化的精度与功耗需求。基于65nm CMOS工艺实现的OSA-HCIM在精度与资源利用率之间展现出卓越的平衡性。值得注意的是,这是首个引入动态数模边界的存算一体设计,为显著性感知计算提供了前所未有的灵活性。经CIFAR100数据集测试,相较于DCIM,OSA-HCIM在保持最低精度损失的前提下,能效提升1.95倍。