Spiking Neural Networks (SNNs) have demonstrated capabilities for solving diverse machine learning tasks with ultra-low power/energy consumption. To maximize the performance and efficiency of SNN inference, the Compute-in-Memory (CIM) hardware accelerators with emerging device technologies (e.g., RRAM) have been employed. However, SNN architectures are typically developed without considering constraints from the application and the underlying CIM hardware, thereby hindering SNNs from reaching their full potential in accuracy and efficiency. To address this, we propose NeuroNAS, a novel framework for developing energy-efficient neuromorphic CIM systems using a hardware-aware spiking neural architecture search (NAS), i.e., by quickly finding an SNN architecture that offers high accuracy under the given constraints (e.g., memory, area, latency, and energy consumption). NeuroNAS employs the following key steps: (1) optimizing SNN operations to enable efficient NAS, (2) employing quantization to minimize the memory footprint, (3) developing an SNN architecture that facilitates an effective learning, and (4) devising a systematic hardware-aware search algorithm to meet the constraints. Compared to the state-of-the-art, NeuroNAS with 8bit weight precision quickly finds SNNs that maintain high accuracy by up to 6.6x search time speed-ups, while achieving up to 92% area savings, 1.2x latency speed-ups, 84% energy savings across CIFAR-10, CIFAR-100, and TinyImageNet-200 datasets; while the state-of-the-art fail to meet all constraints at once. In this manner, NeuroNAS enables efficient design automation in developing energy-efficient neuromorphic CIM systems for diverse ML-based applications.
翻译:脉冲神经网络(SNNs)已展现出以超低功耗/能耗解决多样化机器学习任务的能力。为最大化SNN推理的性能与效率,采用新兴器件技术(如RRAM)的存内计算(CIM)硬件加速器已被广泛应用。然而,SNN架构的开发通常未考虑具体应用场景及底层CIM硬件的约束条件,从而阻碍了SNN在精度与效率方面发挥其全部潜力。为此,我们提出NeuroNAS——一种基于硬件感知脉冲神经网络架构搜索(NAS)开发高能效神经形态CIM系统的新型框架,即通过快速搜索出在给定约束条件(如内存、面积、延迟和能耗)下仍能保持高精度的SNN架构。NeuroNAS包含以下关键步骤:(1)优化SNN运算以实现高效NAS;(2)采用量化技术最小化内存占用;(3)构建有利于高效学习的SNN架构;(4)设计系统化的硬件感知搜索算法以满足约束条件。相较于现有最优方案,采用8位权重精度的NeuroNAS能够以最高6.6倍的搜索速度加速,快速找到保持高精度的SNN架构,同时在CIFAR-10、CIFAR-100和TinyImageNet-200数据集上实现最高92%的面积节省、1.2倍的延迟加速以及84%的能耗降低;而现有最优方案无法同时满足所有约束条件。通过这种方式,NeuroNAS为开发面向多样化机器学习应用的高能效神经形态CIM系统提供了高效的设计自动化支持。