一种基于循环脉冲神经网络的71.2微瓦语音识别加速器 (A 71.2-$μ$W Speech Recognition Accelerator with Recurrent Spiking Neural Network)

This paper introduces a 71.2-$\mu$W speech recognition accelerator designed for edge devices' real-time applications, emphasizing an ultra low power design. Achieved through algorithm and hardware co-optimizations, we propose a compact recurrent spiking neural network with two recurrent layers, one fully connected layer, and a low time step (1 or 2). The 2.79-MB model undergoes pruning and 4-bit fixed-point quantization, shrinking it by 96.42\% to 0.1 MB. On the hardware front, we take advantage of \textit{mixed-level pruning}, \textit{zero-skipping} and \textit{merged spike} techniques, reducing complexity by 90.49\% to 13.86 MMAC/S. The \textit{parallel time-step execution} addresses inter-time-step data dependencies and enables weight buffer power savings through weight sharing. Capitalizing on the sparse spike activity, an input broadcasting scheme eliminates zero computations, further saving power. Implemented on the TSMC 28-nm process, the design operates in real time at 100 kHz, consuming 71.2 $\mu$W, surpassing state-of-the-art designs. At 500 MHz, it has 28.41 TOPS/W and 1903.11 GOPS/mm$^2$ in energy and area efficiency, respectively.

翻译：本文介绍了一种专为边缘设备实时应用设计的71.2微瓦语音识别加速器，其设计重点在于超低功耗。通过算法与硬件的协同优化，我们提出了一种紧凑的循环脉冲神经网络，该网络包含两个循环层、一个全连接层，并采用低时间步长（1或2）。该2.79 MB的模型经过剪枝和4位定点量化，尺寸缩减了96.42%至0.1 MB。在硬件层面，我们利用了*混合级剪枝*、*零跳过*和*合并脉冲*技术，将计算复杂度降低了90.49%至13.86 MMAC/S。*并行时间步执行*解决了时间步间的数据依赖问题，并通过权重共享实现了权重缓冲区的功耗节省。利用脉冲活动的稀疏性，一种输入广播方案消除了零值计算，进一步节省了功耗。该设计基于台积电28纳米工艺实现，在100 kHz频率下实时运行，功耗为71.2微瓦，性能超越了现有先进设计。在500 MHz频率下，其能效和面积效率分别为28.41 TOPS/W和1903.11 GOPS/mm²。

相关内容

Neural Networks

关注 1652

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日