DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement

Speech enhancement (SE) improves communication in noisy environments, affecting areas such as automatic speech recognition, hearing aids, and telecommunications. With these domains typically being power-constrained and event-based while requiring low latency, neuromorphic algorithms in the form of spiking neural networks (SNNs) have great potential. Yet, current effective SNN solutions require a contextual sampling window imposing substantial latency, typically around 32ms, too long for many applications. Inspired by Dual-Path Spiking Neural Networks (DPSNNs) in classical neural networks, we develop a two-phase time-domain streaming SNN framework -- the Dual-Path Spiking Neural Network (DPSNN). In the DPSNN, the first phase uses Spiking Convolutional Neural Networks (SCNNs) to capture global contextual information, while the second phase uses Spiking Recurrent Neural Networks (SRNNs) to focus on frequency-related features. In addition, the regularizer suppresses activation to further enhance energy efficiency of our DPSNNs. Evaluating on the VCTK and Intel DNS Datasets, we demonstrate that our approach achieves the very low latency (approximately 5ms) required for applications like hearing aids, while demonstrating excellent signal-to-noise ratio (SNR), perceptual quality, and energy efficiency.

翻译：语音增强（SE）旨在改善嘈杂环境中的通信质量，其应用领域包括自动语音识别、助听器和电信等。这些领域通常受限于功耗且基于事件驱动，同时要求低延迟，因此以脉冲神经网络（SNNs）形式存在的神经形态算法展现出巨大潜力。然而，当前有效的SNN解决方案需要依赖上下文采样窗口，这会引入显著延迟（通常约为32毫秒），对于许多应用而言过长。受经典神经网络中双路径脉冲神经网络（DPSNNs）的启发，我们开发了一种双阶段时域流式SNN框架——双路径脉冲神经网络（DPSNN）。在DPSNN中，第一阶段使用脉冲卷积神经网络（SCNNs）捕捉全局上下文信息，而第二阶段使用脉冲循环神经网络（SRNNs）聚焦于频率相关特征。此外，通过正则化器抑制激活以进一步提升DPSNN的能量效率。在VCTK和Intel DNS数据集上的评估表明，我们的方法实现了助听器等应用所需的极低延迟（约5毫秒），同时展现出优异的信噪比（SNR）、感知质量和能量效率。

相关内容

Neural Networks

关注 1654

神经网络（Neural Networks）是世界上三个最古老的神经建模学会的档案期刊:国际神经网络学会(INNS)、欧洲神经网络学会(ENNS)和日本神经网络学会(JNNS)。神经网络提供了一个论坛，以发展和培育一个国际社会的学者和实践者感兴趣的所有方面的神经网络和相关方法的计算智能。神经网络欢迎高质量论文的提交，有助于全面的神经网络研究，从行为和大脑建模，学习算法，通过数学和计算分析，系统的工程和技术应用，大量使用神经网络的概念和技术。这一独特而广泛的范围促进了生物和技术研究之间的思想交流，并有助于促进对生物启发的计算智能感兴趣的跨学科社区的发展。因此，神经网络编委会代表的专家领域包括心理学，神经生物学，计算机科学，工程，数学，物理。该杂志发表文章、信件和评论以及给编辑的信件、社论、时事、软件调查和专利信息。文章发表在五个部分之一:认知科学，神经科学，学习系统，数学和计算分析、工程和应用。官网地址：http://dblp.uni-trier.de/db/journals/nn/

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日