speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition - 专知论文

会员服务 ·

0

噪声 · 语音识别 · 稳健性 · Networking · 语音增强 ·

2023 年 5 月 29 日

speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

翻译：面向鲁棒语音识别任务中结合语音失真损失的语音与噪声双流语谱图精炼网络

Haoyu Lu,Nan Li,Tongtong Song,Longbiao Wang,Jianwu Dang,Xiaobao Wang,Shiliang Zhang

In recent years, the joint training of speech enhancement front-end and automatic speech recognition (ASR) back-end has been widely used to improve the robustness of ASR systems. Traditional joint training methods only use enhanced speech as input for the backend. However, it is difficult for speech enhancement systems to directly separate speech from input due to the diverse types of noise with different intensities. Furthermore, speech distortion and residual noise are often observed in enhanced speech, and the distortion of speech and noise is different. Most existing methods focus on fusing enhanced and noisy features to address this issue. In this paper, we propose a dual-stream spectrogram refine network to simultaneously refine the speech and noise and decouple the noise from the noisy input. Our proposed method can achieve better performance with a relative 8.6% CER reduction.

翻译：近年来，语音增强前端与自动语音识别（ASR）后端联合训练被广泛用于提升ASR系统的鲁棒性。传统联合训练方法仅将增强后的语音作为后端输入，但由于噪声类型多样且强度各异，语音增强系统难以直接分离输入中的语音成分。此外，增强后的语音常存在语音失真与残留噪声现象，且语音与噪声的失真特性不尽相同。现有方法多通过融合增强特征与带噪特征来应对该问题。本文提出一种双流语谱图精炼网络，同步精炼语音与噪声，并从带噪输入中解耦噪声。实验表明，所提方法可实现相对8.6%的字错误率（CER）降低。

0

相关内容

【2022新书】谱图理论，Spectral Graph Theory，100页pdf

【2022新书】谱图理论，Spectral Graph Theory，100页pdf

专知会员服务

76+阅读 · 2022年4月15日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

面向无机炸药痕量检测的ZnS量子点气敏性能的掺杂调控及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

输入/输出时滞Markov跳变随机系统的最优控制与滤波

国家自然科学基金

0+阅读 · 2014年12月31日

基于流动体系的再生水管道腐蚀垢层形成机理及动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于散射SNOM的超分辨成像光场的检测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

肽核酸与蛋白质互作机理及肽核酸适配体筛选的毛细管电泳方法学研究

国家自然科学基金

0+阅读 · 2012年12月31日

土中螺旋驱动自行走式隧道掘进机合理螺旋参数的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

多自由度哈密顿系统的动力学不稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

稀土元素改善快淬非晶Mg2Ni型合金电极过程动力学性能的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

LA-Net: Landmark-Aware Learning for Reliable Facial Expression Recognition under Label Noise

Arxiv

0+阅读 · 2023年7月18日

Neural Network Pruning as Spectrum Preserving Process

Arxiv

0+阅读 · 2023年7月18日

Semi-supervised cross-lingual speech emotion recognition

Arxiv

0+阅读 · 2023年7月17日

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

Arxiv

0+阅读 · 2023年7月16日

On decoder-only architecture for speech-to-text and large language model integration

Arxiv

0+阅读 · 2023年7月14日

Implicit Neural Feature Fusion Function for Multispectral and Hyperspectral Image Fusion

Arxiv

0+阅读 · 2023年7月14日

Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition

Arxiv

0+阅读 · 2023年7月14日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

3+阅读 · 6月22日

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

4+阅读 · 6月22日

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

5+阅读 · 6月22日

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

4+阅读 · 6月22日

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

4+阅读 · 6月22日

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

4+阅读 · 6月22日

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

4+阅读 · 6月22日

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

7+阅读 · 6月21日

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

5+阅读 · 6月21日

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

8+阅读 · 6月21日

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

21+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

5+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

8+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

7+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

9+阅读 · 6月18日

相关VIP内容

【2022新书】谱图理论，Spectral Graph Theory，100页pdf

【2022新书】谱图理论，Spectral Graph Theory，100页pdf

专知会员服务

76+阅读 · 2022年4月15日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

纽约大学最新《语音识别Speech Recognition》2020课程，不可错过！

专知会员服务

44+阅读 · 2020年11月2日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 3D场景图：开放挑战与未来方向

21世纪的无人机战争

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

【论文推荐】最新八篇情感分析相关论文—Pair-wise判别器、多模态情感分析、上下文语境、Gated 卷积网络

专知

20+阅读 · 2018年6月29日

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

最新5篇生成对抗网络相关论文推荐—FusedGAN、DeblurGAN、AdvGAN、CipherGAN、MMD GANS

专知

23+阅读 · 2018年1月18日

相关论文

LA-Net: Landmark-Aware Learning for Reliable Facial Expression Recognition under Label Noise

Arxiv

0+阅读 · 2023年7月18日

Neural Network Pruning as Spectrum Preserving Process

Arxiv

0+阅读 · 2023年7月18日

Semi-supervised cross-lingual speech emotion recognition

Arxiv

0+阅读 · 2023年7月17日

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

Arxiv

0+阅读 · 2023年7月16日

On decoder-only architecture for speech-to-text and large language model integration

Arxiv

0+阅读 · 2023年7月14日

Implicit Neural Feature Fusion Function for Multispectral and Hyperspectral Image Fusion

Arxiv

0+阅读 · 2023年7月14日

Complementary Frequency-Varying Awareness Network for Open-Set Fine-Grained Image Recognition

Arxiv

0+阅读 · 2023年7月14日

Faster Meta Update Strategy for Noise-Robust Deep Learning

Arxiv

11+阅读 · 2021年4月30日

Self-Supervised Learning For Few-Shot Image Classification

Self-Supervised Learning For Few-Shot Image Classification

Arxiv

19+阅读 · 2019年11月14日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

相关基金

面向无机炸药痕量检测的ZnS量子点气敏性能的掺杂调控及机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

输入/输出时滞Markov跳变随机系统的最优控制与滤波

国家自然科学基金

0+阅读 · 2014年12月31日

基于流动体系的再生水管道腐蚀垢层形成机理及动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于散射SNOM的超分辨成像光场的检测方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

肽核酸与蛋白质互作机理及肽核酸适配体筛选的毛细管电泳方法学研究

国家自然科学基金

0+阅读 · 2012年12月31日

土中螺旋驱动自行走式隧道掘进机合理螺旋参数的研究

国家自然科学基金

0+阅读 · 2012年12月31日

Lorenz-like系统族的等价性和混沌吸引子几何结构

国家自然科学基金

0+阅读 · 2011年12月31日

多自由度哈密顿系统的动力学不稳定性研究

国家自然科学基金

0+阅读 · 2011年12月31日

稀土元素改善快淬非晶Mg2Ni型合金电极过程动力学性能的机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员