Metric-oriented Speech Enhancement using Diffusion Probabilistic Model - 专知论文

会员服务 ·

0

语音增强 · MoDELS · Processing（编程语言） · Obvious · Integration ·

2023 年 2 月 23 日

Metric-oriented Speech Enhancement using Diffusion Probabilistic Model

翻译：面向指标的扩散概率模型语音增强方法

Chen Chen,Yuchen Hu,Weiwei Weng,Eng Siong Chng

from arxiv, Accepted by ICASSP2023

Deep neural network based speech enhancement technique focuses on learning a noisy-to-clean transformation supervised by paired training data. However, the task-specific evaluation metric (e.g., PESQ) is usually non-differentiable and can not be directly constructed in the training criteria. This mismatch between the training objective and evaluation metric likely results in sub-optimal performance. To alleviate it, we propose a metric-oriented speech enhancement method (MOSE), which leverages the recent advances in the diffusion probabilistic model and integrates a metric-oriented training strategy into its reverse process. Specifically, we design an actor-critic based framework that considers the evaluation metric as a posterior reward, thus guiding the reverse process to the metric-increasing direction. The experimental results demonstrate that MOSE obviously benefits from metric-oriented training and surpasses the generative baselines in terms of all evaluation metrics.

翻译：基于深度神经网络的语音增强技术侧重于通过配对训练数据监督学习从含噪到干净的转换。然而，任务特定的评估指标（如PESQ）通常不可微，无法直接纳入训练准则。这种训练目标与评估指标之间的不匹配可能导致次优性能。为解决这一问题，我们提出了一种面向指标的语音增强方法（MOSE），该方法利用扩散概率模型的最新进展，将面向指标的训练策略集成到其逆向过程中。具体而言，我们设计了一个基于演员-评论家（actor-critic）的框架，将评估指标视为后验奖励，从而引导逆向过程向指标提升方向优化。实验结果表明，MOSE显著受益于面向指标的训练，在所有评估指标上均超越了生成式基线方法。

0

相关内容

语音增强

语音增强是指当语音信号被各种各样的噪声干扰、甚至淹没后，从噪声背景中提取有用的语音信号，抑制、降低噪声干扰的技术。一句话，从含噪语音中提取尽可能纯净的原始语音。

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

同步辐射光电离质谱技术研究C3 Criegee中间体宏观反应动力学

国家自然科学基金

0+阅读 · 2015年12月31日

Versican 3'-非翻译区(3'-UTR)作为非编码竞争内源性RNA(ceRNA)通过调控MicroRNAs的功能在乳腺癌中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

prohibitin与PIG3基因启动子区（TGYCC）n序列结合并调控其转录的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于磁热疗法的cRGD介导的超顺磁长循环纳米脂质体用于维吾尔药去氢骆驼蓬碱抗癌及其对癌细胞p53基因的研究

国家自然科学基金

0+阅读 · 2014年12月31日

肽核酸与蛋白质互作机理及肽核酸适配体筛选的毛细管电泳方法学研究

国家自然科学基金

0+阅读 · 2012年12月31日

转录因子基因PtrICE1调控柑橘冷响应基因表达的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白质核酸适配体高效筛选的毛细管电泳方法学基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂体系GC-MS高通量分析方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

血红素加氧酶-1（HO-1）在胃癌腹膜转移中的作用研究

国家自然科学基金

0+阅读 · 2008年12月31日

Diffusion Models in Vision: A Survey

Arxiv

30+阅读 · 2022年9月10日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Diffusion Models: A Comprehensive Survey of Methods and Applications

Arxiv

67+阅读 · 2022年9月2日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

67+阅读 · 2019年9月8日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Arxiv

11+阅读 · 2018年1月11日

VIP会员

文章信息

相关主题

Processing（编程语言）

最新内容

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

专知会员服务

0+阅读 · 今天14:44

CVPR 2026教程：统一多模态模型走向收敛之路

CVPR 2026教程：统一多模态模型走向收敛之路

专知会员服务

0+阅读 · 今天14:41

《人工智能在网络防御中的机遇》

《人工智能在网络防御中的机遇》

专知会员服务

3+阅读 · 今天12:49

认知战：定义与能力发展

认知战：定义与能力发展

专知会员服务

4+阅读 · 今天9:25

2026年美国防部人工智能政策如何将国防人工智能转向速度、规模与“人工智能优先”作战

2026年美国防部人工智能政策如何将国防人工智能转向速度、规模与“人工智能优先”作战

专知会员服务

5+阅读 · 今天9:09

《伊朗-以色列对抗中的算法化目标选定：技术现实、法律门槛与人类控制的边界》

《伊朗-以色列对抗中的算法化目标选定：技术现实、法律门槛与人类控制的边界》

专知会员服务

4+阅读 · 今天9:04

《红外图像中掩埋目标检测的深度学习方法》2026最新报告

《红外图像中掩埋目标检测的深度学习方法》2026最新报告

专知会员服务

4+阅读 · 今天9:00

《小部队领导者运用新技术训练与制胜指南》2026最新50页

《小部队领导者运用新技术训练与制胜指南》2026最新50页

专知会员服务

5+阅读 · 今天8:23

乌军利用美国“黄蜂”无人机摧毁俄军后勤

乌军利用美国“黄蜂”无人机摧毁俄军后勤

专知会员服务

7+阅读 · 6月7日

《支持作战级人机协同智能的交互式OODA流程》

《支持作战级人机协同智能的交互式OODA流程》

专知会员服务

15+阅读 · 6月7日

《军事地面机动的概率等时分析：未来自适应模型的多方法协同》

《军事地面机动的概率等时分析：未来自适应模型的多方法协同》

专知会员服务

7+阅读 · 6月7日

大语言模型与物联网：大语言模型与物联网融合全面综述

大语言模型与物联网：大语言模型与物联网融合全面综述

专知会员服务

12+阅读 · 6月7日

【伯克利博士论文】基于动作分块策略的强化学习

【伯克利博士论文】基于动作分块策略的强化学习

专知会员服务

6+阅读 · 6月7日

Transformer增强强化学习：通信网络基础与应用综述

Transformer增强强化学习：通信网络基础与应用综述

专知会员服务

6+阅读 · 6月7日

ICML 2026 | SARDI：扩散语言模型的自增强检索

ICML 2026 | SARDI：扩散语言模型的自增强检索

专知会员服务

8+阅读 · 6月6日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

CVPR 2026教程：统一多模态模型走向收敛之路

认知战：定义与能力发展

ICML 2026 Oral｜大模型为何难被提示纠正？内部先验限制标注适应性

《人工智能在网络防御中的机遇》

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Diffusion Models in Vision: A Survey

Arxiv

30+阅读 · 2022年9月10日

A Survey on Generative Diffusion Model

Arxiv

46+阅读 · 2022年9月6日

Diffusion Models: A Comprehensive Survey of Methods and Applications

Arxiv

67+阅读 · 2022年9月2日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Arxiv

45+阅读 · 2022年4月16日

Spatially Consistent Representation Learning

Arxiv

14+阅读 · 2021年3月10日

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Meta-World: A Benchmark and Evaluation for Multi-Task and Meta Reinforcement Learning

Arxiv

34+阅读 · 2019年10月24日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Arxiv

18+阅读 · 2019年9月25日

A Survey of Model Compression and Acceleration for Deep Neural Networks

Arxiv

67+阅读 · 2019年9月8日

Phase-aware Speech Enhancement with Deep Complex U-Net

Phase-aware Speech Enhancement with Deep Complex U-Net

Arxiv

15+阅读 · 2019年3月7日

The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

Arxiv

11+阅读 · 2018年1月11日

相关基金

同步辐射光电离质谱技术研究C3 Criegee中间体宏观反应动力学

国家自然科学基金

0+阅读 · 2015年12月31日

Versican 3'-非翻译区(3'-UTR)作为非编码竞争内源性RNA(ceRNA)通过调控MicroRNAs的功能在乳腺癌中的作用

国家自然科学基金

0+阅读 · 2014年12月31日

prohibitin与PIG3基因启动子区（TGYCC）n序列结合并调控其转录的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

基于磁热疗法的cRGD介导的超顺磁长循环纳米脂质体用于维吾尔药去氢骆驼蓬碱抗癌及其对癌细胞p53基因的研究

国家自然科学基金

0+阅读 · 2014年12月31日

肽核酸与蛋白质互作机理及肽核酸适配体筛选的毛细管电泳方法学研究

国家自然科学基金

0+阅读 · 2012年12月31日

转录因子基因PtrICE1调控柑橘冷响应基因表达的分子机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

蛋白质核酸适配体高效筛选的毛细管电泳方法学基础研究

国家自然科学基金

0+阅读 · 2011年12月31日

复杂体系GC-MS高通量分析方法研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

血红素加氧酶-1（HO-1）在胃癌腹膜转移中的作用研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员