Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models - 专知论文

会员服务 ·

0

语音增强 · 潜在 · MoDELS · U-Net · Processing（编程语言） ·

2023 年 9 月 4 日

Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models

翻译：基于深度复数U-Net与概率潜空间模型的单通道语音增强

Eike J. Nustede,Jörn Anemüller

In this paper, we propose to extend the deep, complex U-Network architecture for speech enhancement by incorporating a probabilistic (i.e., variational) latent space model. The proposed model is evaluated against several ablated versions of itself in order to study the effects of the variational latent space model, complex-value processing, and self-attention. Evaluation on the MS-DNS 2020 and Voicebank+Demand datasets yields consistently high performance. E.g., the proposed model achieves an SI-SDR of up to 20.2 dB, about 0.5 to 1.4 dB higher than its ablated version without probabilistic latent space, 2-2.4 dB higher than WaveUNet, and 6.7 dB above PHASEN. Compared to real-valued magnitude spectrogram processing with a variational U-Net, the complex U-Net achieves an improvement of up to 4.5 dB SI-SDR. Complex spectrum encoding as magnitude and phase yields best performance in anechoic conditions whereas real and imaginary part representation results in better generalization to (novel) reverberation conditions, possibly due to the underlying physics of sound.

翻译：本文提出通过引入概率（即变分）潜空间模型，对用于语音增强的深度复数U-Net架构进行扩展。为研究变分潜空间模型、复数处理及自注意力机制的影响，我们将所提模型与多个消融版本进行对比评估。在MS-DNS 2020和Voicebank+Demand数据集上的评估结果显示其性能持续优异。例如，所提模型可实现高达20.2 dB的SI-SDR，比不含概率潜空间的消融版本高0.5至1.4 dB，比WaveUNet高2-2.4 dB，比PHASEN高6.7 dB。与使用变分U-Net的实值幅度谱图处理方法相比，复数U-Net在SI-SDR上实现了最高4.5 dB的提升。在消声条件下，采用幅度-相位编码的复数频谱表征可获得最佳性能，而实部-虚部表征则更有利于对（新型）混响条件的泛化，这一差异可能源于声音的底层物理机制。

0

相关内容

语音增强

语音增强是指当语音信号被各种各样的噪声干扰、甚至淹没后，从噪声背景中提取有用的语音信号，抑制、降低噪声干扰的技术。一句话，从含噪语音中提取尽可能纯净的原始语音。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

Zero-Knowledge Proofs for Questionnaire Result Verification in Smart Contracts

Zero-Knowledge Proofs for Questionnaire Result Verification in Smart Contracts

Arxiv

0+阅读 · 2023年10月20日

Uplink Multiplexing of eMBB/URLLC Services Assisted by Reconfigurable Intelligent Surfaces

Arxiv

0+阅读 · 2023年10月20日

Simultaneous Shape Tracking of Multiple Deformable Linear Objects with Global-Local Topology Preservation

Arxiv

0+阅读 · 2023年10月20日

Sequence Length Independent Norm-Based Generalization Bounds for Transformers

Arxiv

0+阅读 · 2023年10月19日

An Efficient Algorithm for Counting Cycles in QC and APM LDPC Codes

Arxiv

0+阅读 · 2023年10月19日

Geometry-Guided Ray Augmentation for Neural Surface Reconstruction with Sparse Views

Arxiv

0+阅读 · 2023年10月18日

Asynchronous Distributed Smoothing and Mapping via On-Manifold Consensus ADMM

Arxiv

0+阅读 · 2023年10月18日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

VIP会员

文章信息

相关主题

Processing（编程语言）

最新内容

CVPR 2026教程｜扩散模型原理：连续、离散与实时生成

CVPR 2026教程｜扩散模型原理：连续、离散与实时生成

专知会员服务

1+阅读 · 6月11日

重磅综述｜大模型智能体环境工程：建模、合成、评估与协同演化

重磅综述｜大模型智能体环境工程：建模、合成、评估与协同演化

专知会员服务

1+阅读 · 6月11日

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

专知会员服务

5+阅读 · 6月11日

《多域战场上反制小型无人机系统》150页

《多域战场上反制小型无人机系统》150页

专知会员服务

14+阅读 · 6月11日

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

《基于成果军事教育框架下的军官联合职业军事教育认证程序》2026最新170页

专知会员服务

5+阅读 · 6月11日

战场人工智能：增强陆地作战能力的发现与要求

战场人工智能：增强陆地作战能力的发现与要求

专知会员服务

3+阅读 · 6月11日

人工智能赋能指挥所：以人工智能为中心的指挥控制的核心要素

人工智能赋能指挥所：以人工智能为中心的指挥控制的核心要素

专知会员服务

8+阅读 · 6月11日

以人工智能为中心的指挥控制

以人工智能为中心的指挥控制

专知会员服务

3+阅读 · 6月11日

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

《通过适应复杂环境与特殊作战行动动态来变革情报周期》

专知会员服务

4+阅读 · 6月11日

俄乌冲突背景下军事特种公路运输日益增长的重要性

俄乌冲突背景下军事特种公路运输日益增长的重要性

专知会员服务

4+阅读 · 6月11日

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

速度优先于谨慎：NSPM-11意味着什么（将人工智能融入美国国防和情报行动最全面的声明）

专知会员服务

9+阅读 · 6月10日

《基于深度强化学习的反无人机技术研究》178页

《基于深度强化学习的反无人机技术研究》178页

专知会员服务

13+阅读 · 6月10日

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

技术突破与战略优势竞争：美军人工智能技术运用阶段分析

专知会员服务

8+阅读 · 6月10日

“史诗怒火”行动与“AI中心战”模式的浮现

“史诗怒火”行动与“AI中心战”模式的浮现

专知会员服务

15+阅读 · 6月10日

【CVPR2026教程】扩散模型的解析理解

【CVPR2026教程】扩散模型的解析理解

专知会员服务

6+阅读 · 6月10日

相关VIP内容

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

32+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

164+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

重磅综述｜大模型智能体环境工程：建模、合成、评估与协同演化

《多域战场上反制小型无人机系统》150页

CVPR 2026教程｜扩散模型原理：连续、离散与实时生成

面向特种部队的、以操作员为中心的人工智能决策支持系统框架

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

Focal Loss for Dense Object Detection

Focal Loss for Dense Object Detection

统计学习与视觉计算组

12+阅读 · 2018年3月15日

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

IJCAI | Cascade Dynamics Modeling with Attention-based RNN

KingsGarden

13+阅读 · 2017年7月16日

From Softmax to Sparsemax-ICML16（1）

From Softmax to Sparsemax-ICML16（1）

KingsGarden

74+阅读 · 2016年11月26日

相关论文

Zero-Knowledge Proofs for Questionnaire Result Verification in Smart Contracts

Zero-Knowledge Proofs for Questionnaire Result Verification in Smart Contracts

Arxiv

0+阅读 · 2023年10月20日

Uplink Multiplexing of eMBB/URLLC Services Assisted by Reconfigurable Intelligent Surfaces

Arxiv

0+阅读 · 2023年10月20日

Simultaneous Shape Tracking of Multiple Deformable Linear Objects with Global-Local Topology Preservation

Arxiv

0+阅读 · 2023年10月20日

Sequence Length Independent Norm-Based Generalization Bounds for Transformers

Arxiv

0+阅读 · 2023年10月19日

An Efficient Algorithm for Counting Cycles in QC and APM LDPC Codes

Arxiv

0+阅读 · 2023年10月19日

Geometry-Guided Ray Augmentation for Neural Surface Reconstruction with Sparse Views

Arxiv

0+阅读 · 2023年10月18日

Asynchronous Distributed Smoothing and Mapping via On-Manifold Consensus ADMM

Arxiv

0+阅读 · 2023年10月18日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Meta Learning for End-to-End Low-Resource Speech Recognition

Meta Learning for End-to-End Low-Resource Speech Recognition

Arxiv

20+阅读 · 2019年10月26日

Order-Free RNN with Visual Attention for Multi-Label Classification

Arxiv

16+阅读 · 2017年12月20日

相关基金

城市“建成环境——空间行为”的多尺度影响关系与机理研究

国家自然科学基金

13+阅读 · 2017年12月31日

Musielak-Orlicz-Sobolev 空间中的迹嵌入及其应用

国家自然科学基金

2+阅读 · 2015年12月31日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

关于 Finsler 流形上调和映射与 Laplacian 的若干问题研究

国家自然科学基金

1+阅读 · 2014年12月31日

动态Gr？bner 基与GVW算法

国家自然科学基金

0+阅读 · 2014年12月31日

“杰文斯”悖论、能效政策改进与“双控目标”分解

国家自然科学基金

0+阅读 · 2014年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

海量Web用户生成内容物化关键技术

国家自然科学基金

2+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员