Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection - 专知论文

会员服务 ·

0

稳健性 · Learning · Automator · 基准 · 损失 ·

2023 年 1 月 26 日

Robust Vocal Quality Feature Embeddings for Dysphonic Voice Detection

翻译：鲁棒性嗓音质量特征嵌入用于嘶哑语音检测

Jianwei Zhang,Julie Liss,Suren Jayasuriya,Visar Berisha

from arxiv, This manuscript is submitted on July 06, 2022 to IEEE/ACM Transactions on Audio, Speech, and Language Processing for peer-review

Approximately 1.2% of the world's population has impaired voice production. As a result, automatic dysphonic voice detection has attracted considerable academic and clinical interest. However, existing methods for automated voice assessment often fail to generalize outside the training conditions or to other related applications. In this paper, we propose a deep learning framework for generating acoustic feature embeddings sensitive to vocal quality and robust across different corpora. A contrastive loss is combined with a classification loss to train our deep learning model jointly. Data warping methods are used on input voice samples to improve the robustness of our method. Empirical results demonstrate that our method not only achieves high in-corpus and cross-corpus classification accuracy but also generates good embeddings sensitive to voice quality and robust across different corpora. We also compare our results against three baseline methods on clean and three variations of deteriorated in-corpus and cross-corpus datasets and demonstrate that the proposed model consistently outperforms the baseline methods.

翻译：全球约有1.2%的人口存在发声障碍，因此自动嘶哑语音检测引起了学术界和临床领域的广泛关注。然而，现有的自动语音评估方法往往难以在训练条件之外或面向其他相关应用时实现泛化。本文提出一种深度学习框架，用于生成对嗓音质量敏感且跨语料库具有鲁棒性的声学特征嵌入。我们将对比损失与分类损失相结合，以联合训练深度学习模型。通过在输入语音样本上采用数据扭曲方法，提升了模型的鲁棒性。实验结果表明，我们的方法不仅在语料库内和跨语料库分类中均取得了高准确率，还生成了对嗓音质量敏感且跨语料库鲁棒性良好的嵌入特征。此外，我们在干净语料库以及三种变异的退化语料库（含语料库内和跨语料库）上，将我们的结果与三种基线方法进行了对比，证明所提模型始终优于基线方法。

0

相关内容

稳健性

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

基于特征骨架质谱定位法快速发现海绵中aaptamine生物碱类抗肿瘤先导化合物

国家自然科学基金

0+阅读 · 2015年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

CuO/Cu2O纳米线与银混合结构在光电转换中激子输运特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cu基类金刚石结构新型热电化合物的设计与优化

国家自然科学基金

0+阅读 · 2012年12月31日

低维纳米尺度金属（Cu、Ni）的腐蚀行为及电化学特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀土离子掺杂铌酸锂晶体薄膜波导的研究

国家自然科学基金

0+阅读 · 2012年12月31日

磷石膏低温催化分解及过程物相迁移研究

国家自然科学基金

0+阅读 · 2011年12月31日

《计算机研究与发展》学术期刊

国家自然科学基金

1+阅读 · 2011年12月31日

异质结构纳米带的界面及光电性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

Refinement for Absolute Pose Regression with Neural Feature Synthesis

Arxiv

0+阅读 · 2023年3月17日

Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

Arxiv

0+阅读 · 2023年3月17日

Measuring Improvement of F$_1$-Scores in Detection of Self-Admitted Technical Debt

Arxiv

0+阅读 · 2023年3月16日

Feature Alignment by Uncertainty and Self-Training for Source-Free Unsupervised Domain Adaptation

Arxiv

0+阅读 · 2023年3月16日

Strong Baseline and Bag of Tricks for COVID-19 Detection of CT Scans

Arxiv

0+阅读 · 2023年3月15日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

37+阅读 · 2019年6月6日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

VIP会员

文章信息

相关主题

最新内容

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

1+阅读 · 今天15:02

综述 | 3D场景图：开放挑战与未来方向

综述 | 3D场景图：开放挑战与未来方向

专知会员服务

1+阅读 · 今天15:00

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

专知会员服务

2+阅读 · 今天14:30

21世纪的无人机战争

21世纪的无人机战争

专知会员服务

2+阅读 · 今天14:05

《伊朗与以色列-美国热战及其对数字技术的影响》

《伊朗与以色列-美国热战及其对数字技术的影响》

专知会员服务

2+阅读 · 今天13:55

《量子技术的军事任务技术适配与利用》

《量子技术的军事任务技术适配与利用》

专知会员服务

2+阅读 · 今天13:51

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

《美国陆军军官学校（西点军校）本科生科研中生成式人工智能的使用》

专知会员服务

2+阅读 · 今天13:48

美国从乌克兰无人机战争中学习经验

美国从乌克兰无人机战争中学习经验

专知会员服务

7+阅读 · 6月21日

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

ICML 2026 | 面向视觉语言模型的语义鲁棒性认证

专知会员服务

5+阅读 · 6月21日

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

综述 | 智能体电子设计自动化：从“交接有效性”重新理解Agentic EDA

专知会员服务

7+阅读 · 6月21日

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

深入解读 Palantir AIP：全球最具争议的人工智能平台究竟如何运作

专知会员服务

20+阅读 · 6月20日

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

ICML 2026 | 多任务贝叶斯上下文学习：让 Transformer 在测试时显式适应新先验

专知会员服务

5+阅读 · 6月19日

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

ACL 2026综述 | 大规模手语数据集：资源、基准与标注标准

专知会员服务

8+阅读 · 6月19日

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

ICML 2026 Spotlight | SmoothSMoE：解析稀疏 MoE 路由不连续

专知会员服务

7+阅读 · 6月18日

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

综述 | 周期表视角下的大模型推理：范式、方法与失败模式

专知会员服务

9+阅读 · 6月18日

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

综述 | 3D场景图：开放挑战与未来方向

21世纪的无人机战争

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

《国防工业6.0：全自主作战系统、量子-人工智能融合与新一代战略威慑》

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Refinement for Absolute Pose Regression with Neural Feature Synthesis

Arxiv

0+阅读 · 2023年3月17日

Analysis of Utterance Embeddings and Clustering Methods Related to Intent Induction for Task-Oriented Dialogue

Arxiv

0+阅读 · 2023年3月17日

Measuring Improvement of F$_1$-Scores in Detection of Self-Admitted Technical Debt

Arxiv

0+阅读 · 2023年3月16日

Feature Alignment by Uncertainty and Self-Training for Source-Free Unsupervised Domain Adaptation

Arxiv

0+阅读 · 2023年3月16日

Strong Baseline and Bag of Tricks for COVID-19 Detection of CT Scans

Arxiv

0+阅读 · 2023年3月15日

Attention Bottlenecks for Multimodal Fusion

Arxiv

31+阅读 · 2021年6月30日

Multi-view Knowledge Graph Embedding for Entity Alignment

Arxiv

37+阅读 · 2019年6月6日

Reverse Attention for Salient Object Detection

Arxiv

11+阅读 · 2019年4月15日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Zero-Shot Object Detection by Hybrid Region Embedding

Arxiv

19+阅读 · 2018年5月17日

相关基金

Volterra积分微分方程的多区间Chebyshev和Legendre谱配置法

国家自然科学基金

0+阅读 · 2015年12月31日

基于特征骨架质谱定位法快速发现海绵中aaptamine生物碱类抗肿瘤先导化合物

国家自然科学基金

0+阅读 · 2015年12月31日

基于SURE/PURE准则的图像盲反卷积算法研究

国家自然科学基金

3+阅读 · 2013年12月31日

CuO/Cu2O纳米线与银混合结构在光电转换中激子输运特性研究

国家自然科学基金

0+阅读 · 2012年12月31日

Cu基类金刚石结构新型热电化合物的设计与优化

国家自然科学基金

0+阅读 · 2012年12月31日

低维纳米尺度金属（Cu、Ni）的腐蚀行为及电化学特征研究

国家自然科学基金

0+阅读 · 2012年12月31日

稀土离子掺杂铌酸锂晶体薄膜波导的研究

国家自然科学基金

0+阅读 · 2012年12月31日

磷石膏低温催化分解及过程物相迁移研究

国家自然科学基金

0+阅读 · 2011年12月31日

《计算机研究与发展》学术期刊

国家自然科学基金

1+阅读 · 2011年12月31日

异质结构纳米带的界面及光电性质研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员