Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features - 专知论文

会员服务 ·

0

MOS · 语音合成 · MoDELS · state-of-the-art · 分解的 ·

2023 年 5 月 7 日

Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features

翻译：研究基于韵律和语言特征的内容感知神经文本转语音MOS预测

Alexandra Vioni,Georgia Maniati,Nikolaos Ellinas,June Sig Sung,Inchul Hwang,Aimilios Chalamandaris,Pirros Tsiakoulis

from arxiv, Proceedings of ICASSP 2023

Current state-of-the-art methods for automatic synthetic speech evaluation are based on MOS prediction neural models. Such MOS prediction models include MOSNet and LDNet that use spectral features as input, and SSL-MOS that relies on a pretrained self-supervised learning model that directly uses the speech signal as input. In modern high-quality neural TTS systems, prosodic appropriateness with regard to the spoken content is a decisive factor for speech naturalness. For this reason, we propose to include prosodic and linguistic features as additional inputs in MOS prediction systems, and evaluate their impact on the prediction outcome. We consider phoneme level F0 and duration features as prosodic inputs, as well as Tacotron encoder outputs, POS tags and BERT embeddings as higher-level linguistic inputs. All MOS prediction systems are trained on SOMOS, a neural TTS-only dataset with crowdsourced naturalness MOS evaluations. Results show that the proposed additional features are beneficial in the MOS prediction task, by improving the predicted MOS scores' correlation with the ground truths, both at utterance-level and system-level predictions.

翻译：当前最先进的自动合成语音评估方法基于MOS预测神经模型。这类MOS预测模型包括使用频谱特征作为输入的MOSNet和LDNet，以及依赖预训练自监督学习模型直接使用语音信号作为输入的SSL-MOS。在现代高质量神经TTS系统中，关于语音内容的韵律适当性是影响语音自然度的关键因素。为此，我们提出在MOS预测系统中引入韵律和语言特征作为额外输入，并评估其对预测结果的影响。我们考虑音素级别的基频和时长特征作为韵律输入，同时采用Tacotron编码器输出、词性标注标签和BERT嵌入作为高层语言输入。所有MOS预测系统均在SOMOS（一个仅包含神经TTS系统且具有众包自然度MOS评估的数据集）上进行训练。结果表明，所提出的额外特征在MOS预测任务中具有积极作用，能够提升预测MOS分数与真实值之间的相关性，无论是在语句级别还是系统级别的预测中。

0

相关内容

MOS

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

炎症介导的EGFR由内而外调节肺腺癌侵润网络构建及对抗策略

国家自然科学基金

1+阅读 · 2015年12月31日

利用诱饵多肽分离和鉴定人外周血单个核细胞的HBV受体

国家自然科学基金

0+阅读 · 2015年12月31日

Orexin/OX1R激动FOXO1/Atg7干预胰岛β细胞自噬的机制及其在胰岛功能缺陷中的意义

国家自然科学基金

0+阅读 · 2014年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

CTHRC1基因遗传多态性与原发性胆汁性肝硬化的相关性研究及其功能鉴定

国家自然科学基金

0+阅读 · 2013年12月31日

基于HPLC-TPMT EAD-MS联用技术的天芪降糖胶囊中TPMT酶亲和活性成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

肝细胞膜上OATP和MRP载体与肝功能评价的相关性MRI实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

遗传性慢性胰腺炎致病基因定位及突变鉴定研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

高分辨大视野的多针孔SPECT成像研究

国家自然科学基金

0+阅读 · 2008年12月31日

Feature Interactions Reveal Linguistic Structure in Language Models

Arxiv

0+阅读 · 2023年6月21日

The Dual PC Algorithm and the Role of Gaussianity for Structure Learning of Bayesian Networks

Arxiv

0+阅读 · 2023年6月20日

On the Robustness of Dataset Inference

Arxiv

0+阅读 · 2023年6月19日

Can predictive models be used for causal inference?

Arxiv

0+阅读 · 2023年6月18日

MOSPC: MOS Prediction Based on Pairwise Comparison

Arxiv

0+阅读 · 2023年6月18日

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

Arxiv

0+阅读 · 2023年6月18日

GlyphNet: Homoglyph domains dataset and detection using attention-based Convolutional Neural Networks

Arxiv

0+阅读 · 2023年6月17日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

VIP会员

文章信息

相关主题

state-of-the-art

最新内容

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

专知会员服务

8+阅读 · 7月18日

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

专知会员服务

6+阅读 · 7月18日

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

专知会员服务

8+阅读 · 7月18日

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

专知会员服务

5+阅读 · 7月18日

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

DARPA拟打造十万规模自主思考作战的AI智能体集群：“受控涌现式分布式人工智能”（DICE）项目

专知会员服务

8+阅读 · 7月17日

《边缘端实时无线感知赋能现场多机器人部署》200页

《边缘端实时无线感知赋能现场多机器人部署》200页

专知会员服务

8+阅读 · 7月17日

战力倍增器：自主武器系统与乌克兰及加沙冲突

战力倍增器：自主武器系统与乌克兰及加沙冲突

专知会员服务

4+阅读 · 7月17日

人工智能赋能战场情报：提速决策进程

人工智能赋能战场情报：提速决策进程

专知会员服务

2+阅读 · 7月17日

《拥抱新兴技术：面向未来军官的教育革新》

《拥抱新兴技术：面向未来军官的教育革新》

专知会员服务

6+阅读 · 7月17日

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

ACM MM 2026 | MAR-GRPO：稳定混合图像生成的强化学习训练

专知会员服务

4+阅读 · 7月17日

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

综述 | 大模型水印理论与部署：来源追踪、攻击鲁棒与可信治理

专知会员服务

5+阅读 · 7月17日

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

《火线上的后勤保障：对抗环境下的随机规划模型研究——俄乌场景案例分析》99页

专知会员服务

13+阅读 · 7月16日

《无人地面战车（UGV）的崛起》报告

《无人地面战车（UGV）的崛起》报告

专知会员服务

8+阅读 · 7月16日

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

《无人机参数化与集群飞行创新项目的监控流程管理：模型、策略及自适应解决方案》

专知会员服务

7+阅读 · 7月16日

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

《美军开放式任务系统（OMS）定义与文档（D&D）——Java关键抽象层（CAL）接口生成规范》47页标准

专知会员服务

15+阅读 · 7月16日

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

61+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

80+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

欧洲智能弹药战略创新管理：迈向制导弹药、巡飞系统与自主无人机蜂群的技术主权研究路线图

综述 | 长程智能体研究全景：基础、演化、框架、优化与前沿

《基于智能体建模与仿真的无人机蜂群模型目标定位涌现行为比较分析》360页

从领域适配到部署与可解释：Berkeley博士论文解析大语言模型真实落地

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

LibRec 精选：推荐系统的论文与源码

LibRec 精选：推荐系统的论文与源码

LibRec智能推荐

14+阅读 · 2018年11月29日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

【论文推荐】最新六篇推荐系统相关论文—注意力机制、多任务、协同跨网络、非结构化文本、TransRev、章节推荐

专知

12+阅读 · 2018年4月26日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

相关论文

Feature Interactions Reveal Linguistic Structure in Language Models

Arxiv

0+阅读 · 2023年6月21日

The Dual PC Algorithm and the Role of Gaussianity for Structure Learning of Bayesian Networks

Arxiv

0+阅读 · 2023年6月20日

On the Robustness of Dataset Inference

Arxiv

0+阅读 · 2023年6月19日

Can predictive models be used for causal inference?

Arxiv

0+阅读 · 2023年6月18日

MOSPC: MOS Prediction Based on Pairwise Comparison

Arxiv

0+阅读 · 2023年6月18日

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

Arxiv

0+阅读 · 2023年6月18日

GlyphNet: Homoglyph domains dataset and detection using attention-based Convolutional Neural Networks

Arxiv

0+阅读 · 2023年6月17日

On the Opportunities and Risks of Foundation Models

Arxiv

30+阅读 · 2021年8月18日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

Arxiv

11+阅读 · 2018年6月16日

相关基金

炎症介导的EGFR由内而外调节肺腺癌侵润网络构建及对抗策略

国家自然科学基金

1+阅读 · 2015年12月31日

利用诱饵多肽分离和鉴定人外周血单个核细胞的HBV受体

国家自然科学基金

0+阅读 · 2015年12月31日

Orexin/OX1R激动FOXO1/Atg7干预胰岛β细胞自噬的机制及其在胰岛功能缺陷中的意义

国家自然科学基金

0+阅读 · 2014年12月31日

压缩感知与稀疏信号恢复

国家自然科学基金

2+阅读 · 2014年12月31日

CTHRC1基因遗传多态性与原发性胆汁性肝硬化的相关性研究及其功能鉴定

国家自然科学基金

0+阅读 · 2013年12月31日

基于HPLC-TPMT EAD-MS联用技术的天芪降糖胶囊中TPMT酶亲和活性成分研究

国家自然科学基金

0+阅读 · 2013年12月31日

肝细胞膜上OATP和MRP载体与肝功能评价的相关性MRI实验研究

国家自然科学基金

0+阅读 · 2012年12月31日

遗传性慢性胰腺炎致病基因定位及突变鉴定研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

高分辨大视野的多针孔SPECT成像研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员