This paper introduces FLEURS-R, a speech restoration applied version of the Few-shot Learning Evaluation of Universal Representations of Speech (FLEURS) corpus. FLEURS-R maintains an N-way parallel speech corpus in 102 languages as FLEURS, with improved audio quality and fidelity by applying the speech restoration model Miipher. The aim of FLEURS-R is to advance speech technology in more languages and catalyze research including text-to-speech (TTS) and other speech generation tasks in low-resource languages. Comprehensive evaluations with the restored speech and TTS baseline models trained from the new corpus show that the new corpus obtained significantly improved speech quality while maintaining the semantic contents of the speech. The corpus is publicly released via Hugging Face.
翻译:本文介绍了FLEURS-R,这是Few-shot Learning Evaluation of Universal Representations of Speech (FLEURS) 语料库经过语音修复处理后的版本。FLEURS-R与FLEURS一样,保持了涵盖102种语言的N路平行语音语料库结构,并通过应用语音修复模型Miipher提升了音频质量与保真度。FLEURS-R的目标是推动更多语言的语音技术进步,并促进包括文本到语音(TTS)及其他低资源语言语音生成任务在内的研究。通过对修复后的语音以及基于新语料库训练的TTS基线模型进行的综合评估表明,新语料库在保持语音语义内容的同时,显著提升了语音质量。该语料库已通过Hugging Face平台公开发布。