Services of personalized TTS systems for the Mandarin-speaking speech impaired are rarely mentioned. Taiwan started the VoiceBanking project in 2020, aiming to build a complete set of services to deliver personalized Mandarin TTS systems to amyotrophic lateral sclerosis patients. This paper reports the corpus design, corpus recording, data purging and correction for the corpus, and evaluations of the developed personalized TTS systems, for the VoiceBanking project. The developed corpus is named after the VoiceBank-2023 speech corpus because of its release year. The corpus contains 29.78 hours of utterances with prompts of short paragraphs and common phrases spoken by 111 native Mandarin speakers. The corpus is labeled with information about gender, degree of speech impairment, types of users, transcription, SNRs, and speaking rates. The VoiceBank-2023 is available by request for non-commercial use and welcomes all parties to join the VoiceBanking project to improve the services for the speech impaired.
翻译:针对普通话言语障碍者的个性化语音合成系统服务鲜有报道。中国台湾省自2020年启动"语音银行"项目,旨在建立一套完整的服务体系,为肌萎缩侧索硬化症患者提供个性化普通话语音合成系统。本文报告了该语音银行项目的语料库设计、语料录制、数据清洗与校正流程,以及所开发个性化语音合成系统的评估结果。该语料库因其发布年份被命名为VoiceBank-2023语音库,包含29.78小时的语音数据,涵盖111位以普通话为母语的发音人朗读的短段落和常用短语。语料库标注了性别、言语障碍程度、用户类型、转写文本、信噪比及语速等信息。VoiceBank-2023可申请用于非商业用途,并欢迎各界加入语音银行项目以改善对言语障碍者的服务。