The scarcity of high-quality and multi-task singing datasets significantly hinders the development of diverse controllable and personalized singing tasks, as existing singing datasets suffer from low quality, limited diversity of languages and singers, absence of multi-technique information and realistic music scores, and poor task suitability. To tackle these problems, we present GTSinger, a large global, multi-technique, free-to-use, high-quality singing corpus with realistic music scores, designed for all singing tasks, along with its benchmarks. Particularly, (1) we collect 80.59 hours of high-quality singing voices, forming the largest recorded singing dataset; (2) 20 professional singers across nine widely spoken languages offer diverse timbres and styles; (3) we provide controlled comparison and phoneme-level annotations of six commonly used singing techniques, helping technique modeling and control; (4) GTSinger offers realistic music scores, assisting real-world musical composition; (5) singing voices are accompanied by manual phoneme-to-audio alignments, global style labels, and 16.16 hours of paired speech for various singing tasks. Moreover, to facilitate the use of GTSinger, we conduct four benchmark experiments: technique-controllable singing voice synthesis, technique recognition, style transfer, and speech-to-singing conversion. The corpus and demos can be found at http://gtsinger.github.io. We provide the dataset and the code for processing data and conducting benchmarks at https://huggingface.co/datasets/GTSinger/GTSinger and https://github.com/GTSinger/GTSinger.
翻译:高质量多任务歌唱数据集的稀缺严重阻碍了多样化可控与个性化歌唱任务的发展,现有歌唱数据集普遍存在质量较低、语言与歌手多样性不足、缺乏多唱法信息与真实乐谱、任务适配性差等问题。为解决这些问题,我们提出了GTSinger——一个面向所有歌唱任务的全球性、多唱法、免费使用、高质量且包含真实乐谱的大型歌唱语料库及其基准测试体系。具体而言:(1)我们收集了80.59小时的高质量歌唱音频,构建了目前规模最大的录制歌唱数据集;(2)涵盖九种广泛使用语言的20位专业歌手提供了多样化的音色与风格;(3)我们提供了六种常用歌唱技巧的受控对比与音素级标注,助力唱法建模与控制;(4)GTSinger提供真实乐谱,辅助实际音乐创作;(5)歌唱音频均配备人工音素-音频对齐标注、全局风格标签及16.16小时配对语音数据,适用于多种歌唱任务。此外,为促进GTSinger的应用,我们开展了四项基准实验:唱法可控歌唱声音合成、唱法识别、风格迁移及语音到歌唱转换。语料库及演示可访问 http://gtsinger.github.io。数据集及数据处理与基准测试代码发布于 https://huggingface.co/datasets/GTSinger/GTSinger 与 https://github.com/GTSinger/GTSinger。