In the domain of symbolic music research, the progress of developing scalable systems has been notably hindered by the scarcity of available training data and the demand for models tailored to specific tasks. To address these issues, we propose MelodyT5, a novel unified framework that leverages an encoder-decoder architecture tailored for symbolic music processing in ABC notation. This framework challenges the conventional task-specific approach, considering various symbolic music tasks as score-to-score transformations. Consequently, it integrates seven melody-centric tasks, from generation to harmonization and segmentation, within a single model. Pre-trained on MelodyHub, a newly curated collection featuring over 261K unique melodies encoded in ABC notation and encompassing more than one million task instances, MelodyT5 demonstrates superior performance in symbolic music processing via multi-task transfer learning. Our findings highlight the efficacy of multi-task transfer learning in symbolic music processing, particularly for data-scarce tasks, challenging the prevailing task-specific paradigms and offering a comprehensive dataset and framework for future explorations in this domain.
翻译:在符号音乐研究领域,可扩展系统的开发进展一直明显受到可用训练数据稀缺以及对特定任务定制模型需求的阻碍。为解决这些问题,我们提出了MelodyT5,这是一种新颖的统一框架,它利用专为ABC记谱法中的符号音乐处理而设计的编码器-解码器架构。该框架挑战了传统的任务特定方法,将各种符号音乐任务视为乐谱到乐谱的转换。因此,它将七项以旋律为中心的任务——从生成到和声编配与分割——集成在单一模型中。MelodyT5在MelodyHub上进行了预训练,这是一个新策划的数据集,包含超过26.1万首以ABC记谱法编码的唯一旋律,涵盖超过一百万个任务实例。通过多任务迁移学习,MelodyT5在符号音乐处理中展现出卓越的性能。我们的研究结果突显了多任务迁移学习在符号音乐处理中的有效性,特别是对于数据稀缺的任务,这挑战了当前主流的任务特定范式,并为该领域未来的探索提供了一个全面的数据集和框架。