This paper explores transformer-based models for music overpainting, focusing on jazz piano variations. Music overpainting generates new variations while preserving the melodic and harmonic structure of the input. Existing approaches are limited by small datasets, restricting scalability and diversity. We introduce VAR4000, a subset of a larger dataset for jazz piano performances, consisting of 4,352 training pairs. Using a semi-automatic pipeline, we evaluate two transformer configurations on VAR4000, comparing their performance with the smaller JAZZVAR dataset. Preliminary results show promising improvements in generalisation and performance with the larger dataset configuration, highlighting the potential of transformer models to scale effectively for music overpainting on larger and more diverse datasets.
翻译:本文探讨基于Transformer模型的音乐修复技术,聚焦于爵士钢琴变奏领域。音乐修复旨在生成新的音乐变奏,同时保持输入旋律与和声结构的完整性。现有方法受限于小型数据集,制约了模型的可扩展性与生成多样性。我们提出VAR4000数据集——从大规模爵士钢琴演奏数据中提取的子集,包含4,352组训练配对样本。通过半自动化处理流程,我们在VAR4000上评估了两种Transformer架构配置,并与较小规模的JAZZVAR数据集进行性能对比。初步实验表明,采用更大规模数据集的配置在泛化能力和生成性能方面均展现出显著提升,这凸显了Transformer模型在更大规模、更富多样性的数据集上进行音乐修复任务时具备的有效扩展潜力。