The Charlie Parker Omnibook is a cornerstone of jazz music education, described by pianist Ethan Iverson as "the most important jazz education text ever published". In this work we propose a new transcription pipeline and explore the extent to which state of the art music technology is able to reconstruct these scores directly from the audio without human intervention. Our pipeline includes: a newly trained source separation model for saxophone, a new MIDI transcription model for solo saxophone and an adaptation of an existing MIDI-to-score method for monophonic instruments. To assess this pipeline we also provide an enhanced dataset of Charlie Parker transcriptions as score-audio pairs with accurate MIDI alignments and downbeat annotations. This represents a challenging new benchmark for automatic audio-to-score transcription that we hope will advance research into areas beyond transcribing audio-to-MIDI alone. Together, these form another step towards producing scores that musicians can use directly, without the need for onerous corrections or revisions. To facilitate future research, all model checkpoints and data are made available to download along with code for the transcription pipeline. Improvements in our modular pipeline could one day make the automatic transcription of complex jazz solos a routine possibility, thereby enriching the resources available for music education and preservation.
翻译:《查理·帕克全集》是爵士音乐教育的基石,被钢琴家伊桑·艾弗森称为“有史以来最重要的爵士教育文本”。本研究提出一种新的转录流程,并探索当前最先进的音乐技术在多大程度上能够直接从音频重建这些乐谱,而无需人工干预。我们的流程包括:一个为新训练的萨克斯音源分离模型、一个针对独奏萨克斯的新型MIDI转录模型,以及对现有单音乐器MIDI到乐谱方法的适配。为评估该流程,我们还提供了一个增强版的查理·帕克转录数据集,包含精确对齐的乐谱-音频对、MIDI对齐信息和强拍标注。这为自动音频到乐谱转录提供了一个具有挑战性的新基准,有望推动超越单纯音频到MIDI转录的研究领域。这些工作共同向生成音乐家可直接使用的乐谱迈进一步,无需繁琐的修正或修订。为促进未来研究,所有模型检查点、数据及转录流程代码均已公开提供。我们模块化流程的改进有望使复杂爵士独奏的自动转录成为常规可能,从而丰富音乐教育与保存的可用资源。