Deep learning-based probabilistic models of musical data are producing increasingly realistic results and promise to enter creative workflows of many kinds. Yet they have been little-studied in a performance setting, where the results of user actions typically ought to feel instantaneous. To enable such study, we designed Notochord, a deep probabilistic model for sequences of structured events, and trained an instance of it on the Lakh MIDI dataset. Our probabilistic formulation allows interpretable interventions at a sub-event level, which enables one model to act as a backbone for diverse interactive musical functions including steerable generation, harmonization, machine improvisation, and likelihood-based interfaces. Notochord can generate polyphonic and multi-track MIDI, and respond to inputs with latency below ten milliseconds. Training code, model checkpoints and interactive examples are provided as open source software.
翻译:基于深度学习的音乐数据概率模型正生成越来越逼真的结果,并有望融入多种创意工作流程。然而,这些模型在演奏场景中的研究尚不充分,而在此类场景中,用户操作的结果通常需要具备即时响应性。为支持此类研究,我们设计了诺托科德(Notochord)——一种适用于结构化事件序列的深度概率模型,并在Lakh MIDI数据集上训练了其实例。我们的概率公式支持在子事件层面进行可解释的干预,使得单一模型能够作为多种交互式音乐功能的骨干网络,包括可引导生成、和声化、机器即兴演奏以及基于似然的交互界面。诺托科德可生成多声部、多轨道的MIDI数据,并以低于十毫秒的延迟对输入做出响应。我们以开源软件形式提供了训练代码、模型检查点及交互示例。