Denoising Diffusion Probabilistic models have emerged as simple yet very powerful generative models. Unlike other generative models, diffusion models do not suffer from mode collapse or require a discriminator to generate high-quality samples. In this paper, a diffusion model that uses a binomial prior distribution to generate piano rolls is proposed. The paper also proposes an efficient method to train the model and generate samples. The generated music has coherence at time scales up to the length of the training piano roll segments. The paper demonstrates how this model is conditioned on the input and can be used to harmonize a given melody, complete an incomplete piano roll, or generate a variation of a given piece. The code is publicly shared to encourage the use and development of the method by the community.
翻译:去噪扩散概率模型已崭露头角,成为既简单又强大的生成模型。与其它生成模型不同,扩散模型既不会遭遇模式崩溃,也无需判别器即可生成高质量样本。本文提出了一种采用二项先验分布的扩散模型,用于生成钢琴卷帘。同时,本文还提出了一种高效训练模型及生成样本的方法。所生成的音乐在时间尺度上具有连贯性,可达到训练钢琴卷帘片段的长度。本文展示了该模型如何根据输入进行条件控制,并可用于为给定旋律配和声、补全不完整的钢琴卷帘,或生成给定乐曲的变体。相关代码已公开分享,以鼓励社区对该方法的使用与改进。