Probabilistic Denoising Diffusion models have emerged as simple yet very powerful generative models. Diffusion models unlike other generative models do not suffer from mode collapse nor require a discriminator to generate high quality samples. In this paper, we propose a diffusion model that uses a binomial prior distribution to generate piano-rolls. The paper also proposes an efficient method to train the model and generate samples. The generated music has coherence at time scales up to the length of the training piano-roll segments. We show how such a model is conditioned on the input and can be used to harmonize a given melody, complete an incomplete piano-roll or generate a variation of a given piece. The code is shared publicly to encourage the use and development of the method by the community.
翻译:概率去噪扩散模型已成为简单而强大的生成模型。与其他生成模型不同,扩散模型既不会遭遇模式坍塌问题,也无需判别器即可生成高质量样本。本文提出一种采用二项先验分布生成钢琴卷帘的扩散模型,并同时提出一种高效的训练与采样方法。生成的音乐在时间尺度上可保持与训练钢琴卷帘片段长度一致的连贯性。我们展示了该模型如何根据输入条件进行约束,并可用于给定旋律的和声编配、不完整钢琴卷帘的补全或给定乐曲的变体生成。相关代码已公开发布,以促进社区对该方法的使用与开发。