Flow-based generative modeling in continuous spaces exploit Tweedie's formula to express the denoiser (learned in training) as a score function (used in sampling). In contrast, this relation has been largely missing in the discrete setting where common approaches focus on learning discrete scores and rates. In this work we close this gap for discrete non-negative ordinal data by introducing Binomial flows. Our framework provides a simple recipe for training a discrete diffusion model which simultaneously denoises, samples, and estimates exact likelihoods. We verify our methodology on synthetic examples and obtain competitive results on real-world data sets.
翻译:连续空间中的基于流的生成模型利用Tweedie公式将(训练中学习的)去噪器表示为(采样中使用的)得分函数。然而,在离散设置中,这种关系基本缺失,常见方法主要关注学习离散得分和速率。本文针对离散非负有序数据引入二项式流,弥补了这一空白。我们的框架提供了一种训练离散扩散模型的简单方案,该模型可同时执行去噪、采样和精确似然估计。我们在合成示例上验证了该方法,并在真实世界数据集上获得了具有竞争力的结果。