Variational flows allow practitioners to learn complex continuous distributions, but approximating discrete distributions remains a challenge. Current methodologies typically embed the discrete target in a continuous space - usually via continuous relaxation or dequantization - and then apply a continuous flow. These approaches involve a surrogate target that may not capture the original discrete target, might have biased or unstable gradients, and can create a difficult optimization problem. In this work, we develop a variational flow family for discrete distributions without any continuous embedding. First, we develop a measure-preserving and discrete (MAD) invertible map that leaves the discrete target invariant, and then create a mixed variational flow (MAD Mix) based on that map. We also develop an extension to MAD Mix that handles joint discrete and continuous models. Our experiments suggest that MAD Mix produces more reliable approximations than continuous-embedding flows while being significantly faster to train.
翻译:变分流允许实践者学习复杂的连续分布,但近似离散分布仍是一个挑战。当前方法通常将离散目标嵌入连续空间——通常通过连续松弛或去量化——然后应用连续流。这些方法涉及一个替代目标,可能无法捕捉原始离散目标,可能导致有偏或不稳定的梯度,并可能产生困难的优化问题。在本工作中,我们开发了一个无需任何连续嵌入的离散分布变分流族。首先,我们构造了一个保持测度的离散可逆映射(MAD),该映射保持离散目标不变,然后基于此映射创建了混合变分流(MAD Mix)。我们还开发了一个MAD Mix的扩展,用于处理联合离散和连续模型。我们的实验表明,MAD Mix比连续嵌入流产生更可靠的近似,同时训练速度显著更快。