This paper builds bridges between two families of probabilistic algorithms: (hierarchical) variational inference (VI), which is typically used to model distributions over continuous spaces, and generative flow networks (GFlowNets), which have been used for distributions over discrete structures such as graphs. We demonstrate that, in certain cases, VI algorithms are equivalent to special cases of GFlowNets in the sense of equality of expected gradients of their learning objectives. We then point out the differences between the two families and show how these differences emerge experimentally. Notably, GFlowNets, which borrow ideas from reinforcement learning, are more amenable than VI to off-policy training without the cost of high gradient variance induced by importance sampling. We argue that this property of GFlowNets can provide advantages for capturing diversity in multimodal target distributions.
翻译:本文构建了两种概率算法族之间的桥梁:(层次化)变分推断(VI)——通常用于对连续空间上的分布建模,以及生成流网络(GFlowNets)——已被用于图等离散结构上的分布建模。我们证明,在某些情况下,VI算法在期望梯度与其学习目标相等的意义上等同于GFlowNets的特例。随后,我们指出这两类算法族之间的差异,并通过实验揭示这些差异如何产生。值得注意的是,借鉴强化学习思想的GFlowNets比VI更适用于离策略训练,且无需承担重要性采样导致的高梯度方差代价。我们认为,GFlowNets的这一特性可为捕捉多模态目标分布中的多样性提供优势。