GFlowNets is a novel flow-based method for learning a stochastic policy to generate objects via a sequence of actions and with probability proportional to a given positive reward. We contribute to relaxing hypotheses limiting the application range of GFlowNets, in particular: acyclicity (or lack thereof). To this end, we extend the theory of GFlowNets on measurable spaces which includes continuous state spaces without cycle restrictions, and provide a generalization of cycles in this generalized context. We show that losses used so far push flows to get stuck into cycles and we define a family of losses solving this issue. Experiments on graphs and continuous tasks validate those principles.
翻译:GFlowNets是一种基于流的新方法,用于学习随机策略以通过一系列动作生成对象,且生成概率与给定正奖励成正比。我们致力于放宽限制GFlowNets应用范围的假设,特别是:无环性(或其缺失)。为此,我们扩展了可测空间(包括无环限制的连续状态空间)上的GFlowNets理论,并在这一广义背景下提出了循环的泛化定义。我们证明,目前使用的损失会使流陷入循环,并定义了一类解决该问题的损失函数。在图与连续任务的实验验证了上述原理。