Generative Flow Networks (GFlowNets) are amortized sampling methods that learn a distribution over discrete objects proportional to their rewards. GFlowNets exhibit a remarkable ability to generate diverse samples, yet occasionally struggle to consistently produce samples with high rewards due to over-exploration on wide sample space. This paper proposes to train GFlowNets with local search which focuses on exploiting high rewarded sample space to resolve this issue. Our main idea is to explore the local neighborhood via destruction and reconstruction guided by backward and forward policies, respectively. This allows biasing the samples toward high-reward solutions, which is not possible for a typical GFlowNet solution generation scheme which uses the forward policy to generate the solution from scratch. Extensive experiments demonstrate a remarkable performance improvement in several biochemical tasks. Source code is available: \url{https://github.com/dbsxodud-11/ls_gfn}.
翻译:生成流网络(GFlowNets)是一种摊销采样方法,能够学习按照与奖励成正比的概率分布对离散对象进行采样。GFlowNets 在生成多样化样本方面表现出显著能力,但有时因在广阔样本空间中的过度探索而难以持续生成高奖励样本。本文提出通过局部搜索训练GFlowNets,聚焦于利用高奖励样本空间来解决该问题。我们的核心思想是分别利用反向策略和前向策略引导的破坏与重建过程探索局部邻域,从而将样本偏向高奖励解——这与典型GFlowNet从零开始生成解的方案不同。大量实验表明,该方法在多项生物化学任务中取得了显著的性能提升。源代码已公开:\url{https://github.com/dbsxodud-11/ls_gfn}。