Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact algorithms, making them a tempting domain to apply machine learning methods. The highly structured constraints in these problems can hinder either optimization or sampling directly in the solution space. On the other hand, GFlowNets have recently emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially and have the potential to amortize such solution-searching processes in CO, as well as generate diverse solution candidates. In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space. Efficient training techniques are also developed to benefit long-range credit assignment. Through extensive experiments on a variety of different CO tasks with synthetic and realistic data, we demonstrate that GFlowNet policies can efficiently find high-quality solutions. Our implementation is open-sourced at https://github.com/zdhNarsil/GFlowNet-CombOpt.
翻译:组合优化(CO)问题通常是NP难的,超出现有精确算法的求解范围,因此成为应用机器学习方法的热门领域。这些问题中高度结构化的约束会阻碍直接在解空间中进行优化或采样。另一方面,GFlowNets最近作为一种强大技术出现,能够高效地顺序采样复合非归一化密度,并具有在组合优化中摊销此类解搜索过程以及生成多样化候选解的潜力。在本文中,我们针对不同组合问题设计了马尔可夫决策过程(MDP),并提出训练条件GFlowNets以从解空间中进行采样。我们还开发了高效训练技术以改进长程信用分配。通过在包含合成数据与真实数据的多种不同组合优化任务上的广泛实验,我们证明GFlowNet策略能够高效地找到高质量解。我们的实现已开源至https://github.com/zdhNarsil/GFlowNet-CombOpt。