Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact algorithms, making them a tempting domain to apply machine learning methods. The highly structured constraints in these problems can hinder either optimization or sampling directly in the solution space. On the other hand, GFlowNets have recently emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially and have the potential to amortize such solution-searching processes in CO, as well as generate diverse solution candidates. In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space. Efficient training techniques are also developed to benefit long-range credit assignment. Through extensive experiments on a variety of different CO tasks with synthetic and realistic data, we demonstrate that GFlowNet policies can efficiently find high-quality solutions. Our implementation is open-sourced at https://github.com/zdhNarsil/GFlowNet-CombOpt.
翻译:组合优化(CO)问题通常是NP难的,因此超出了精确算法的求解范围,使其成为应用机器学习方法的诱人领域。这些问题中的高度结构化约束可能阻碍在解空间中直接进行优化或采样。另一方面,GFlowNets最近作为一种强大的机制出现,能够高效地从复合非归一化密度中顺序采样,并具有在CO中摊销此类解搜索过程以及生成多样化解候选的潜力。在本文中,我们针对不同的组合问题设计了马尔可夫决策过程(MDP),并提出训练条件GFlowNets以从解空间中采样。我们还开发了高效训练技术,以利于长程信用分配。通过在多种不同CO任务上使用合成和真实数据进行大量实验,我们证明了GFlowNet策略能够高效找到高质量解。我们的实现已在https://github.com/zdhNarsil/GFlowNet-CombOpt开源。