Combinatorial optimization (CO) problems are often NP-hard and thus out of reach for exact algorithms, making them a tempting domain to apply machine learning methods. The highly structured constraints in these problems can hinder either optimization or sampling directly in the solution space. On the other hand, GFlowNets have recently emerged as a powerful machinery to efficiently sample from composite unnormalized densities sequentially and have the potential to amortize such solution-searching processes in CO, as well as generate diverse solution candidates. In this paper, we design Markov decision processes (MDPs) for different combinatorial problems and propose to train conditional GFlowNets to sample from the solution space. Efficient training techniques are also developed to benefit long-range credit assignment. Through extensive experiments on a variety of different CO tasks with synthetic and realistic data, we demonstrate that GFlowNet policies can efficiently find high-quality solutions.
翻译:组合优化(CO)问题通常是NP难的,因而难以用精确算法求解,这使其成为应用机器学习方法的诱人领域。这些问题中高度结构化的约束可能阻碍直接解空间中的优化或采样。另一方面,GFlowNets最近作为一种强大机制出现,能够高效地从复合非归一化密度中顺序采样,并有望在组合优化中摊销此类解搜索过程,同时生成多样化的候选解。在本文中,我们针对不同组合问题设计了马尔可夫决策过程(MDP),并提出训练条件GFlowNets从解空间中采样。同时,我们开发了高效训练技术以利于长程信用分配。通过在多种不同组合优化任务(使用合成和真实数据)上的广泛实验,我们展示了GFlowNet策略能够高效地找到高质量解。