Neural Combinatorial Optimization (NCO) has mostly focused on learning policies, typically neural networks, that operate on a single candidate solution at a time, either by constructing one from scratch or iteratively improving it. In contrast, decades of work in metaheuristics have shown that maintaining and evolving populations of solutions improves robustness and exploration, and often leads to stronger performance. To close this gap, we study how to make NCO explicitly population-based by learning policies that act on sets of candidate solutions. We first propose a simple taxonomy of population awareness levels and use it to highlight two key design challenges: (i) how to represent a whole population inside a neural network, and (ii) how to learn population dynamics that balance intensification (generating good solutions) and diversification (maintaining variety). We make these ideas concrete with two complementary tools: one that improves existing solutions using information shared across the whole population, and the other generates new candidate solutions that explicitly balance being high-quality with diversity. Experimental results on Maximum Cut and Maximum Independent Set indicate that incorporating population structure is advantageous for learned optimization methods and opens new connections between NCO and classical population-based search.
翻译:神经组合优化(NCO)的研究主要集中于学习策略(通常为神经网络),这些策略每次仅对单个候选解进行操作,无论是从头构建解还是通过迭代改进。相比之下,数十年的元启发式研究表明,维护并演化解种群能够提升鲁棒性与探索能力,并往往带来更优的性能。为弥合这一差距,我们研究如何通过作用于候选解集合的策略学习,使NCO显式地具备种群特性。我们首先提出一种简单的种群感知层级分类法,并借此阐明两个关键设计挑战:(i)如何在神经网络内部表征整个种群;(ii)如何学习能平衡强化(生成优质解)与多样化(保持多样性)的种群动态。我们通过两种互补工具具体实现这些理念:一种利用整个种群共享的信息改进现有解,另一种则生成能显式平衡高质量与多样性的新候选解。在最大割与最大独立集问题上的实验结果表明,融入种群结构对学习型优化方法具有显著优势,并为神经组合优化与经典种群搜索方法开辟了新的联系路径。