Motivated by developments in machine learning technologies, unsupervised learning (UL)-based solvers for CO problems have recently been proposed. These solvers train a neural network that outputs a solution by optimizing the CO objective directly. UL-based solvers have several advantages over traditional methods. However, various studies have shown that these solvers underperform compared to greedy algorithms for complex CO problems. In addition, these solvers employ a continuous relaxation strategy; thus, post-learning rounding from the continuous space back to the original discrete space is required, undermining the robustness of the results. To address these problems, we propose the continuous relaxation annealing (CRA) strategy. The CRA introduces a penalty term to control the continuity and discreteness of the relaxed variables and eliminate local optima. In addition, the CRA implements an annealing process for the penalty term that initially prioritizes continuous solutions and progressively transitions towards discreet solutions until the relaxed variables become nearly discrete, eliminating the artificial rounding. Experimental results demonstrate that the CRA significantly enhances the UL-based solvers, outperforming both existing UL-based solvers and greedy algorithms for complex CO problems.
翻译:受机器学习技术发展的推动,近年来出现了针对组合优化问题的无监督学习求解器。此类求解器通过直接优化组合优化目标来训练输出解的神经网络。无监督学习求解器相比传统方法具有若干优势。然而,多项研究表明,在处理复杂组合优化问题时,这些求解器的表现劣于贪心算法。此外,这类求解器采用连续松弛策略,因此需要在学习后执行从连续空间到原始离散空间的舍入操作,这损害了结果的鲁棒性。针对上述问题,我们提出连续松弛退火策略。该策略引入惩罚项以控制松弛变量的连续性与离散性,并消除局部最优解。同时,连续松弛退火对惩罚项执行退火过程:初始阶段优先关注连续解,随后逐步向离散解过渡,直至松弛变量接近离散状态,从而避免了人工舍入操作。实验结果表明,连续松弛退火显著提升了无监督学习求解器的性能,在复杂组合优化问题中优于现有无监督学习求解器与贪心算法。