Unsupervised learning (UL)-based solvers for combinatorial optimization (CO) train a neural network whose output provides a soft solution by directly optimizing the CO objective using a continuous relaxation strategy. These solvers offer several advantages over traditional methods and other learning-based methods, particularly for large-scale CO problems. However, UL-based solvers face two practical issues: (I) an optimization issue where UL-based solvers are easily trapped at local optima, and (II) a rounding issue where UL-based solvers require artificial post-learning rounding from the continuous space back to the original discrete space, undermining the robustness of the results. This study proposes a Continuous Relaxation Annealing (CRA) strategy, an effective rounding-free learning method for UL-based solvers. CRA introduces a penalty term that dynamically shifts from prioritizing continuous solutions, effectively smoothing the non-convexity of the objective function, to enforcing discreteness, eliminating the artificial rounding. Experimental results demonstrate that CRA significantly enhances the performance of UL-based solvers, outperforming existing UL-based solvers and greedy algorithms in complex CO problems. It also effectively eliminates the artificial rounding and accelerates the learning.
翻译:基于无监督学习(UL)的组合优化(CO)求解器通过连续松弛策略直接优化CO目标,训练一个输出软解(soft solution)的神经网络。相较于传统方法及其他基于学习的方法,这类求解器在大规模CO问题上展现出显著优势。然而,基于UL的求解器面临两个实际问题:(I)优化问题:基于UL的求解器易陷入局部最优;(II)取整问题:基于UL的求解器需要人工进行后学习取整(rounding),将连续空间映射回原始离散空间,这削弱了结果的鲁棒性。本研究提出一种连续松弛退火(Continuous Relaxation Annealing, CRA)策略,这是一种针对基于UL求解器的有效免取整学习方法。CRA引入一个惩罚项,该惩罚项从优先考虑连续解(有效平滑目标函数的非凸性)动态过渡到强制离散性,从而消除了人工取整步骤。实验结果表明,CRA显著提升了基于UL求解器的性能,在复杂CO问题上优于现有基于UL的求解器及贪心算法。它还有效消除了人工取整并加速了学习过程。