Unsupervised learning (UL)-based solvers for combinatorial optimization (CO) train a neural network that generates a soft solution by directly optimizing the CO objective using a continuous relaxation strategy. These solvers offer several advantages over traditional methods and other learning-based methods, particularly for large-scale CO problems. However, UL-based solvers face two practical issues: (I) an optimization issue, where UL-based solvers are easily trapped at local optima, and (II) a rounding issue, where UL-based solvers require artificial post-learning rounding from the continuous space back to the original discrete space, undermining the robustness of the results. This study proposes a Continuous Relaxation Annealing (CRA) strategy, an effective rounding-free learning method for UL-based solvers. CRA introduces a penalty term that dynamically shifts from prioritizing continuous solutions, effectively smoothing the non-convexity of the objective function, to enforcing discreteness, eliminating artificial rounding. Experimental results demonstrate that CRA significantly enhances the performance of UL-based solvers, outperforming existing UL-based solvers and greedy algorithms in complex CO problems. Additionally, CRA effectively eliminates artificial rounding and accelerates the learning process.
翻译:基于无监督学习(UL)的组合优化(CO)求解器通过连续松弛策略直接优化CO目标,训练神经网络生成软解。相较于传统方法及其他基于学习的方法,这类求解器在大规模CO问题上展现出显著优势。然而,基于UL的求解器面临两个实际问题:(I)优化问题:此类求解器易陷入局部最优;(II)舍入问题:求解器需在训练后通过人工方式将连续空间解舍入回原始离散空间,这削弱了结果的鲁棒性。本研究提出连续松弛退火(CRA)策略,这是一种适用于基于UL求解器的有效免舍入学习方法。CRA引入了一个动态调整的惩罚项:初期优先保证解的连续性以有效平滑目标函数的非凸性,后期则强制离散性以消除人工舍入。实验结果表明,CRA显著提升了基于UL求解器的性能,在复杂CO问题上超越了现有基于UL的求解器与贪心算法。此外,CRA有效消除了人工舍入过程并加速了学习收敛。