Reinforcement Learning-based Non-Autoregressive Solver for Traveling Salesman Problems

The Traveling Salesman Problem (TSP) is a well-known combinatorial optimization problem with broad real-world applications. Recently, neural networks have gained popularity in this research area because they provide strong heuristic solutions to TSPs. Compared to autoregressive neural approaches, non-autoregressive (NAR) networks exploit the inference parallelism to elevate inference speed but suffer from comparatively low solution quality. In this paper, we propose a novel NAR model named NAR4TSP, which incorporates a specially designed architecture and an enhanced reinforcement learning strategy. To the best of our knowledge, NAR4TSP is the first TSP solver that successfully combines RL and NAR networks. The key lies in the incorporation of NAR network output decoding into the training process. NAR4TSP efficiently represents TSP encoded information as rewards and seamlessly integrates it into reinforcement learning strategies, while maintaining consistent TSP sequence constraints during both training and testing phases. Experimental results on both synthetic and real-world TSP instances demonstrate that NAR4TSP outperforms four state-of-the-art models in terms of solution quality, inference speed, and generalization to unseen scenarios.

翻译：旅行商问题（TSP）是一个著名的组合优化问题，具有广泛的实际应用场景。近年来，神经网络在该研究领域广受欢迎，因其能为TSP提供强大的启发式解决方案。与非自回归方法相比，自回归神经网络虽能利用推理并行性提升推理速度，但求解质量相对较低。本文提出一种名为NAR4TSP的新型非自回归（NAR）模型，该模型融合了专门设计的架构和增强型强化学习策略。据我们所知，NAR4TSP是首个成功将强化学习与NAR网络相结合的TSP求解器。其关键创新在于将NAR网络输出解码过程纳入训练流程。NAR4TSP能够高效地将TSP编码信息转化为奖励信号，并无缝整合到强化学习策略中，同时在训练和测试阶段保持一致的TSP序列约束。在合成数据与真实世界TSP实例上的实验结果表明，NAR4TSP在求解质量、推理速度以及对未见场景的泛化能力方面，均超越了四种当前最优模型。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日