An Edge-Aware Graph Autoencoder Trained on Scale-Imbalanced Data for Travelling Salesman Problems

Recent years have witnessed a surge in research on machine learning for combinatorial optimization since learning-based approaches can outperform traditional heuristics and approximate exact solvers at a lower computation cost. However, most existing work on supervised neural combinatorial optimization focuses on TSP instances with a fixed number of cities and requires large amounts of training samples to achieve a good performance, making them less practical to be applied to realistic optimization scenarios. This work aims to develop a data-driven graph representation learning method for solving travelling salesman problems (TSPs) with various numbers of cities. To this end, we propose an edge-aware graph autoencoder (EdgeGAE) model that can learn to solve TSPs after being trained on solution data of various sizes with an imbalanced distribution. We formulate the TSP as a link prediction task on sparse connected graphs. A residual gated encoder is trained to learn latent edge embeddings, followed by an edge-centered decoder to output link predictions in an end-to-end manner. To improve the model's generalization capability of solving large-scale problems, we introduce an active sampling strategy into the training process. In addition, we generate a benchmark dataset containing 50,000 TSP instances with a size from 50 to 500 cities, following an extremely scale-imbalanced distribution, making it ideal for investigating the model's performance for practical applications. We conduct experiments using different amounts of training data with various scales, and the experimental results demonstrate that the proposed data-driven approach achieves a highly competitive performance among state-of-the-art learning-based methods for solving TSPs.

翻译：近年来，基于机器学习的方法在组合优化领域的研究激增，因为这类方法能以更低的计算成本超越传统启发式算法和近似精确求解器。然而，现有监督式神经组合优化研究大多聚焦于固定城市数量的旅行商问题实例，且需要大量训练样本才能获得良好性能，这限制了其在现实优化场景中的实用性。本研究旨在开发一种数据驱动的图表示学习方法，用于求解具有不同城市数量的旅行商问题。为此，我们提出了一种边缘感知图自编码器模型，该模型可在不同规模且分布不平衡的求解数据上训练后，学会求解旅行商问题。我们将旅行商问题表述为稀疏连接图上的链接预测任务，训练残差门控编码器来学习潜在边缘嵌入，并通过边缘中心解码器以端到端方式输出链接预测结果。为提升模型求解大规模问题的泛化能力，我们在训练过程中引入主动采样策略。此外，我们生成了一个包含50,000个旅行商问题实例的基准数据集，实例规模从50个到500个城市不等，且遵循极度尺度不平衡分布，这为研究模型在实际应用中的性能提供了理想条件。我们使用不同规模、不同数量的训练数据进行了实验，结果表明，所提出的数据驱动方法在求解旅行商问题方面，与最先进的基于学习的方法相比，达到了极具竞争力的性能。

相关内容

自编码器

关注 141

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日