Learning-based methods have become increasingly popular for solving vehicle routing problems due to their near-optimal performance and fast inference speed. Among them, the combination of deep reinforcement learning and graph representation allows for the abstraction of node topology structures and features in an encoder-decoder style. Such an approach makes it possible to solve routing problems end-to-end without needing complicated heuristic operators designed by domain experts. Existing research studies have been focusing on novel encoding and decoding structures via various neural network models to enhance the node embedding representation. Despite the sophisticated approaches applied, there is a noticeable lack of consideration for the graph-theoretic properties inherent to routing problems. Moreover, the potential ramifications of inter-nodal interactions on the decision-making efficacy of the models have not been adequately explored. To bridge this gap, we propose an adaptive Graph Attention Sampling with the Edges Fusion framework (GASE),where nodes' embedding is determined through attention calculation from certain highly correlated neighbourhoods and edges, utilizing a filtered adjacency matrix. In detail, the selections of particular neighbours and adjacency edges are led by a multi-head attention mechanism, contributing directly to the message passing and node embedding in graph attention sampling networks. Furthermore, we incorporate an adaptive actor-critic algorithm with policy improvements to expedite the training convergence. We then conduct comprehensive experiments against baseline methods on learning-based VRP tasks from different perspectives. Our proposed model outperforms the existing methods by 2.08\%-6.23\% and shows stronger generalization ability, achieving state-of-the-art performance on randomly generated instances and real-world datasets.
翻译:基于学习的方法因具有近优性能和快速推理速度,在求解车辆路径问题中日益普及。其中,深度强化学习与图表示的结合,能够以编码器-解码器形式抽象节点拓扑结构与特征。此类方法使得无需领域专家设计复杂启发式算子即可端到端地求解路径问题。现有研究聚焦于通过各类神经网络模型设计新型编码解码结构以增强节点嵌入表示。尽管采用了诸多复杂方法,但显著缺少对路径问题固有图论性质的考量。此外,节点间交互对模型决策效能的潜在影响尚未得到充分探索。为弥补这一空白,我们提出自适应边融合图注意力采样框架(GASE),通过利用过滤后的邻接矩阵,从特定高相关性邻域和边中计算注意力来确定节点嵌入。具体而言,多头注意力机制引导特定邻居和邻接边的选择,直接参与图注意力采样网络中的消息传递与节点嵌入构建。同时,我们融合带策略改进的自适应演员-评论家算法以加速训练收敛。随后,我们从不同视角对基于学习的VRP任务开展与基线方法的全面对比实验。所提模型在性能上超越现有方法2.08%-6.23%,并展现出更强的泛化能力,在随机生成实例和真实数据集上均达到当前最优性能。