Recently, deep reinforcement learning has shown promising results for learning fast heuristics to solve routing problems. Meanwhile, most of the solvers suffer from generalizing to an unseen distribution or distributions with different scales. To address this issue, we propose a novel architecture, called Invariant Nested View Transformer (INViT), which is designed to enforce a nested design together with invariant views inside the encoders to promote the generalizability of the learned solver. It applies a modified policy gradient algorithm enhanced with data augmentations. We demonstrate that the proposed INViT achieves a dominant generalization performance on both TSP and CVRP problems with various distributions and different problem scales.
翻译:近年来,深度强化学习在学习快速启发式算法解决路由问题方面展现出令人鼓舞的成果。然而,大多数求解器在泛化到未见分布或不同尺度的分布时仍存在局限。为解决此问题,我们提出一种新型架构——不变嵌套视图变换器(INViT),该架构通过强制在编码器内实现嵌套设计结合不变视图,以提升学习求解器的泛化能力。该方法采用经数据增强改进的策略梯度算法。实验表明,所提出的INViT在TSP和CVRP问题中,针对不同分布与问题规模均取得了主导性的泛化性能。