Neural models produce promising results when solving Vehicle Routing Problems (VRPs), but often fall short in generalization. Recent attempts to enhance model generalization often incur unnecessarily large training cost or cannot be directly applied to other models solving different VRP variants. To address these issues, we take a novel perspective on model architecture in this study. Specifically, we propose a plug-and-play Entropy-based Scaling Factor (ESF) and a Distribution-Specific (DS) decoder to enhance the size and distribution generalization, respectively. ESF adjusts the attention weight pattern of the model towards familiar ones discovered during training when solving VRPs of varying sizes. The DS decoder explicitly models VRPs of multiple training distribution patterns through multiple auxiliary light decoders, expanding the model representation space to encompass a broader range of distributional scenarios. We conduct extensive experiments on both synthetic and widely recognized real-world benchmarking datasets and compare the performance with seven baseline models. The results demonstrate the effectiveness of using ESF and DS decoder to obtain a more generalizable model and showcase their applicability to solve different VRP variants, i.e., travelling salesman problem and capacitated VRP. Notably, our proposed generic components require minimal computational resources, and can be effortlessly integrated into conventional generalization strategies to further elevate model generalization.
翻译:神经模型在求解车辆路径问题时展现出有前景的结果,但通常在泛化能力方面存在不足。近期提升模型泛化能力的尝试往往带来不必要的巨大训练成本,或无法直接应用于求解不同VRP变体的其他模型。为解决这些问题,本研究从模型架构角度提出新颖视角。具体而言,我们分别提出即插即用的基于熵的缩放因子与分布特定解码器,以分别增强规模泛化与分布泛化能力。ESF在求解不同规模的VRP时,将模型的注意力权重模式调整为训练过程中发现的熟悉模式。DS解码器通过多个辅助轻量解码器显式建模多种训练分布模式的VRP,从而扩展模型表示空间以涵盖更广泛的分布场景。我们在合成数据集与广泛认可的真实世界基准数据集上进行了大量实验,并与七个基线模型进行性能比较。结果表明,使用ESF和DS解码器能有效获得更具泛化能力的模型,并展示了其在求解不同VRP变体(即旅行商问题与容量约束VRP)时的适用性。值得注意的是,我们提出的通用组件仅需极少计算资源,且可无缝集成到传统泛化策略中,进一步提升模型泛化性能。