This study addresses a gap in the utilization of Reinforcement Learning (RL) and Machine Learning (ML) techniques in solving the Stochastic Vehicle Routing Problem (SVRP) that involves the challenging task of optimizing vehicle routes under uncertain conditions. We propose a novel end-to-end framework that comprehensively addresses the key sources of stochasticity in SVRP and utilizes an RL agent with a simple yet effective architecture and a tailored training method. Through comparative analysis, our proposed model demonstrates superior performance compared to a widely adopted state-of-the-art metaheuristic, achieving a significant 3.43% reduction in travel costs. Furthermore, the model exhibits robustness across diverse SVRP settings, highlighting its adaptability and ability to learn optimal routing strategies in varying environments. The publicly available implementation of our framework serves as a valuable resource for future research endeavors aimed at advancing RL-based solutions for SVRP.
翻译:本研究填补了强化学习(RL)与机器学习(ML)技术在解决随机车辆路径问题(SVRP)中的应用空白,该问题涉及在不确定条件下优化车辆路径的挑战性任务。我们提出一种新颖的端到端框架,全面应对SVRP中的主要随机性来源,并采用架构简洁高效且配备定制化训练方法的RL智能体。通过对比分析,我们的模型相较于广泛采用的先进元启发式算法展现出更优性能,实现了旅行成本显著降低3.43%。此外,该模型在多种SVRP场景中均表现出稳健性,凸显其适应不同环境并学习最优路径策略的能力。我们框架的开源实现将为后续探索基于RL的SVRP解决方案的研究工作提供宝贵资源。