Research on Reinforcement Learning (RL) approaches for discrete optimization problems has increased considerably, extending RL to an area classically dominated by Operations Research (OR). Vehicle routing problems are a good example of discrete optimization problems with high practical relevance where RL techniques have had considerable success. Despite these advances, open-source development frameworks remain scarce, hampering both the testing of algorithms and the ability to objectively compare results. This ultimately slows down progress in the field and limits the exchange of ideas between the RL and OR communities. Here we propose a library composed of multi-agent environments that simulates classic vehicle routing problems. The library, built on PyTorch, provides a flexible modular architecture design that allows easy customization and incorporation of new routing problems. It follows the Agent Environment Cycle ("AEC") games model and has an intuitive API, enabling rapid adoption and easy integration into existing reinforcement learning frameworks. The library allows for a straightforward use of classical OR benchmark instances in order to narrow the gap between the test beds for algorithm benchmarking used by the RL and OR communities. Additionally, we provide benchmark instance sets for each environment, as well as baseline RL models and training code.
翻译:针对离散优化问题的强化学习方法研究已显著增加,将强化学习扩展到了传统上由运筹学主导的领域。车辆路径问题作为具有高度实际相关性的离散优化问题的典型代表,强化学习技术在其中已取得相当成效。尽管存在这些进展,开源开发框架仍然稀缺,这既阻碍了算法的测试,也影响了客观比较结果的能力,最终减缓了该领域的进展,并限制了强化学习与运筹学社区之间的思想交流。本文提出一个由多智能体环境组成的库,用于模拟经典车辆路径问题。该库基于PyTorch构建,采用灵活的模块化架构设计,便于定制和整合新的路径问题。它遵循智能体环境循环游戏模型,并提供直观的API接口,能够快速适配并轻松集成到现有强化学习框架中。该库支持直接使用经典运筹学基准实例,以缩小强化学习与运筹学社区在算法基准测试平台方面的差距。此外,我们为每个环境提供了基准实例集,以及基线强化学习模型和训练代码。