Deep reinforcement learning offers notable benefits in addressing combinatorial problems over traditional solvers, reducing the reliance on domain-specific knowledge and expert solutions, and improving computational efficiency. Despite the recent surge in interest in neural combinatorial optimization, practitioners often do not have access to a standardized code base. Moreover, different algorithms are frequently based on fragmentized implementations that hinder reproducibility and fair comparison. To address these challenges, we introduce RL4CO, a unified Reinforcement Learning (RL) for Combinatorial Optimization (CO) library. We employ state-of-the-art software and best practices in implementation, such as modularity and configuration management, to be flexible, easily modifiable, and extensible by researchers. Thanks to our unified codebase, we benchmark baseline RL solvers with different evaluation schemes on zero-shot performance, generalization, and adaptability on diverse tasks. Notably, we find that some recent methods may fall behind their predecessors depending on the evaluation settings. We hope RL4CO will encourage the exploration of novel solutions to complex real-world tasks, allowing the community to compare with existing methods through a unified framework that decouples the science from software engineering. We open-source our library at https://github.com/ai4co/rl4co.
翻译:深度强化学习在解决组合优化问题时相比传统求解器具有显著优势,能够减少对领域特定知识和专家解决方案的依赖,并提升计算效率。尽管近年来神经组合优化领域的研究兴趣激增,但从业人员往往缺乏标准化的代码库。此外,不同算法常基于碎片化实现,阻碍了可重复性和公平比较。为解决这些问题,我们提出了RL4CO——一个面向组合优化的统一强化学习库。我们采用最先进的软件工程实践,如模块化设计和配置管理,确保库的灵活性、易修改性和可扩展性,方便研究人员使用。通过统一代码库,我们在零样本性能、泛化能力及多任务适应性等不同评估方案下对基线强化学习求解器进行了基准测试。值得注意的是,我们发现某些近期方法在不同评估设置下可能落后于其前身。我们希望RL4CO能推动针对复杂现实任务的新型解决方案探索,使学界能够通过一个将科学问题与软件工程解耦的统一框架与现有方法进行比较。该库已开源发布于https://github.com/ai4co/rl4co。