The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving complex combinatorial optimization problems (COPs). ALNS selects various heuristics adaptively during the search process, leveraging their strengths to find good solutions for optimization problems. However, the effectiveness of ALNS depends on the proper configuration of its selection and acceptance parameters. To address this limitation, we propose a Deep Reinforcement Learning (DRL) approach that selects heuristics, adjusts parameters, and controls the acceptance criteria during the search process. The proposed method aims to learn, based on the state of the search, how to configure the next iteration of the ALNS to obtain good solutions to the underlying optimization problem. We evaluate the proposed method on a time-dependent orienteering problem with stochastic weights and time windows, used in an IJCAI competition. The results show that our approach outperforms vanilla ALNS and ALNS tuned with Bayesian Optimization. In addition, it obtained better solutions than two state-of-the-art DRL approaches, which are the winning methods of the competition, with much fewer observations required for training. The implementation of our approach will be made publicly available.
翻译:自适应大邻域搜索(ALNS)算法在解决复杂组合优化问题(COPs)方面取得了显著成功。ALNS在搜索过程中自适应地选择多种启发式策略,利用其优势为优化问题寻找优质解。然而,ALNS的有效性取决于其选择和接受参数的合理配置。针对这一局限性,我们提出了一种深度强化学习(DRL)方法,该方法在搜索过程中动态选择启发式策略、调整参数并控制接受准则。所提方法旨在基于搜索状态学习如何配置ALNS的下一轮迭代,从而获得底层优化问题的优质解。我们在IJCAI竞赛中使用的随机权重与时间窗时间依赖型定向问题上评估了该方法。结果表明,我们的方法优于原始ALNS和经过贝叶斯优化调参的ALNS。此外,与竞赛优胜方法——两种最先进的DRL方法相比,本方法能以少得多的训练观测次数获得更优解。我们的方法实现将公开发布。