The job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades. In recent years, machine learning (ML) is playing an increasingly important role in advancing existing and building new heuristic solutions for the JSSP, aiming to find better solutions in shorter computation times. In this paper we build on top of a state-of-the-art deep reinforcement learning (DRL) agent, called Neural Local Search (NLS), which can efficiently and effectively control a large local neighborhood search on the JSSP. In particular, we develop a method for training the decision transformer (DT) algorithm on search trajectories taken by a trained NLS agent to further improve upon the learned decision-making sequences. Our experiments show that the DT successfully learns local search strategies that are different and, in many cases, more effective than those of the NLS agent itself. In terms of the tradeoff between solution quality and acceptable computational time needed for the search, the DT is particularly superior in application scenarios where longer computational times are acceptable. In this case, it makes up for the longer inference times required per search step, which are caused by the larger neural network architecture, through better quality decisions per step. Thereby, the DT achieves state-of-the-art results for solving the JSSP with ML-enhanced search.
翻译:作业车间调度问题(JSSP)及其求解算法数十年来在学术界和工业界持续受到广泛关注。近年来,机器学习(ML)在改进现有启发式算法和构建JSSP新启发式解决方案方面发挥着日益重要的作用,其目标是在更短的计算时间内找到更优解。本文基于一种称为神经局部搜索(NLS)的先进深度强化学习(DRL)智能体展开研究,该智能体能够高效控制JSSP的大规模局部邻域搜索。具体而言,我们开发了一种在训练完成的NLS智能体搜索轨迹上训练决策Transformer(DT)算法的方法,以进一步优化已习得的决策序列。实验表明,DT成功学习到了与NLS智能体不同且在多数情况下更有效的局部搜索策略。在解质量与可接受计算时间的权衡方面,DT在允许较长计算时间的应用场景中表现尤为突出。此时,其通过每步更优的决策质量弥补了因更大规模神经网络架构导致的单步搜索所需较长推理时间。由此,DT在使用ML增强搜索求解JSSP方面取得了最先进的结果。