The Flexible Job Shop Scheduling Problem (FJSP) is the optimal allocation of a set of jobs to machines. Two primary challenges persist in FJSP: the unpredictable arrival of future jobs and the combinatorial complexity of the problem, rendering it intractable for conventional mixed-integer linear programming solvers. This paper proposes an event-based \gls{DRL} approach to solve FJSP with random job arrivals. Specifically, we employ the Proximal Policy Optimization algorithm and use lightweight Multi-Layer Perceptrons to train the \gls{DRL} agent for minimizing the total completion time of all jobs. We design the state representation to be directly accessible from the environment, and limit the learning agent to selecting from among a set of well-established dispatching rules. Simulations show that our \gls{DRL} approach outperforms any of the individual dispatching rules on datasets with varying heterogeneity and job arrival rates. We benchmark our \gls{DRL} against an arrival-triggered mixed-integer linear programming solution and show that our method achieves good performance especially when the datasets are heterogeneous.
翻译:柔性作业车间调度问题(FJSP)旨在将一组作业最优地分配给机器。该领域存在两大主要挑战:未来作业到达的不可预测性以及问题本身的组合复杂性,这使得传统混合整数线性规划求解器难以处理。本文提出一种基于事件的深度强化学习(DRL)方法,用于求解存在随机作业到达的FJSP。具体而言,我们采用近端策略优化算法,并使用轻量级多层感知器训练DRL智能体,以最小化所有作业的总完工时间。我们将状态表示设计为可直接从环境中获取,并将学习智能体的动作空间限制在一组成熟的调度规则之中。仿真结果表明,在具有不同异质性和作业到达率的数据集上,我们的DRL方法优于任何单一调度规则。我们将所提出的DRL方法与基于到达触发机制的混合整数线性规划求解方案进行基准对比,结果表明,该方法在数据集具有异质性时表现尤为出色。