This paper introduces a reinforcement learning (RL) approach to address the challenges associated with configuring and optimizing genetic algorithms (GAs) for solving difficult combinatorial or non-linear problems. The proposed RL+GA method was specifically tested on the flow shop scheduling problem (FSP). The hybrid algorithm incorporates neural networks (NN) and uses the off-policy method Q-learning or the on-policy method Sarsa(0) to control two key genetic algorithm (GA) operators: parent selection mechanism and mutation. At each generation, the RL agent's action is determining the selection method, the probability of the parent selection and the probability of the offspring mutation. This allows the RL agent to dynamically adjust the selection and mutation based on its learned policy. The results of the study highlight the effectiveness of the RL+GA approach in improving the performance of the primitive GA. They also demonstrate its ability to learn and adapt from population diversity and solution improvements over time. This adaptability leads to improved scheduling solutions compared to static parameter configurations while maintaining population diversity throughout the evolutionary process.
翻译:本文提出一种强化学习(RL)方法,以解决配置和优化遗传算法(GA)求解困难组合或非线性问题时所面临的挑战。所提出的RL+GA方法专门在流水车间调度问题(FSP)上进行了测试。该混合算法采用神经网络(NN),并利用离策略方法Q-learning或同策略方法Sarsa(0)控制遗传算法的两个关键算子:父代选择机制与变异。在每一代中,RL智能体的动作为确定选择方法、父代选择概率以及子代变异概率。这使得RL智能体能够基于其学习策略动态调整选择与变异。研究结果突显了RL+GA方法在提升原始GA性能方面的有效性,同时展示了其从种群多样性和解质量的时序改进中学习与适应的能力。相较于静态参数配置,这种适应性在保持进化过程中种群多样性的同时,能够产生更优的调度解。