Constrained Reinforcement Learning for Dynamic Material Handling

As one of the core parts of flexible manufacturing systems, material handling involves storage and transportation of materials between workstations with automated vehicles. The improvement in material handling can impulse the overall efficiency of the manufacturing system. However, the occurrence of dynamic events during the optimisation of task arrangements poses a challenge that requires adaptability and effectiveness. In this paper, we aim at the scheduling of automated guided vehicles for dynamic material handling. Motivated by some real-world scenarios, unknown new tasks and unexpected vehicle breakdowns are regarded as dynamic events in our problem. We formulate the problem as a constrained Markov decision process which takes into account tardiness and available vehicles as cumulative and instantaneous constraints, respectively. An adaptive constrained reinforcement learning algorithm that combines Lagrangian relaxation and invalid action masking, named RCPOM, is proposed to address the problem with two hybrid constraints. Moreover, a gym-like dynamic material handling simulator, named DMH-GYM, is developed and equipped with diverse problem instances, which can be used as benchmarks for dynamic material handling. Experimental results on the problem instances demonstrate the outstanding performance of our proposed approach compared with eight state-of-the-art constrained and non-constrained reinforcement learning algorithms, and widely used dispatching rules for material handling.

翻译：作为柔性制造系统的核心组成部分之一，物料搬运涉及利用自动化车辆在工作站之间进行物料存储与运输。物料搬运的改进能够推动制造系统整体效率的提升。然而，在任务调度优化过程中出现的动态事件带来了适应性及有效性的挑战。本文聚焦于自动化导引车辆在动态物料搬运中的调度问题。受实际场景启发，我们将未知新任务与突发车辆故障视为问题中的动态事件。我们将该问题建模为约束马尔可夫决策过程，其中以延迟和可用车辆分别作为累计约束与即时约束。为应对具有双重混合约束的问题，我们提出一种融合拉格朗日松弛与无效动作掩码的自适应约束强化学习算法RCPOM。此外，我们开发了名为DMH-GYM的类Gym动态物料搬运模拟器，该模拟器配备多种问题实例，可作为动态物料搬运的基准平台。在问题实例上的实验结果表明，与八种最先进的约束及非约束强化学习算法以及广泛使用的物料搬运调度规则相比，本文提出的方法展现出卓越性能。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

神经常微分方程教程，50页ppt，A brief tutorial on Neural ODEs

专知会员服务

74+阅读 · 2020年8月2日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

116+阅读 · 2020年4月5日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日