Automating the segregation process is a need for every sector experiencing a high volume of materials handling, repetitive and exhaustive operations, in addition to risky exposures. Learning automated pick-and-place operations can be efficiently done by introducing collaborative autonomous systems (e.g. manipulators) in the workplace and among human operators. In this paper, we propose a deep reinforcement learning strategy to learn the place task of multi-categorical items from a shared workspace between dual-manipulators and to multi-goal destinations, assuming the pick has been already completed. The learning strategy leverages first a stochastic actor-critic framework to train an agent's policy network, and second, a dynamic 3D Gym environment where both static and dynamic obstacles (e.g. human factors and robot mate) constitute the state space of a Markov decision process. Learning is conducted in a Gazebo simulator and experiments show an increase in cumulative reward function for the agent further away from human factors. Future investigations will be conducted to enhance the task performance for both agents simultaneously.
翻译:自动化分拣流程是处理大量物料搬运、重复性及高强度操作并面临风险暴露的行业的普遍需求。通过引入协作式自主系统(如机械臂)与人类操作员协同工作,可高效学习自动化拾取与放置操作。本文提出一种深度强化学习策略,用于从双机械臂共享工作空间中学习多类别物品的放置任务(假设拾取已完成),并将其送达多目标目的地。该学习策略首先利用随机演员-评论家框架训练智能体的策略网络,其次构建动态三维Gym环境,其中静态与动态障碍物(如人为因素与协作机器人)构成马尔可夫决策过程的状态空间。学习过程在Gazebo模拟器中执行,实验结果表明,远离人为因素的智能体的累积奖励函数值显著提升。未来将开展进一步研究以同时提升双智能体的任务性能。