In this work, we focus on a robotic unloading problem from visual observations, where robots are required to autonomously unload stacks of parcels using RGB-D images as their primary input source. While supervised and imitation learning have accomplished good results in these types of tasks, they heavily rely on labeled data, which are challenging to obtain in realistic scenarios. Our study aims to develop a sample efficient controller framework that can learn unloading tasks without the need for labeled data during the learning process. To tackle this challenge, we propose a hierarchical controller structure that combines a high-level decision-making module with classical motion control. The high-level module is trained using Deep Reinforcement Learning (DRL), wherein we incorporate a safety bias mechanism and design a reward function tailored to this task. Our experiments demonstrate that both these elements play a crucial role in achieving improved learning performance. Furthermore, to ensure reproducibility and establish a benchmark for future research, we provide free access to our code and simulation.
翻译:在本文中,我们聚焦于基于视觉观测的机器人卸货问题,要求机器人利用RGB-D图像作为主要输入源,自主完成包裹堆的卸载任务。尽管监督学习和模仿学习在这类任务中取得了良好效果,但它们严重依赖标注数据,而在实际场景中获取此类数据极具挑战性。本研究旨在开发一种样本高效的控制器框架,能够在学习过程中无需标注数据即可掌握卸货任务。为应对这一挑战,我们提出一种层级控制器结构,将高层决策模块与经典运动控制相结合。高层模块采用深度强化学习进行训练,其中我们引入安全偏向机制,并设计了针对该任务的奖励函数。实验结果表明,这两个要素对于提升学习性能均具有关键作用。此外,为确保可重复性并为后续研究建立基准,我们免费开放了代码与仿真环境。