Robotic manipulation tasks, such as object rearrangement, play a crucial role in enabling robots to interact with complex and arbitrary environments. Existing work focuses primarily on single-level rearrangement planning and, even if multiple levels exist, dependency relations among substructures are geometrically simpler, like tower stacking. We propose Structural Concept Learning (SCL), a deep learning approach that leverages graph attention networks to perform multi-level object rearrangement planning for scenes with structural dependency hierarchies. It is trained on a self-generated simulation data set with intuitive structures, works for unseen scenes with an arbitrary number of objects and higher complexity of structures, infers independent substructures to allow for task parallelization over multiple manipulators, and generalizes to the real world. We compare our method with a range of classical and model-based baselines to show that our method leverages its scene understanding to achieve better performance, flexibility, and efficiency. The dataset, supplementary details, videos, and code implementation are available at: https://manavkulshrestha.github.io/scl
翻译:机器人操作任务(如物体重排)在使机器人能够与复杂和任意环境交互中扮演着关键角色。现有工作主要关注单层级重排规划,即便存在多个层级,子结构间的依赖关系在几何上也较为简单(如塔式堆叠)。我们提出结构概念学习(SCL),这是一种利用图注意力网络对具有结构依赖层次结构的场景进行多层级物体重排规划的深度学习方法。该方法在自生成的直观结构仿真数据集上训练,可处理具有任意数量物体及更高结构复杂度的未见场景,通过推断独立子结构实现多机械臂任务并行化,并具备向真实世界的泛化能力。我们将该方法与一系列经典及基于模型的基线方法进行比较,结果表明我们的方法通过场景理解实现了更优的性能、灵活性和效率。数据集、补充细节、演示视频及代码实现详见:https://manavkulshrestha.github.io/scl