Object rearranging is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a goal configuration. Previous studies focus on designing an expert system for each specific task by model-based or data-driven approaches and the application scenarios are therefore limited. Some research has been attempting to design a general framework to obtain more advanced manipulation capabilities for deformable rearranging tasks, with lots of progress achieved in simulation. However, transferring from simulation to reality is difficult due to the limitation of the end-to-end CNN architecture. To address these challenges, we design a local GNN (Graph Neural Network) based learning method, which utilizes two representation graphs to encode keypoints detected from images. Self-attention is applied for graph updating and cross-attention is applied for generating manipulation actions. Extensive experiments have been conducted to demonstrate that our framework is effective in multiple 1-D (rope, rope ring) and 2-D (cloth) rearranging tasks in simulation and can be easily transferred to a real robot by fine-tuning a keypoint detector.
翻译:物体重排是最常见的可变形操作任务之一,要求机器人将可变形物体重排至目标构型。以往研究侧重于通过基于模型或数据驱动方法为特定任务设计专用系统,因而应用场景受限。部分研究尝试设计通用框架以提升可变形重排任务的操作能力,并在仿真环境中取得显著进展。然而,由于端到端CNN架构的局限性,从仿真到现实的迁移仍面临挑战。为解决这些问题,我们设计了一种基于局部图神经网络(GNN)的学习方法,该方法利用两个表征图编码从图像中检测到的关键点。自注意力机制用于图更新,交叉注意力机制用于生成操作动作。大量实验表明,本框架在仿真环境的多个一维(绳索、绳环)和二维(布料)重排任务中表现有效,并通过微调关键点检测器可便捷迁移至真实机器人。