Learning visual-based deformable object rearrangement with local graph neural networks

Goal-conditioned rearrangement of deformable objects (e.g. straightening a rope and folding a cloth) is one of the most common deformable manipulation tasks, where the robot needs to rearrange a deformable object into a prescribed goal configuration with only visual observations. These tasks are typically confronted with two main challenges: the high dimensionality of deformable configuration space and the underlying complexity, nonlinearity and uncertainty inherent in deformable dynamics. To address these challenges, we propose a novel representation strategy that can efficiently model the deformable object states with a set of keypoints and their interactions. We further propose local-graph neural network (GNN), a light local GNN learning to jointly model the deformable rearrangement dynamics and infer the optimal manipulation actions (e.g. pick and place) by constructing and updating two dynamic graphs. Both simulated and real experiments have been conducted to demonstrate that the proposed dynamic graph representation shows superior expressiveness in modeling deformable rearrangement dynamics. Our method reaches much higher success rates on a variety of deformable rearrangement tasks (96.3% on average) than state-of-the-art method in simulation experiments. Besides, our method is much more lighter and has a 60% shorter inference time than state-of-the-art methods. We also demonstrate that our method performs well in the multi-task learning scenario and can be transferred to real-world applications with an average success rate of 95% by solely fine tuning a keypoint detector.

翻译：目标条件引导的变形物体重排任务（如拉直绳索、折叠布料）是最常见的变形操纵任务之一，要求机器人仅通过视觉观测将可变形物体调整至指定目标构型。这类任务通常面临两大挑战：变形构型空间的高维特性，以及变形动力学固有的复杂性、非线性和不确定性。针对这些挑战，我们提出了一种新型表征策略，能够通过关键点集合及其交互关系高效建模可变形物体状态。进一步提出局部图神经网络（GNN）——一种轻量级局部图神经网络，通过构建和更新两个动态图，联合建模变形重排动力学并推导最优操纵动作（如抓取与放置）。仿真与实物实验表明，所提出的动态图表征在建模变形重排动力学方面展现出显著优越的表达能力。在多种变形重排任务中，该方法平均成功率达96.3%，远超仿真实验中的现有最优方法。此外，该方法更为轻量，推理时间较现有最优方法缩短60%。我们还验证了该方法在多任务学习场景中的优异表现，并可通过仅微调关键点检测器实现真实场景迁移，平均成功率达95%。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日