GraspLDM: Generative 6-DoF Grasp Synthesis using Latent Diffusion Models

Vision-based grasping of unknown objects in unstructured environments is a key challenge for autonomous robotic manipulation. A practical grasp synthesis system is required to generate a diverse set of 6-DoF grasps from which a task-relevant grasp can be executed. Although generative models are suitable for learning such complex data distributions, existing models have limitations in grasp quality, long training times, and a lack of flexibility for task-specific generation. In this work, we present GraspLDM- a modular generative framework for 6-DoF grasp synthesis that uses diffusion models as priors in the latent space of a VAE. GraspLDM learns a generative model of object-centric $SE(3)$ grasp poses conditioned on point clouds. GraspLDM's architecture enables us to train task-specific models efficiently by only re-training a small de-noising network in the low-dimensional latent space, as opposed to existing models that need expensive re-training. Our framework provides robust and scalable models on both full and single-view point clouds. GraspLDM models trained with simulation data transfer well to the real world and provide an 80\% success rate for 80 grasp attempts of diverse test objects, improving over existing generative models. We make our implementation available at https://github.com/kuldeepbrd1/graspldm .

翻译：非结构化环境中基于视觉的未知物体抓取是自主机器人操作的关键挑战。一个实用的抓取合成系统需生成多样化的6自由度抓取姿态，以便执行与任务相关的抓取动作。尽管生成模型适合学习此类复杂数据分布，但现有模型在抓取质量、训练耗时及任务特定生成的灵活性方面存在局限。本文提出GraspLDM——一种基于扩散模型作为变分自编码器潜在空间先验的模块化6自由度抓取合成框架。GraspLDM学习以点云为条件、面向物体中心的$SE(3)$抓取姿态的生成模型。其架构仅需在低维潜在空间中重新训练小型去噪网络即可高效训练任务特定模型，而现有模型则需要昂贵的重新训练。该框架在完整点云与单视角点云上均能获得鲁棒且可扩展的模型。基于仿真数据训练的GraspLDM模型可良好迁移至真实场景，对80次不同测试物体的抓取尝试成功率达80%，优于现有生成模型。代码已开源：https://github.com/kuldeepbrd1/graspldm。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日