Constrained-Context Conditional Diffusion Models for Imitation Learning

Offline Imitation Learning (IL) is a powerful paradigm to learn visuomotor skills, especially for high-precision manipulation tasks. However, IL methods are prone to spurious correlation - expressive models may focus on distractors that are irrelevant to action prediction - and are thus fragile in real-world deployment. Prior methods have addressed this challenge by exploring different model architectures and action representations. However, none were able to balance between sample efficiency, robustness against distractors, and solving high-precision manipulation tasks with complex action space. To this end, we present $\textbf{C}$onstrained-$\textbf{C}$ontext $\textbf{C}$onditional $\textbf{D}$iffusion $\textbf{M}$odel (C3DM), a diffusion model policy for solving 6-DoF robotic manipulation tasks with high precision and ability to ignore distractions. A key component of C3DM is a fixation step that helps the action denoiser to focus on task-relevant regions around the predicted action while ignoring distractors in the context. We empirically show that C3DM is able to consistently achieve high success rate on a wide array of tasks, ranging from table top manipulation to industrial kitting, that require varying levels of precision and robustness to distractors. For details, please visit this https://sites.google.com/view/c3dm-imitation-learning

翻译：离线模仿学习（Offline Imitation Learning, IL）是一种学习视觉运动技能的有效范式，尤其适用于高精度操作任务。然而，IL方法容易受到伪相关性的影响——表达能力强的模型可能会关注与动作预测无关的干扰因素，因此在现实部署中表现脆弱。先前的研究通过探索不同的模型架构和动作表示来应对这一挑战，但尚未有方法能够在样本效率、抗干扰鲁棒性以及解决具有复杂动作空间的高精度操作任务之间实现平衡。为此，我们提出了**C**onstrained-**C**ontext **C**onditional **D**iffusion **M**odel（C3DM），这是一种扩散模型策略，用于解决六自由度（6-DoF）机器人操作任务，具有高精度和忽略干扰的能力。C3DM的关键组成部分是一个固定步骤（fixation step），它帮助动作去噪器聚焦于预测动作周围的任务相关区域，同时忽略上下文中的干扰。我们通过实验证明，C3DM能够在从桌面操作到工业组装的多种任务中持续实现高成功率，这些任务对精度和抗干扰鲁棒性有不同程度的要求。详情请访问：https://sites.google.com/view/c3dm-imitation-learning

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日