TWIST: Teacher-Student World Model Distillation for Efficient Sim-to-Real Transfer

Model-based RL is a promising approach for real-world robotics due to its improved sample efficiency and generalization capabilities compared to model-free RL. However, effective model-based RL solutions for vision-based real-world applications require bridging the sim-to-real gap for any world model learnt. Due to its significant computational cost, standard domain randomisation does not provide an effective solution to this problem. This paper proposes TWIST (Teacher-Student World Model Distillation for Sim-to-Real Transfer) to achieve efficient sim-to-real transfer of vision-based model-based RL using distillation. Specifically, TWIST leverages state observations as readily accessible, privileged information commonly garnered from a simulator to significantly accelerate sim-to-real transfer. Specifically, a teacher world model is trained efficiently on state information. At the same time, a matching dataset is collected of domain-randomised image observations. The teacher world model then supervises a student world model that takes the domain-randomised image observations as input. By distilling the learned latent dynamics model from the teacher to the student model, TWIST achieves efficient and effective sim-to-real transfer for vision-based model-based RL tasks. Experiments in simulated and real robotics tasks demonstrate that our approach outperforms naive domain randomisation and model-free methods in terms of sample efficiency and task performance of sim-to-real transfer.

翻译：基于模型的强化学习因其相比于无模型强化学习具有更高的样本效率和泛化能力，成为真实机器人领域一种有前景的方法。然而，针对基于视觉的真实世界应用，有效的基于模型强化学习解决方案需要弥合任何已学习世界模型的仿真到现实差距。由于计算成本过高，标准领域随机化无法为此问题提供有效解决方案。本文提出TWIST（面向仿真到现实迁移的师生世界模型蒸馏方法），通过蒸馏技术实现基于视觉的模型强化学习的仿真到现实高效迁移。具体而言，TWIST利用状态观测作为从仿真器中轻松获取的易于访问的特权信息，显著加速仿真到现实迁移。教师世界模型基于状态信息高效训练，同时收集与领域随机化图像观测匹配的数据集。随后，教师世界模型监督以领域随机化图像观测为输入的学生世界模型。通过将学习到的潜在动力学模型从教师模型蒸馏到学生模型，TWIST实现了基于视觉的模型强化学习任务的高效且有效的仿真到现实迁移。在仿真和真实机器人任务上的实验表明，本方法在仿真到现实迁移的样本效率和任务性能方面优于朴素领域随机化和无模型方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日