Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at https://transic-robot.github.io/
翻译:摘要:在仿真环境中学习并将学得策略迁移至现实世界,有望实现通用机器人。该方法的核心挑战在于应对仿真到现实(sim-to-real)的差距。先前方法通常需要先验的领域特定知识。我们认为,获取此类知识的一种直接方式是让人类观察并协助机器人在现实世界中的策略执行,进而通过向人类学习来弥合各类仿真到现实的差距。本文提出TRANSIC,一种基于人在环框架的数据驱动方法,旨在实现成功的仿真到现实迁移。TRANSIC允许人类通过干预和在线纠正,整体性地增强仿真策略以克服各种未建模的仿真现实差距。残差策略可从人类纠正中习得,并与仿真策略集成以实现自主执行。我们证明,该方法能够在复杂且涉及密集接触的操作任务(如家具组装)中实现成功的仿真到现实迁移。通过将仿真学得策略与人类反馈策略进行协同整合,TRANSIC可作为整体性方法有效应对多种常共存的仿真现实差距,展现出随人类投入规模扩展等优良特性。相关视频与代码已发布于https://transic-robot.github.io/。