We propose the first framework to learn control policies for vision-based human-to-robot handovers, a critical task for human-robot interaction. While research in Embodied AI has made significant progress in training robot agents in simulated environments, interacting with humans remains challenging due to the difficulties of simulating humans. Fortunately, recent research has developed realistic simulated environments for human-to-robot handovers. Leveraging this result, we introduce a method that is trained with a human-in-the-loop via a two-stage teacher-student framework that uses motion and grasp planning, reinforcement learning, and self-supervision. We show significant performance gains over baselines on a simulation benchmark, sim-to-sim transfer and sim-to-real transfer.
翻译:我们提出了首个从视觉中学习人机交接控制策略的框架,这是人机交互中的一项关键任务。尽管具身人工智能研究在模拟环境中训练机器人智能体方面取得了显著进展,但由于模拟人类的困难,与人类交互仍然具有挑战性。幸运的是,近期研究已开发出逼真的人机交接模拟环境。利用这一成果,我们提出了一种方法,通过两阶段教师-学生框架结合运动与抓取规划、强化学习和自监督学习,在人类参与下进行训练。实验表明,该方法在仿真基准测试、仿真到仿真迁移以及仿真到现实迁移中均显著优于基线方法。