Scene transfer for vision-based mobile robotics applications is a highly relevant and challenging problem. The utility of a robot greatly depends on its ability to perform a task in the real world, outside of a well-controlled lab environment. Existing scene transfer end-to-end policy learning approaches often suffer from poor sample efficiency or limited generalization capabilities, making them unsuitable for mobile robotics applications. This work proposes an adaptive multi-pair contrastive learning strategy for visual representation learning that enables zero-shot scene transfer and real-world deployment. Control policies relying on the embedding are able to operate in unseen environments without the need for finetuning in the deployment environment. We demonstrate the performance of our approach on the task of agile, vision-based quadrotor flight. Extensive simulation and real-world experiments demonstrate that our approach successfully generalizes beyond the training domain and outperforms all baselines.
翻译:面向视觉移动机器人应用的场景迁移是一个高度相关且极具挑战性的问题。机器人的实用性在很大程度上取决于其在脱离受控实验室环境的真实世界中执行任务的能力。现有基于端到端策略学习的场景迁移方法普遍存在样本效率低下或泛化能力有限的问题,难以适用于移动机器人应用场景。本研究提出了一种自适应多对对比学习策略用于视觉表征学习,实现了零样本场景迁移与真实世界部署。依赖该嵌入表示的控制策略能够在无需目标环境微调的情况下,在未知场景中稳定运行。我们以基于视觉的敏捷四旋翼飞行任务为例验证了该方法性能。大量仿真与真实世界实验表明,本方法成功实现了超越训练域的泛化,且性能全面优于所有基线方法。