We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments. At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images. We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration. Trained on a large offline dataset of prior experience, the model acquires a representation of visual goals that is robust to task-irrelevant distractors. We demonstrate our method on a mobile ground robot in open-world exploration scenarios. Given an image of a goal that is up to 80 meters away, our method leverages its representation to explore and discover the goal in under 20 minutes, even amidst previously-unseen obstacles and weather conditions. Please check out the project website for videos of our experiments and information about the real-world dataset used at https://sites.google.com/view/recon-robot.
翻译:我们描述了一种用于多样化开放世界环境中自主探索与导航的机器人学习系统。该系统的核心是一个学习得到的距离与动作潜在变量模型,以及一个基于图像的非参数拓扑记忆。我们利用信息瓶颈对学习策略进行正则化,从而获得:(i)紧凑的视觉目标表征,(ii)增强的泛化能力,以及(iii)用于探索的可采样可行目标机制。该模型通过大量离线先验经验数据集进行训练,习得了对任务无关干扰物具有鲁棒性的视觉目标表征。我们在开放世界探索场景中将其部署于地面移动机器人上进行验证。给定距离最远达80米的目标图像,该方法能利用其表征在20分钟内完成探索并发现目标,即使遇到未曾见过的障碍物和天气条件仍能实现。实验视频及真实世界数据集信息请访问项目网站:https://sites.google.com/view/recon-robot。