A fundamental task in robotics is to navigate between two locations. In particular, real-world navigation can require long-horizon planning using high-dimensional RGB images, which poses a substantial challenge for end-to-end learning-based approaches. Current semi-parametric methods instead achieve long-horizon navigation by combining learned modules with a topological memory of the environment, often represented as a graph over previously collected images. However, using these graphs in practice typically involves tuning a number of pruning heuristics to avoid spurious edges, limit runtime memory usage and allow reasonably fast graph queries. In this work, we present One-4-All (O4A), a method leveraging self-supervised and manifold learning to obtain a graph-free, end-to-end navigation pipeline in which the goal is specified as an image. Navigation is achieved by greedily minimizing a potential function defined continuously over the O4A latent space. Our system is trained offline on non-expert exploration sequences of RGB data and controls, and does not require any depth or pose measurements. We show that O4A can reach long-range goals in 8 simulated Gibson indoor environments, and further demonstrate successful real-world navigation using a Jackal UGV platform.
翻译:机器人学的一项基础任务是在两个位置之间导航。特别地,现实世界的导航可能需要利用高维RGB图像进行长时域规划,这对基于端到端学习的方法构成了重大挑战。当前的半参数方法通过将学习模块与环境的拓扑记忆相结合来实现长时域导航,这种记忆通常表示为基于先前采集图像的图结构。然而,实际使用这些图结构通常需要调整一系列剪枝启发式规则,以避免虚假边、限制运行时内存使用并允许快速图查询。在本工作中,我们提出One-4-All(O4A)方法,该方法利用自监督学习和流形学习来获得一个无图的端到端导航流水线,其中目标以图像形式指定。通过贪婪地最小化在O4A潜在空间上连续定义的势函数来实现导航。我们的系统通过非专家探索序列的RGB数据和控制信号进行离线训练,且不需要任何深度或位姿测量。我们展示了O4A能够在8个模拟Gibson室内环境中到达长距离目标,并进一步利用Jackal UGV平台验证了成功的现实世界导航。