Autonomous navigation in unknown environments requires multi-scale spatial understanding that captures geometric details, topological connectivity, and global structure to support high-level decision making under partial observability. Existing approaches struggle to efficiently capture such multi-scale spatial understanding while maintaining low computational cost for real-time navigation. We present MacroNav, a learning-based navigation framework featuring two key components: (1) a lightweight context encoder trained via multi-task self-supervised learning to capture multi-scale, navigation-centric spatial representations; and (2) a reinforcement learning policy that seamlessly integrates these representations with graph-based reasoning for efficient action selection. Extensive experiments demonstrate the context encoder's effective and robust environmental understanding. Real-world deployments further validate MacroNav's effectiveness, yielding significant gains over state-of-the-art navigation methods in both Success Rate (SR) and Success weighted by Path Length (SPL), with superior computational efficiency.
翻译:未知环境中的自主导航需要多尺度空间理解,以捕捉几何细节、拓扑连通性和全局结构,从而在部分可观测条件下支持高层决策。现有方法在保持低计算成本以实现实时导航的同时,难以高效捕获这种多尺度空间理解。我们提出MacroNav,一种基于学习的导航框架,包含两个关键组件:(1)通过多任务自监督学习训练的轻量级上下文编码器,用于捕获以导航为中心的多尺度空间表示;(2)结合这些表示与基于图的推理的强化学习策略,实现高效的动作选择。大量实验证明了上下文编码器有效且鲁棒的环境理解能力。实际部署进一步验证了MacroNav的效果,在成功率(SR)和路径长度加权成功率(SPL)上均显著优于最先进的导航方法,并具有卓越的计算效率。