Lifelong embodied navigation in dynamic environments requires robots to form persistent scene understanding from fragmentary observations, which remains difficult for existing methods that rely on explicit maps or scene graphs and struggle to generalize beyond structured settings. We propose AllDayNav, a lifelong self-learning navigation framework that implicitly encodes scene dynamics into the billion-scale parameters of a large model via reinforcement learning, powered by a self-evolving multimodal memory that maintains and updates visual keyframes, semantic descriptions, and temporal context while autonomously generating open-vocabulary instructions, image goals, and structured rewards. Experiments in both synthetic and real-world environments across cross-room, cross-episode, and cross-task scenarios show that AllDayNav achieves success rates approaching $100\%$ and consistently surpasses strong map-based, VLM, and RL baselines in path efficiency and robustness, demonstrating implicit, memory-driven reinforcement learning as a scalable alternative to explicit mapping for reliable lifelong navigation.
翻译:在动态环境中,终身具身导航要求机器人从碎片化观测中形成持久的场景理解,这对于依赖显式地图或场景图且难以推广到非结构化环境的现有方法而言仍然困难。我们提出AllDayNav,一种终身自学习导航框架,通过强化学习隐式地将场景动态编码到大模型的十亿级参数中,并由自进化多模态记忆驱动,该记忆维护并更新视觉关键帧、语义描述和时序上下文,同时自主生成开放词汇指令、图像目标和结构化奖励。在跨房间、跨回合和跨任务场景的合成及真实环境实验表明,AllDayNav实现了接近$100\%$的成功率,并在路径效率和鲁棒性上持续优于基于地图、VLM及强化学习的强基线方法,证明了隐式、记忆驱动的强化学习可作为显式映射的可扩展替代方案,用于可靠的终身导航。