Long-range navigation is commonly addressed through hierarchical pipelines in which a global planner generates a path, decomposed into waypoints, and followed sequentially by a local planner. These systems are sensitive to global path quality, as inaccurate remote sensing data can result in locally infeasible waypoints, which degrade local execution. At the same time, the limited global context available to the local planner hinders long-range efficiency. To address this issue, we propose a reinforcement learning-based local navigation policy that leverages path information as contextual guidance. The policy is conditioned on reference path observations and trained with a reward function mainly based on goal-reaching objectives, without any explicit path-following reward. Through this implicit conditioning, the policy learns to opportunistically exploit path information while remaining robust to misleading or degraded guidance. Experimental results show that the proposed approach significantly improves navigation efficiency when high-quality paths are available and maintains baseline-level performance when path observations are severely degraded or even non-existent. These properties make the method particularly well-suited for long-range navigation scenarios in which high-level plans are approximate and local execution must remain adaptive to uncertainty.
翻译:远程导航通常通过分层流水线实现,其中全局规划器生成一条路径,该路径被分解为一系列航点,并由局部规划器依次跟随执行。此类系统对全局路径质量较为敏感,因为不准确的遥感数据可能导致局部不可行的航点,从而降低局部执行效果。同时,局部规划器可获取的全局上下文信息有限,制约了远程导航的效率。为解决这一问题,我们提出一种基于强化学习的局部导航策略,该策略利用路径信息作为上下文引导。该策略以参考路径观测为条件进行训练,其奖励函数主要基于目标到达目标设计,不包含任何显式的路径跟随奖励。通过这种隐式条件化机制,策略学会在机会性地利用路径信息的同时,对误导性或退化的引导保持鲁棒性。实验结果表明,当高质量路径可用时,所提方法能显著提升导航效率;而在路径观测严重退化甚至完全缺失时,仍能保持基线水平的性能。这些特性使得该方法特别适用于高层规划具有近似性、且局部执行需保持对不确定性自适应能力的远程导航场景。