This paper focuses on the acquisition of mapless navigation skills within unknown environments. We introduce the Skill Q-Network (SQN), a novel reinforcement learning method featuring an adaptive skill ensemble mechanism. Unlike existing methods, our model concurrently learns a high-level skill decision process alongside multiple low-level navigation skills, all without the need for prior knowledge. Leveraging a tailored reward function for mapless navigation, the SQN is capable of learning adaptive maneuvers that incorporate both exploration and goal-directed skills, enabling effective navigation in new environments. Our experiments demonstrate that our SQN can effectively navigate complex environments, exhibiting a 40% higher performance compared to baseline models. Without explicit guidance, SQN discovers how to combine low-level skill policies, showcasing both goal-directed navigations to reach destinations and exploration maneuvers to escape from local minimum regions in challenging scenarios. Remarkably, our adaptive skill ensemble method enables zero-shot transfer to out-of-distribution domains, characterized by unseen observations from non-convex obstacles or uneven, subterranean-like environments.
翻译:本文聚焦于未知环境中无地图导航技能的习得问题。我们提出了技能Q网络(SQN),一种具备自适应技能集成机制的新型强化学习方法。与现有方法不同,我们的模型无需先验知识即可同时学习高层技能决策过程与多个低层导航技能。通过利用针对无地图导航设计的定制化奖励函数,SQN能够学习融合探索与目标导向技能的自适应机动策略,从而在新环境中实现高效导航。实验表明,SQN能在复杂环境中有效导航,其性能较基线模型提升40%。在无显式指导的情况下,SQN能够自主发现低层技能策略的组合方式,在挑战性场景中既展现出抵达目标的目标导向导航能力,又实现了逃离局部极小区域的探索机动。值得注意的是,我们的自适应技能集成方法能够实现零样本迁移至分布外领域,这些领域以非凸障碍物或崎岖类地下环境等未见过观测为特征。