Map representation learned by expert demonstrations has shown promising research value. However, recent advancements in the visual navigation field face challenges due to the lack of human datasets in the real world for efficient supervised representation learning of the environments. We present a Landmark-Aware Visual Navigation (LAVN) dataset to allow for supervised learning of human-centric exploration policies and map building. We collect RGB observation and human point-click pairs as a human annotator explores virtual and real-world environments with the goal of full coverage exploration of the space. The human annotators also provide distinct landmark examples along each trajectory, which we intuit will simplify the task of map or graph building and localization. These human point-clicks serve as direct supervision for waypoint prediction when learning to explore in environments. Our dataset covers a wide spectrum of scenes, including rooms in indoor environments, as well as walkways outdoors. Dataset is available at DOI: 10.5281/zenodo.10608067.
翻译:摘要:基于专家示范学习的环境地图表征已展现出重要的研究价值。然而,由于缺乏真实世界的人类数据集用于高效监督式环境表征学习,视觉导航领域的最新进展仍面临挑战。我们提出一个地标感知的视觉导航(LAVN)数据集,旨在支持以人为中心的探索策略与地图构建的监督学习。通过收集RGB观测数据与人类点击-点选对,人类标注员以空间全覆盖探索为目标,在虚拟与现实环境中进行标注。标注员还沿每条轨迹提供显著的地标样本,我们推测这将简化地图或图结构构建及定位任务。这些人类点击行为作为探索学习过程中航点预测的直接监督信号。本数据集涵盖广泛的场景类型,包括室内房间环境及室外人行通道。数据集可通过DOI: 10.5281/zenodo.10608067获取。