In the context of autonomous navigation, effectively conveying abstract navigational cues to agents in dynamic environments poses challenges, particularly when the navigation information is multimodal. To address this issue, the paper introduces a novel technique termed "Virtual Guidance," which is designed to visually represent non-visual instructional signals. These visual cues, rendered as colored paths or spheres, are overlaid onto the agent's camera view, serving as easily comprehensible navigational instructions. We evaluate our proposed method through experiments in both simulated and real-world settings. In the simulated environments, our virtual guidance outperforms baseline hybrid approaches in several metrics, including adherence to planned routes and obstacle avoidance. Furthermore, we extend the concept of virtual guidance to transform text-prompt-based instructions into a visually intuitive format for real-world experiments. Our results validate the adaptability of virtual guidance and its efficacy in enabling policy transfer from simulated scenarios to real-world ones.
翻译:在自主导航的背景下,如何有效地向动态环境中的智能体传达抽象的导航提示仍然面临挑战,尤其是当导航信息为多模态时。为解决这一问题,本文提出了一种名为“虚拟引导”(Virtual Guidance)的新技术,旨在以视觉形式表示非视觉的指令信号。这些视觉提示以彩色路径或球体的形式渲染并叠加在智能体的摄像头视野中,作为易于理解的导航指令。我们通过在模拟环境和真实场景中进行实验来评估所提出的方法。在模拟环境中,我们的虚拟引导在多个指标上优于基线混合方法,包括对规划路径的遵循和障碍物规避。此外,我们将虚拟引导的概念扩展至实际场景,将基于文本提示的指令转化为视觉直观的格式。实验结果验证了虚拟引导的适应性及其在从模拟场景向真实场景进行策略迁移中的有效性。