Navigation presents a significant challenge for persons with visual impairments (PVI). While traditional aids such as white canes and guide dogs are invaluable, they fall short in delivering detailed spatial information and precise guidance to desired locations. Recent developments in large language models (LLMs) and vision-language models (VLMs) offer new avenues for enhancing assistive navigation. In this paper, we introduce Guide-LLM, an embodied LLM-based agent designed to assist PVI in navigating large indoor environments. Our approach features a novel text-based topological map that enables the LLM to plan global paths using a simplified environmental representation, focusing on straight paths and right-angle turns to facilitate navigation. Additionally, we utilize the LLM's commonsense reasoning for hazard detection and personalized path planning based on user preferences. Simulated experiments demonstrate the system's efficacy in guiding PVI, underscoring its potential as a significant advancement in assistive technology. The results highlight Guide-LLM's ability to offer efficient, adaptive, and personalized navigation assistance, pointing to promising advancements in this field.
翻译:导航对于视障人士而言是一项重大挑战。虽然传统辅助工具如盲杖和导盲犬具有不可估量的价值,但它们在提供详细空间信息和精确引导至目标位置方面仍存在不足。大型语言模型和视觉-语言模型的最新发展为增强辅助导航提供了新途径。本文介绍Guide-LLM,一种基于具身LLM的智能体,旨在协助视障人士在大型室内环境中导航。我们的方法采用了一种新颖的基于文本的拓扑地图,使LLM能够利用简化的环境表示进行全局路径规划,重点关注直线路径和直角转弯以简化导航。此外,我们利用LLM的常识推理能力进行危险检测,并根据用户偏好进行个性化路径规划。模拟实验证明了该系统在引导视障人士方面的有效性,凸显了其作为辅助技术领域重要进展的潜力。结果突显了Guide-LLM提供高效、自适应和个性化导航辅助的能力,指明了该领域前景广阔的发展方向。