Zero-Shot Object Navigation in unknown environments poses significant challenges for Unmanned Aerial Vehicles (UAVs) due to the conflict between high-level semantic reasoning requirements and limited onboard computational resources. To address this, we present USS-Nav, a lightweight framework that incrementally constructs a Unified Spatio-Semantic scene graph and enables efficient Large Language Model (LLM)-augmented Zero-Shot Object Navigation in unknown environments. Specifically, we introduce an incremental Spatial Connectivity Graph generation method utilizing polyhedral expansion to capture global geometric topology, which is dynamically partitioned into semantic regions via graph clustering. Concurrently, open-vocabulary object semantics are instantiated and anchored to this topology to form a hierarchical environmental representation. Leveraging this hierarchical structure, we present a coarse-to-fine exploration strategy: LLM grounded in the scene graph's semantics to determine global target regions, while a local planner optimizes frontier coverage based on information gain. Experimental results demonstrate that our framework outperforms state-of-the-art methods in terms of computational efficiency and real-time update frequency (15 Hz) on a resource-constrained platform. Furthermore, ablation studies confirm the effectiveness of our framework, showing substantial improvements in Success weighted by Path Length (SPL). The source code will be made publicly available to foster further research.
翻译:在未知环境中进行零样本目标导航对无人机提出了重大挑战,这源于高层语义推理需求与有限机载计算资源之间的冲突。为解决此问题,我们提出了USS-Nav,一个轻量级框架,它增量式地构建一个统一空间语义场景图,并实现在未知环境中高效的大语言模型增强的零样本目标导航。具体而言,我们引入了一种利用多面体扩展的增量式空间连通图生成方法,以捕获全局几何拓扑;该拓扑通过图聚类被动态地划分为语义区域。同时,开放词汇的对象语义被实例化并锚定到此拓扑上,以形成一个层次化的环境表示。利用这种层次结构,我们提出了一种由粗到细的探索策略:大语言模型基于场景图的语义确定全局目标区域,而局部规划器则根据信息增益优化前沿覆盖。实验结果表明,在资源受限的平台上,我们的框架在计算效率和实时更新频率(15 Hz)方面优于最先进的方法。此外,消融研究证实了我们框架的有效性,显示出在由路径长度加权的成功率指标上取得了显著提升。源代码将公开提供,以促进进一步研究。