Object Goal Navigation (ObjectNav) in temporally changing indoor environments is challenging because object relocation can invalidate historical scene knowledge. To address this issue, we propose a probabilistic planning framework that combines uncertainty-aware scene priors with online target relevance estimates derived from a Vision Language Model (VLM). The framework contains a dual-layer semantic mapping module and a real-time planner. The mapping module includes an Information Gain Map (IGM) built from a 3D scene graph (3DSG) during prior exploration to model object co-occurrence relations and provide global guidance on likely target regions. It also maintains a VLM score map (VLM-SM) that fuses confidence-weighted semantic observations into the map for local validation of the current scene. Based on these two cues, we develop a planner that jointly exploits information gain and semantic evidence for online decision making. The planner biases tree expansion toward semantically salient regions with high prior likelihood and strong online relevance (IGV-RRT), while preserving kinematic feasibility through gradient-based analysis. Simulation and real-world experiments demonstrate that the proposed method effectively mitigates the impact of object rearrangement, achieving higher search efficiency and success rates than representative baselines in complex indoor environments.
翻译:在时间变化的室内环境中,对象目标导航(ObjectNav)具有挑战性,因为物体重新定位会使历史场景知识失效。为解决此问题,我们提出一种结合不确定性感知场景先验与基于视觉语言模型(VLM)的在线目标相关性估计的概率规划框架。该框架包含双层语义映射模块和实时规划器。映射模块包括在先验探索期间基于三维场景图(3DSG)构建的信息增益图(IGM),用于对物体共现关系建模并提供关于可能目标区域的全局引导。同时维护一个VLM评分图(VLM-SM),将置信度加权的语义观测融合到地图中,用于当前场景的局部验证。基于这两个线索,我们开发了一种联合利用信息增益和语义证据进行在线决策的规划器。该规划器通过梯度分析保持运动学可行性,同时将树扩展偏向于具有高先验似然和强在线相关性的语义显著区域(IGV-RRT)。仿真和真实世界实验表明,所提方法有效减轻了物体重新布置的影响,在复杂室内环境中实现了比代表性基线方法更高的搜索效率和成功率。