In cluttered scenes with inevitable occlusions and incomplete observations, selecting informative viewpoints is essential for building a reliable representation. In this context, 3D Gaussian Splatting (3DGS) offers a distinct advantage, as it can explicitly guide the selection of subsequent viewpoints and then refine the representation with new observations. However, existing approaches rely solely on geometric cues, neglect manipulation-relevant semantics, and tend to prioritize exploitation over exploration. To tackle these limitations, we introduce an instance-aware Next Best View (NBV) policy that prioritizes underexplored regions by leveraging object features. Specifically, our object-aware 3DGS distills instancelevel information into one-hot object vectors, which are used to compute confidence-weighted information gain that guides the identification of regions associated with erroneous and uncertain Gaussians. Furthermore, our method can be easily adapted to an object-centric NBV, which focuses view selection on a target object, thereby improving reconstruction robustness to object placement. Experiments demonstrate that our NBV policy reduces depth error by up to 77.14% on the synthetic dataset and 34.10% on the real-world GraspNet dataset compared to baselines. Moreover, compared to targeting the entire scene, performing NBV on a specific object yields an additional reduction of 25.60% in depth error for that object. We further validate the effectiveness of our approach through real-world robotic manipulation tasks.
翻译:在存在不可避免遮挡与不完整观测的杂乱场景中,选择信息丰富的视点对于构建可靠表征至关重要。在此背景下,三维高斯溅射(3DGS)展现出独特优势,因其能够显式指导后续视点的选择,并利用新观测优化表征。然而,现有方法仅依赖几何线索,忽略与操作相关的语义信息,且往往偏重利用而非探索。为克服这些局限,我们提出一种实例感知的最优视点(NBV)策略,通过利用物体特征优先探索未充分观测区域。具体而言,我们的物体感知3DGS将实例级信息蒸馏为独热编码的物体向量,用于计算置信度加权的信息增益,从而指导识别与错误及不确定高斯分布相关的区域。此外,本方法可轻松适配为以物体为中心的NBV策略,将视点选择聚焦于目标物体,从而提升对物体位置放置的重建鲁棒性。实验表明:在合成数据集上,相比基线方法,我们的NBV策略将深度误差降低达77.14%;在真实世界GraspNet数据集上降低达34.10%。相较于针对整个场景执行NBV,对特定物体执行NBV可额外降低该物体25.60%的深度误差。我们进一步通过真实世界机器人操作任务验证了本方法的有效性。