Belief Scene Graphs: Expanding Partial Scenes with Objects through Computation of Expectation

In this article, we propose the novel concept of Belief Scene Graphs, which are utility-driven extensions of partial 3D scene graphs, that enable efficient high-level task planning with partial information. We propose a graph-based learning methodology for the computation of belief (also referred to as expectation) on any given 3D scene graph, which is then used to strategically add new nodes (referred to as blind nodes) that are relevant to a robotic mission. We propose the method of Computation of Expectation based on Correlation Information (CECI), to reasonably approximate real Belief/Expectation, by learning histograms from available training data. A novel Graph Convolutional Neural Network (GCN) model is developed, to learn CECI from a repository of 3D scene graphs. As no database of 3D scene graphs exists for the training of the novel CECI model, we present a novel methodology for generating a 3D scene graph dataset based on semantically annotated real-life 3D spaces. The generated dataset is then utilized to train the proposed CECI model and for extensive validation of the proposed method. We establish the novel concept of \textit{Belief Scene Graphs} (BSG), as a core component to integrate expectations into abstract representations. This new concept is an evolution of the classical 3D scene graph concept and aims to enable high-level reasoning for task planning and optimization of a variety of robotics missions. The efficacy of the overall framework has been evaluated in an object search scenario, and has also been tested in a real-life experiment to emulate human common sense of unseen-objects. For a video of the article, showcasing the experimental demonstration, please refer to the following link: https://youtu.be/hsGlSCa12iY

翻译：本文提出信念场景图这一新概念，它作为部分三维场景图的效用驱动扩展，能够在信息不完整的情况下实现高效的高层任务规划。我们提出一种基于图的学习方法，用于计算任意给定三维场景图的信念（亦称期望），进而策略性地添加与机器人任务相关的新节点（称为盲节点）。我们提出基于关联信息的期望计算方法，通过从可用训练数据中学习直方图来合理逼近真实信念/期望。本文开发了一种新颖的图卷积神经网络模型，用于从三维场景图库中学习该方法。由于缺乏用于训练该模型的三维场景图数据库，我们提出基于语义标注的真实三维空间生成三维场景图数据集的新方法。生成的数据集随后用于训练所提出的模型，并对所提方法进行广泛验证。我们将信念场景图确立为核心概念，作为将期望整合到抽象表示中的关键组件。这一新概念是经典三维场景图概念的演进，旨在为各类机器人任务规划与优化提供高层推理能力。该框架的整体效能已在对象搜索场景中得到评估，并在真实实验中测试了其对不可见对象的人类常识模拟能力。有关展示实验演示的视频，请参阅以下链接：https://youtu.be/hsGlSCa12iY