While deep learning has significantly advanced robotic object recognition, purely data-driven approaches often lack semantic consistency and fail to leverage valuable, pre-existing knowledge about the environment. This report presents the ExPrIS project, which addresses this challenge by investigating how knowledge-level expectations can serve as to improve object interpretation from sensor data. Our approach is based on the incremental construction of a 3D Semantic Scene Graph (3DSSG). We integrate expectations from two sources: contextual priors from past observations and semantic knowledge from external graphs like ConceptNet. These are embedded into a heterogeneous Graph Neural Network (GNN) to create an expectation-biased inference process. This method moves beyond static, frame-by-frame analysis to enhance the robustness and consistency of scene understanding over time. The report details this architecture, its evaluation, and outlines its planned integration on a mobile robotic platform.
翻译:尽管深度学习已显著推动了机器人物体识别的发展,但纯数据驱动的方法往往缺乏语义一致性,且未能充分利用环境中已有的宝贵先验知识。本报告介绍了ExPrIS项目,该项目通过研究如何将知识层期望作为先验来改进基于传感器数据的物体解释,以应对这一挑战。我们的方法基于增量式构建的三维语义场景图(3DSSG)。我们整合了来自两个来源的期望:来自过去观测的上下文先验,以及来自外部知识图谱(如ConceptNet)的语义知识。这些期望被嵌入到一个异构图神经网络(GNN)中,以创建一个具有期望偏置的推理过程。该方法超越了静态的逐帧分析,从而提升了场景理解在时间维度上的鲁棒性和一致性。报告详细阐述了该架构及其评估,并概述了其在移动机器人平台上的集成计划。