Numerous applications require robots to operate in environments shared with other agents, such as humans or other robots. However, such shared scenes are typically subject to different kinds of long-term semantic scene changes. The ability to model and predict such changes is thus crucial for robot autonomy. In this work, we formalize the task of semantic scene variability estimation and identify three main varieties of semantic scene change: changes in the position of an object, its semantic state, or the composition of a scene as a whole. To represent this variability, we propose the Variable Scene Graph (VSG), which augments existing 3D Scene Graph (SG) representations with the variability attribute, representing the likelihood of discrete long-term change events. We present a novel method, DeltaVSG, to estimate the variability of VSGs in a supervised fashion. We evaluate our method on the 3RScan long-term dataset, showing notable improvements in this novel task over existing approaches. Our method DeltaVSG achieves an accuracy of 77.1% and a recall of 72.3%, often mimicking human intuition about how indoor scenes change over time. We further show the utility of VSG prediction in the task of active robotic change detection, speeding up task completion by 66.0% compared to a scene-change-unaware planner. We make our code available as open-source.
翻译:众多应用要求机器人在与人或其他机器人等智能体共有的环境中运行。然而,此类共享场景通常会经历不同类型的长期语义变化。因此,建模和预测这类变化的能力对机器人自主性至关重要。本文正式定义了语义场景变异性估计任务,并识别出三种主要语义场景变化类型:物体位置变化、物体语义状态变化以及场景整体构成变化。为表征这种变异性,我们提出可变场景图(VSG),该图通过引入变异性属性对现有三维场景图(SG)表示进行扩展,以表征离散长期变化事件的发生概率。我们提出一种新颖方法DeltaVSG,以监督学习方式估计VSG的变异性。在3RScan长期数据集上的评估表明,该方法在此新任务上相较于现有方法取得显著提升。我们的DeltaVSG方法达到77.1%的准确率与72.3%的召回率,其预测结果常能模仿人类对室内场景随时间变化的直觉判断。我们进一步展示了VSG预测在主动式机器人变化检测任务中的实用性,相较于忽略场景变化的规划器,任务完成效率提升66.0%。相关代码已开源发布。