3D scene graph prediction aims to abstract complex 3D environments into structured graphs consisting of objects and their pairwise relationships. Existing approaches typically adopt object-centric graph neural networks, where relation edge features are iteratively updated by aggregating messages from connected object nodes. However, this design inherently restricts relation representations to pairwise object context, making it difficult to capture high-order relational dependencies that are essential for accurate relation prediction. To address this limitation, we propose a Link-guided Edge-centric relational reasoning framework with Object-aware fusion, namely LEO, which enables progressive reasoning from relation-level context to object-level understanding. Specifically, LEO first predicts potential links between object pairs to suppress irrelevant edges, and then transforms the original scene graph into a line graph where each relation is treated as a node. A line graph neural network is applied to perform edge-centric relational reasoning to capture inter-relation context. The enriched relation features are subsequently integrated into the original object-centric graph to enhance object-level reasoning and improve relation prediction. Our framework is model-agnostic and can be integrated with any existing object-centric method. Experiments on the 3DSSG dataset with two competitive baselines show consistent improvements, highlighting the effectiveness of our edge-to-object reasoning paradigm.
翻译:三维场景图预测旨在将复杂的三维环境抽象为由物体及其成对关系构成的结构化图。现有方法通常采用以物体为中心的图神经网络,其中关系边特征通过聚合相连物体节点的信息进行迭代更新。然而,这种设计本质上将关系表示限制在成对物体上下文中,难以捕捉对准确关系预测至关重要的高阶关系依赖。为解决这一局限,我们提出一种基于链接引导的边缘中心关系推理框架,融合物体感知机制,命名为LEO,该框架能够实现从关系级上下文到物体级理解的渐进式推理。具体而言,LEO首先预测物体对之间的潜在链接以抑制无关边,随后将原始场景图转换为线图,其中每个关系被视为节点。通过应用线图神经网络执行边缘中心关系推理以捕捉关系间上下文。增强后的关系特征随后被整合到原始以物体为中心的图中,以强化物体级推理并提升关系预测性能。本框架与模型无关,可与任何现有以物体为中心的方法集成。在3DSSG数据集上使用两种竞争性基线的实验显示了一致的性能提升,凸显了本方法从边缘到物体的推理范式的有效性。