Scene Graph Generation is a critical enabler of environmental comprehension for autonomous robotic systems. Most of existing methods, however, are often thwarted by the intricate dynamics of background complexity, which limits their ability to fully decode the inherent topological information of the environment. Additionally, the wealth of contextual information encapsulated within depth cues is often left untapped, rendering existing approaches less effective. To address these shortcomings, we present STDG, an avant-garde Depth-Guided One-Stage Scene Graph Generation methodology. The innovative architecture of STDG is a triad of custom-built modules: The Depth Guided HHA Representation Generation Module, the Depth Guided Semi-Teaching Network Learning Module, and the Depth Guided Scene Graph Generation Module. This trifecta of modules synergistically harnesses depth information, covering all aspects from depth signal generation and depth feature utilization, to the final scene graph prediction. Importantly, this is achieved without imposing additional computational burden during the inference phase. Experimental results confirm that our method significantly enhances the performance of one-stage scene graph generation baselines.
翻译:场景图生成是自主机器人系统环境理解的关键技术。然而,现有方法常受限于背景复杂度的动态特性,难以充分解码环境中的固有拓扑信息。此外,深度线索中蕴含的丰富上下文信息通常未被利用,导致现有方法效果欠佳。为克服这些缺陷,我们提出STDG——一种前沿的深度引导单阶段场景图生成方法。其创新架构由三大定制模块构成:深度引导HHA表示生成模块、深度引导半教师网络学习模块及深度引导场景图生成模块。这三个模块协同利用深度信息,全面覆盖从深度信号生成、深度特征利用到最终场景图预测的各个环节。重要的是,该过程在推理阶段未引入额外计算负担。实验结果证实,我们的方法显著提升了单阶段场景图生成基线模型的性能。