Along with generative AI, interest in scene graph generation (SGG), which comprehensively captures the relationships and interactions between objects in an image and creates a structured graph-based representation, has significantly increased in recent years. However, relying on object-centric and dichotomous relationships, existing SGG methods have a limited ability to accurately predict detailed relationships. To solve these problems, a new approach to the modeling multiobject relationships, called edge dual scene graph generation (EdgeSGG), is proposed herein. EdgeSGG is based on a edge dual scene graph and Dual Message Passing Neural Network (DualMPNN), which can capture rich contextual interactions between unconstrained objects. To facilitate the learning of edge dual scene graphs with a symmetric graph structure, the proposed DualMPNN learns both object- and relation-centric features for more accurately predicting relation-aware contexts and allows fine-grained relational updates between objects. A comparative experiment with state-of-the-art (SoTA) methods was conducted using two public datasets for SGG operations and six metrics for three subtasks. Compared with SoTA approaches, the proposed model exhibited substantial performance improvements across all SGG subtasks. Furthermore, experiment on long-tail distributions revealed that incorporating the relationships between objects effectively mitigates existing long-tail problems.
翻译:随着生成式人工智能的发展,场景图生成(SGG)——即全面捕捉图像中物体关系与交互并构建结构化图表示的方法——近年来受到广泛关注。然而,现有SGG方法依赖于以物体为中心的二值化关系,导致其难以准确预测细粒度关系。针对这些问题,本文提出一种名为边缘对偶场景图生成(EdgeSGG)的新型多物体关系建模方法。EdgeSGG基于边缘对偶场景图与对偶消息传递神经网络(DualMPNN),能够捕捉非约束物体间的丰富上下文交互。为促进具有对称图结构的边缘对偶场景图学习,所提出的DualMPNN可同时学习物体中心与关系中心特征,从而更精确地预测关系感知上下文,并实现物体间细粒度的关系更新。使用两个公开数据集进行SGG操作实验,并采用六个指标评估三个子任务,与现有最佳方法(SoTA)进行对比。结果表明,所提模型在所有SGG子任务上均展现出显著性能提升。此外,关于长尾分布的实验揭示,融入物体间关系可有效缓解现有长尾问题。