The scene graph generation (SGG) task is designed to identify the predicates based on the subject-object pairs.However,existing datasets generally include two imbalance cases: one is the class imbalance from the predicted predicates and another is the context imbalance from the given subject-object pairs, which presents significant challenges for SGG. Most existing methods focus on the imbalance of the predicted predicate while ignoring the imbalance of the subject-object pairs, which could not achieve satisfactory results. To address the two imbalance cases, we propose a novel Environment Invariant Curriculum Relation learning (EICR) method, which can be applied in a plug-and-play fashion to existing SGG methods. Concretely, to remove the imbalance of the subject-object pairs, we first construct different distribution environments for the subject-object pairs and learn a model invariant to the environment changes. Then, we construct a class-balanced curriculum learning strategy to balance the different environments to remove the predicate imbalance. Comprehensive experiments conducted on VG and GQA datasets demonstrate that our EICR framework can be taken as a general strategy for various SGG models, and achieve significant improvements.
翻译:场景图生成(SGG)任务旨在基于主-客体对识别谓词。然而,现有数据集通常包含两类不平衡问题:一是预测谓词的类别不平衡,二是给定主-客体对的上下文不平衡,这给SGG带来了巨大挑战。现有方法大多聚焦于预测谓词的不平衡,而忽视了主-客体对的不平衡,因此难以取得令人满意的效果。为解决这两类不平衡问题,我们提出了一种新颖的环境不变的课程关系学习(EICR)方法,该方法能以即插即用的方式应用于现有SGG方法。具体而言,为消除主-客体对的不平衡,我们首先为主-客体对构建不同的分布环境,并学习一种对环境变化不变的模型。随后,我们构建类别平衡的课程学习策略以平衡不同环境,从而消除谓词不平衡。在VG和GQA数据集上进行的大量实验表明,我们的EICR框架可作为通用策略适用于各类SGG模型,并取得了显著性能提升。