Unbiased Scene Graph Generation (USGG) aims to address biased predictions in SGG. To that end, data transfer methods are designed to convert coarse-grained predicates into fine-grained ones, mitigating imbalanced distribution. However, them overlook contextual relevance between transferred labels and subject-object pairs, such as unsuitability of 'eating' for 'woman-table'. Furthermore, they typically involve a two-stage process with significant computational costs, starting with pre-training a model for data transfer, followed by training from scratch using transferred labels. Thus, we introduce a plug-and-play method named CITrans, which iteratively trains SGG models with progressively enhanced data. First, we introduce Context-Restricted Transfer (CRT), which imposes subject-object constraints within predicates' semantic space to achieve fine-grained data transfer. Subsequently, Efficient Iterative Learning (EIL) iteratively trains models and progressively generates enhanced labels which are consistent with model's learning state, thereby accelerating the training process. Finally, extensive experiments show that CITrans achieves state-of-the-art and results with high efficiency.
翻译:无偏场景图生成旨在解决场景图生成中的偏置预测问题。为此,数据迁移方法被设计用于将粗粒度谓词转换为细粒度谓词,从而缓解分布不平衡。然而,这些方法忽略了迁移标签与主客体对之间的上下文关联性,例如"吃"不适用于"女人-桌子"对。此外,它们通常采用两阶段流程,计算成本高昂:先预训练一个模型用于数据迁移,再使用迁移标签从头进行训练。因此,我们提出了一种即插即用的方法CITrans,该方法通过逐步增强的数据迭代训练场景图生成模型。首先,我们引入上下文受限迁移,在谓词语义空间内施加主客体约束,以实现细粒度数据迁移。随后,高效迭代学习通过迭代训练模型并逐步生成与模型学习状态一致的增强标签,从而加速训练过程。最后,大量实验表明,CITrans以高效性能实现了最先进的结果。