This work focuses on training dataset enhancement of informative relational triplets for Scene Graph Generation (SGG). Due to the lack of effective supervision, the current SGG model predictions perform poorly for informative relational triplets with inadequate training samples. Therefore, we propose two novel training dataset enhancement modules: Feature Space Triplet Augmentation (FSTA) and Soft Transfer. FSTA leverages a feature generator trained to generate representations of an object in relational triplets. The biased prediction based sampling in FSTA efficiently augments artificial triplets focusing on the challenging ones. In addition, we introduce Soft Transfer, which assigns soft predicate labels to general relational triplets to make more supervisions for informative predicate classes effectively. Experimental results show that integrating FSTA and Soft Transfer achieve high levels of both Recall and mean Recall in Visual Genome dataset. The mean of Recall and mean Recall is the highest among all the existing model-agnostic methods.
翻译:本研究聚焦于场景图生成(SGG)中信息性关系三元组的训练数据集增强。由于缺乏有效监督,当前SGG模型对训练样本不足的信息性关系三元组预测性能较差。为此,我们提出两个新颖的训练数据集增强模块:特征空间三元组增强(FSTA)与软迁移。FSTA利用训练好的特征生成器,生成关系三元组中对象的表征。FSTA中基于偏差预测的采样机制能高效地针对挑战性三元组进行人工数据增强。此外,我们引入软迁移方法,通过为通用关系三元组分配软谓词标签,从而有效增加信息性谓词类别的监督信号。实验结果表明,在Visual Genome数据集上,集成FSTA与软迁移的方法在召回率与平均召回率指标上均达到优异水平。其召回率与平均召回率的平均值在所有现有模型无关方法中位列最高。