Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results, performing competitively or even outperforming supervised learning counterparts in some downstream tasks. Such approaches employ data augmentation to create two views of the same instance (i.e., positive pairs) and encourage the model to learn good representations by attracting these views closer in the embedding space without collapsing to the trivial solution. However, data augmentation is limited in representing positive pairs, and the repulsion process between the instances during contrastive learning may discard important features for instances that have similar categories. To address this issue, we propose an approach to identify those images with similar semantic content and treat them as positive instances, thereby reducing the chance of discarding important features during representation learning and increasing the richness of the latent representation. Our approach is generic and could work with any self-supervised instance discrimination frameworks such as MoCo and SimSiam. To evaluate our method, we run experiments on three benchmark datasets: ImageNet, STL-10 and CIFAR-10 with different instance discrimination SSL approaches. The experimental results show that our approach consistently outperforms the baseline methods across all three datasets; for instance, we improve upon the vanilla MoCo-v2 by 4.1% on ImageNet under a linear evaluation protocol over 800 epochs. We also report results on semi-supervised learning, transfer learning on downstream tasks, and object detection.
翻译:基于实例判别的自监督学习算法已展现出令人瞩目的成果,在某些下游任务中其表现与有监督学习相当甚至更优。这类方法通过数据增强生成同一实例的两个视图(即正样本对),并通过在嵌入空间中拉近这些视图距离(同时避免崩溃至平凡解)来促进模型学习优质表示。然而,数据增强在表示正样本对方面存在局限性,对比学习过程中实例间的排斥机制可能导致具有相似类别的实例丢失重要特征。为解决此问题,我们提出一种方法识别具有相似语义内容的图像并将其视为正样本,从而在表示学习过程中降低重要特征被丢弃的概率,同时增强潜表示的丰富性。本方法具有通用性,可适用于MoCo、SimSiam等任意自监督实例判别框架。为评估方法性能,我们采用不同实例判别自监督学习算法在ImageNet、STL-10和CIFAR-10三个基准数据集上进行实验。实验结果表明,我们的方法在所有三个数据集上均持续优于基线方法:例如,在ImageNet上基于800轮线性评估协议,本方法较原始MoCo-v2提升4.1%。我们还报告了半监督学习、下游任务迁移学习及目标检测任务的实验结果。