Self-supervised learning algorithms based on instance discrimination effectively prevent representation collapse and produce promising results in representation learning. However, the process of attracting positive pairs (i.e., two views of the same instance) in the embedding space and repelling all other instances (i.e., negative pairs) irrespective of their categories could result in discarding important features. To address this issue, we propose an approach to identifying those images with similar semantic content and treating them as positive instances, named semantic positive pairs set (SPPS), thereby reducing the risk of discarding important features during representation learning. Our approach could work with any contrastive instance discrimination framework such as SimCLR or MOCO. We conduct experiments on three datasets: ImageNet, STL-10 and CIFAR-10 to evaluate our approach. The experimental results show that our approach consistently outperforms the baseline method vanilla SimCLR across all three datasets; for example, our approach improves upon vanilla SimCLR under linear evaluation protocol by 4.18% on ImageNet with a batch size 1024 and 800 epochs.
翻译:基于实例判别的自监督学习算法能有效防止表示坍塌,并在表示学习中取得显著成果。然而,在嵌入空间中吸引正样本对(即同一实例的两个视图)并排斥所有其他实例(即负样本对)的过程——无论其类别如何——可能导致重要特征的丢失。为解决这一问题,我们提出一种方法,识别具有相似语义内容的图像并将其视为正样本,命名为语义正样本集(SPPS),从而降低表示学习过程中重要特征被丢弃的风险。我们的方法可适用于任何对比实例判别框架(如SimCLR或MOCO)。我们在ImageNet、STL-10和CIFAR-10三个数据集上开展实验评估该方法。实验结果表明,我们的方法在所有三个数据集上均优于基线方法vanilla SimCLR;例如,在ImageNet上采用批量大小1024和800轮次训练时,线性评估协议下我们的方法相比vanilla SimCLR提升4.18%。