Contrastive learning predicts whether two images belong to the same category by training a model to make their feature representations as close or as far away as possible. In this paper, we rethink how to mine samples in contrastive learning, unlike other methods, our approach is more comprehensive, taking into account both positive and negative samples, and mining potential samples from two aspects: First, for positive samples, we consider both the augmented sample views obtained by data augmentation and the mined sample views through data mining. Then, we weight and combine them using both soft and hard weighting strategies. Second, considering the existence of uninformative negative samples and false negative samples in the negative samples, we analyze the negative samples from the gradient perspective and finally mine negative samples that are neither too hard nor too easy as potential negative samples, i.e., those negative samples that are close to positive samples. The experiments show the obvious advantages of our method compared with some traditional self-supervised methods. Our method achieves 88.57%, 61.10%, and 36.69% top-1 accuracy on CIFAR10, CIFAR100, and TinyImagenet, respectively.
翻译:对比学习通过训练模型使两张图像的特征表示尽可能接近或远离,以预测它们是否属于同一类别。本文对对比学习中的样本挖掘方法进行了重新思考。与其他方法不同,我们的方法更加全面,同时考虑了正样本和负样本,并从两个方面挖掘潜在样本:首先,对于正样本,我们既考虑了通过数据增强获得的增强样本视图,也考虑了通过数据挖掘获得的样本视图。然后,采用软加权和硬加权两种策略对它们进行加权组合。其次,针对负样本中存在无信息负样本和假负样本的问题,我们从梯度角度对负样本进行分析,最终挖掘出既不太难也不太易的负样本作为潜在负样本,即那些接近正样本的负样本。实验表明,与传统的自监督方法相比,我们的方法具有明显优势。在CIFAR10、CIFAR100和TinyImagenet数据集上,我们的方法分别取得了88.57%、61.10%和36.69%的top-1准确率。