Self-supervised contrastive learning (CL) has achieved state-of-the-art performance in representation learning by minimizing the distance between positive pairs while maximizing that of negative ones. Recently, it has been verified that the model learns better representation with diversely augmented positive pairs because they enable the model to be more view-invariant. However, only a few studies on CL have considered the difference between augmented views, and have not gone beyond the hand-crafted findings. In this paper, we first observe that the score-matching function can measure how much data has changed from the original through augmentation. With the observed property, every pair in CL can be weighted adaptively by the difference of score values, resulting in boosting the performance of the existing CL method. We show the generality of our method, referred to as ScoreCL, by consistently improving various CL methods, SimCLR, SimSiam, W-MSE, and VICReg, up to 3%p in k-NN evaluation on CIFAR-10, CIFAR-100, and ImageNet-100. Moreover, we have conducted exhaustive experiments and ablations, including results on diverse downstream tasks, comparison with possible baselines, and improvement when used with other proposed augmentation methods. We hope our exploration will inspire more research in exploiting the score matching for CL.
翻译:自监督对比学习通过最小化正样本对之间的距离同时最大化负样本对之间的距离,在表征学习领域取得了最先进的性能。近期研究表明,采用多样性增强的正样本对能使模型学习到更好的表征,因为这有助于提升模型对视角变化的鲁棒性。然而,目前仅有少数对比学习研究考虑了增强视图之间的差异,且尚未超越手工设计的经验发现。本文首先观察到得分匹配函数能够度量数据经过增强后相较于原始数据的改变程度。基于这一特性,我们可以根据得分值的差异自适应地为对比学习中的每个样本对分配权重,从而提升现有对比学习方法的性能。我们将所提出的方法命名为ScoreCL,并通过在CIFAR-10、CIFAR-100和ImageNet-100数据集上进行k近邻评估,证明其能够持续提升多种对比学习方法(SimCLR、SimSiam、W-MSE和VICReg)的性能,最高提升3个百分点。此外,我们开展了详尽的实验与消融研究,包括在不同下游任务上的结果对比、与多种基准方法的比较,以及与其他增强方法联合使用时性能提升情况。我们希望这项探索能激发更多利用得分匹配机制改进对比学习的研究工作。