The popularity of self-supervised learning has made it possible to train models without relying on labeled data, which saves expensive annotation costs. However, most existing self-supervised contrastive learning methods often overlook the combination of global and local feature information. This paper proposes a multi-network contrastive learning framework based on global and local representations. We introduce global and local feature information for self-supervised contrastive learning through multiple networks. The model learns feature information at different scales of an image by contrasting the embedding pairs generated by multiple networks. The framework also expands the number of samples used for contrast and improves the training efficiency of the model. Linear evaluation results on three benchmark datasets show that our method outperforms several existing classical self-supervised learning methods.
翻译:自监督学习的兴起使得在不依赖标注数据的情况下训练模型成为可能,从而节省了昂贵的标注成本。然而,现有的大多数自监督对比学习方法往往忽略了全局与局部特征信息的结合。本文提出一种基于全局与局部表示的多网络对比学习框架。我们通过多个网络引入全局与局部特征信息进行自监督对比学习。该模型通过对比多个网络生成的嵌入对,学习图像在不同尺度上的特征信息。同时,该框架扩展了对比所用的样本数量,提升了模型的训练效率。在三个基准数据集上的线性评估结果表明,我们的方法优于若干现有经典自监督学习方法。