Nearest neighbor (NN) sampling provides more semantic variations than pre-defined transformations for self-supervised learning (SSL) based image recognition problems. However, its performance is restricted by the quality of the support set, which holds positive samples for the contrastive loss. In this work, we show that the quality of the support set plays a crucial role in any nearest neighbor based method for SSL. We then provide a refined baseline (pNNCLR) to the nearest neighbor based SSL approach (NNCLR). To this end, we introduce pseudo nearest neighbors (pNN) to control the quality of the support set, wherein, rather than sampling the nearest neighbors, we sample in the vicinity of hard nearest neighbors by varying the magnitude of the resultant vector and employing a stochastic sampling strategy to improve the performance. Additionally, to stabilize the effects of uncertainty in NN-based learning, we employ a smooth-weight-update approach for training the proposed network. Evaluation of the proposed method on multiple public image recognition and medical image recognition datasets shows that it performs up to 8 percent better than the baseline nearest neighbor method, and is comparable to other previously proposed SSL methods.
翻译:摘要:最近邻采样为基于自监督学习的图像识别问题提供了比预定义变换更丰富的语义变化。然而,其性能受限于支持集的质量,该集合用于为对比损失提供正样本。本研究表明,支持集的质量在任何基于最近邻的自监督学习方法中起关键作用。我们随后为基于最近邻的自监督学习方法(NNCLR)提出了一个改进基线(pNNCLR)。为此,我们引入伪最近邻以控制支持集质量:不是直接采样最近邻,而是通过变化结果向量的幅度并在困难最近邻附近采样,同时采用随机采样策略以提升性能。此外,为稳定基于最近邻学习中的不确定性影响,我们采用平滑权重更新方法训练所提出的网络。在多个公开图像识别和医学图像识别数据集上的评估表明,所提方法性能比基线最近邻方法提升高达8%,且与其他此前提出的自监督学习方法性能相当。