Learning from large amounts of unsupervised data and a small amount of supervision is an important open problem in computer vision. We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations. Our method extends self-supervised contrastive learning -- where representations are shaped by distinguishing whether two samples represent the same underlying datum (positives) or not (negatives) -- with a novel approach to selecting positives. To enrich the set of positives, we leverage the few existing ground-truth labels to predict the missing ones through a $k$-nearest neighbours classifier by using the learned embeddings of the labelled data. We thus extend the set of positives with datapoints having the same pseudo-label and call these semantic positives. We jointly learn the representation and predict bootstrapped pseudo-labels. This creates a reinforcing cycle. Strong initial representations enable better pseudo-label predictions which then improve the selection of semantic positives and lead to even better representations. SemPPL outperforms competing semi-supervised methods setting new state-of-the-art performance of $68.5\%$ and $76\%$ top-$1$ accuracy when using a ResNet-$50$ and training on $1\%$ and $10\%$ of labels on ImageNet, respectively. Furthermore, when using selective kernels, SemPPL significantly outperforms previous state-of-the-art achieving $72.3\%$ and $78.3\%$ top-$1$ accuracy on ImageNet with $1\%$ and $10\%$ labels, respectively, which improves absolute $+7.8\%$ and $+6.2\%$ over previous work. SemPPL also exhibits state-of-the-art performance over larger ResNet models as well as strong robustness, out-of-distribution and transfer performance. We release the checkpoints and the evaluation code at https://github.com/deepmind/semppl .
翻译:从大量无监督数据与少量监督中学习是计算机视觉领域的重要开放问题。本文提出一种新的半监督学习方法——基于伪标签的语义正样本(SemPPL),该方法结合标注与非标注数据以学习信息性表征。我们的方法扩展了自监督对比学习——通过区分两个样本是否源自同一底层数据(正样本)或不同数据(负样本)来塑造表征——并提出了一种筛选正样本的新策略。为丰富正样本集,我们利用少量现有真实标签,通过基于标注数据学习嵌入的k近邻分类器预测缺失标签。由此,我们将具有相同伪标签的数据点纳入正样本集,并称其为语义正样本。我们联合学习表征并预测自举式伪标签,从而形成强化循环:初始强表征能实现更优的伪标签预测,进而改进语义正样本的选择,最终生成更强的表征。在ImageNet数据集上使用ResNet-50训练时,SemPPL在仅使用1%和10%标签的情况下分别实现了68.5%和76%的top-1准确率,刷新了半监督方法的性能纪录。此外,结合选择性核后,SemPPL在1%和10%标签条件下分别达到72.3%和78.3%的top-1准确率,较先前工作绝对提升了+7.8%和+6.2%。SemPPL在更大规模ResNet模型上同样展现了最先进性能,并具备强鲁棒性、分布外泛化及迁移能力。我们已发布模型检查点及评估代码于https://github.com/deepmind/semppl。