Contrastive learning has become a dominant approach in self-supervised visual representation learning. Hard negatives - samples closely resembling the anchor - are key to enhancing learned representations' discriminative power. However, efficiently leveraging hard negatives remains challenging. We introduce SynCo (Synthetic Negatives in Contrastive learning), a novel approach that improves model performance by generating synthetic hard negatives on the representation space. Building on the MoCo framework, SynCo introduces six strategies for creating diverse synthetic hard negatives on-the-fly with minimal computational overhead. SynCo achieves faster training and better representation learning, reaching 67.9% top-1 accuracy on ImageNet ILSVRC-2012 linear evaluation after 200 pretraining epochs, surpassing MoCo's 67.5% using the same ResNet-50 encoder. It also transfers more effectively to detection tasks: on PASCAL VOC, it outperforms both the supervised baseline and MoCo with 82.5% AP; on COCO, it sets new benchmarks with 40.9% AP for bounding box detection and 35.5% AP for instance segmentation. Our synthetic hard negative generation approach significantly enhances visual representations learned through self-supervised contrastive learning. Code is available at https://github.com/giakoumoglou/synco.
翻译:对比学习已成为自监督视觉表征学习的主流方法。困难负样本——与锚点样本高度相似的样本——对于增强所学表征的判别能力至关重要。然而,如何高效利用困难负样本仍具挑战性。本文提出SynCo(对比学习中的合成负样本),一种通过在表征空间生成合成困难负样本以提升模型性能的新方法。基于MoCo框架,SynCo引入了六种策略,能够以最小计算开销动态生成多样化的合成困难负样本。SynCo实现了更快的训练速度和更优的表征学习效果,在ImageNet ILSVRC-2012线性评估中,经过200轮预训练后达到67.9%的top-1准确率,超越使用相同ResNet-50编码器的MoCo方法(67.5%)。该方法在检测任务上也展现出更强的迁移能力:在PASCAL VOC数据集上以82.5% AP超越监督基线和MoCo;在COCO数据集上,边界框检测达到40.9% AP,实例分割达到35.5% AP,创造了新的性能基准。我们的合成困难负样本生成方法显著提升了通过自监督对比学习获得的视觉表征质量。代码发布于https://github.com/giakoumoglou/synco。