In vision domain, large-scale natural datasets typically exhibit long-tailed distribution which has large class imbalance between head and tail classes. This distribution poses difficulty in learning good representations for tail classes. Recent developments have shown good long-tailed model can be learnt by decoupling the training into representation learning and classifier balancing. However, these works pay insufficient consideration on the long-tailed effect on representation learning. In this work, we propose interpolative centroid contrastive learning (ICCL) to improve long-tailed representation learning. ICCL interpolates two images from a class-agnostic sampler and a class-aware sampler, and trains the model such that the representation of the interpolative image can be used to retrieve the centroids for both source classes. We demonstrate the effectiveness of our approach on multiple long-tailed image classification benchmarks. Our result shows a significant accuracy gain of 2.8% on the iNaturalist 2018 dataset with a real-world long-tailed distribution.
翻译:在视觉领域中,大规模自然数据集通常呈现长尾分布,其中头部类别与尾部类别之间存在显著的类别不平衡。这种分布为学习尾部类别的良好表示带来了困难。近期研究表明,通过将训练过程解耦为表示学习与分类器平衡两个阶段,可以学习到较好的长尾模型。然而,这些工作对表示学习过程中长尾效应的影响考虑不足。本文提出插值质心对比学习(ICCL)方法以改进长尾表示学习。ICCL对来自类别无关采样器和类别感知采样器的两幅图像进行插值,并训练模型使得插值图像的表示能够用于检索两个源类别的质心。我们在多个长尾图像分类基准数据集上验证了该方法的有效性。实验结果表明,在具有真实世界长尾分布的iNaturalist 2018数据集上,本方法获得了2.8%的显著准确率提升。