Contrastive learning (CL) is a predominant technique in image classification, but they showed limited performance with an imbalanced dataset. Recently, several supervised CL methods have been proposed to promote an ideal regular simplex geometric configuration in the representation space-characterized by intra-class feature collapse and uniform inter-class mean spacing, especially for imbalanced datasets. In particular, existing prototype-based methods include class prototypes, as additional samples to consider all classes. However, the existing CL methods suffer from two limitations. First, they do not consider the alignment between the class means/prototypes and classifiers, which could lead to poor generalization. Second, existing prototype-based methods treat prototypes as only one additional sample per class, making their influence depend on the number of class instances in a batch and causing unbalanced contributions across classes. To address these limitations, we propose Equilibrium Contrastive Learning (ECL), a supervised CL framework designed to promote geometric equilibrium, where class features, means, and classifiers are harmoniously balanced under data imbalance. The proposed ECL framework uses two main components. First, ECL promotes the representation geometric equilibrium (i.e., a regular simplex geometry characterized by collapsed class samples and uniformly distributed class means), while balancing the contributions of class-average features and class prototypes. Second, ECL establishes a classifier-class center geometric equilibrium by aligning classifier weights and class prototypes. We ran experiments with three long-tailed datasets, the CIFAR-10(0)-LT, ImageNet-LT, and the two imbalanced medical datasets, the ISIC 2019 and our constructed LCCT dataset. Results show that ECL outperforms existing SOTA supervised CL methods designed for imbalanced classification.
翻译:对比学习(CL)是图像分类中的主流技术,但在非平衡数据集上其性能表现有限。最近,一些监督式对比学习方法被提出,旨在促进表征空间中理想的规则单纯形几何构型——其特点是类内特征坍缩和类间均值均匀分布,尤其针对非平衡数据集。具体而言,现有的基于原型的方法引入了类别原型作为额外样本以覆盖所有类别。然而,现有对比学习方法存在两个局限性:首先,它们未考虑类别均值/原型与分类器之间的对齐关系,这可能导致泛化能力不足;其次,现有的基于原型的方法仅将原型视为每类的一个额外样本,使得其影响力取决于批次中类别实例的数量,并导致不同类别的贡献度不平衡。为解决这些局限性,我们提出均衡对比学习(ECL),这是一种监督式对比学习框架,旨在促进几何均衡,使类别特征、均值和分类器在数据不平衡条件下实现和谐平衡。所提出的ECL框架包含两个核心组件:其一,ECL通过平衡类别平均特征与类别原型的贡献度,促进表征几何均衡(即实现类内样本坍缩且类间均值均匀分布的规则单纯形几何);其二,ECL通过对齐分类器权重与类别原型,建立分类器-类中心几何均衡。我们在三个长尾数据集(CIFAR-10(0)-LT、ImageNet-LT)以及两个非平衡医学数据集(ISIC 2019和我们构建的LCCT数据集)上进行了实验。结果表明,ECL在非平衡分类任务中优于现有的监督式对比学习先进方法。