Real-world data usually suffers from severe class imbalance and long-tailed distributions, where minority classes are significantly underrepresented compared to the majority ones. Recent research prefers to utilize multi-expert architectures to mitigate the model uncertainty on the minority, where collaborative learning is employed to aggregate the knowledge of experts, i.e., online distillation. In this paper, we observe that the knowledge transfer between experts is imbalanced in terms of class distribution, which results in limited performance improvement of the minority classes. To address it, we propose a re-weighted distillation loss by comparing two classifiers' predictions, which are supervised by online distillation and label annotations, respectively. We also emphasize that feature-level distillation will significantly improve model performance and increase feature robustness. Finally, we propose an Effective Collaborative Learning (ECL) framework that integrates a contrastive proxy task branch to further improve feature quality. Quantitative and qualitative experiments on four standard datasets demonstrate that ECL achieves state-of-the-art performance and the detailed ablation studies manifest the effectiveness of each component in ECL.
翻译:真实世界数据通常遭受严重的类别不平衡和长尾分布问题,其中少数类别的样本数量显著少于多数类别。近期研究倾向于采用多专家架构来缓解模型在少数类别上的不确定性,通过协作学习(即在线蒸馏)聚合专家知识。本文发现,专家间的知识转移在类别分布上存在不平衡现象,导致少数类别性能提升有限。为解决此问题,我们提出一种基于两个分类器预测结果的加权蒸馏损失函数,这两个分类器分别受在线蒸馏和标签标注监督。同时,我们强调特征级蒸馏将显著提升模型性能并增强特征鲁棒性。最终,我们提出高效协作学习(ECL)框架,该框架集成对比代理任务分支以进一步提升特征质量。在四个标准数据集上的定性和定量实验表明,ECL达到了最先进的性能,而详细的消融研究则证实了ECL中各组件有效性。