Long-tailed image recognition is a computer vision problem considering a real-world class distribution rather than an artificial uniform. Existing methods typically detour the problem by i) adjusting a loss function, ii) decoupling classifier learning, or iii) proposing a new multi-head architecture called experts. In this paper, we tackle the problem from a different perspective to augment a training dataset to enhance the sample diversity of minority classes. Specifically, our method, namely Confusion-Pairing Mixup (CP-Mix), estimates the confusion distribution of the model and handles the data deficiency problem by augmenting samples from confusion pairs in real-time. In this way, CP-Mix trains the model to mitigate its weakness and distinguish a pair of classes it frequently misclassifies. In addition, CP-Mix utilizes a novel mixup formulation to handle the bias in decision boundaries that originated from the imbalanced dataset. Extensive experiments demonstrate that CP-Mix outperforms existing methods for long-tailed image recognition and successfully relieves the confusion of the classifier.
翻译:长尾图像识别是一个考虑真实世界类别分布而非人工均匀分布的计算机视觉问题。现有方法通常通过以下方式规避该问题:i) 调整损失函数,ii) 解耦分类器学习,或 iii) 提出称为专家模型的新型多头架构。本文从不同角度处理该问题,通过增强训练数据集来提高少数类别的样本多样性。具体而言,我们提出的混淆配对混合方法(CP-Mix)通过实时估计模型的混淆分布,并利用混淆对样本进行数据增强来解决数据不足问题。通过这种方式,CP-Mix训练模型以缓解其弱点,并区分其经常错误分类的类别对。此外,CP-Mix采用一种新颖的混合公式来处理由不平衡数据集引起的决策边界偏差。大量实验表明,CP-Mix在长尾图像识别任务中优于现有方法,并有效缓解了分类器的混淆问题。