Recognition problems in long-tailed data, where the sample size per class is heavily skewed, have recently gained importance because the distribution of the sample size per class in a dataset is generally exponential unless the sample size is intentionally adjusted. Various approaches have been devised to address these problems. Recently, weight balancing, which combines well-known classical regularization techniques with two-stage training, has been proposed. Despite its simplicity, it is known for its high performance against existing methods devised in various ways. However, there is a lack of understanding as to why this approach is effective for long-tailed data. In this study, we analyze the method focusing on neural collapse and cone effect at each training stage and find that it can be decomposed into the increase in Fisher's discriminant ratio of the feature extractor caused by weight decay and cross entropy loss and implicit logit adjustment caused by weight decay and class-balanced loss. Our analysis shows that the training method can be further simplified by reducing the number of training stages to one while increasing accuracy.
翻译:长尾数据中的识别问题(即每类样本数量严重偏斜)近年来日益受到关注,因为除非刻意调整,数据集中各类别样本数量的分布通常呈指数型。针对此类问题,研究者提出了多种方法。最近,一种将经典正则化技术与两阶段训练相结合的权重平衡方法被提出。尽管该方法结构简洁,但其在应对各类现有方法时展现出卓越性能。然而,目前尚缺乏对该方法在长尾数据中有效性的理论解释。本研究从神经坍缩与锥体效应两个维度,对每个训练阶段的方法进行深入分析,发现该方法可分解为两个独立机制:权重衰减与交叉熵损失联合导致的特征提取器费希尔判别比提升,以及权重衰减与类别均衡损失协同产生的隐式logit调整。我们的分析表明,该训练方法可进一步简化——在将训练阶段缩减至单阶段的同时,仍能提升模型准确率。