Recognition problems in long-tailed data, where the sample size per class is heavily skewed, have recently gained importance because the distribution of the sample size per class in a dataset is generally exponential unless the sample size is intentionally adjusted. Various approaches have been devised to address these problems. Recently, weight balancing, which combines well-known classical regularization techniques with two-stage training, has been proposed. Despite its simplicity, it is known for its high performance against existing methods devised in various ways. However, there is a lack of understanding as to why this approach is effective for long-tailed data. In this study, we analyze the method focusing on neural collapse and cone effect at each training stage and find that it can be decomposed into the increase in Fisher's discriminant ratio of the feature extractor caused by weight decay and cross entropy loss and implicit logit adjustment caused by weight decay and class-balanced loss. Our analysis shows that the training method can be further simplified by reducing the number of training stages to one while increasing accuracy.
翻译:长尾数据中的识别问题(各类别样本规模严重不均衡)近年来受到广泛关注,这是由于数据集中的类别样本规模分布通常呈指数型,除非有意调整样本数量。针对这类问题,研究者已提出多种方法。近期提出的权重平衡方法将经典正则化技术与两阶段训练相结合。尽管该方法结构简单,但其性能优越性在众多方法中表现突出。然而,目前尚缺乏对该方法在长尾数据中有效性机理的深入理解。本研究从训练各阶段的神经坍缩与锥形效应入手分析该方法,发现其可分解为:权重衰减与交叉熵损失引起的特征提取器费希尔判别比提升,以及权重衰减与类别平衡损失带来的隐式对数几率调整。分析表明,该训练方法可通过减少训练阶段数(压缩至单阶段)进一步简化,同时提升准确率。