Recognition problems in long-tailed data, in which the sample size per class is heavily skewed, have gained importance because the distribution of the sample size per class in a dataset is generally exponential unless the sample size is intentionally adjusted. Various methods have been devised to address these problems. Recently, weight balancing, which combines well-known classical regularization techniques with two-stage training, has been proposed. Despite its simplicity, it is known for its high performance compared with existing methods devised in various ways. However, there is a lack of understanding as to why this method is effective for long-tailed data. In this study, we analyze weight balancing by focusing on neural collapse and the cone effect at each training stage and found that it can be decomposed into an increase in Fisher's discriminant ratio of the feature extractor caused by weight decay and cross entropy loss and implicit logit adjustment caused by weight decay and class-balanced loss. Our analysis enables the training method to be further simplified by reducing the number of training stages to one while increasing accuracy.
翻译:长尾数据中的识别问题(即每个类别的样本量严重不均衡)已变得至关重要,因为除非有意调整样本量,否则数据集中每个类别的样本量分布通常呈指数级。人们设计了各种方法来解决这些问题。最近,有研究提出了权重平衡方法,该方法将著名的经典正则化技术与两阶段训练相结合。尽管其结构简单,但与以各种方式设计的现有方法相比,该方法以高性能著称。然而,关于该方法为何对长尾数据有效,目前仍缺乏理解。在本研究中,我们通过聚焦于神经坍缩和每个训练阶段的锥形效应来分析权重平衡,并发现它可以分解为:由权重衰减和交叉熵损失引起的特征提取器费希尔判别比的提升,以及由权重衰减和类别平衡损失引起的隐含对数几率调整。我们的分析使得训练方法能够进一步简化,只需将训练阶段减少为一个阶段,同时提高准确率。