In real-world scenarios, the number of training samples across classes usually subjects to a long-tailed distribution. The conventionally trained network may achieve unexpected inferior performance on the rare class compared to the frequent class. Most previous works attempt to rectify the network bias from the data-level or from the classifier-level. Differently, in this paper, we identify that the bias towards the frequent class may be encoded into features, i.e., the rare-specific features which play a key role in discriminating the rare class are much weaker than the frequent-specific features. Based on such an observation, we introduce a simple yet effective approach, normalizing the parameters of Batch Normalization (BN) layer to explicitly rectify the feature bias. To achieve this end, we represent the Weight/Bias parameters of a BN layer as a vector, normalize it into a unit one and multiply the unit vector by a scalar learnable parameter. Through decoupling the direction and magnitude of parameters in BN layer to learn, the Weight/Bias exhibits a more balanced distribution and thus the strength of features becomes more even. Extensive experiments on various long-tailed recognition benchmarks (i.e., CIFAR-10/100-LT, ImageNet-LT and iNaturalist 2018) show that our method outperforms previous state-of-the-arts remarkably. The code and checkpoints are available at https://github.com/yuxiangbao/NBN.
翻译:在现实场景中,各类别的训练样本数量通常服从长尾分布。与传统训练的网络相比,在稀有类别上的性能可能意外地低于频繁类别。大多数先前工作试图从数据层面或分类器层面修正网络偏差。与此不同,本文发现对频繁类别的偏差可能被编码到特征中,即区分稀有类别的关键特征——稀有类别特有特征——远弱于频繁类别特有特征。基于这一观察,我们提出了一种简单而有效的方法:归一化批归一化(BN)层的参数以显式修正特征偏差。为实现这一目标,我们将BN层的权重/偏置参数表示为向量,将其归一化为单位向量,并乘以一个可学习的标量参数。通过解耦BN层参数的方向与幅度进行学习,权重/偏置呈现出更均衡的分布,从而特征的强度变得更加均匀。在多个长尾识别基准数据集(即CIFAR-10/100-LT、ImageNet-LT和iNaturalist 2018)上的大量实验表明,我们的方法显著优于先前的最先进技术。代码与检查点可在https://github.com/yuxiangbao/NBN获取。