In this paper, we show that the difference in $l_2$ norms of sample features can hinder batch normalization from obtaining more distinguished inter-class features and more compact intra-class features. To address this issue, we propose an intuitive but effective method to equalize the $l_2$ norms of sample features. Concretely, we $l_2$-normalize each sample feature before feeding them into batch normalization, and therefore the features are of the same magnitude. Since the proposed method combines the $l_2$ normalization and batch normalization, we name our method $L_2$BN. The $L_2$BN can strengthen the compactness of intra-class features and enlarge the discrepancy of inter-class features. The $L_2$BN is easy to implement and can exert its effect without any additional parameters or hyper-parameters. Therefore, it can be used as a basic normalization method for neural networks. We evaluate the effectiveness of $L_2$BN through extensive experiments with various models on image classification and acoustic scene classification tasks. The results demonstrate that the $L_2$BN can boost the generalization ability of various neural network models and achieve considerable performance improvements.
翻译:本文证明样本特征$l_2$范数的差异会阻碍批量归一化获得更具区分性的类间特征与更紧凑的类内特征。为解决该问题,我们提出一种直观而有效的方法来均衡样本特征的$l_2$范数。具体而言,我们在将每个样本特征输入批量归一化之前先对其进行$l_2$归一化处理,从而使特征具有相同的量级。由于所提方法融合了$l_2$归一化与批量归一化,故将其命名为$L_2$BN。$L_2$BN能够增强类内特征的紧凑性并扩大类间特征的差异性。该方法易于实现,无需额外参数或超参数即可发挥作用,因此可作为神经网络的基础归一化方法。通过在图像分类和声学场景分类任务上使用多种模型进行大量实验,我们评估了$L_2$BN的有效性。结果表明,$L_2$BN能提升各类神经网络模型的泛化能力,并取得显著的性能改进。