Batch Normalization (BN) has become an essential technique in contemporary neural network design, enhancing training stability. Specifically, BN employs centering and scaling operations to standardize features along the batch dimension and uses an affine transformation to recover features. Although standard BN has shown its capability to improve deep neural network training and convergence, it still exhibits inherent limitations in certain cases. Current enhancements to BN typically address only isolated aspects of its mechanism. In this work, we critically examine BN from a feature perspective, identifying feature condensation during BN as a detrimental factor to test performance. To tackle this problem, we propose a two-stage unified framework called Unified Batch Normalization (UBN). In the first stage, we employ a straightforward feature condensation threshold to mitigate condensation effects, thereby preventing improper updates of statistical norms. In the second stage, we unify various normalization variants to boost each component of BN. Our experimental results reveal that UBN significantly enhances performance across different visual backbones and different vision tasks, and notably expedites network training convergence, particularly in early training stages. Notably, our method improved about 3% in accuracy on ImageNet classification and 4% in mean average precision on both Object Detection and Instance Segmentation on COCO dataset, showing the effectiveness of our approach in real-world scenarios.
翻译:批量归一化已成为当代神经网络设计中的关键技术,可提升训练稳定性。具体而言,BN通过中心化和缩放操作沿批次维度标准化特征,并利用仿射变换恢复特征。尽管标准BN已展现出改善深度神经网络训练与收敛的能力,但在某些情况下仍存在固有局限性。当前对BN的增强方案通常仅针对其机制的孤立方面。本研究从特征视角批判性地审视BN,发现BN过程中的特征浓缩对测试性能具有负面影响。为解决该问题,我们提出称为统一批量归一化的两阶段统一框架。第一阶段采用简洁的特征浓缩阈值来缓解浓缩效应,从而防止统计范数的不当更新。第二阶段将多种归一化变体进行统一,以增强BN的各个组件。实验结果表明,UBN在不同视觉骨干网络和视觉任务中显著提升了性能,并显著加速了网络训练收敛,尤其在训练早期阶段。值得注意的是,我们的方法在ImageNet分类任务上提升了约3%的准确率,在COCO数据集的目标检测和实例分割任务上分别提升了4%的平均精度均值,证明了该方法在实际场景中的有效性。