Normalization techniques have been widely used in the field of deep learning due to their capability of enabling higher learning rates and are less careful in initialization. However, the effectiveness of popular normalization technologies is typically limited to specific areas. Unlike the standard Batch Normalization (BN) and Layer Normalization (LN), where BN computes the mean and variance along the (N,H,W) dimensions and LN computes the mean and variance along the (C,H,W) dimensions (N, C, H and W are the batch, channel, spatial height and width dimension, respectively), this paper presents a novel normalization technique called Batch Channel Normalization (BCN). To exploit both the channel and batch dependence and adaptively and combine the advantages of BN and LN based on specific datasets or tasks, BCN separately normalizes inputs along the (N, H, W) and (C, H, W) axes, then combines the normalized outputs based on adaptive parameters. As a basic block, BCN can be easily integrated into existing models for various applications in the field of computer vision. Empirical results show that the proposed technique can be seamlessly applied to various versions of CNN or Vision Transformer architecture. The code is publicly available at https://github.com/AfifaKhaled/BatchChannel-Normalization
翻译:归一化技术因支持更高的学习率并降低对初始化的敏感度而在深度学习领域被广泛应用。然而,主流归一化技术的有效性通常局限于特定领域。与标准批量归一化(BN)沿(N,H,W)维度计算均值与方差、层归一化(LN)沿(C,H,W)维度计算均值与方差的机制不同(其中N、C、H、W分别表示批大小、通道数、空间高度与宽度),本文提出一种新型归一化技术——批量通道归一化(BCN)。为同时利用通道与批量依赖性,并基于特定数据集或任务自适应融合BN与LN的优势,BCN分别沿(N,H,W)和(C,H,W)轴对输入进行独立归一化,随后通过自适应参数组合归一化输出。作为基础模块,BCN可便捷地集成至现有模型中,适用于计算机视觉领域的各类应用。实验结果表明,该技术可无缝应用于不同版本的CNN或Vision Transformer架构。相关代码已开源至https://github.com/AfifaKhaled/BatchChannel-Normalization。